Most Cited 2025 "covariance structure" Papers
22,274 papers found • Page 18 of 112
Conference
Focusing on Tracks for Online Multi-Object Tracking
Kyujin Shim, Kangwook Ko, YuJin Yang et al.
Synthetic Prior for Few-Shot Drivable Head Avatar Inversion
Wojciech Zielonka, Stephan J. Garbin, Alexandros Lattas et al.
DiffFNO: Diffusion Fourier Neural Operator
Xiaoyi Liu, Hao Tang
Hypergraph Vision Transformers: Images are More than Nodes, More than Edges
Joshua Fixelle
How new data permeates LLM knowledge and how to dilute it
Chen Sun, Renat Aksitov, Andrey Zhmoginov et al.
Offline-to-Online Hyperparameter Transfer for Stochastic Bandits
Dravyansh Sharma, Arun Suggala
Not All Data Are Unlearned Equally
Aravind Krishnan, Siva Reddy, Marius Mosbach
Can a MISL Fly? Analysis and Ingredients for Mutual Information Skill Learning
Chongyi Zheng, Jens Tuyls, Joanne Peng et al.
Towards Unifying Evaluation of Counterfactual Explanations: Leveraging Large Language Models for Human-Centric Assessments
Marharyta Domnich, Julius Välja, Rasmus Moorits Veski et al.
Noise Calibration and Spatial-Frequency Interactive Network for STEM Image Enhancement
Hesong Li, Ziqi Wu, Ruiwen Shao et al.
Self-Discriminative Modeling for Anomalous Graph Detection
Jinyu Cai, Yunhe Zhang, Jicong Fan
DynamicID: Zero-Shot Multi-ID Image Personalization with Flexible Facial Editability
Xirui Hu, Jiahao Wang, Hao chen et al.
A General Framework for Producing Interpretable Semantic Text Embeddings
Yiqun Sun, Qiang Huang, Yixuan Tang et al.
DAViD: Modeling Dynamic Affordance of 3D Objects Using Pre-trained Video Diffusion Models
Hyeonwoo Kim, Sangwon Baik, Hanbyul Joo
Conformal Prediction Sets Can Cause Disparate Impact
Jesse Cresswell, Bhargava Kumar, Yi Sui et al.
Unsupervised Audio-Visual Segmentation with Modality Alignment
Swapnil Bhosale, Haosen Yang, Diptesh Kanojia et al.
Solving Inverse Problems with FLAIR
Julius Erbach, Dominik Narnhofer, Andreas Dombos et al.
Unveiling Differences in Generative Models: A Scalable Differential Clustering Approach
Jingwei Zhang, Mohammad Jalali, Cheuk Ting Li et al.
OpenViewer: Openness-Aware Multi-View Learning
Shide Du, Zihan Fang, Yanchao Tan et al.
Improving LLMs for Recommendation with Out-Of-Vocabulary Tokens
Ting-Ji Huang, Jia-Qi Yang, Chunxu Shen et al.
Data Unlearning in Diffusion Models
Silas Alberti, Kenan Hasanaliyev, Manav Shah et al.
Unlocking the Power of Function Vectors for Characterizing and Mitigating Catastrophic Forgetting in Continual Instruction Tuning
Gangwei Jiang, caigao jiang, Zhaoyi Li et al.
LoRA-One: One-Step Full Gradient Could Suffice for Fine-Tuning Large Language Models, Provably and Efficiently
Yuanhe Zhang, Fanghui Liu, Yudong Chen
PRE-Mamba: A 4D State Space Model for Ultra-High-Frequent Event Camera Deraining
Ciyu Ruan, Ruishan Guo, Zihang GONG et al.
Efficient Reward Poisoning Attacks on Online Deep Reinforcement Learning
Yinglun Xu, Qi Zeng, Gagandeep Singh
Federated Domain Generalization with Data-free On-server Matching Gradient
Binh Nguyen, Minh-Duong Nguyen, Jinsun Park et al.
Direct Alignment with Heterogeneous Preferences
Ali Shirali, Arash Nasr-Esfahany, Abdullah Alomar et al.
Step-by-Step Reasoning for Math Problems via Twisted Sequential Monte Carlo
Shengyu Feng, Xiang Kong, shuang ma et al.
Face-Human-Bench: A Comprehensive Benchmark of Face and Human Understanding for Multi-modal Assistants
Lixiong Qin, Shilong Ou, Miaoxuan Zhang et al.
(Almost Full) EFX for Three (and More) Types of Agents
Pratik Ghosal, Vishwa Prakash HV, Prajakta Nimbhorkar et al.
Online Guidance Graph Optimization for Lifelong Multi-Agent Path Finding
Hongzhi Zang, Yulun Zhang, He Jiang et al.
AutoAdvExBench: Benchmarking Autonomous Exploitation of Adversarial Example Defenses
Nicholas Carlini, Edoardo Debenedetti, Javier Rando et al.
Re2LLM: Reflective Reinforcement Large Language Model for Session-based Recommendation
Ziyan Wang, Yingpeng Du, Zhu Sun et al.
UniPCGC: Towards Practical Point Cloud Geometry Compression via an Efficient Unified Approach
Kangli Wang, Wei Gao
Free Hunch: Denoiser Covariance Estimation for Diffusion Models Without Extra Costs
Severi Rissanen, Markus Heinonen, Arno Solin
Incomplete Multi-view Deep Clustering with Data Imputation and Alignment
Jiyuan Liu, Xinwang Liu, Xinhang Wan et al.
Language Models Trained to do Arithmetic Predict Human Risky and Intertemporal Choice
Jian-Qiao Zhu, Haijiang Yan, Thomas L. Griffiths
LLMs Encode Harmfulness and Refusal Separately
Jiachen Zhao, Jing Huang, Zhengxuan Wu et al.
PanTS: The Pancreatic Tumor Segmentation Dataset
Wenxuan Li, Xinze Zhou, Qi Chen et al.
A Closer Look at Multimodal Representation Collapse
Abhra Chaudhuri, Anjan Dutta, Tu Bui et al.
Ada-K Routing: Boosting the Efficiency of MoE-based LLMs
Zijia Zhao, Longteng Guo, Jie Cheng et al.
Diffusion-based Synthetic Data Generation for Visible-Infrared Person Re-Identification
Wenbo Dai, Lijing Lu, Zhihang Li
Compositional simulation-based inference for time series
Manuel Gloeckler, Shoji Toyota, Kenji Fukumizu et al.
Visual Jenga: Discovering Object Dependencies via Counterfactual Inpainting
Anand Bhattad, Konpat Preechakul, Alexei Efros
COCONut-PanCap: Joint Panoptic Segmentation and Grounded Captions for Fine-Grained Understanding and Generation
Xueqing Deng, Linjie Yang, Qihang Yu et al.
FAVOR-Bench: A Comprehensive Benchmark for Fine-Grained Video Motion Understanding
Chongjun Tu, Lin Zhang, pengtao chen et al.
An Item Is Worth a Prompt: Versatile Image Editing with Disentangled Control
Aosong Feng, Weikang Qiu, Jinbin Bai et al.
Gaussian Mixture Flow Matching Models
Hansheng Chen, Kai Zhang, Hao Tan et al.
RoboPEPP: Vision-Based Robot Pose and Joint Angle Estimation through Embedding Predictive Pre-Training
Raktim Gautam Goswami, Prashanth Krishnamurthy, Yann LeCun et al.
MLZero: A Multi-Agent System for End-to-end Machine Learning Automation
Haoyang Fang, Boran Han, Nick Erickson et al.
Two Experts Are All You Need for Steering Thinking: Reinforcing Cognitive Effort in MoE Reasoning Models Without Additional Training
Mengru Wang, Xingyu Chen, Yue Wang et al.
A transfer learning framework for weak to strong generalization
Seamus Somerstep, Felipe Maia Polo, Moulinath Banerjee et al.
Deep Weight Factorization: Sparse Learning Through the Lens of Artificial Symmetries
Chris Kolb, Tobias Weber, Bernd Bischl et al.
Learning normalized image densities via dual score matching
Florentin Guth, Zahra Kadkhodaie, Eero Simoncelli
MoEQuant: Enhancing Quantization for Mixture-of-Experts Large Language Models via Expert-Balanced Sampling and Affinity Guidance
Zhixuan Chen, Xing Hu, Dawei Yang et al.
Sample complexity of data-driven tuning of model hyperparameters in neural networks with structured parameter-dependent dual function
Maria-Florina Balcan, Anh Nguyen, Dravyansh Sharma
Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers
Zeyuan Allen-Zhu
De-mark: Watermark Removal in Large Language Models
Ruibo Chen, Yihan Wu, Junfeng Guo et al.
Re-HOLD: Video Hand Object Interaction Reenactment via adaptive Layout-instructed Diffusion Model
Yingying Fan, Quanwei Yang, Kaisiyuan Wang et al.
STAR: Synthesis of Tailored Architectures
Armin Thomas, Rom Parnichkun, Alexander Amini et al.
Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation
François Rozet, Ruben Ohana, Michael McCabe et al.
Why Masking Diffusion Works: Condition on the Jump Schedule for Improved Discrete Diffusion
Alan Amin, Nate Gruver, Andrew Wilson
Sensor-Invariant Tactile Representation
Harsh Gupta, Yuchen Mo, Shengmiao Jin et al.
DTGBrepGen: A Novel B-rep Generative Model through Decoupling Topology and Geometry
Jing Li, Yihang Fu, Falai Chen
INST-IT: Boosting Instance Understanding via Explicit Visual Prompt Instruction Tuning
Wujian Peng, Lingchen Meng, Yitong Chen et al.
Fair Submodular Cover
Wenjing Chen, Shuo Xing, Samson Zhou et al.
SIGMAN: Scaling 3D Human Gaussian Generation with Millions of Assets
Yuhang Yang, Fengqi Liu, Yixing Lu et al.
VIoTGPT: Learning to Schedule Vision Tools Towards Intelligent Video Internet of Things
Yaoyao Zhong, Mengshi Qi, Rui Wang et al.
Mask^2DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation
Tianhao Qi, Jianlong Yuan, Wanquan Feng et al.
DroneSplat: 3D Gaussian Splatting for Robust 3D Reconstruction from In-the-Wild Drone Imagery
Jiadong Tang, Yu Gao, Dianyi Yang et al.
Bayesian Optimization of Antibodies Informed by a Generative Model of Evolving Sequences
Alan Amin, Nate Gruver, Yilun Kuang et al.
Adaptive Learn-then-Test: Statistically Valid and Efficient Hyperparameter Selection
Matteo Zecchin, Sangwoo Park, Osvaldo Simeone
Solving Token Gradient Conflict in Mixture-of-Experts for Large Vision-Language Model
Longrong Yang, Dong Shen, Chaoxiang Cai et al.
ROD-MLLM: Towards More Reliable Object Detection in Multimodal Large Language Models
Heng Yin, Yuqiang Ren, Ke Yan et al.
SonicSim: A customizable simulation platform for speech processing in moving sound source scenarios
Kai Li, Wendi Sang, Chang Zeng et al.
CaPo: Cooperative Plan Optimization for Efficient Embodied Multi-Agent Cooperation
Jie Liu, Pan Zhou, Yingjun Du et al.
X-Fusion: Introducing New Modality to Frozen Large Language Models
Sicheng Mo, Thao Nguyen, Xun Huang et al.
MUNBa: Machine Unlearning via Nash Bargaining
Jing Wu, Mehrtash Harandi
Generating Physically Stable and Buildable Brick Structures from Text
Ava Pun, Kangle Deng, Ruixuan Liu et al.
TG-LLaVA: Text Guided LLaVA via Learnable Latent Embeddings
Dawei Yan, Pengcheng Li, Yang Li et al.
From Experts to a Generalist: Toward General Whole-Body Control for Humanoid Robots
Yuxuan Wang, Ming Yang, Gang Ding et al.
VIKI‑R: Coordinating Embodied Multi-Agent Cooperation via Reinforcement Learning
Li Kang, Xiufeng Song, Heng Zhou et al.
BRiTE: Bootstrapping Reinforced Thinking Process to Enhance Language Model Reasoning
Han Zhong, Yutong Yin, Shenao Zhang et al.
HELM: Hyperbolic Large Language Models via Mixture-of-Curvature Experts
Neil He, Rishabh Anand, Hiren Madhu et al.
The Effectiveness of Curvature-Based Rewiring and the Role of Hyperparameters in GNNs Revisited
Floriano Tori, Vincent Holst, Vincent Ginis
Injecting Universal Jailbreak Backdoors into LLMs in Minutes
Zhuowei Chen, qiannan zhang, Shichao Pei
PICO: Reconstructing 3D People In Contact with Objects
Alpár Cseke, Shashank Tripathi, Sai Kumar Dwivedi et al.
Mask-Adapter: The Devil is in the Masks for Open-Vocabulary Segmentation
Yongkang Li, Tianheng Cheng, Bin Feng et al.
VideoOrion: Tokenizing Object Dynamics in Videos
Yicheng Feng, Yijiang Li, Wanpeng Zhang et al.
Tartan IMU: A Light Foundation Model for Inertial Positioning in Robotics
Shibo Zhao, Sifan Zhou, Raphael Blanchard et al.
From One to More: Contextual Part Latents for 3D Generation
Shaocong Dong, Lihe Ding, Xiao Chen et al.
How do Transformers Learn Implicit Reasoning?
Jiaran Ye, Zijun Yao, Zhidian Huang et al.
Expected Sliced Transport Plans
Xinran Liu, Rocio Diaz Martin, Yikun Bai et al.
Motion-adaptive Transformer for Event-based Image Deblurring
Senyan Xu, Zhijing Sun, Mingchen Zhong et al.
Geometry-aware RL for Manipulation of Varying Shapes and Deformable Objects
Tai Hoang, Huy Le, Philipp Becker et al.
Finer-CAM: Spotting the Difference Reveals Finer Details for Visual Explanation
Ziheng Zhang, Jianyang Gu, Arpita Chowdhury et al.
Sail into the Headwind: Alignment via Robust Rewards and Dynamic Labels against Reward Hacking
Paria Rashidinejad, Yuandong Tian
Dehaze-RetinexGAN: Real-World Image Dehazing via Retinex-based Generative Adversarial Network
Xinran Wang, Guang Yang, Tian Ye et al.
DELIFT: Data Efficient Language model Instruction Fine-Tuning
Ishika Agarwal, Krishnateja Killamsetty, Lucian Popa et al.
GaussMark: A Practical Approach for Structural Watermarking of Language Models
Adam Block, Alexander Rakhlin, Ayush Sekhari
Trace3D: Consistent Segmentation Lifting via Gaussian Instance Tracing
Hongyu Shen, Junfeng Ni, Weishuo Li et al.
VISION-XL: High Definition Video Inverse Problem Solver using Latent Image Diffusion Models
Taesung Kwon, Jong Ye
From Specificity to Generality: Revisiting Generalizable Artifacts in Detecting Face Deepfakes
Long Ma, Zhiyuan Yan, Jin Xu et al.
NullSwap: Proactive Identity Cloaking Against Deepfake Face Swapping
Tianyi Wang, Shuaicheng Niu, Harry Cheng et al.
MammAlps: A Multi-view Video Behavior Monitoring Dataset of Wild Mammals in the Swiss Alps
Valentin Gabeff, Haozhe Qi, Brendan Flaherty et al.
Fast and Robust: Task Sampling with Posterior and Diversity Synergies for Adaptive Decision-Makers in Randomized Environments
Yun Qu, Cheems Wang, Yixiu Mao et al.
LIFe-GoM: Generalizable Human Rendering with Learned Iterative Feedback Over Multi-Resolution Gaussians-on-Mesh
Jing Wen, Alex Schwing, Shenlong Wang
Multi-clue Consistency Learning to Bridge Gaps Between General and Oriented Object in Semi-supervised Detection
Chenxu Wang, Chunyan Xu, Xiang Li et al.
Combining Cost Constrained Runtime Monitors for AI Safety
Tim Hua, James Baskerville, Henri Lemoine et al.
GTR: Guided Thought Reinforcement Prevents Thought Collapse in RL-based VLM Agent Training
Tong Wei, Yijun Yang, Junliang Xing et al.
Secant Line Search for Frank-Wolfe Algorithms
Deborah Hendrych, Sebastian Pokutta, Mathieu Besançon et al.
RealisHuman: A Two-Stage Approach for Refining Malformed Human Parts in Generated Images
Benzhi Wang, Jingkai Zhou, Jingqi Bai et al.
DIPO: Dual-State Images Controlled Articulated Object Generation Powered by Diverse Data
Ruiqi Wu, Xinjie wang, Liu.Liu et al.
Modality-Specialized Synergizers for Interleaved Vision-Language Generalists
Zhiyang Xu, Minqian Liu, Ying Shen et al.
Fast and Slow Streams for Online Time Series Forecasting Without Information Leakage
Ying-yee Ava Lau, Zhiwen Shao, Dit-Yan Yeung
Learning Chaos In A Linear Way
Xiaoyuan Cheng, Yi He, Yiming Yang et al.
Effective and Efficient Masked Image Generation Models
Zebin You, Jingyang Ou, Xiaolu Zhang et al.
Do Visual Imaginations Improve Vision-and-Language Navigation Agents?
Akhil Perincherry, Jacob Krantz, Stefan Lee
PhysSplat: Efficient Physics Simulation for 3D Scenes via MLLM-Guided Gaussian Splatting
Haoyu Zhao, Hao Wang, Xingyue Zhao et al.
Data Taggants: Dataset Ownership Verification Via Harmless Targeted Data Poisoning
Wassim Bouaziz, Nicolas Usunier, El-Mahdi El-Mhamdi
ZIM: Zero-Shot Image Matting for Anything
Beomyoung Kim, Chanyong Shin, Joonhyun Jeong et al.
Temporal Heterogeneous Graph Generation with Privacy, Utility, and Efficiency
Xinyu He, Dongqi Fu, Hanghang Tong et al.
Autonomous LLM-Enhanced Adversarial Attack for Text-to-Motion
Honglei Miao, Fan Ma, Ruijie Quan et al.
Bundle Neural Network for message diffusion on graphs
Jacob Bamberger, Federico Barbero, Xiaowen Dong et al.
Differentially Private Steering for Large Language Model Alignment
Anmol Goel, Yaxi Hu, Iryna Gurevych et al.
AniMer: Animal Pose and Shape Estimation Using Family Aware Transformer
Jin Lyu, Tianyi Zhu, Yi Gu et al.
Soft Reasoning: Navigating Solution Spaces in Large Language Models through Controlled Embedding Exploration
Qinglin Zhu, Runcong Zhao, Hanqi Yan et al.
HC-LLM: Historical-Constrained Large Language Models for Radiology Report Generation
Tengfei Liu, Jiapu Wang, Yongli Hu et al.
DataMan: Data Manager for Pre-training Large Language Models
Ru Peng, Kexin Yang, Yawen Zeng et al.
Can Textual Gradient Work in Federated Learning?
Minghui Chen, Ruinan Jin, Wenlong Deng et al.
DiC: Rethinking Conv3x3 Designs in Diffusion Models
Yuchuan Tian, Jing Han, Chengcheng Wang et al.
CapeX: Category-Agnostic Pose Estimation from Textual Point Explanation
Matan Rusanovsky, Or Hirschorn, Shai Avidan
Incomplete Modality Disentangled Representation for Ophthalmic Disease Grading and Diagnosis
Chengzhi Liu, Zile Huang, Zhe Chen et al.
LangSplatV2: High-dimensional 3D Language Gaussian Splatting with 450+ FPS
Wanhua Li, Yujie Zhao, Minghan Qin et al.
SfM-Free 3D Gaussian Splatting via Hierarchical Training
Bo Ji, Angela Yao
ViG: Linear-complexity Visual Sequence Learning with Gated Linear Attention
Bencheng Liao, Xinggang Wang, Lianghui Zhu et al.
Adversarial Generative Flow Network for Solving Vehicle Routing Problems
Ni Zhang, Jingfeng Yang, Zhiguang Cao et al.
Efficient Fine-Tuning and Concept Suppression for Pruned Diffusion Models
Reza Shirkavand, Peiran Yu, Shangqian Gao et al.
DenseGrounding: Improving Dense Language-Vision Semantics for Ego-centric 3D Visual Grounding
Henry Zheng, Hao Shi, Qihang Peng et al.
SceneWeaver: All-in-One 3D Scene Synthesis with an Extensible and Self-Reflective Agent
Yandan Yang, Baoxiong Jia, Shujie Zhang et al.
Micro-macro Wavelet-based Gaussian Splatting for 3D Reconstruction from Unconstrained Images
Yihui Li, Chengxin Lv, Hongyu Yang et al.
ROUTE: Robust Multitask Tuning and Collaboration for Text-to-SQL
Yang Qin, Chao Chen, Zhihang Fu et al.
Local-Prompt: Extensible Local Prompts for Few-Shot Out-of-Distribution Detection
Fanhu Zeng, Zhen Cheng, Fei Zhu et al.
Instruction-based Image Manipulation by Watching How Things Move
Mingdeng Cao, Xuaner Zhang, Yinqiang Zheng et al.
DORNet: A Degradation Oriented and Regularized Network for Blind Depth Super-Resolution
Zhengxue Wang, Zhiqiang Yan, Jinshan Pan et al.
MergeNet: Knowledge Migration Across Heterogeneous Models, Tasks, and Modalities
Kunxi Li, Tianyu Zhan, Kairui Fu et al.
Feature Denoising Diffusion Model for Blind Image Quality Assessment
Xudong Li, Yan Zhang, Yunhang Shen et al.
Efficient stagewise pretraining via progressive subnetworks
Abhishek Panigrahi, Nikunj Saunshi, Kaifeng Lyu et al.
ForestFormer3D: A Unified Framework for End-to-End Segmentation of Forest LiDAR 3D Point Clouds
Binbin Xiang, Maciej Wielgosz, Stefano Puliti et al.
vesselFM: A Foundation Model for Universal 3D Blood Vessel Segmentation
Bastian Wittmann, Yannick Wattenberg, Tamaz Amiranashvili et al.
Rashomon Sets for Prototypical-Part Networks: Editing Interpretable Models in Real-Time
Jon Donnelly, Zhicheng Guo, Alina Jade Barnett et al.
Beyond Sequence: Impact of Geometric Context for RNA Property Prediction
Junjie Xu, Artem Moskalev, Tommaso Mansi et al.
Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects
Weimin Qiu, Jieke Wang, Meng Tang
AFL: A Single-Round Analytic Approach for Federated Learning with Pre-trained Models
Run He, Kai Tong, Di Fang et al.
VidChain: Chain-of-Tasks with Metric-based Direct Preference Optimization for Dense Video Captioning
Ji Soo Lee, Jongha Kim, Jeehye Na et al.
EdgeTAM: On-Device Track Anything Model
Chong Zhou, Chenchen Zhu, Yunyang Xiong et al.
Scaling Laws for Gradient Descent and Sign Descent for Linear Bigram Models under Zipf’s Law
Frederik Kunstner, Francis Bach
PhysX-3D: Physical-Grounded 3D Asset Generation
Ziang Cao, Zhaoxi Chen, Liang Pan et al.
LoR-VP: Low-Rank Visual Prompting for Efficient Vision Model Adaptation
Can Jin, Ying Li, Mingyu Zhao et al.
Near, far: Patch-ordering enhances vision foundation models' scene understanding
Valentinos Pariza, Mohammadreza Salehi, Gertjan J Burghouts et al.
Decoding Game: On Minimax Optimality of Heuristic Text Generation Strategies
Sijin Chen, Omar Hagrass, Jason Klusowski
Energy Matching: Unifying Flow Matching and Energy-Based Models for Generative Modeling
Michal Balcerak, Tamaz Amiranashvili, Antonio Terpin et al.
DataEnvGym: Data Generation Agents in Teacher Environments with Student Feedback
Zaid Khan, Elias Stengel-Eskin, Jaemin Cho et al.
On the Expressiveness of Rational ReLU Neural Networks With Bounded Depth
Gennadiy Averkov, Christopher Hojny, Maximilian Merkert
Right Now, Wrong Then: Non-Stationary Direct Preference Optimization under Preference Drift
Seongho Son, William Bankes, Sayak Ray Chowdhury et al.
UltraFusion: Ultra High Dynamic Imaging using Exposure Fusion
Zixuan Chen, Yujin Wang, Xin Cai et al.
CONTRA: Conformal Prediction Region via Normalizing Flow Transformation
Zhenhan FANG, Aixin Tan, Jian Huang
StateSpaceDiffuser: Bringing Long Context to Diffusion World Models
Nedko Savov, Naser Kazemi, Deheng Zhang et al.
Information-Driven Design of Imaging Systems
Henry Pinkard, Leyla Kabuli, Eric Markley et al.
Chain-of-region: Visual Language Models Need Details for Diagram Analysis
Xue Li, Yiyou Sun, Wei Cheng et al.
RAD: Region-Aware Diffusion Models for Image Inpainting
Sora Kim, Sungho Suh, Minsik Lee
Not all solutions are created equal: An analytical dissociation of functional and representational similarity in deep linear neural networks
Lukas Braun, Erin Grant, Andrew Saxe
As large as it gets – Studying Infinitely Large Convolutions via Neural Implicit Frequency Filters
Margret Keuper, Julia Grabinski, Janis Keuper
Show and Segment: Universal Medical Image Segmentation via In-Context Learning
Yunhe Gao, Di Liu, Zhuowei Li et al.
Efficient Multi-modal Large Language Models via Progressive Consistency Distillation
Zichen Wen, Shaobo Wang, Yufa Zhou et al.
GeoSplatting: Towards Geometry Guided Gaussian Splatting for Physically-based Inverse Rendering
Kai Ye, Chong Gao, Guanbin Li et al.
MANTRA: The Manifold Triangulations Assemblage
Rubén Ballester, Ernst Roell, Daniel Bin Schmid et al.
BANet: Bilateral Aggregation Network for Mobile Stereo Matching
Gangwei Xu, Jiaxin Liu, Xianqi Wang et al.
Seeing Eye to AI: Human Alignment via Gaze-Based Response Rewards for Large Language Models
Ángela López-Cardona, Carlos Segura, Alexandros Karatzoglou et al.
DynamicVL: Benchmarking Multimodal Large Language Models for Dynamic City Understanding
Weihao Xuan, Junjue Wang, Heli Qi et al.
Adaptive Unimodal Regulation for Balanced Multimodal Information Acquisition
Chengxiang Huang, Yake Wei, Zequn Yang et al.
Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning
Joey Hong, Anca Dragan, Sergey Levine
V2C-CBM: Building Concept Bottlenecks with Vision-to-Concept Tokenizer
Hangzhou He, Lei Zhu, Xinliang Zhang et al.
Orchid: Image Latent Diffusion for Joint Appearance and Geometry Generation
Akshay Krishnan, Xinchen Yan, Vincent Casser et al.
Quantum-PEFT: Ultra parameter-efficient fine-tuning
Toshiaki Koike-Akino, Francesco Tonin, Yongtao Wu et al.
Sparse2DGS: Geometry-Prioritized Gaussian Splatting for Surface Reconstruction from Sparse Views
Jiang Wu, Rui Li, Yu Zhu et al.
SAM-R1: Leveraging SAM for Reward Feedback in Multimodal Segmentation via Reinforcement Learning
Jiaqi Huang, Zunnan Xu, Jun Zhou et al.
TopoDiffusionNet: A Topology-aware Diffusion Model
Saumya Gupta, Dimitris Samaras, Chao Chen
SmartRAG: Jointly Learn RAG-Related Tasks From the Environment Feedback
Jingsheng Gao, Linxu Li, Ke Ji et al.
Joint Out-of-Distribution Filtering and Data Discovery Active Learning
Sebastian Schmidt, Leonard Schenk, Leo Schwinn et al.
Generative Zero-Shot Composed Image Retrieval
Lan Wang, Wei Ao, Vishnu Naresh Boddeti et al.
Interactive Speculative Planning: Enhance Agent Efficiency through Co-design of System and User Interface
Wenyue Hua, Mengting Wan, JAGANNATH VADREVU et al.
Scaling Embedding Layers in Language Models
Da Yu, Edith Cohen, Badih Ghazi et al.
Whole-Body Conditioned Egocentric Video Prediction
Yutong Bai, Danny Tran, Amir Bar et al.
LOMA: Language-assisted Semantic Occupancy Network via Triplane Mamba
Yubo Cui, Zhiheng Li, Jiaqiang Wang et al.
SeqGrowGraph: Learning Lane Topology as a Chain of Graph Expansions
Mengwei Xie, Shuang Zeng, Xinyuan Chang et al.
SoMA: Singular Value Decomposed Minor Components Adaptation for Domain Generalizable Representation Learning
Seokju Yun, Seunghye Chae, Dongheon Lee et al.
Federated Residual Low-Rank Adaption of Large Language Models
Yunlu Yan, Chun-Mei Feng, Wangmeng Zuo et al.
CoopTrack: Exploring End-to-End Learning for Efficient Cooperative Sequential Perception
Jiaru Zhong, Jiahao Wang, Jiahui Xu et al.
Revisiting Random Walks for Learning on Graphs
Jinwoo Kim, Olga Zaghen, Ayhan Suleymanzade et al.