Most Cited 2025 &quot;causal perspective&quot; Papers

ICCV 2025arXiv:2503.21851

#11002

On Large Multimodal Models as Open-World Image Classifiers

Alessandro Conti, Massimiliano Mancini, Enrico Fini et al.

NEURIPS 2025arXiv:2502.06536

#11003

Sample-efficient Learning of Concepts with Theoretical Guarantees: from Data to Concepts without Interventions

Hidde Fokkema, Tim van Erven, Sara Magliacane

CVPR 2025arXiv:2405.16226

#11004

Detecting Adversarial Data Using Perturbation Forgery

Qian Wang, Chen Li, Yuchen Luo et al.

NEURIPS 2025oralarXiv:2510.20685

#11005

C-NAV: Towards Self-Evolving Continual Object Navigation in Open World

MingMing Yu, Fei Zhu, Wenzhuo Liu et al.

CVPR 2025arXiv:2503.15211

#11006

GO-N3RDet: Geometry Optimized NeRF-enhanced 3D Object Detector

Zechuan Li, Hongshan Yu, Yihao Ding et al.

NEURIPS 2025arXiv:2411.13730

#11007

Replicable Online Learning

Saba Ahmadi, Siddharth Bhandari, Avrim Blum

NEURIPS 2025arXiv:2505.14512

#11008

Just One Layer Norm Guarantees Stable Extrapolation

Juliusz Ziomek, George Whittle, Michael A Osborne

NEURIPS 2025arXiv:2505.17257

#11009

JanusDNA: A Powerful Bi-directional Hybrid DNA Foundation Model

Qihao Duan, Bingding Huang, Zhenqiao Song et al.

CVPR 2025highlightarXiv:2502.20161

#11010

Balanced Rate-Distortion Optimization in Learned Image Compression

Yichi Zhang, Zhihao Duan, Yuning Huang et al.

ICCV 2025arXiv:2506.23581

#11011

PBCAT: Patch-Based Composite Adversarial Training against Physically Realizable Attacks on Object Detection

Xiao Li, Yiming Zhu, Yifan Huang et al.

NEURIPS 2025arXiv:2501.11866

#11012

Evaluating multiple models using labeled and unlabeled data

Divya Shanmugam, Shuvom Sadhuka, Manish Raghavan et al.

NEURIPS 2025oralarXiv:2412.03054

#11013

TREND: Unsupervised 3D Representation Learning via Temporal Forecasting for LiDAR Perception

Runjian Chen, Hyoungseob Park, Bo Zhang et al.

NEURIPS 2025arXiv:2512.02914

#11014

Martingale Score: An Unsupervised Metric for Bayesian Rationality in LLM Reasoning

Zhonghao He, Tianyi (Alex) Qiu, Hirokazu Shirado et al.

ICCV 2025highlightarXiv:2507.23284

#11015

Bidirectional Likelihood Estimation with Multi-Modal Large Language Models for Text-Video Retrieval

Dohwan Ko, Ji Soo Lee, Minhyuk Choi et al.

CVPR 2025arXiv:2504.14860

#11016

Bridge the Gap: From Weak to Full Supervision for Temporal Action Localization with PseudoFormer

Ziyi Liu, Yangcen Liu

ICCV 2025arXiv:2503.17358

#11017

Image as an IMU: Estimating Camera Motion from a Single Motion-Blurred Image

Jerred Chen, Ronald Clark

CVPR 2025highlightarXiv:2503.02745

#11018

ArcPro: Architectural Programs for Structured 3D Abstraction of Sparse Points

Qirui Huang, Runze Zhang, Kangjun Liu et al.

NEURIPS 2025oralarXiv:2506.15507

#11019

Over-squashing in Spatiotemporal Graph Neural Networks

Ivan Marisca, Jacob Bamberger, Cesare Alippi et al.

NEURIPS 2025oralarXiv:2505.16602

#11020

MEgoHand: Multimodal Egocentric Hand-Object Interaction Motion Generation

Bohan Zhou, Yi Zhan, Zhongbin Zhang et al.

CVPR 2025arXiv:2503.21140

#11021

Recurrent Feature Mining and Keypoint Mixup Padding for Category-Agnostic Pose Estimation

Junjie Chen, Weilong Chen, Yifan Zuo et al.

CVPR 2025highlightarXiv:2412.00932

#11022

FIction: 4D Future Interaction Prediction from Video

Kumar Ashutosh, Georgios Pavlakos, Kristen Grauman

ICCV 2025arXiv:2408.06569

#11023

Social Debiasing for Fair Multi-modal LLMs

Harry Cheng, Yangyang Guo, Qingpei Guo et al.

CVPR 2025arXiv:2505.20764

#11024

ConText-CIR: Learning from Concepts in Text for Composed Image Retrieval

Eric Xing, Pranavi Kolouju, Robert Pless et al.

ICCV 2025arXiv:2506.17213

#11025

Long-term Traffic Simulation with Interleaved Autoregressive Motion and Scenario Generation

Xiuyu Yang, Shuhan Tan, Philipp Kraehenbuehl

NEURIPS 2025arXiv:2508.18413

#11026

Parallelizing MCMC Across the Sequence Length

David Zoltowski, Skyler Wu, Xavier Gonzalez et al.

#11027

Mechanism and Emergence of Stacked Attention Heads in Multi-Layer Transformers

Tiberiu Mușat

ICCV 2025arXiv:2507.13260

#11028

Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy

Yiting Yang, Hao Luo, Yuan Sun et al.

NEURIPS 2025arXiv:2506.01420

#11029

Self-Refining Language Model Anonymizers via Adversarial Distillation

Kyuyoung Kim, Hyunjun Jeon, Jinwoo Shin

NEURIPS 2025arXiv:2505.24848

#11030

Reading Recognition in the Wild

Charig Yang, Samiul Alam, Shakhrul Iman Siam et al.

CVPR 2025arXiv:2505.16971

#11031

UniPhy: Learning a Unified Constitutive Model for Inverse Physics Simulation

Himangi Mittal, Peiye Zhuang, Hsin-Ying Lee et al.

NEURIPS 2025arXiv:2410.04039

#11032

BlockScan: Detecting Anomalies in Blockchain Transactions

Jiahao Yu, Xian Wu, Hao Liu et al.

CVPR 2025arXiv:2404.00916

#11033

Gyro-based Neural Single Image Deblurring

Heemin Yang, Jaesung Rim, Seungyong Lee et al.

ICCV 2025arXiv:2507.03304

#11034

Bridging Domain Generalization to Multimodal Domain Generalization via Unified Representations

Hai Huang, Yan Xia, Sashuai Zhou et al.

ICCV 2025arXiv:2505.14414

#11035

Diving into the Fusion of Monocular Priors for Generalized Stereo Matching

Chengtang Yao, Lidong Yu, Zhidan Liu et al.

ICCV 2025arXiv:2506.18520

#11036

Enhancing Image Restoration Transformer via Adaptive Translation Equivariance

JiaKui Hu, Zhengjian Yao, Lujia Jin et al.

ICCV 2025highlightarXiv:2503.16616

#11037

Progressive Test Time Energy Adaptation for Medical Image Segmentation

Xiaoran Zhang, Byung-Woo Hong, Hyoungseob Park et al.

NEURIPS 2025spotlightarXiv:2505.18698

#11038

MonarchAttention: Zero-Shot Conversion to Fast, Hardware-Aware Structured Attention

Can Yaras, Alec Xu, Pierre Abillama et al.

NEURIPS 2025arXiv:2506.10963

#11039

MMMG: A Massive, Multidisciplinary, Multi-Tier Generation Benchmark for Text-to-Image Reasoning

Yuxuan Luo, Ryan Yuan, Junwen Chen et al.

CVPR 2025arXiv:2502.10060

#11040

DiSciPLE: Learning Interpretable Programs for Scientific Visual Discovery

Utkarsh Mall, Cheng Perng Phoo, Mia Chiquier et al.

ICCV 2025arXiv:2503.11167

#11041

Neurons: Emulating the Human Visual Cortex Improves Fidelity and Interpretability in fMRI-to-Video Reconstruction

Haonan Wang, Qixiang ZHANG, Lehan Wang et al.

NEURIPS 2025arXiv:2502.13119

#11042

STEER-ME: Assessing the Microeconomic Reasoning of Large Language Models

Narun Raman, Taylor Lundy, Thiago Amin et al.

CVPR 2025arXiv:2506.03512

#11043

EDCFlow: Exploring Temporally Dense Difference Maps for Event-based Optical Flow Estimation

Daikun Liu, Lei Cheng, Teng Wang et al.

CVPR 2025arXiv:2503.12124

#11044

Z-Magic: Zero-shot Multiple Attributes Guided Image Creator

Yingying Deng, Xiangyu He, Fan Tang et al.

NEURIPS 2025arXiv:2506.04224

#11045

Seeing in the Dark: Benchmarking Egocentric 3D Vision with the Oxford Day-and-Night Dataset

Zirui Wang, Wenjing Bian, Xinghui Li et al.

ICLR 2025arXiv:2402.14648

#11046

Rethinking Invariance Regularization in Adversarial Training to Improve Robustness-Accuracy Trade-off

Futa Waseda, Ching-Chun Chang, Isao Echizen

ICCV 2025arXiv:2507.04769

#11047

From Imitation to Innovation: The Emergence of AI's Unique Artistic Styles and the Challenge of Copyright Protection

Zexi Jia, Chuanwei Huang, Hongyan Fei et al.

#11048

Parameterized Blur Kernel Prior Learning for Local Motion Deblurring

Zhenxuan Fang, Fangfang Wu, Tao Huang et al.

ICCV 2025highlightarXiv:2507.19875

#11049

ATCTrack: Aligning Target-Context Cues with Dynamic Target States for Robust Vision-Language Tracking

Xiaokun Feng, Shiyu Hu, Xuchen Li et al.

NEURIPS 2025arXiv:2505.15145

#11050

CineTechBench: A Benchmark for Cinematographic Technique Understanding and Generation

Xinran Wang, Songyu Xu, Shan Xiangxuan et al.

#11051

MQuAKE-Remastered: Multi-Hop Knowledge Editing Can Only Be Advanced with Reliable Evaluations

Shaochen Zhong, Yifan (Louie) Lu, Lize Shao et al.

NEURIPS 2025spotlightarXiv:2506.12421

#11052

Wide-Horizon Thinking and Simulation-Based Evaluation for Real-World LLM Planning with Multifaceted Constraints

Dongjie Yang, Chengqiang Lu, Qimeng Wang et al.

#11053

How To Make Your Cell Tracker Say "I dunno!"

Richard D Paul, Johannes Seiffarth, David Rügamer et al.

CVPR 2025arXiv:2503.19232

#11054

HoGS: Unified Near and Far Object Reconstruction via Homogeneous Gaussian Splatting

Xinpeng Liu, Zeyi Huang, Fumio Okura et al.

NEURIPS 2025arXiv:2509.16456

#11055

GPO: Learning from Critical Steps to Improve LLM Reasoning

Jiahao Yu, Zelei Cheng, Xian Wu et al.

NEURIPS 2025arXiv:2505.14884

#11056

Polar Sparsity: High Throughput Batched LLM Inferencing with Scalable Contextual Sparsity

Susav Shrestha, Bradley Settlemyer, Nikoli Dryden et al.

ICCV 2025arXiv:2507.12998

#11057

Differential-informed Sample Selection Accelerates Multimodal Contrastive Learning

Zihua Zhao, Feng Hong, Mengxi Chen et al.

ICCV 2025arXiv:2509.12145

#11058

Open-ended Hierarchical Streaming Video Understanding with Vision Language Models

Hyolim Kang, Yunsu Park, Youngbeom Yoo et al.

ICCV 2025arXiv:2412.02837

#11059

BATCLIP: Bimodal Online Test-Time Adaptation for CLIP

Sarthak Kumar Maharana, Baoming Zhang, Leonid Karlinsky et al.

CVPR 2025arXiv:2502.19937

#11060

Image Referenced Sketch Colorization Based on Animation Creation Workflow

Dingkun Yan, Xinrui Wang, Zhuoru Li et al.

ICLR 2025oralarXiv:2410.15910

#11061

Diverse Policies Recovering via Pointwise Mutual Information Weighted Imitation Learning

Hanlin Yang, Jian Yao, Weiming Liu et al.

CVPR 2025arXiv:2505.03116

#11062

TimeTracker: Event-based Continuous Point Tracking for Video Frame Interpolation with Non-linear Motion

Haoyue Liu, Jinghan Xu, Yi Chang et al.

NEURIPS 2025arXiv:2412.05718

#11063

RLZero: Direct Policy Inference from Language Without In-Domain Supervision

Harshit Sushil Sikchi, Siddhant Agarwal, Pranaya Jajoo et al.

NEURIPS 2025arXiv:2509.16756

#11064

Discrete Diffusion Models: Novel Analysis and New Sampler Guarantees

Yuchen Liang, Yingbin Liang, Lifeng LAI et al.

ICCV 2025arXiv:2507.17661

#11065

Monocular Semantic Scene Completion via Masked Recurrent Networks

Xuzhi Wang, Xinran Wu, Song Wang et al.

CVPR 2025arXiv:2502.21130

#11066

Fast and Accurate Gigapixel Pathological Image Classification with Hierarchical Distillation Multi-Instance Learning

Jiuyang Dong, Junjun Jiang, Kui Jiang et al.

NEURIPS 2025oralarXiv:2505.12737

#11067

Option-aware Temporally Abstracted Value for Offline Goal-Conditioned Reinforcement Learning

Hongjoon Ahn, Heewoong Choi, Jisu Han et al.

#11068

UNICL-SAM: Uncertainty-Driven In-Context Segmentation with Part Prototype Discovery

Dianmo Sheng, Dongdong Chen, Zhentao Tan et al.

ICLR 2025arXiv:2502.00745

#11069

BEEM: Boosting Performance of Early Exit DNNs using Multi-Exit Classifiers as Experts

Divya Jyoti Bajpai, Manjesh Kumar Hanawal

ICCV 2025arXiv:2508.07747

#11070

Grouped Speculative Decoding for Autoregressive Image Generation

Junhyuk So, Juncheol Shin, Hyunho Kook et al.

NEURIPS 2025arXiv:2510.12489

#11071

CrossAD: Time Series Anomaly Detection with Cross-scale Associations and Cross-window Modeling

Beibu Li, Qichao Shentu, Yang Shu et al.

NEURIPS 2025arXiv:2505.19986

#11072

Regret Analysis of Average-Reward Unichain MDPs via an Actor-Critic Approach

Swetha Ganesh, Vaneet Aggarwal

NEURIPS 2025arXiv:2506.21924

#11073

SPAZER: Spatial-Semantic Progressive Reasoning Agent for Zero-shot 3D Visual Grounding

Zhao Jin, Rong-Cheng Tu, Jingyi Liao et al.

ICCV 2025arXiv:2509.04582

#11074

Inpaint4Drag: Repurposing Inpainting Models for Drag-Based Image Editing via Bidirectional Warping

Jingyi Lu, Kai Han

NEURIPS 2025arXiv:2510.11675

#11075

FACE: Faithful Automatic Concept Extraction

Dipkamal Bhusal, Michael Clifford, Sara Rampazzi et al.

NEURIPS 2025arXiv:2502.19648

#11076

Spectral Analysis of Representational Similarity with Limited Neurons

Hyunmo Kang, Abdulkadir Canatar, SueYeon Chung

NEURIPS 2025arXiv:2510.25970

#11077

SplitFlow: Flow Decomposition for Inversion-Free Text-to-Image Editing

Sung-Hoon Yoon, Minghan Li, Gaspard Beaudouin et al.

CVPR 2025arXiv:2412.16778

#11078

RoomPainter: View-Integrated Diffusion for Consistent Indoor Scene Texturing

Zhipeng Huang, Wangbo Yu, Xinhua Cheng et al.

ICCV 2025arXiv:2509.04833

#11079

PropVG: End-to-End Proposal-Driven Visual Grounding with Multi-Granularity Discrimination

Ming Dai, Wenxuan Cheng, Jiedong Zhuang et al.

ICCV 2025arXiv:2507.19481

#11080

HairCUP: Hair Compositional Universal Prior for 3D Gaussian Avatars

Byungjun Kim, Shunsuke Saito, Giljoo Nam et al.

ICCV 2025arXiv:2405.13337

#11081

Semantic Equitable Clustering: A Simple and Effective Strategy for Clustering Vision Tokens

Qihang Fan, Huaibo Huang, Mingrui Chen et al.

ICCV 2025arXiv:2503.14530

#11082

SAUCE: Selective Concept Unlearning in Vision-Language Models with Sparse Autoencoders

Jiahui Geng, Qing Li

NEURIPS 2025arXiv:2505.13893

#11083

InfiGFusion: Graph-on-Logits Distillation via Efficient Gromov-Wasserstein for Model Fusion

Yuanyi Wang, Zhaoyi Yan, Yiming Zhang et al.

CVPR 2025arXiv:2412.19331

#11084

CALICO: Part-Focused Semantic Co-Segmentation with Large Vision-Language Models

Kiet A. Nguyen, Adheesh Juvekar, Tianjiao Yu et al.

ICCV 2025highlightarXiv:2506.23897

#11085

PriOr-Flow: Enhancing Primitive Panoramic Optical Flow with Orthogonal View

Longliang Liu, Miaojie Feng, Junda Cheng et al.

ICCV 2025arXiv:2411.00144

#11086

Self-Ensembling Gaussian Splatting for Few-Shot Novel View Synthesis

Chen Zhao, Xuan Wang, Tong Zhang et al.

CVPR 2025highlightarXiv:2503.06467

#11087

SP3D: Boosting Sparsely-Supervised 3D Object Detection via Accurate Cross-Modal Semantic Prompts

Shijia Zhao, Qiming Xia, Xusheng Guo et al.

NEURIPS 2025spotlightarXiv:2506.19248

#11088

Inference-Time Reward Hacking in Large Language Models

Hadi Khalaf, Claudio Mayrink Verdun, Alex Oesterling et al.

ICCV 2025arXiv:2507.16240

#11089

Scale Your Instructions: Enhance the Instruction-Following Fidelity of Unified Image Generation Model by Self-Adaptive Attention Scaling

Chao Zhou, Tianyi Wei, Nenghai Yu

CVPR 2025arXiv:2505.23290

#11090

Wav2Sem: Plug-and-Play Audio Semantic Decoupling for 3D Speech-Driven Facial Animation

Hao Li, Ju Dai, Xin Zhao et al.

NEURIPS 2025arXiv:2508.09874

#11091

Memory Decoder: A Pretrained, Plug-and-Play Memory for Large Language Models

Jiaqi Cao, Jiarui Wang, Rubin Wei et al.

NEURIPS 2025oralarXiv:2505.17478

#11092

Simultaneous Modeling of Protein Conformation and Dynamics via Autoregression

Yuning Shen, Lihao Wang, Huizhuo Yuan et al.

NEURIPS 2025oralarXiv:2505.19281

#11093

A Snapshot of Influence: A Local Data Attribution Framework for Online Reinforcement Learning

Yuzheng Hu, Fan Wu, Haotian Ye et al.

NEURIPS 2025arXiv:2502.00657

#11094

LLM Safety Alignment is Divergence Estimation in Disguise

Rajdeep Haldar, Ziyi Wang, Guang Lin et al.

NEURIPS 2025arXiv:2506.18322

#11095

Escaping the SpuriVerse: Can Large Vision-Language Models Generalize Beyond Seen Spurious Correlations?

Yiwei Yang, Chung Peng Lee, Shangbin Feng et al.

NEURIPS 2025arXiv:2506.09417

#11096

ODG: Occupancy Prediction Using Dual Gaussians

Yunxiao Shi, Yinhao Zhu, Herbert Cai et al.

CVPR 2025highlightarXiv:2503.21099

#11097

Learning Class Prototypes for Unified Sparse-Supervised 3D Object Detection

Yun Zhu, Le Hui, Hang Yang et al.

CVPR 2025arXiv:2503.15300

#11098

SUM Parts: Benchmarking Part-Level Semantic Segmentation of Urban Meshes

Weixiao Gao, Liangliang Nan, Hugo Ledoux

CVPR 2025arXiv:2411.16760

#11099

LibraGrad: Balancing Gradient Flow for Universally Better Vision Transformer Attributions

Faridoun Mehri, Mahdieh Baghshah, Mohammad Taher Pilehvar

NEURIPS 2025arXiv:2506.04970

#11100

Bringing SAM to new heights: leveraging elevation data for tree crown segmentation from drone imagery

Mélisande Teng, Arthur Ouaknine, Etienne Laliberté et al.

#11101

Your Scale Factors are My Weapon: Targeted Bit-Flip Attacks on Vision Transformers via Scale Factor Manipulation

Jialai Wang, Yuxiao Wu, Weiye Xu et al.

NEURIPS 2025arXiv:2507.11847

#11102

Generalized Linear Bandits: Almost Optimal Regret with One-Pass Update

Yu-Jie Zhang, Sheng-An Xu, Peng Zhao et al.

NEURIPS 2025arXiv:2506.11252

#11103

Anti-Aliased 2D Gaussian Splatting

Mae Younes, Adnane Boukhayma

NEURIPS 2025arXiv:2507.07580

#11104

COALA: Numerically Stable and Efficient Framework for Context-Aware Low-Rank Approximation

Uliana Parkina, Maxim Rakhuba

ICCV 2025arXiv:2411.09572

#11105

Dynamic Reconstruction of Hand-Object Interaction with Distributed Force-aware Contact Representation

Zhenjun Yu, Wenqiang Xu, Pengfei Xie et al.

NEURIPS 2025arXiv:2505.15311

#11106

Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning

Yurun Yuan, Fan Chen, Zeyu Jia et al.

NEURIPS 2025arXiv:2411.06568

#11107

Meta-Learning Objectives for Preference Optimization

Carlo Alfano, Silvia Sapora, Jakob Foerster et al.

CVPR 2025highlightarXiv:2412.16604

#11108

OmniSplat: Taming Feed-Forward 3D Gaussian Splatting for Omnidirectional Images with Editable Capabilities

Suyoung Lee, JAEYOUNG CHUNG, Kihoon Kim et al.

CVPR 2025arXiv:2502.20685

#11109

EDM: Equirectangular Projection-Oriented Dense Kernelized Feature Matching

Dongki Jung, Jaehoon Choi, Yonghan Lee et al.

CVPR 2025arXiv:2506.03714

#11110

FSHNet: Fully Sparse Hybrid Network for 3D Object Detection

Shuai Liu, Mingyue Cui, Boyang Li et al.

NEURIPS 2025arXiv:2503.07539

#11111

XIFBench: Evaluating Large Language Models on Multilingual Instruction Following

Zhenyu Li, Kehai Chen, Yunfei Long et al.

CVPR 2025arXiv:2504.00660

#11112

Learning to Normalize on the SPD Manifold under Bures-Wasserstein Geometry

Rui Wang, Shaocheng Jin, Ziheng Chen et al.

NEURIPS 2025arXiv:2506.21710

#11113

FOCUS: Internal MLLM Representations for Efficient Fine-Grained Visual Question Answering

Liangyu Zhong, Fabio Philipp Rosenthal, Joachim Sicking et al.

NEURIPS 2025spotlightarXiv:2505.16969

#11114

3D Equivariant Visuomotor Policy Learning via Spherical Projection

Boce Hu, Dian Wang, David Klee et al.

NEURIPS 2025spotlightarXiv:2504.15863

#11115

DERD-Net: Learning Depth from Event-based Ray Densities

Diego de Oliveira Hitzges, Suman Ghosh, Guillermo Gallego

NEURIPS 2025arXiv:2510.18707

#11116

OmniCast: A Masked Latent Diffusion Model for Weather Forecasting Across Time Scales

Tung Nguyen, Tuan Pham, Troy Arcomano et al.

#11117

Gradient Multi-Normalization for Efficient LLM Training

Meyer Scetbon, Chao Ma, Wenbo Gong et al.

NEURIPS 2025

NEURIPS 2025arXiv:2505.05226

#11118

Put CASH on Bandits: A Max K-Armed Problem for Automated Machine Learning

Amir Rezaei Balef, Claire Vernade, Katharina Eggensperger

CVPR 2025arXiv:2406.05704

#11119

Hierarchical Features Matter: A Deep Exploration of Progressive Parameterization Method for Dataset Distillation

Xinhao Zhong, Hao Fang, Bin Chen et al.

NEURIPS 2025arXiv:2505.18585

#11120

RvLLM: LLM Runtime Verification with Domain Knowledge

Yedi Zhang, Sun Emma, Annabelle En et al.

#11121

PS-Diffusion: Photorealistic Subject-Driven Image Editing with Disentangled Control and Attention

Weicheng Wang, Guoli Jia, Zhongqi Zhang et al.

NEURIPS 2025arXiv:2506.05568

#11122

Ravan: Multi-Head Low-Rank Adaptation for Federated Fine-Tuning

Arian Raje, Baris Askin, Divyansh Jhunjhunwala et al.

NEURIPS 2025arXiv:2507.03279

#11123

Conformal Information Pursuit for Interactively Guiding Large Language Models

Kwan Ho Ryan Chan, Yuyan Ge, Edgar Dobriban et al.

#11124

A machine learning approach that beats Rubik's cubes

Alexander Chervov, Kirill Khoruzhii, Nikita Bukhal et al.

NEURIPS 2025spotlight

NEURIPS 2025oralarXiv:2505.20744

#11125

MoPFormer: Motion-Primitive Transformer for Wearable-Sensor Activity Recognition

Hao Zhang, Zhan Zhuang, Xuehao Wang et al.

NEURIPS 2025spotlightarXiv:2507.13383

#11126

Whose View of Safety? A Deep DIVE Dataset for Pluralistic Alignment of Text-to-Image Models

Charvi Rastogi, Tian Huey Teh, Pushkar Mishra et al.

NEURIPS 2025arXiv:2506.13464

#11127

Unveiling the Learning Mind of Language Models: A Cognitive Framework and Empirical Study

Zhengyu Hu, Jianxun Lian, Zheyuan Xiao et al.

CVPR 2025arXiv:2503.07597

#11128

HumanMM: Global Human Motion Recovery from Multi-shot Videos

Yuhong Zhang, Guanlin Wu, Ling-Hao Chen et al.

ICCV 2025arXiv:2503.09447

#11129

Online Language Splatting

Saimouli Katragadda, Cho-Ying Wu, Yuliang Guo et al.

NEURIPS 2025oralarXiv:2509.10687

#11130

Stable Part Diffusion 4D: Multi-View RGB and Kinematic Parts Video Generation

Hao Zhang, Chun-Han Yao, Simon Donné et al.

NEURIPS 2025oralarXiv:2505.18110

#11131

Watch and Listen: Understanding Audio-Visual-Speech Moments with Multimodal LLM

Zinuo Li, Xian Zhang, Yongxin Guo et al.

#11132

FeedEdit: Text-Based Image Editing with Dynamic Feedback Regulation

Fengyi Fu, Lei Zhang, Mengqi Huang et al.

NEURIPS 2025arXiv:2506.17252

#11133

Adaptive Batch-Wise Sample Scheduling for Direct Preference Optimization

Zixuan Huang, Yikun Ban, Lean Fu et al.

NEURIPS 2025spotlightarXiv:2507.07860

#11134

THUNDER: Tile-level Histopathology image UNDERstanding benchmark

Pierre Marza, Leo Fillioux, Sofiène Boutaj et al.

CVPR 2025arXiv:2407.18914

#11135

Floating No More: Object-Ground Reconstruction from a Single Image

Yunze Man, Yichen Sheng, Jianming Zhang et al.

NEURIPS 2025spotlightarXiv:2506.02444

#11136

SViMo: Synchronized Diffusion for Video and Motion Generation in Hand-object Interaction Scenarios

Lingwei Dang, Ruizhi Shao, Hongwen Zhang et al.

NEURIPS 2025spotlightarXiv:2510.24709

#11137

Does Object Binding Naturally Emerge in Large Pretrained Vision Transformers?

Yihao Li, Saeed Salehi, Lyle Ungar et al.

#11138

GG-SSMs: Graph-Generating State Space Models

Nikola Zubic, Davide Scaramuzza

NEURIPS 2025arXiv:2501.19215

#11139

Strassen Attention, Split VC Dimension and Compositionality in Transformers

Alexander Kozachinskiy, Felipe Urrutia, Hector Orellana et al.

CVPR 2025arXiv:2408.08568

#11140

DV-Matcher: Deformation-based Non-rigid Point Cloud Matching Guided by Pre-trained Visual Features

Zhangquan Chen, Puhua Jiang, Ruqi Huang

NEURIPS 2025arXiv:2510.21003

#11141

Distilled Decoding 2: One-step Sampling of Image Auto-regressive Models with Conditional Score Distillation

Enshu Liu, Qian Chen, Xuefei Ning et al.

NEURIPS 2025arXiv:2509.15185

#11142

Understand Before You Generate: Self-Guided Training for Autoregressive Image Generation

Xiaoyu Yue, ZiDong Wang, Yuqing Wang et al.

ICCV 2025arXiv:2507.02813

#11143

LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion

Fangfu Liu, Hao Li, Jiawei Chi et al.

CVPR 2025arXiv:2505.16376

#11144

DeCafNet: Delegate and Conquer for Efficient Temporal Grounding in Long Videos

Zijia Lu, ASM Iftekhar, Gaurav Mittal et al.

NEURIPS 2025arXiv:2507.01513

#11145

SafePTR: Token-Level Jailbreak Defense in Multimodal LLMs via Prune-then-Restore Mechanism

Beitao Chen, Xinyu Lyu, shengming yuan et al.

NEURIPS 2025arXiv:2502.03429

#11146

On Fairness of Unified Multimodal Large Language Model for Image Generation

Ming Liu, Hao Chen, Jindong Wang et al.

NEURIPS 2025arXiv:2502.13177

#11147

KL Penalty Control via Perturbation for Direct Preference Optimization

Sangkyu Lee, Janghoon Han, Hosung Song et al.

CVPR 2025arXiv:2503.22725

#11148

Uncertainty Weighted Gradients for Model Calibration

Jinxu Lin, Linwei Tao, Minjing Dong et al.

NEURIPS 2025arXiv:2506.04193

#11149

Understanding challenges to the interpretation of disaggregated evaluations of algorithmic fairness

Stephen Pfohl, Natalie Harris, Chirag Nagpal et al.

NEURIPS 2025arXiv:2509.23492

#11150

Orientation-anchored Hyper-Gaussian for 4D Reconstruction from Casual Videos

Junyi Wu, Jiachen Tao, Haoxuan Wang et al.

NEURIPS 2025arXiv:2506.14763

#11151

RobotSmith: Generative Robotic Tool Design for Acquisition of Complex Manipulation Skills

Chunru Lin, Haotian Yuan, Yian Wang et al.

CVPR 2025arXiv:2406.04746

#11152

PQPP: A Joint Benchmark for Text-to-Image Prompt and Query Performance Prediction

Eduard Poesina, Adriana Valentina Costache, Adrian-Gabriel Chifu et al.

NEURIPS 2025oralarXiv:2501.13457

#11153

Zero-Shot Trajectory Planning for Signal Temporal Logic Tasks

Ruijia Liu, Ancheng Hou, Xiao Yu et al.

CVPR 2025arXiv:2411.16503

#11154

Noise Diffusion for Enhancing Semantic Faithfulness in Text-to-Image Synthesis

Boming Miao, Chunxiao Li, Xiaoxiao Wang et al.

NEURIPS 2025arXiv:2502.01609

#11155

Adaptive Distraction: Probing LLM Contextual Robustness with Automated Tree Search

Yanbo Wang, Zixiang Xu, Yue Huang et al.

NEURIPS 2025arXiv:2507.01649

#11156

GradMetaNet: An Equivariant Architecture for Learning on Gradients

Yoav Gelberg, Yam Eitan, Aviv Navon et al.

CVPR 2025arXiv:2407.01330

#11157

A Lightweight UDF Learning Framework for 3D Reconstruction Based on Local Shape Functions

Jiangbei Hu, Yanggeng Li, Fei Hou et al.

ICCV 2025arXiv:2412.01562

#11158

Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle

Miroslav Purkrabek, Jiri Matas

NEURIPS 2025arXiv:2503.20117

#11159

Exact and Linear Convergence for Federated Learning under Arbitrary Client Participation is Attainable

Bicheng Ying, Zhe Li, Haibo Yang

ICCV 2025arXiv:2508.21222

#11160

Generalizable Object Re-Identification via Visual In-Context Prompting

Zhizhong Huang, Xiaoming Liu

CVPR 2025arXiv:2503.00495

#11161

Towards High-fidelity 3D Talking Avatar with Personalized Dynamic Texture

Xuanchen Li, Jianyu Wang, Yuhao Cheng et al.

NEURIPS 2025arXiv:2506.13274

#11162

AdaLRS: Loss-Guided Adaptive Learning Rate Search for Efficient Foundation Model Pretraining

Hongyuan Dong, Dingkang Yang, Xiao Liang et al.

ICLR 2025arXiv:2402.03664

#11163

Partial Gromov-Wasserstein Metric

Yikun Bai, Rocio Diaz Martin, Abihith Kothapalli et al.

CVPR 2025arXiv:2503.16068

#11164

PoseTraj: Pose-Aware Trajectory Control in Video Diffusion

longbin ji, Lei Zhong, Pengfei Wei et al.

NEURIPS 2025oralarXiv:2510.04908

#11165

How Different from the Past? Spatio-Temporal Time Series Forecasting with Self-Supervised Deviation Learning

Haotian Gao, Zheng Dong, Jiawei Yong et al.

CVPR 2025highlightarXiv:2503.02261

#11166

Volume Tells: Dual Cycle-Consistent Diffusion for 3D Fluorescence Microscopy De-noising and Super-Resolution

ZELIN LI, Chenwei Wang, Zhaoke Huang et al.

CVPR 2025highlightarXiv:2504.02697

#11167

Learning Phase Distortion with Selective State Space Models for Video Turbulence Mitigation

Xingguang Zhang, Nicholas M Chimitt, Xijun Wang et al.

NEURIPS 2025arXiv:2502.06597

#11168

Continual Release Moment Estimation with Differential Privacy

Nikita Kalinin, Jalaj Upadhyay, Christoph Lampert

NEURIPS 2025arXiv:2510.24940

#11169

SemCoT: Accelerating Chain-of-Thought Reasoning through Semantically-Aligned Implicit Tokens

Yinhan He, Wendy Zheng, Yaochen Zhu et al.

NEURIPS 2025arXiv:2512.10403

#11170

BRACE: A Benchmark for Robust Audio Caption Quality Evaluation

Tianyu Guo, Hongyu Chen, Hao Liang et al.

NEURIPS 2025arXiv:2505.15602

#11171

Deep learning for continuous-time stochastic control with jumps

Patrick Cheridito, Jean-Loup Dupret, Donatien Hainaut

NEURIPS 2025arXiv:2506.08001

#11172

Reparameterized LLM Training via Orthogonal Equivalence Transformation

Zeju Qiu, Simon Buchholz, Tim Xiao et al.

ICLR 2025arXiv:2502.01692

#11173

Fast Direct: Query-Efficient Online Black-box Guidance for Diffusion-model Target Generation

Kim Yong Tan, YUEMING LYU, Ivor Tsang et al.

#11174

Task Descriptors Help Transformers Learn Linear Models In-Context

Ruomin Huang, Rong Ge

CVPR 2025arXiv:2504.15616

#11175

SocialMOIF: Multi-Order Intention Fusion for Pedestrian Trajectory Prediction

Kai Chen, Xiaodong Zhao, Yujie Huang et al.

NEURIPS 2025arXiv:2506.16802

#11176

Seeing What Matters: Generalizable AI-generated Video Detection with Forensic-Oriented Augmentation

Riccardo Corvi, Davide Cozzolino, Ekta Prashnani et al.

NEURIPS 2025arXiv:2508.20326

#11177

Stochastic Gradients under Nuisances

Facheng Yu, Ronak Mehta, Alex Luedtke et al.

#11178

Triad: Empowering LMM-based Anomaly Detection with Expert-guided Region-of-Interest Tokenizer and Manufacturing Process

Yuanze Li, Shihao Yuan, Haolin Wang et al.

ICCV 2025arXiv:2506.21549

#11179

SiM3D: Single-instance Multiview Multimodal and Multisetup 3D Anomaly Detection Benchmark

Alex Costanzino, Pierluigi Zama Ramirez, Luigi Lella et al.

ICCV 2025arXiv:2507.23784

#11180

SUB: Benchmarking CBM Generalization via Synthetic Attribute Substitutions

Jessica Bader, Leander Girrbach, Stephan Alaniz et al.

ICCV 2025arXiv:2506.05342

#11181

Refer to Any Segmentation Mask Group With Vision-Language Prompts

Shengcao Cao, Zijun Wei, Jason Kuen et al.

ICCV 2025arXiv:2503.12764

#11182

Decouple to Reconstruct: High Quality UHD Restoration via Active Feature Disentanglement and Reversible Fusion

Yidi Liu, Dong Li, Yuxin Ma et al.

ICCV 2025highlightarXiv:2507.23771

#11183

Consensus-Driven Active Model Selection

Justin Kay, Grant Horn, Subhransu Maji et al.

ICCV 2025arXiv:2410.00204

#11184

OpenAnimals: Revisiting Person Re-Identification for Animals Towards Better Generalization

Saihui Hou, Panjian Huang, Zengbin Wang et al.

#11185

Backdoor Attacks on Neural Networks via One-Bit Flip

Xiang Li, Lannan Luo, Qiang Zeng

ICCV 2025arXiv:2507.04547

#11186

FB-Diff: Fourier Basis-guided Diffusion for Temporal Interpolation of 4D Medical Imaging

Xin You, Runze Yang, Chuyan Zhang et al.

#11187

Describe, Don’t Dictate: Semantic Image Editing with Natural Language Intent

En Ci, Shanyan Guan, Yanhao Ge et al.

#11188

Enhance Multi-View Classification Through Multi-Scale Alignment and Expanded Boundary

Yuena Lin, Yiyuan Wang, Gengyu Lyu et al.

ICCV 2025arXiv:2506.03594

#11189

SplArt: Articulation Estimation and Part-Level Reconstruction with 3D Gaussian Splatting

Shengjie Lin, Jiading Fang, Muhammad Zubair Irshad et al.

ICCV 2025arXiv:2507.09209

#11190

Uncertainty-Driven Expert Control: Enhancing the Reliability of Medical Vision-Language Models

Xiao Liang, Di Wang, Zhicheng Jiao et al.

ICCV 2025arXiv:2506.21017

#11191

Multimodal Prompt Alignment for Facial Expression Recognition

Fuyan Ma, Yiran He, Bin Sun et al.

ICCV 2025arXiv:2411.17845

#11192

CABLD: Contrast-Agnostic Brain Landmark Detection with Consistency-Based Regularization

Soorena Salari, Arash Harirpoush, Hassan Rivaz et al.

ICCV 2025arXiv:2507.09168

#11193

Stable Score Distillation

Haiming Zhu, Yangyang Xu, Chenshu Xu et al.

ICCV 2025arXiv:2506.22099

#11194

BézierGS: Dynamic Urban Scene Reconstruction with Bézier Curve Gaussian Splatting

Zipei Ma, Junzhe Jiang, Yurui Chen et al.

ICCV 2025arXiv:2503.21410

#11195

Diffusion Image Prior

Hamadi Chihaoui, Paolo Favaro

ICCV 2025arXiv:2502.20045

#11196

Text2VDM: Text to Vector Displacement Maps for Expressive and Interactive 3D Sculpting

Hengyu Meng, Duotun Wang, Zhijing Shao et al.

ICCV 2025arXiv:2509.20807

#11197

Federated Domain Generalization with Domain-specific Soft Prompts Generation

Jianhan Wu, Xiaoyang Qu, Zhangcheng Huang et al.

ICCV 2025arXiv:2508.07877

#11198

Selective Contrastive Learning for Weakly Supervised Affordance Grounding

WonJun Moon, Hyun Seok Seong, Jae-Pil Heo

ICCV 2025highlightarXiv:2508.00359

#11199

CoST: Efficient Collaborative Perception From Unified Spatiotemporal Perspective

Zongheng Tang, Yi Liu, Yifan Sun et al.

ICCV 2025arXiv:2508.09000

#11200

UniConvNet: Expanding Effective Receptive Field while Maintaining Asymptotically Gaussian Distribution for ConvNets of Any Scale

Yuhao Wang, Wei Xi