Most Cited 2025 "deep generative model" Papers

22,274 papers found • Page 44 of 112

#8601

LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale

Joya Chen, Yiqi Lin, Ziyun Zeng et al.

CVPR 2025arXiv:2504.16030
4
citations
#8602

Simplification Is All You Need against Out-of-Distribution Overconfidence

Keke Tang, Chao Hou, Weilong Peng et al.

CVPR 2025
4
citations
#8603

Token Cropr: Faster ViTs for Quite a Few Tasks

Benjamin Bergner, Christoph Lippert, Aravindh Mahendran

CVPR 2025arXiv:2412.00965
4
citations
#8604

SafeVid: Toward Safety Aligned Video Large Multimodal Models

Yixu Wang, Jiaxin Song, Yifeng Gao et al.

NEURIPS 2025arXiv:2505.11926
4
citations
#8605

KORGym: A Dynamic Game Platform for LLM Reasoning Evaluation

Jiajun Shi, Jian Yang, Jiaheng Liu et al.

NEURIPS 2025spotlightarXiv:2505.14552
4
citations
#8606

CADRef: Robust Out-of-Distribution Detection via Class-Aware Decoupled Relative Feature Leveraging

Zhiwei Ling, Yachen Chang, Hailiang Zhao et al.

CVPR 2025arXiv:2503.00325
4
citations
#8607

GenFlowRL: Shaping Rewards with Generative Object-Centric Flow in Visual Reinforcement Learning

Kelin Yu, Sheng Zhang, Harshit Soora et al.

ICCV 2025arXiv:2508.11049
4
citations
#8608

JointDiT: Enhancing RGB-Depth Joint Modeling with Diffusion Transformers

Kwon Byung-Ki, Qi Dai, Lee Hyoseok et al.

ICCV 2025arXiv:2505.00482
4
citations
#8609

GauCho: Gaussian Distributions with Cholesky Decomposition for Oriented Object Detection

Jeffri Erwin Murrugarra Llerena, José Henrique Marques, Claudio Jung

CVPR 2025arXiv:2502.01565
4
citations
#8610

DaCapo: Score Distillation as Stacked Bridge for Fast and High-quality 3D Editing

Yufei Huang, Bangyan Liao, Yuqi Hu et al.

CVPR 2025
4
citations
#8611

On the Out-Of-Distribution Generalization of Large Multimodal Models

Xingxuan Zhang, Jiansheng Li, Wenjing Chu et al.

CVPR 2025
4
citations
#8612

Neural Shell Texture Splatting: More Details and Fewer Primitives

Xin Zhang, Anpei Chen, Jincheng Xiong et al.

ICCV 2025arXiv:2507.20200
4
citations
#8613

ViewSRD: 3D Visual Grounding via Structured Multi-View Decomposition

Ronggang Huang, Haoxin Yang, Yan Cai et al.

ICCV 2025arXiv:2507.11261
4
citations
#8614

Spectral Convolutional Conditional Neural Process

Peiman Mohseni, Nick Duffield

NEURIPS 2025
4
citations
#8615

Fact-R1: Towards Explainable Video Misinformation Detection with Deep Reasoning

Fanrui Zhang, Dian Li, Qiang Zhang et al.

NEURIPS 2025arXiv:2505.16836
4
citations
#8616

MARS: A Malignity-Aware Backdoor Defense in Federated Learning

Wei Wan, Ning Yuxuan, Zhicong Huang et al.

NEURIPS 2025arXiv:2509.20383
4
citations
#8617

SPRINT: Enabling Interleaved Planning and Parallelized Execution in Reasoning Models

Emil Biju, Shayan Talaei, Zhemin Huang et al.

NEURIPS 2025arXiv:2506.05745
4
citations
#8618

Permissioned LLMs: Enforcing Access Control in Large Language Models

Bargav Jayaraman, Virendra Marathe, Hamid Mozaffari et al.

NEURIPS 2025arXiv:2505.22860
4
citations
#8619

TrajAgent: An LLM-Agent Framework for Trajectory Modeling via Large-and-Small Model Collaboration

Yuwei Du, Jie Feng, Jie Zhao et al.

NEURIPS 2025arXiv:2410.20445
4
citations
#8620

Advanced Sign Language Video Generation with Compressed and Quantized Multi-Condition Tokenization

Cong Wang, Zexuan Deng, Zhiwei Jiang et al.

NEURIPS 2025oralarXiv:2506.15980
4
citations
#8621

DRoP: Distributionally Robust Data Pruning

Artem Vysogorets, Kartik Ahuja, Julia Kempe

ICLR 2025arXiv:2404.05579
4
citations
#8622

LARGO: Latent Adversarial Reflection through Gradient Optimization for Jailbreaking LLMs

Ran Li, Hao Wang, Chengzhi Mao

NEURIPS 2025arXiv:2505.10838
4
citations
#8623

CASAGPT: Cuboid Arrangement and Scene Assembly for Interior Design

Weitao Feng, Hang Zhou, Jing Liao et al.

CVPR 2025highlightarXiv:2504.19478
4
citations
#8624

AttentionPredictor: Temporal Patterns Matter for KV Cache Compression

Qingyue Yang, Jie Wang, Xing Li et al.

NEURIPS 2025oralarXiv:2502.04077
4
citations
#8625

ProtInvTree: Deliberate Protein Inverse Folding with Reward-guided Tree Search

Mengdi Liu, Xiaoxue Cheng, Zhangyang Gao et al.

NEURIPS 2025spotlightarXiv:2506.00925
4
citations
#8626

MaintainCoder: Maintainable Code Generation Under Dynamic Requirements

Zhengren Wang, Rui ling, Chufan Wang et al.

NEURIPS 2025arXiv:2503.24260
4
citations
#8627

Flexible Language Modeling in Continuous Space with Transformer-based Autoregressive Flows

Ruixiang Zhang, Shuangfei Zhai, Jiatao Gu et al.

NEURIPS 2025arXiv:2507.00425
4
citations
#8628

Channel Consistency Prior and Self-Reconstruction Strategy Based Unsupervised Image Deraining

Guanglu Dong, Tianheng Zheng, Yuanzhouhan Cao et al.

CVPR 2025arXiv:2503.18703
4
citations
#8629

End-to-End Vision Tokenizer Tuning

Wenxuan Wang, Fan Zhang, Yufeng Cui et al.

NEURIPS 2025arXiv:2505.10562
4
citations
#8630

ReRAW: RGB-to-RAW Image Reconstruction via Stratified Sampling for Efficient Object Detection on the Edge

Radu Berdan, Beril Besbinar, Christoph Reinders et al.

CVPR 2025arXiv:2503.03782
4
citations
#8631

GIVEPose: Gradual Intra-class Variation Elimination for RGB-based Category-Level Object Pose Estimation

Ziqin Huang, Gu Wang, Chenyangguang Zhang et al.

CVPR 2025arXiv:2503.15110
4
citations
#8632

Towards Effective Federated Graph Foundation Model via Mitigating Knowledge Entanglement

Yinlin Zhu, Xunkai Li, Jishuo Jia et al.

NEURIPS 2025arXiv:2505.12684
4
citations
#8633

NSD-Imagery: A Benchmark Dataset for Extending fMRI Vision Decoding Methods to Mental Imagery

Reese Kneeland, Paul Scotti, Ghislain St-Yves et al.

CVPR 2025highlightarXiv:2506.06898
4
citations
#8634

LayerD: Decomposing Raster Graphic Designs into Layers

Tomoyuki Suzuki, Kang-Jun Liu, Naoto Inoue et al.

ICCV 2025arXiv:2509.25134
4
citations
#8635

Multiscale guidance of protein structure prediction with heterogeneous cryo-EM data

Rishwanth Raghu, Axel Levy, Gordon Wetzstein et al.

NEURIPS 2025arXiv:2506.04490
4
citations
#8636

Statistical inference for Linear Stochastic Approximation with Markovian Noise

Sergey Samsonov, Marina Sheshukova, Eric Moulines et al.

NEURIPS 2025arXiv:2505.19102
4
citations
#8637

Diffusion Model is Effectively Its Own Teacher

Xinyin Ma, Runpeng Yu, Songhua Liu et al.

CVPR 2025
4
citations
#8638

Certifying Language Model Robustness with Fuzzed Randomized Smoothing: An Efficient Defense Against Backdoor Attacks

Bowei He, Lihao Yin, Huiling Zhen et al.

ICLR 2025arXiv:2502.06892
4
citations
#8639

Knowledge Distillation with Multi-granularity Mixture of Priors for Image Super-Resolution

Simiao Li, Yun Zhang, Wei Li et al.

ICLR 2025arXiv:2404.02573
4
citations
#8640

Effortless, Simulation-Efficient Bayesian Inference using Tabular Foundation Models

Julius Vetter, Manuel Gloeckler, Daniel Gedon et al.

NEURIPS 2025arXiv:2504.17660
4
citations
#8641

$\texttt{BetaConform}$: Efficient MAP Estimation of LLM Ensemble Judgment Performance with Prior Transfer

Huaizhi Qu, Inyoung Choi, Zhen Tan et al.

NEURIPS 2025
4
citations
#8642

Boosting Multimodal Learning via Disentangled Gradient Learning

Shicai Wei, Chunbo Luo, Yang Luo

ICCV 2025arXiv:2507.10213
4
citations
#8643

LightSwitch: Multi-view Relighting with Material-guided Diffusion

Yehonathan Litman, Fernando De la Torre, Shubham Tulsiani

ICCV 2025arXiv:2508.06494
4
citations
#8644

3D-MOOD: Lifting 2D to 3D for Monocular Open-Set Object Detection

Yung-Hsu Yang, Luigi Piccinelli, Mattia Segu et al.

ICCV 2025arXiv:2507.23567
4
citations
#8645

PRM: Photometric Stereo based Large Reconstruction Model

Wenhang Ge, Jiantao Lin, Guibao SHEN et al.

ICCV 2025highlightarXiv:2412.07371
4
citations
#8646

Gene Regulatory Network Inference in the Presence of Selection Bias and Latent Confounders

Gongxu Luo, Haoyue Dai, Longkang Li et al.

NEURIPS 2025arXiv:2501.10124
4
citations
#8647

GRAVER: Generative Graph Vocabularies for Robust Graph Foundation Models Fine-tuning

Haonan Yuan, Qingyun Sun, Junhua Shi et al.

NEURIPS 2025arXiv:2511.05592
4
citations
#8648

DSV-LFS: Unifying LLM-Driven Semantic Cues with Visual Features for Robust Few-Shot Segmentation

Amin Karimi, Charalambos Poullis

CVPR 2025arXiv:2503.04006
4
citations
#8649

Characterizing the Expressivity of Fixed-Precision Transformer Language Models

Jiaoda Li, Ryan Cotterell

NEURIPS 2025oralarXiv:2505.23623
4
citations
#8650

ASHiTA: Automatic Scene-grounded HIerarchical Task Analysis

Yun Chang, Leonor Fermoselle, Duy Ta et al.

CVPR 2025arXiv:2504.06553
4
citations
#8651

Convergence and Implicit Bias of Gradient Descent on Continual Linear Classification

Hyunji Jung, Hanseul Cho, Chulhee Yun

ICLR 2025arXiv:2504.12712
4
citations
#8652

HORT: Monocular Hand-held Objects Reconstruction with Transformers

Zerui Chen, Rolandos Alexandros Potamias, Shizhe Chen et al.

ICCV 2025arXiv:2503.21313
4
citations
#8653

AltLoRA: Towards Better Gradient Approximation in Low-Rank Adaptation with Alternating Projections

Xin Yu, Yujia Wang, Jinghui Chen et al.

NEURIPS 2025arXiv:2505.12455
4
citations
#8654

Learning to Discover Regulatory Elements for Gene Expression Prediction

Xingyu Su, Haiyang Yu, Degui Zhi et al.

ICLR 2025arXiv:2502.13991
4
citations
#8655

What You Have is What You Track: Adaptive and Robust Multimodal Tracking

Yuedong Tan, Jiawei Shao, Eduard Zamfir et al.

ICCV 2025arXiv:2507.05899
4
citations
#8656

Conditional Panoramic Image Generation via Masked Autoregressive Modeling

Chaoyang Wang, Xiangtai Li, Lu Qi et al.

NEURIPS 2025arXiv:2505.16862
4
citations
#8657

PlayerOne: Egocentric World Simulator

Yuanpeng Tu, Hao Luo, Xi Chen et al.

NEURIPS 2025oralarXiv:2506.09995
4
citations
#8658

Praxis-VLM: Vision-Grounded Decision Making via Text-Driven Reinforcement Learning

Zhe Hu, Jing Li, Zhongzhu Pu et al.

NEURIPS 2025arXiv:2503.16965
4
citations
#8659

Unbiased Video Scene Graph Generation via Visual and Semantic Dual Debiasing

Yanjun Li, Zhaoyang Li, Honghui Chen et al.

CVPR 2025arXiv:2503.00548
4
citations
#8660

Doppelgangers++: Improved Visual Disambiguation with Geometric 3D Features

Yuanbo Xiangli, Ruojin Cai, Hanyu Chen et al.

CVPR 2025highlightarXiv:2412.05826
4
citations
#8661

SViM3D: Stable Video Material Diffusion for Single Image 3D Generation

Andreas Engelhardt, Mark Boss, Vikram Voleti et al.

ICCV 2025arXiv:2510.08271
4
citations
#8662

Focus-N-Fix: Region-Aware Fine-Tuning for Text-to-Image Generation

Xiaoying Xing, Avinab Saha, Junfeng He et al.

CVPR 2025highlightarXiv:2501.06481
4
citations
#8663

One-Step Diffusion-Based Image Compression with Semantic Distillation

Naifu Xue, Zhaoyang Jia, Jiahao Li et al.

NEURIPS 2025arXiv:2505.16687
4
citations
#8664

EgoThinker: Unveiling Egocentric Reasoning with Spatio-Temporal CoT

Baoqi Pei, Yifei Huang, Jilan Xu et al.

NEURIPS 2025oralarXiv:2510.23569
4
citations
#8665

Training-Free Dataset Pruning for Instance Segmentation

Yalun Dai, Lingao Xiao, Ivor Tsang et al.

ICLR 2025arXiv:2503.00828
4
citations
#8666

Knowledge-based Visual Question Answer with Multimodal Processing, Retrieval and Filtering

yuyang Hong, Jiaqi Gu, Yang Qi et al.

NEURIPS 2025arXiv:2510.14605
4
citations
#8667

Sharp Matrix Empirical Bernstein Inequalities

Hongjian Wang, Aaditya Ramdas

NEURIPS 2025arXiv:2411.09516
4
citations
#8668

A Closer Look at Model Collapse: From a Generalization-to-Memorization Perspective

Lianghe Shi, Meng Wu, Huijie Zhang et al.

NEURIPS 2025spotlightarXiv:2509.16499
4
citations
#8669

FineMotion: A Dataset and Benchmark with both Spatial and Temporal Annotation for Fine-grained Motion Generation and Editing

Bizhu Wu, Jinheng Xie, Meidan Ding et al.

ICCV 2025arXiv:2507.19850
4
citations
#8670

Embracing Contradiction: Theoretical Inconsistency Will Not Impede the Road of Building Responsible AI Systems

Gordon Dai, Yunze Xiao

NEURIPS 2025oralarXiv:2505.18139
4
citations
#8671

Contractive Dynamical Imitation Policies for Efficient Out-of-Sample Recovery

Amin Soleimani Abyaneh, Mahrokh Boroujeni, Hsiu-Chin Lin et al.

ICLR 2025arXiv:2412.07544
4
citations
#8672

VOVTrack: Exploring the Potentiality in Raw Videos for Open-Vocabulary Multi-Object Tracking

Zekun Qian, Ruize Han, Junhui Hou et al.

ICCV 2025
4
citations
#8673

Test-time Adaptation for Foundation Medical Segmentation Model Without Parametric Updates

Kecheng Chen, Xinyu Luo, Tiexin Qin et al.

ICCV 2025highlightarXiv:2504.02008
4
citations
#8674

Self-supervised Learning of Hybrid Part-aware 3D Representations of 2D Gaussians and Superquadrics

Zhirui Gao, Renjiao Yi, Yuhang Huang et al.

ICCV 2025arXiv:2408.10789
4
citations
#8675

Nearly Zero-Cost Protection Against Mimicry by Personalized Diffusion Models

Namhyuk Ahn, KiYoon Yoo, Wonhyuk Ahn et al.

CVPR 2025arXiv:2412.11423
4
citations
#8676

Chain-of-Zoom: Extreme Super-Resolution via Scale Autoregression and Preference Alignment

Bryan Sangwoo Kim, Jeongsol Kim, Jong Chul Ye

NEURIPS 2025spotlightarXiv:2505.18600
4
citations
#8677

SceneMI: Motion In-betweening for Modeling Human-Scene Interaction

Inwoo Hwang, Bing Zhou, Young Min Kim et al.

ICCV 2025highlightarXiv:2503.16289
4
citations
#8678

LISAt: Language-Instructed Segmentation Assistant for Satellite Imagery

Jerome Quenum, Wen-Han Hsieh, Tsung-Han (Patrick) Wu et al.

NEURIPS 2025arXiv:2505.02829
4
citations
#8679

On the Emergence of Linear Analogies in Word Embeddings

Daniel Korchinski, Dhruva Karkada, Yasaman Bahri et al.

NEURIPS 2025arXiv:2505.18651
4
citations
#8680

SCAN: Self-Denoising Monte Carlo Annotation for Robust Process Reward Learning

Yuyang Ding, Xinyu Shi, Juntao Li et al.

NEURIPS 2025arXiv:2509.16548
4
citations
#8681

Precise Event Spotting in Sports Videos: Solving Long-Range Dependency and Class Imbalance

Sanchayan Santra, Vishal Chudasama, Pankaj Wasnik et al.

CVPR 2025arXiv:2503.00147
4
citations
#8682

Protecting Your Video Content: Disrupting Automated Video-based LLM Annotations

Haitong Liu, Kuofeng Gao, Yang Bai et al.

CVPR 2025arXiv:2503.21824
4
citations
#8683

Inference-Scale Complexity in ANN-SNN Conversion for High-Performance and Low-Power Applications

Tong Bu, Maohua Li, Zhaofei Yu

CVPR 2025arXiv:2409.03368
4
citations
#8684

Balanced Image Stylization with Style Matching Score

Yuxin Jiang, Liming Jiang, Shuai Yang et al.

ICCV 2025arXiv:2503.07601
4
citations
#8685

Measure gradients, not activations! Enhancing neuronal activity in deep reinforcement learning

Jiashun Liu, Zihao Wu, Johan Obando Ceron et al.

NEURIPS 2025arXiv:2505.24061
4
citations
#8686

Steady Progress Beats Stagnation: Mutual Aid of Foundation and Conventional Models in Mixed Domain Semi-Supervised Medical Image Segmentation

Qinghe Ma, Jian Zhang, Zekun Li et al.

CVPR 2025arXiv:2503.16997
4
citations
#8687

ATP: Adaptive Threshold Pruning for Efficient Data Encoding in Quantum Neural Networks

Mohamed Afane, Gabrielle Ebbrecht, Ying Wang et al.

CVPR 2025arXiv:2503.21815
4
citations
#8688

GENIUS: A Generative Framework for Universal Multimodal Search

Sungyeon Kim, Xinliang Zhu, Xiaofan Lin et al.

CVPR 2025arXiv:2503.19868
4
citations
#8689

PersonaBooth: Personalized Text-to-Motion Generation

Boeun Kim, Hea In Jeong, JungHoon Sung et al.

CVPR 2025arXiv:2503.07390
4
citations
#8690

dFLMoE: Decentralized Federated Learning via Mixture of Experts for Medical Data Analysis

Luyuan Xie, Tianyu Luan, Wenyuan Cai et al.

CVPR 2025arXiv:2503.10412
4
citations
#8691

Incentivizing LLMs to Self-Verify Their Answers

Fuxiang Zhang, Jiacheng Xu, Chaojie Wang et al.

NEURIPS 2025arXiv:2506.01369
4
citations
#8692

Towards All-in-One Medical Image Re-Identification

Yuan Tian, Kaiyuan Ji, Rongzhao Zhang et al.

CVPR 2025arXiv:2503.08173
4
citations
#8693

Continuous Diffusion Model for Language Modeling

Jaehyeong Jo, Sung Ju Hwang

NEURIPS 2025arXiv:2502.11564
4
citations
#8694

Vision Function Layer in Multimodal LLMs

Cheng Shi, Yizhou Yu, Sibei Yang

NEURIPS 2025arXiv:2509.24791
4
citations
#8695

Stop the Nonconsensual Use of Nude Images in Research

Princessa Cintaqia, Arshia Arya, Elissa Redmiles et al.

NEURIPS 2025oralarXiv:2510.22423
4
citations
#8696

TimeFormer: Capturing Temporal Relationships of Deformable 3D Gaussians for Robust Reconstruction

Dadong Jiang, Zhi Hou, Zhihui Ke et al.

ICCV 2025arXiv:2411.11941
4
citations
#8697

MALinZero: Efficient Low-Dimensional Search for Mastering Complex Multi-Agent Planning

Sizhe Tang, Jiayu Chen, Tian Lan

NEURIPS 2025arXiv:2511.06142
4
citations
#8698

HumanOLAT: A Large-Scale Dataset for Full-Body Human Relighting and Novel-View Synthesis

Timo Teufel, xilong zhou, Umar Iqbal et al.

ICCV 2025arXiv:2508.09137
4
citations
#8699

Efficient Pre-Training of LLMs via Topology-Aware Communication Alignment on More Than 9600 GPUs

Guoliang He, Youhe Jiang, Wencong Xiao et al.

NEURIPS 2025arXiv:2509.15940
4
citations
#8700

DeClotH: Decomposable 3D Cloth and Human Body Reconstruction from a Single Image

Hyeongjin Nam, Donghwan Kim, Jeongtaek Oh et al.

CVPR 2025arXiv:2503.19373
4
citations
#8701

Proxy Target: Bridging the Gap Between Discrete Spiking Neural Networks and Continuous Control

Zijie Xu, Tong Bu, Zecheng Hao et al.

NEURIPS 2025arXiv:2505.24161
4
citations
#8702

CoDA: Coordinated Diffusion Noise Optimization for Whole-Body Manipulation of Articulated Objects

Huaijin Pi, Zhi Cen, Zhiyang Dou et al.

NEURIPS 2025arXiv:2505.21437
4
citations
#8703

Better Estimation of the Kullback--Leibler Divergence Between Language Models

Afra Amini, Tim Vieira, Ryan Cotterell

NEURIPS 2025arXiv:2504.10637
4
citations
#8704

Codifying Character Logic in Role-Playing

Letian Peng, Jingbo Shang

NEURIPS 2025oralarXiv:2505.07705
4
citations
#8705

DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles

Rui Zhao, Weijia Mao, Mike Zheng Shou

CVPR 2025arXiv:2503.03651
4
citations
#8706

Articulate3D: Holistic Understanding of 3D Scenes as Universal Scene Description

Anna-Maria Halacheva, Yang Miao, Jan-Nico Zaech et al.

ICCV 2025arXiv:2412.01398
4
citations
#8707

BridgeDepth: Bridging Monocular and Stereo Reasoning with Latent Alignment

Tongfan Guan, Jiaxin Guo, Chen Wang et al.

ICCV 2025highlightarXiv:2508.04611
4
citations
#8708

Multi-Label Prototype Visual Spatial Search for Weakly Supervised Semantic Segmentation

Songsong Duan, Xi Yang, Nannan Wang

CVPR 2025highlight
4
citations
#8709

Balanced Token Pruning: Accelerating Vision Language Models Beyond Local Optimization

kaiyuan Li, Xiaoyue Chen, Chen Gao et al.

NEURIPS 2025arXiv:2505.22038
4
citations
#8710

USP-Gaussian: Unifying Spike-based Image Reconstruction, Pose Correction and Gaussian Splatting

Kang Chen, Jiyuan Zhang, Zecheng Hao et al.

CVPR 2025highlightarXiv:2411.10504
4
citations
#8711

Auto-Vocabulary Semantic Segmentation

Osman Ülger, Maksymilian Kulicki, Yuki Asano et al.

ICCV 2025arXiv:2312.04539
4
citations
#8712

GECKO: Gigapixel Vision-Concept Contrastive Pretraining in Histopathology

Saarthak Kapse, Pushpak Pati, Srikar Yellapragada et al.

ICCV 2025highlightarXiv:2504.01009
4
citations
#8713

ResGS: Residual Densification of 3D Gaussian for Efficient Detail Recovery

Yanzhe Lyu, Kai Cheng, Kang Xin et al.

ICCV 2025arXiv:2412.07494
4
citations
#8714

Lost in Transmission: When and Why LLMs Fail to Reason Globally

Tobias Schnabel, Kiran Tomlinson, Adith Swaminathan et al.

NEURIPS 2025spotlightarXiv:2505.08140
4
citations
#8715

A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning

Xin Wen, Bingchen Zhao, Yilun Chen et al.

CVPR 2025arXiv:2503.06960
4
citations
#8716

VoteFlow: Enforcing Local Rigidity in Self-Supervised Scene Flow

Yancong Lin, Shiming Wang, Liangliang Nan et al.

CVPR 2025arXiv:2503.22328
4
citations
#8717

Robust Simulation-Based Inference under Missing Data via Neural Processes

Yogesh Verma, Ayush Bharti, Vikas Garg

ICLR 2025arXiv:2503.01287
4
citations
#8718

Sampling from multi-modal distributions with polynomial query complexity in fixed dimension via reverse diffusion

Adrien Vacher, Omar Chehab, Anna Korba

NEURIPS 2025arXiv:2501.00565
4
citations
#8719

Learning 4D Panoptic Scene Graph Generation from Rich 2D Visual Scene

Shengqiong Wu, Hao Fei, Jingkang Yang et al.

CVPR 2025highlightarXiv:2503.15019
4
citations
#8720

Understanding protein function with a multimodal retrieval-augmented foundation model

Timothy Truong Jr, Tristan Bepler

NEURIPS 2025
4
citations
#8721

GAP: Gaussianize Any Point Clouds with Text Guidance

Weiqi Zhang, Junsheng Zhou, Haotian Geng et al.

ICCV 2025arXiv:2508.05631
4
citations
#8722

Pursuing Better Decision Boundaries for Long-Tailed Object Detection via Category Information Amount

Yanbiao Ma, Wei Dai, Jiayi Chen

ICLR 2025arXiv:2502.03852
4
citations
#8723

VideoLLaMB: Long Streaming Video Understanding with Recurrent Memory Bridges

Yuxuan Wang, Yiqi Song, Cihang Xie et al.

ICCV 2025arXiv:2409.01071
4
citations
#8724

EquiTabPFN: A Target-Permutation Equivariant Prior Fitted Network

Michael Arbel, David Salinas, Frank Hutter

NEURIPS 2025arXiv:2502.06684
4
citations
#8725

Brain-Like Processing Pathways Form in Models With Heterogeneous Experts

Jack Cook, Danyal Akarca, Rui Costa et al.

NEURIPS 2025arXiv:2506.02813
4
citations
#8726

PolarFree: Polarization-based Reflection-Free Imaging

Mingde Yao, Menglu Wang, King Man Tam et al.

CVPR 2025arXiv:2503.18055
4
citations
#8727

Can Video LLMs Refuse to Answer? Alignment for Answerability in Video Large Language Models

Eunseop Yoon, Hee Suk Yoon, Mark Hasegawa-Johnson et al.

ICLR 2025arXiv:2507.04976
4
citations
#8728

A Simple Linear Patch Revives Layer-Pruned Large Language Models

Xinrui Chen, Haoli Bai, Tao Yuan et al.

NEURIPS 2025arXiv:2505.24680
4
citations
#8729

Is Limited Participant Diversity Impeding EEG-based Machine Learning?

Philipp Bomatter, Henry Gouk

NEURIPS 2025arXiv:2503.13497
4
citations
#8730

LUNA: Efficient and Topology-Agnostic Foundation Model for EEG Signal Analysis

Berkay Döner, Thorir Mar Ingolfsson, Luca Benini et al.

NEURIPS 2025oralarXiv:2510.22257
4
citations
#8731

Algorithms and SQ Lower Bounds for Robustly Learning Real-valued Multi-Index Models

Ilias Diakonikolas, Giannis Iakovidis, Daniel Kane et al.

NEURIPS 2025spotlightarXiv:2505.21475
4
citations
#8732

DREAM: Drafting with Refined Target Features and Entropy-Adaptive Cross-Attention Fusion for Multimodal Speculative Decoding

Yunhai Hu, Tianhua Xia, Zining Liu et al.

NEURIPS 2025arXiv:2505.19201
4
citations
#8733

Know What You Don't Know: Uncertainty Calibration of Process Reward Models

Young-Jin Park, Kristjan Greenewald, Kaveh Alimohammadi et al.

NEURIPS 2025arXiv:2506.09338
4
citations
#8734

BadVideo: Stealthy Backdoor Attack against Text-to-Video Generation

Ruotong Wang, Mingli Zhu, Jiarong Ou et al.

ICCV 2025arXiv:2504.16907
4
citations
#8735

GenDataAgent: On-the-fly Dataset Augmentation with Synthetic Data

Zhiteng Li, Lele Chen, Jerone Andrews et al.

ICLR 2025
4
citations
#8736

Prompt2Perturb (P2P): Text-Guided Diffusion-Based Adversarial Attack on Breast Ultrasound Images

Yasamin Medghalchi, Moein Heidari, Clayton Allard et al.

CVPR 2025arXiv:2412.09910
4
citations
#8737

Introducing FOReCAst: The Future Outcome Reasoning and Confidence Assessment Benchmark

Zhangdie Yuan, Zifeng Ding, Andreas Vlachos

NEURIPS 2025
4
citations
#8738

Riemannian Flow Matching for Brain Connectivity Matrices via Pullback Geometry

Antoine Collas, Ce Ju, Nicolas Salvy et al.

NEURIPS 2025arXiv:2505.18193
4
citations
#8739

WISE: A Framework for Gigapixel Whole-Slide-Image Lossless Compression

Yu Mao, Jun Wang, Nan Guan et al.

CVPR 2025arXiv:2503.18074
4
citations
#8740

VerbDiff: Text-Only Diffusion Models with Enhanced Interaction Awareness

SeungJu Cha, Kwanyoung Lee, Ye-Chan Kim et al.

CVPR 2025arXiv:2503.16406
4
citations
#8741

SAD Neural Networks: Divergent Gradient Flows and Asymptotic Optimality via o-minimal Structures

Julian Kranz, Davide Gallon, Steffen Dereich et al.

NEURIPS 2025arXiv:2505.09572
4
citations
#8742

Beyond Value Functions: Single-Loop Bilevel Optimization under Flatness Conditions

Liuyuan Jiang, Quan Xiao, Lisha Chen et al.

NEURIPS 2025arXiv:2507.20400
4
citations
#8743

MeToken: Uniform Micro-environment Token Boosts Post-Translational Modification Prediction

Cheng Tan, Zhenxiao Cao, Zhangyang Gao et al.

ICLR 2025arXiv:2411.01856
4
citations
#8744

iManip: Skill-Incremental Learning for Robotic Manipulation

Zexin Zheng, Jia-Feng Cai, Xiao-Ming Wu et al.

ICCV 2025arXiv:2503.07087
4
citations
#8745

Generative Zoo

Tomasz Niewiadomski, Anastasios Yiannakidis, Hanz Cuevas Velasquez et al.

ICCV 2025arXiv:2412.08101
4
citations
#8746

Towards Robust Parameter-Efficient Fine-Tuning for Federated Learning

Xiuwen Fang, Mang Ye

NEURIPS 2025
4
citations
#8747

H-MoRe: Learning Human-centric Motion Representation for Action Analysis

Zhanbo Huang, Xiaoming Liu, Yu Kong

CVPR 2025highlightarXiv:2504.10676
4
citations
#8748

Self-supervised contrastive learning performs non-linear system identification

Rodrigo Gonzalez Laiz, Tobias Schmidt, Steffen Schneider

ICLR 2025oralarXiv:2410.14673
4
citations
#8749

KVCOMM: Online Cross-context KV-cache Communication for Efficient LLM-based Multi-agent Systems

Hancheng Ye, Zhengqi Gao, Mingyuan Ma et al.

NEURIPS 2025arXiv:2510.12872
4
citations
#8750

Position: Towards Bidirectional Human-AI Alignment

Hua Shen, Tiffany Knearem, Reshmi Ghosh et al.

NEURIPS 2025oralarXiv:2406.09264
4
citations
#8751

BIGS: Bimanual Category-agnostic Interaction Reconstruction from Monocular Videos via 3D Gaussian Splatting

Jeongwan On, Kyeonghwan Gwak, Gunyoung Kang et al.

CVPR 2025arXiv:2504.09097
4
citations
#8752

MultiMorph: On-demand Atlas Construction

Mazdak Abulnaga, Andrew Hoopes, Neel Dey et al.

CVPR 2025arXiv:2504.00247
4
citations
#8753

Symmetry Strikes Back: From Single-Image Symmetry Detection to 3D Generation

Xiang Li, Zixuan Huang, Anh Thai et al.

CVPR 2025highlightarXiv:2411.17763
4
citations
#8754

Learning Neural Exposure Fields for View Synthesis

Michael Niemeyer, Fabian Manhardt, Marie-Julie Rakotosaona et al.

NEURIPS 2025arXiv:2510.08279
4
citations
#8755

Reasoning to Attend: Try to Understand How <SEG> Token Works

Rui Qian, Xin Yin, Dejing Dou

CVPR 2025arXiv:2412.17741
4
citations
#8756

Universal Sequence Preconditioning

Annie Marsden, Elad Hazan

NEURIPS 2025spotlightarXiv:2502.06545
4
citations
#8757

End-to-End Implicit Neural Representations for Classification

Alexander Gielisse, Jan van Gemert

CVPR 2025arXiv:2503.18123
4
citations
#8758

3D-RAD: A Comprehensive 3D Radiology Med-VQA Dataset with Multi-Temporal Analysis and Diverse Diagnostic Tasks

Xiaotang Gai, Jiaxiang Liu, Yichen Li et al.

NEURIPS 2025oralarXiv:2506.11147
4
citations
#8759

Dual-view X-ray Detection: Can AI Detect Prohibited Items from Dual-view X-ray Images like Humans?

Renshuai Tao, Haoyu Wang, Yuzhe Guo et al.

CVPR 2025arXiv:2411.18082
4
citations
#8760

Safe and Stable Control via Lyapunov-Guided Diffusion Models

Xiaoyuan Cheng, Xiaohang Tang, Yiming Yang

NEURIPS 2025arXiv:2509.25375
4
citations
#8761

Understanding Multi-Task Activities from Single-Task Videos

Yuhan Shen, Ehsan Elhamifar

CVPR 2025highlight
4
citations
#8762

InternScenes: A Large-scale Simulatable Indoor Scene Dataset with Realistic Layouts

Weipeng Zhong, Peizhou Cao, Yichen Jin et al.

NEURIPS 2025arXiv:2509.10813
4
citations
#8763

Better Training Data Attribution via Better Inverse Hessian-Vector Products

Andrew Wang, Elisa Nguyen, Runshi Yang et al.

NEURIPS 2025arXiv:2507.14740
4
citations
#8764

Improving Time Series Forecasting via Instance-aware Post-hoc Revision

Zhiding Liu, Mingyue Cheng, Guanhao Zhao et al.

NEURIPS 2025arXiv:2505.23583
4
citations
#8765

Secret Lies in Color: Enhancing AI-Generated Images Detection with Color Distribution Analysis

Zexi Jia, Chuanwei Huang, Yeshuang Zhu et al.

CVPR 2025
4
citations
#8766

SURGEON: Memory-Adaptive Fully Test-Time Adaptation via Dynamic Activation Sparsity

Ke Ma, Jiaqi Tang, Bin Guo et al.

CVPR 2025highlightarXiv:2503.20354
4
citations
#8767

CustomKD: Customizing Large Vision Foundation for Edge Model Improvement via Knowledge Distillation

Jungsoo Lee, Debasmit Das, Munawar Hayat et al.

CVPR 2025arXiv:2503.18244
4
citations
#8768

A Large-Scale 3D Face Mesh Video Dataset via Neural Re-parameterized Optimization

Kim Youwang, Lee Hyun, Kim Sung-Bin et al.

ICLR 2025arXiv:2310.03205
4
citations
#8769

SRA-CL: Semantic Retrieval Augmented Contrastive Learning for Sequential Recommendation

Ziqiang Cui, Yunpeng Weng, Xing Tang et al.

NEURIPS 2025
4
citations
#8770

JAMUN: Bridging Smoothed Molecular Dynamics and Score-Based Learning for Conformational Ensemble Generation

Ameya Daigavane, Bodhi Vani, Darcy Davidson et al.

NEURIPS 2025
4
citations
#8771

Plug-and-Play Context Feature Reuse for Efficient Masked Generation

Xuejie Liu, Anji Liu, Guy Van den Broeck et al.

NEURIPS 2025arXiv:2505.19089
4
citations
#8772

Chapter-Llama: Efficient Chaptering in Hour-Long Videos with LLMs

Lucas Ventura, Antoine Yang, Cordelia Schmid et al.

CVPR 2025arXiv:2504.00072
4
citations
#8773

Region-based Cluster Discrimination for Visual Representation Learning

Yin Xie, Kaicheng Yang, Xiang An et al.

ICCV 2025highlightarXiv:2507.20025
4
citations
#8774

VoxelSplat: Dynamic Gaussian Splatting as an Effective Loss for Occupancy and Flow Prediction

Ziyue Zhu, Shenlong Wang, Jin Xie et al.

CVPR 2025arXiv:2506.05563
4
citations
#8775

SpecEdge: Scalable Edge-Assisted Serving Framework for Interactive LLMs

Jinwoo Park, Seunggeun Cho, Dongsu Han

NEURIPS 2025spotlightarXiv:2505.17052
4
citations
#8776

LATTE-MV: Learning to Anticipate Table Tennis Hits from Monocular Videos

Daniel Etaat, Dvij Rajesh Kalaria, Nima Rahmanian et al.

CVPR 2025arXiv:2503.20936
4
citations
#8777

WALL-E: World Alignment by NeuroSymbolic Learning improves World Model-based LLM Agents

Siyu Zhou, Tianyi Zhou, Yijun Yang et al.

NEURIPS 2025
4
citations
#8778

Multi-Token Prediction Needs Registers

Anastasios Gerontopoulos, Spyridon Gidaris, Nikos Komodakis

NEURIPS 2025arXiv:2505.10518
4
citations
#8779

MuTri: Multi-view Tri-alignment for OCT to OCTA 3D Image Translation

zhuangzhuang chen, hualiang wang, Chubin Ou et al.

CVPR 2025arXiv:2504.01428
4
citations
#8780

REVE: A Foundation Model for EEG - Adapting to Any Setup with Large-Scale Pretraining on 25,000 Subjects

Yassine El Ouahidi, Jonathan Lys, Philipp Thölke et al.

NEURIPS 2025oralarXiv:2510.21585
4
citations
#8781

Go With the Flow: Fast Diffusion for Gaussian Mixture Models

George Rapakoulias, Ali Reza Pedram, Fengjiao Liu et al.

NEURIPS 2025spotlightarXiv:2412.09059
4
citations
#8782

Dynamic Multimodal Prototype Learning in Vision-Language Models

Xingyu Zhu, Shuo Wang, Beier Zhu et al.

ICCV 2025arXiv:2507.03657
4
citations
#8783

Memory Mosaics at scale

Jianyu Zhang, Leon Bottou

NEURIPS 2025oralarXiv:2507.03285
4
citations
#8784

Neural Networks Generalize on Low Complexity Data

Sourav Chatterjee, Timothy Sudijono

NEURIPS 2025arXiv:2409.12446
4
citations
#8785

Evaluating Generalization Capabilities of LLM-Based Agents in Mixed-Motive Scenarios Using Concordia

Chandler Smith, Marwa Abdulhai, Manfred Díaz et al.

NEURIPS 2025oralarXiv:2512.03318
4
citations
#8786

Sequential Gaussian Avatars with Hierarchical Motion Context

Wangze Xu, Yifan Zhan, Zhihang Zhong et al.

ICCV 2025arXiv:2411.16768
4
citations
#8787

Scaling Laws For Scalable Oversight

Joshua Engels, David Baek, Subhash Kantamneni et al.

NEURIPS 2025spotlightarXiv:2504.18530
4
citations
#8788

UniGist: Towards General and Hardware-aligned Sequence-level Long Context Compression

Chenlong Deng, Zhisong Zhang, Kelong Mao et al.

NEURIPS 2025arXiv:2509.15763
4
citations
#8789

SPARS3R: Semantic Prior Alignment and Regularization for Sparse 3D Reconstruction

Yutao Tang, Yuxiang Guo, Deming Li et al.

CVPR 2025arXiv:2411.12592
4
citations
#8790

UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning

Ye Liu, Zongyang Ma, Junfu Pu et al.

NEURIPS 2025arXiv:2509.18094
4
citations
#8791

Controllable Weather Synthesis and Removal with Video Diffusion Models

Chih-Hao Lin, Zian Wang, Ruofan Liang et al.

ICCV 2025arXiv:2505.00704
4
citations
#8792

Acknowledging Focus Ambiguity in Visual Questions

Chongyan Chen, Yu-Yun Tseng, Zhuoheng Li et al.

ICCV 2025arXiv:2501.02201
4
citations
#8793

EmoNet-Face: An Expert-Annotated Benchmark for Synthetic Emotion Recognition

Christoph Schuhmann, Robert Kaczmarczyk, Gollam Rabby et al.

NEURIPS 2025arXiv:2505.20033
4
citations
#8794

DINGO: Constrained Inference for Diffusion LLMs

Tarun Suresh, Debangshu Banerjee, Shubham Ugare et al.

NEURIPS 2025arXiv:2505.23061
4
citations
#8795

DistinctAD: Distinctive Audio Description Generation in Contexts

Bo Fang, Wenhao Wu, Qiangqiang Wu et al.

CVPR 2025highlightarXiv:2411.18180
4
citations
#8796

Non-Asymptotic Guarantees for Average-Reward Q-Learning with Adaptive Stepsizes

Zaiwei Chen

NEURIPS 2025arXiv:2504.18743
4
citations
#8797

Scalable and Cost-Efficient de Novo Template-Based Molecular Generation

Piotr Gaiński, Oussama Boussif, Andrei Rekesh et al.

NEURIPS 2025arXiv:2506.19865
4
citations
#8798

Hints of Prompt: Enhancing Visual Representation for Multimodal LLMs in Autonomous Driving

Hao Zhou, Zhanning Gao, Zhili Chen et al.

ICCV 2025arXiv:2411.13076
4
citations
#8799

DuoLoRA : Cycle-consistent and Rank-disentangled Content-Style Personalization

Aniket Roy, Shubhankar Borse, Shreya Kadambi et al.

ICCV 2025arXiv:2504.13206
4
citations
#8800

Pause Tokens Strictly Increase the Expressivity of Constant-Depth Transformers

Charles London, Varun Kanade

NEURIPS 2025arXiv:2505.21024
4
citations