Most Cited 2025 "private estimators" Papers

22,274 papers found • Page 32 of 112

#6201

Fix-CLIP: Dual-Branch Hierarchical Contrastive Learning via Synthetic Captions for Better Understanding of Long Text

Bingchao Wang, Zhiwei Ning, Jianyu Ding et al.

ICCV 2025arXiv:2507.10095
7
citations
#6202

Theoretical Insights in Model Inversion Robustness and Conditional Entropy Maximization for Collaborative Inference Systems

Song Xia, Yi Yu, Wenhan Yang et al.

CVPR 2025highlightarXiv:2503.00383
7
citations
#6203

Mesh-RFT: Enhancing Mesh Generation via Fine-grained Reinforcement Fine-Tuning

Jian Liu, Jing Xu, Song Guo et al.

NEURIPS 2025spotlightarXiv:2505.16761
7
citations
#6204

Space Group Equivariant Crystal Diffusion

Rees Chang, Angela Pak, Alex Guerra et al.

NEURIPS 2025arXiv:2505.10994
7
citations
#6205

GRAPHGPT-O: Synergistic Multimodal Comprehension and Generation on Graphs

Yi Fang, Bowen Jin, Jiacheng Shen et al.

CVPR 2025arXiv:2502.11925
7
citations
#6206

Hyperbolic Dataset Distillation

Wenyuan Li, Guang Li, Keisuke Maeda et al.

NEURIPS 2025arXiv:2505.24623
7
citations
#6207

Latent Policy Barrier: Learning Robust Visuomotor Policies by Staying In-Distribution

Zhanyi Sun, Shuran Song

NEURIPS 2025spotlightarXiv:2508.05941
7
citations
#6208

AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual Videos

Felix Wimbauer, Weirong Chen, Dominik Muhle et al.

CVPR 2025arXiv:2503.23282
7
citations
#6209

LittleBit: Ultra Low-Bit Quantization via Latent Factorization

Banseok Lee, Dongkyu Kim, Youngcheon You et al.

NEURIPS 2025arXiv:2506.13771
7
citations
#6210

FAM Diffusion: Frequency and Attention Modulation for High-Resolution Image Generation with Stable Diffusion

Haosen Yang, Adrian Bulat, Isma Hadji et al.

CVPR 2025arXiv:2411.18552
7
citations
#6211

FactCheXcker: Mitigating Measurement Hallucinations in Chest X-ray Report Generation Models

Alice Heiman, Xiaoman Zhang, Emma Chen et al.

CVPR 2025arXiv:2411.18672
7
citations
#6212

Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language Navigation

Zihan Wang, Seungjun Lee, Gim Hee Lee

NEURIPS 2025oralarXiv:2505.11383
7
citations
#6213

HAIF-GS: Hierarchical and Induced Flow-Guided Gaussian Splatting for Dynamic Scene

Jianing Chen, Zehao Li, Yujun Cai et al.

NEURIPS 2025oralarXiv:2506.09518
7
citations
#6214

On Extending Direct Preference Optimization to Accommodate Ties

Jinghong Chen, Guangyu Yang, Weizhe Lin et al.

NEURIPS 2025arXiv:2409.17431
7
citations
#6215

Generalized Gradient Norm Clipping & Non-Euclidean $(L_0,L_1)$-Smoothness

Thomas Pethick, Wanyun Xie, Mete Erdogan et al.

NEURIPS 2025oralarXiv:2506.01913
7
citations
#6216

Repurposing Stable Diffusion Attention for Training-Free Unsupervised Interactive Segmentation

Markus Karmann, Onay Urfalioglu

CVPR 2025arXiv:2411.10411
7
citations
#6217

NOVA: A Benchmark for Rare Anomaly Localization and Clinical Reasoning in Brain MRI

Cosmin Bercea, Jun Li, Philipp Raffler et al.

NEURIPS 2025oral
7
citations
#6218

BOOM: Benchmarking Out-Of-distribution Molecular Property Predictions of Machine Learning Models

Evan Antoniuk, Shehtab Zaman, Tal Ben-Nun et al.

NEURIPS 2025arXiv:2505.01912
7
citations
#6219

SegAnyPET: Universal Promptable Segmentation from Positron Emission Tomography Images

Yichi Zhang, Le Xue, Wenbo zhang et al.

ICCV 2025arXiv:2502.14351
7
citations
#6220

CAM: A Constructivist View of Agentic Memory for LLM-Based Reading Comprehension

Rui Li, Zeyu Zhang, Xiaohe Bo et al.

NEURIPS 2025arXiv:2510.05520
7
citations
#6221

Value-Guided Search for Efficient Chain-of-Thought Reasoning

Kaiwen Wang, Jin Zhou, Jonathan Chang et al.

NEURIPS 2025arXiv:2505.17373
7
citations
#6222

FirePlace: Geometric Refinements of LLM Common Sense Reasoning for 3D Object Placement

Ian Huang, Yanan Bao, Karen Truong et al.

CVPR 2025highlightarXiv:2503.04919
7
citations
#6223

TempSamp-R1: Effective Temporal Sampling with Reinforcement Fine-Tuning for Video LLMs

Yunheng Li, Jing Cheng, Shaoyong Jia et al.

NEURIPS 2025oralarXiv:2509.18056
7
citations
#6224

MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering

Rushi Qiang, Yuchen Zhuang, Yinghao Li et al.

NEURIPS 2025arXiv:2505.07782
7
citations
#6225

JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation

Kai Liu, Jungang Li, Yuchong Sun et al.

NEURIPS 2025oralarXiv:2512.22905
7
citations
#6226

Scene Map-based Prompt Tuning for Navigation Instruction Generation

Sheng Fan, Rui Liu, Wenguan Wang et al.

CVPR 2025
7
citations
#6227

Dense SAE Latents Are Features, Not Bugs

Xiaoqing Sun, Alessandro Stolfo, Joshua Engels et al.

NEURIPS 2025arXiv:2506.15679
7
citations
#6228

Domain-RAG: Retrieval-Guided Compositional Image Generation for Cross-Domain Few-Shot Object Detection

Yu Li, Xingyu Qiu, Yuqian Fu et al.

NEURIPS 2025arXiv:2506.05872
7
citations
#6229

Is PRM Necessary? Problem-Solving RL Implicitly Induces PRM Capability in LLMs

Zhangyin Feng, Qianglong Chen, Ning Lu et al.

NEURIPS 2025arXiv:2505.11227
7
citations
#6230

Geometric Knowledge-Guided Localized Global Distribution Alignment for Federated Learning

Yanbiao Ma, Wei Dai, Wenke Huang et al.

CVPR 2025arXiv:2503.06457
7
citations
#6231

From Bytes to Ideas: Language Modeling with Autoregressive U-Nets

Mathurin VIDEAU, Badr Youbi Idrissi, Alessandro Leite et al.

NEURIPS 2025arXiv:2506.14761
7
citations
#6232

Generative Pre-trained Autoregressive Diffusion Transformer

Yuan Zhang, Jiacheng Jiang, Guoqing Ma et al.

NEURIPS 2025arXiv:2505.07344
7
citations
#6233

LEDiff: Latent Exposure Diffusion for HDR Generation

Chao Wang, Zhihao Xia, Thomas Leimkuehler et al.

CVPR 2025arXiv:2412.14456
7
citations
#6234

Training-Free Constrained Generation With Stable Diffusion Models

Stefano Zampini, Jacob K Christopher, Luca Oneto et al.

NEURIPS 2025spotlightarXiv:2502.05625
7
citations
#6235

Keeping an Eye on LLM Unlearning: The Hidden Risk and Remedy

Jie Ren, Zhenwei Dai, Xianfeng Tang et al.

NEURIPS 2025arXiv:2506.00359
7
citations
#6236

Seq2Time: Sequential Knowledge Transfer for Video LLM Temporal Grounding

Andong Deng, Zhongpai Gao, Anwesa Choudhuri et al.

CVPR 2025arXiv:2411.16932
7
citations
#6237

A Unified Solution to Video Fusion: From Multi-Frame Learning to Benchmarking

Zixiang Zhao, Haowen Bai, Bingxin Ke et al.

NEURIPS 2025oralarXiv:2505.19858
7
citations
#6238

HotSpot: Signed Distance Function Optimization with an Asymptotically Sufficient Condition

Zimo Wang, Cheng Wang, Taiki Yoshino et al.

CVPR 2025highlightarXiv:2411.14628
7
citations
#6239

SimAvatar: Simulation-Ready Avatars with Layered Hair and Clothing

Xueting Li, Ye Yuan, Shalini De Mello et al.

CVPR 2025arXiv:2412.09545
7
citations
#6240

VideoRFSplat: Direct Scene-Level Text-to-3D Gaussian Splatting Generation with Flexible Pose and Multi-View Joint Modeling

Hyojun Go, Byeongjun Park, Hyelin Nam et al.

ICCV 2025arXiv:2503.15855
7
citations
#6241

Panorama Generation From NFoV Image Done Right

Dian Zheng, Cheng Zhang, Xiao-Ming Wu et al.

CVPR 2025highlightarXiv:2503.18420
7
citations
#6242

Dynamic Updates for Language Adaptation in Visual-Language Tracking

Xiaohai Li, Bineng Zhong, Qihua Liang et al.

CVPR 2025arXiv:2503.06621
7
citations
#6243

EEE-Bench: A Comprehensive Multimodal Electrical And Electronics Engineering Benchmark

Ming Li, Jike Zhong, Tianle Chen et al.

CVPR 2025arXiv:2411.01492
7
citations
#6244

Binarized Mamba-Transformer for Lightweight Quad Bayer HybridEVS Demosaicing

Shiyang Zhou, Haijin Zeng, Yunfan Lu et al.

CVPR 2025arXiv:2503.16134
7
citations
#6245

VideoMage: Multi-Subject and Motion Customization of Text-to-Video Diffusion Models

Chi-Pin Huang, Yen-Siang Wu, Hung-Kai Chung et al.

CVPR 2025arXiv:2503.21781
7
citations
#6246

Perceptually Accurate 3D Talking Head Generation: New Definitions, Speech-Mesh Representation, and Evaluation Metrics

Lee Chae-Yeon, Oh Hyun-Bin, Han EunGi et al.

CVPR 2025highlightarXiv:2503.20308
7
citations
#6247

Zebra-Llama: Towards Extremely Efficient Hybrid Models

Mingyu Yang, Mehdi Rezagholizadeh, Guihong Li et al.

NEURIPS 2025arXiv:2505.17272
7
citations
#6248

Building Vision Models upon Heat Conduction

Zhaozhi Wang, Yue Liu, Yunjie Tian et al.

CVPR 2025arXiv:2405.16555
7
citations
#6249

It’s a (Blind) Match! Towards Vision-Language Correspondence without Parallel Data

Dominik Schnaus, Nikita Araslanov, Daniel Cremers

CVPR 2025arXiv:2503.24129
7
citations
#6250

Not Just Text: Uncovering Vision Modality Typographic Threats in Image Generation Models

Hao Cheng, Erjia Xiao, Jiayan Yang et al.

CVPR 2025arXiv:2412.05538
7
citations
#6251

Turbo3D: Ultra-fast Text-to-3D Generation

Hanzhe Hu, Tianwei Yin, Fujun Luan et al.

CVPR 2025arXiv:2412.04470
7
citations
#6252

Do We Really Need Curated Malicious Data for Safety Alignment in Multi-modal Large Language Models?

Yanbo Wang, Jiyang Guan, Jian Liang et al.

CVPR 2025arXiv:2504.10000
7
citations
#6253

Statistically Valid Post-Deployment Monitoring Should Be Standard for AI-Based Digital Health

Pavel Dolin, Weizhi Li, Gautam Dasarathy et al.

NEURIPS 2025arXiv:2506.05701
7
citations
#6254

SCAP: Transductive Test-Time Adaptation via Supportive Clique-based Attribute Prompting

Chenyu Zhang, Kunlun Xu, Zichen Liu et al.

CVPR 2025arXiv:2503.12866
7
citations
#6255

Causal LLM Routing: End-to-End Regret Minimization from Observational Data

Asterios Tsiourvas, Wei Sun, Georgia Perakis

NEURIPS 2025arXiv:2505.16037
7
citations
#6256

Filter Images First, Generate Instructions Later: Pre-Instruction Data Selection for Visual Instruction Tuning

Bardia Safaei, Faizan Siddiqui, Jiacong Xu et al.

CVPR 2025highlightarXiv:2503.07591
7
citations
#6257

Fine-Grained Erasure in Text-to-Image Diffusion-based Foundation Models

Kartik Thakral, Tamar Glaser, Tal Hassner et al.

CVPR 2025arXiv:2503.19783
7
citations
#6258

Exploiting Temporal State Space Sharing for Video Semantic Segmentation

Hesham Syed, Yun Liu, Guolei Sun et al.

CVPR 2025arXiv:2503.20824
7
citations
#6259

ChatGen: Automatic Text-to-Image Generation From FreeStyle Chatting

Chengyou Jia, Changliang Xia, Zhuohang Dang et al.

CVPR 2025arXiv:2411.17176
7
citations
#6260

Anchored Diffusion Language Model

Litu Rout, Constantine Caramanis, Sanjay Shakkottai

NEURIPS 2025arXiv:2505.18456
7
citations
#6261

Do We Always Need the Simplicity Bias? Looking for Optimal Inductive Biases in the Wild

Damien Teney, Liangze Jiang, Florin Gogianu et al.

CVPR 2025arXiv:2503.10065
7
citations
#6262

SEAL: Semantic Attention Learning for Long Video Representation

Lan Wang, Yujia Chen, Wen-Sheng Chu et al.

CVPR 2025arXiv:2412.01798
7
citations
#6263

Relative Pose Estimation through Affine Corrections of Monocular Depth Priors

Yifan Yu, Shaohui Liu, Rémi Pautrat et al.

CVPR 2025highlightarXiv:2501.05446
7
citations
#6264

A Unified Model for Compressed Sensing MRI Across Undersampling Patterns

Armeet Singh Jatyani, Jiayun Wang, Aditi Chandrashekar et al.

CVPR 2025arXiv:2410.16290
7
citations
#6265

Learning Bijective Surface Parameterization for Inferring Signed Distance Functions from Sparse Point Clouds with Grid Deformation

Takeshi Noda, Chao Chen, Junsheng Zhou et al.

CVPR 2025arXiv:2503.23670
7
citations
#6266

Compliant Residual DAgger: Improving Real-World Contact-Rich Manipulation with Human Corrections

Xiaomeng Xu, Yifan Hou, Zeyi Liu et al.

NEURIPS 2025arXiv:2506.16685
7
citations
#6267

DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation

Wang Zhao, Yan-Pei Cao, Jiale Xu et al.

CVPR 2025arXiv:2412.15200
7
citations
#6268

Real-time High-fidelity Gaussian Human Avatars with Position-based Interpolation of Spatially Distributed MLPs

Youyi Zhan, Tianjia Shao, Yin Yang et al.

CVPR 2025highlightarXiv:2504.12909
7
citations
#6269

HyGen: Efficient LLM Serving via Elastic Online-Offline Request Co-location

Ting Sun, Penghan Wang, Fan Lai

NEURIPS 2025arXiv:2501.14808
7
citations
#6270

ArtiScene: Language-Driven Artistic 3D Scene Generation Through Image Intermediary

Zeqi Gu, Yin Cui, Max Li et al.

CVPR 2025arXiv:2506.00742
7
citations
#6271

Guiding Human-Object Interactions with Rich Geometry and Relations

Mengqing Xue, Yifei Liu, Ling Guo et al.

CVPR 2025arXiv:2503.20172
7
citations
#6272

ModeSeq: Taming Sparse Multimodal Motion Prediction with Sequential Mode Modeling

Zikang Zhou, Hengjian Zhou, Haibo Hu et al.

CVPR 2025arXiv:2411.11911
7
citations
#6273

Balancing Multimodal Training Through Game-Theoretic Regularization

Konstantinos Kontras, Thomas Strypsteen, Christos Chatzichristos et al.

NEURIPS 2025spotlightarXiv:2411.07335
7
citations
#6274

Sherlock: Self-Correcting Reasoning in Vision-Language Models

Yi Ding, Ruqi Zhang

NEURIPS 2025arXiv:2505.22651
7
citations
#6275

Image is All You Need to Empower Large-scale Diffusion Models for In-Domain Generation

Pu Cao, Feng Zhou, Lu Yang et al.

CVPR 2025arXiv:2312.08195
7
citations
#6276

ResearchCodeBench: Benchmarking LLMs on Implementing Novel Machine Learning Research Code

Tianyu Hua, Harper Hua, Violet Xiang et al.

NEURIPS 2025spotlightarXiv:2506.02314
7
citations
#6277

FlipSketch: Flipping Static Drawings to Text-Guided Sketch Animations

Hmrishav Bandyopadhyay, Yi-Zhe Song

CVPR 2025arXiv:2411.10818
7
citations
#6278

Federated Learning with Domain Shift Eraser

Zheng Wang, Zihui Wang, Zheng Wang et al.

CVPR 2025arXiv:2503.13063
7
citations
#6279

Modeling Cell Dynamics and Interactions with Unbalanced Mean Field Schrödinger Bridge

Zhenyi Zhang, Zihan Wang, Yuhao Sun et al.

NEURIPS 2025arXiv:2505.11197
7
citations
#6280

Beyond One Shot, Beyond One Perspective: Cross-View and Long-Horizon Distillation for Better LiDAR Representations

Xiang Xu, Lingdong Kong, Song Wang et al.

ICCV 2025arXiv:2507.05260
7
citations
#6281

Small Singular Values Matter: A Random Matrix Analysis of Transformer Models

Max Staats, Matthias Thamm, Bernd Rosenow

NEURIPS 2025arXiv:2410.17770
7
citations
#6282

UNIC-Adapter: Unified Image-instruction Adapter with Multi-modal Transformer for Image Generation

Lunhao Duan, Shanshan Zhao, Wenjun Yan et al.

CVPR 2025arXiv:2412.18928
7
citations
#6283

Exploring Historical Information for RGBE Visual Tracking with Mamba

Chuanyu Sun, Jiqing Zhang, Yang Wang et al.

CVPR 2025
7
citations
#6284

Grounding 3D Object Affordance with Language Instructions, Visual Observations and Interactions

He Zhu, Quyu Kong, Kechun Xu et al.

CVPR 2025arXiv:2504.04744
7
citations
#6285

AgentRecBench: Benchmarking LLM Agent-based Personalized Recommender Systems

Yu Shang, Peijie Liu, Yuwei Yan et al.

NEURIPS 2025spotlightarXiv:2505.19623
7
citations
#6286

GauSTAR: Gaussian Surface Tracking and Reconstruction

Chengwei Zheng, Lixin Xue, Juan Jose Zarate et al.

CVPR 2025arXiv:2501.10283
7
citations
#6287

Think Small, Act Big: Primitive Prompt Learning for Lifelong Robot Manipulation

Yuanqi Yao, Siao Liu, Haoming Song et al.

CVPR 2025arXiv:2504.00420
7
citations
#6288

Activation Control for Efficiently Eliciting Long Chain-of-thought Ability of Language Models

Zekai Zhao, Qi Liu, Kun Zhou et al.

NEURIPS 2025spotlightarXiv:2505.17697
7
citations
#6289

Gaussian Splatting for Efficient Satellite Image Photogrammetry

Luca Savant Aira, Gabriele Facciolo, Thibaud Ehret

CVPR 2025arXiv:2412.13047
7
citations
#6290

Privacy amplification by random allocation

Moshe Shenfeld, Vitaly Feldman

NEURIPS 2025spotlightarXiv:2502.08202
7
citations
#6291

Rethinking Temporal Fusion with a Unified Gradient Descent View for 3D Semantic Occupancy Prediction

Dubing Chen, Huan Zheng, Jin Fang et al.

CVPR 2025arXiv:2504.12959
7
citations
#6292

QuartDepth: Post-Training Quantization for Real-Time Depth Estimation on the Edge

Xuan Shen, Weize Ma, Jing Liu et al.

CVPR 2025arXiv:2503.16709
7
citations
#6293

Simultaneous Swap Regret Minimization via KL-Calibration

Haipeng Luo, Spandan Senapati, Vatsal Sharan

NEURIPS 2025spotlightarXiv:2502.16387
7
citations
#6294

3D Gaussian Head Avatars with Expressive Dynamic Appearances by Compact Tensorial Representations

yating wang, Xuan Wang, Ran Yi et al.

CVPR 2025arXiv:2504.14967
7
citations
#6295

Free on the Fly: Enhancing Flexibility in Test-Time Adaptation with Online EM

Qiyuan Dai, Sibei Yang

CVPR 2025arXiv:2507.06973
7
citations
#6296

Benign Overfitting in Single-Head Attention

Roey Magen, Shuning Shang, Zhiwei Xu et al.

NEURIPS 2025arXiv:2410.07746
7
citations
#6297

Measuring and Controlling Solution Degeneracy across Task-Trained Recurrent Neural Networks

Ann Huang, Satpreet Harcharan Singh, Flavio Martinelli et al.

NEURIPS 2025spotlightarXiv:2410.03972
7
citations
#6298

GaussianProperty: Integrating Physical Properties to 3D Gaussians with LMMs

Xinli Xu, Wenhang Ge, Dicong Qiu et al.

ICCV 2025arXiv:2412.11258
7
citations
#6299

GroupMamba: Efficient Group-Based Visual State Space Model

Abdelrahman Shaker, Syed Talal Wasim, Salman Khan et al.

CVPR 2025arXiv:2407.13772
7
citations
#6300

How Do I Do That? Synthesizing 3D Hand Motion and Contacts for Everyday Interactions

Aditya Prakash, Benjamin E Lundell, Dmitry Andreychuk et al.

CVPR 2025highlightarXiv:2504.12284
7
citations
#6301

On the Value of Cross-Modal Misalignment in Multimodal Representation Learning

Yichao Cai, Yuhang Liu, Erdun Gao et al.

NEURIPS 2025spotlightarXiv:2504.10143
7
citations
#6302

FG^2: Fine-Grained Cross-View Localization by Fine-Grained Feature Matching

Zimin Xia, Alex Alahi

CVPR 2025arXiv:2503.18725
7
citations
#6303

Adjoint Schrödinger Bridge Sampler

Guan-Horng Liu, Jaemoo Choi, Yongxin Chen et al.

NEURIPS 2025oralarXiv:2506.22565
7
citations
#6304

The Computer Vision Foundation

Yancheng Cai, Fei Yin, Dounia Hammou et al.

CVPR 2025arXiv:2502.20256
7
citations
#6305

Time-o1: Time-Series Forecasting Needs Transformed Label Alignment

Hao Wang, Licheng Pan, Zhichao Chen et al.

NEURIPS 2025oralarXiv:2505.17847
7
citations
#6306

Uncertainty Quantification with the Empirical Neural Tangent Kernel

Joseph Wilson, Chris van der Heide, Liam Hodgkinson et al.

NEURIPS 2025arXiv:2502.02870
7
citations
#6307

FinePhys: Fine-grained Human Action Generation by Explicitly Incorporating Physical Laws for Effective Skeletal Guidance

Dian Shao, Mingfei Shi, Shengda Xu et al.

CVPR 2025arXiv:2505.13437
7
citations
#6308

Depth-Bounds for Neural Networks via the Braid Arrangement

Moritz Grillo, Christoph Hertrich, Georg Loho

NEURIPS 2025oralarXiv:2502.09324
7
citations
#6309

Janus-Pro-R1: Advancing Collaborative Visual Comprehension and Generation via Reinforcement Learning

Kaihang Pan, Yang Wu, Wendong Bu et al.

NEURIPS 2025arXiv:2506.01480
7
citations
#6310

Exact Expressive Power of Transformers with Padding

Will Merrill, Ashish Sabharwal

NEURIPS 2025arXiv:2505.18948
7
citations
#6311

Zero-Shot Novel View and Depth Synthesis with Multi-View Geometric Diffusion

Vitor Guizilini, Muhammad Zubair Irshad, Dian Chen et al.

CVPR 2025arXiv:2501.18804
7
citations
#6312

Enhancing 3D Gaze Estimation in the Wild using Weak Supervision with Gaze Following Labels

Pierre Vuillecard, Jean-marc Odobez

CVPR 2025arXiv:2502.20249
7
citations
#6313

Driving by the Rules: A Benchmark for Integrating Traffic Sign Regulations into Vectorized HD Map

Xinyuan Chang, Maixuan Xue, Xinran Liu et al.

CVPR 2025highlightarXiv:2410.23780
7
citations
#6314

Automatic Joint Structured Pruning and Quantization for Efficient Neural Network Training and Compression

Xiaoyi Qu, David Aponte, Colby Banbury et al.

CVPR 2025arXiv:2502.16638
7
citations
#6315

FLAME: Frozen Large Language Models Enable Data-Efficient Language-Image Pre-training

Anjia Cao, Xing Wei, Zhiheng Ma

CVPR 2025arXiv:2411.11927
7
citations
#6316

Detail-Preserving Latent Diffusion for Stable Shadow Removal

Jiamin Xu, Yuxin Zheng, Zelong Li et al.

CVPR 2025arXiv:2412.17630
7
citations
#6317

HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models

Yu Zhou, Xingyu Wu, Jibin Wu et al.

NEURIPS 2025spotlightarXiv:2409.18893
7
citations
#6318

Progress-Aware Video Frame Captioning

Zihui Xue, Joungbin An, Xitong Yang et al.

CVPR 2025arXiv:2412.02071
7
citations
#6319

Scene Splatter: Momentum 3D Scene Generation from Single Image with Video Diffusion Model

Shengjun Zhang, Jinzhao Li, Xin Fei et al.

CVPR 2025arXiv:2504.02764
7
citations
#6320

DiET-GS: Diffusion Prior and Event Stream-Assisted Motion Deblurring 3D Gaussian Splatting

Seungjun Lee, Gim Hee Lee

CVPR 2025arXiv:2503.24210
7
citations
#6321

OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding

Jingli Lin, Chenming Zhu, Runsen Xu et al.

NEURIPS 2025oralarXiv:2507.07984
7
citations
#6322

JAFAR: Jack up Any Feature at Any Resolution

Paul Couairon, Loïck Chambon, Louis Serrano et al.

NEURIPS 2025arXiv:2506.11136
7
citations
#6323

MOS: Modeling Object-Scene Associations in Generalized Category Discovery

Zhengyuan Peng, Jinpeng Ma, Zhimin Sun et al.

CVPR 2025arXiv:2503.12035
7
citations
#6324

Visual Lexicon: Rich Image Features in Language Space

XuDong Wang, Xingyi Zhou, Alireza Fathi et al.

CVPR 2025arXiv:2412.06774
7
citations
#6325

Multi-modal Vision Pre-training for Medical Image Analysis

Shaohao Rui, Lingzhi Chen, Zhenyu Tang et al.

CVPR 2025highlightarXiv:2410.10604
7
citations
#6326

CTRL-O: Language-Controllable Object-Centric Visual Representation Learning

Aniket Rajiv Didolkar, Andrii Zadaianchuk, Rabiul Awal et al.

CVPR 2025arXiv:2503.21747
7
citations
#6327

Rethinking Verification for LLM Code Generation: From Generation to Testing

Zihan Ma, Taolin Zhang, Maosongcao et al.

NEURIPS 2025arXiv:2507.06920
7
citations
#6328

HiPART: Hierarchical Pose AutoRegressive Transformer for Occluded 3D Human Pose Estimation

Hongwei Zheng, Han Li, Wenrui Dai et al.

CVPR 2025arXiv:2503.23331
7
citations
#6329

LPOSS: Label Propagation Over Patches and Pixels for Open-vocabulary Semantic Segmentation

Vladan Stojnić, Yannis Kalantidis, Jiri Matas et al.

CVPR 2025arXiv:2503.19777
7
citations
#6330

Rethinking Decoder Design: Improving Biomarker Segmentation Using Depth-to-Space Restoration and Residual Linear Attention

Saad Wazir, Daeyoung Kim

CVPR 2025arXiv:2506.18335
7
citations
#6331

STPro: Spatial and Temporal Progressive Learning for Weakly Supervised Spatio-Temporal Grounding

Aaryan Garg, Akash Kumar, Yogesh S. Rawat

CVPR 2025arXiv:2502.20678
7
citations
#6332

MergeVQ: A Unified Framework for Visual Generation and Representation with Disentangled Token Merging and Quantization

Siyuan Li, Luyuan Zhang, Zedong Wang et al.

CVPR 2025arXiv:2504.00999
7
citations
#6333

DiverseFlow: Sample-Efficient Diverse Mode Coverage in Flows

Mashrur M. Morshed, Vishnu Naresh Boddeti

CVPR 2025arXiv:2504.07894
7
citations
#6334

Inversion Circle Interpolation: Diffusion-based Image Augmentation for Data-scarce Classification

Yanghao Wang, Long Chen

CVPR 2025arXiv:2408.16266
7
citations
#6335

Controlling Thinking Speed in Reasoning Models

Zhengkai Lin, Zhihang Fu, Ze Chen et al.

NEURIPS 2025spotlightarXiv:2507.03704
7
citations
#6336

PBR-NeRF: Inverse Rendering with Physics-Based Neural Fields

Sean Wu, Shamik Basu, Tim Broedermann et al.

CVPR 2025arXiv:2412.09680
7
citations
#6337

GSRF: Complex-Valued 3D Gaussian Splatting for Efficient Radio-Frequency Data Synthesis

Kang Yang, Gaofeng Dong, Sijie Ji et al.

NEURIPS 2025spotlightarXiv:2502.01826
7
citations
#6338

A Simple yet Effective Layout Token in Large Language Models for Document Understanding

Zhaoqing Zhu, Chuwei Luo, Zirui Shao et al.

CVPR 2025arXiv:2503.18434
7
citations
#6339

CoMapGS: Covisibility Map-based Gaussian Splatting for Sparse Novel View Synthesis

Youngkyoon Jang, Eduardo Pérez-Pellitero

CVPR 2025arXiv:2503.20998
7
citations
#6340

Pose Priors from Language Models

Sanjay Subramanian, Evonne Ng, Lea Müller et al.

CVPR 2025arXiv:2405.03689
7
citations
#6341

Generative Densification: Learning to Densify Gaussians for High-Fidelity Generalizable 3D Reconstruction

Seungtae Nam, Xiangyu Sun, Gyeongjin Kang et al.

CVPR 2025highlightarXiv:2412.06234
7
citations
#6342

SUMO: Subspace-Aware Moment-Orthogonalization for Accelerating Memory-Efficient LLM Training

Yehonathan Refael, Guy Smorodinsky, Tom Tirer et al.

NEURIPS 2025arXiv:2505.24749
7
citations
#6343

Classifier-Free Guidance Inside the Attraction Basin May Cause Memorization

Anubhav Jain, Yuya Kobayashi, Takashi Shibuya et al.

CVPR 2025arXiv:2411.16738
7
citations
#6344

Image Quality Assessment: From Human to Machine Preference

Chunyi Li, Yuan Tian, Xiaoyue Ling et al.

CVPR 2025highlightarXiv:2503.10078
7
citations
#6345

Steepest Descent Density Control for Compact 3D Gaussian Splatting

Peihao Wang, Yuehao Wang, Dilin Wang et al.

CVPR 2025arXiv:2505.05587
7
citations
#6346

Large Language Models for Lossless Image Compression: Next-Pixel Prediction in Language Space is All You Need

Kecheng Chen, Pingping Zhang, Hui Liu et al.

NEURIPS 2025arXiv:2411.12448
7
citations
#6347

RI3D: Few-Shot Gaussian Splatting With Repair and Inpainting Diffusion Priors

Avinash Paliwal, xilong zhou, Wei Ye et al.

ICCV 2025arXiv:2503.10860
7
citations
#6348

Momentum Multi-Marginal Schrödinger Bridge Matching

Panagiotis Theodoropoulos, Augustinos Saravanos, Evangelos Theodorou et al.

NEURIPS 2025oralarXiv:2506.10168
7
citations
#6349

StdGEN: Semantic-Decomposed 3D Character Generation from Single Images

Yuze He, Yanning Zhou, Wang Zhao et al.

CVPR 2025arXiv:2411.05738
7
citations
#6350

EchoShot: Multi-Shot Portrait Video Generation

Jiahao Wang, Hualian Sheng, Sijia Cai et al.

NEURIPS 2025arXiv:2506.15838
7
citations
#6351

InteractVLM: 3D Interaction Reasoning from 2D Foundational Models

Sai Kumar Dwivedi, Dimitrije Antić, Shashank Tripathi et al.

CVPR 2025arXiv:2504.05303
7
citations
#6352

Adaptive Part Learning for Fine-Grained Generalized Category Discovery: A Plug-and-Play Enhancement

Qiyuan Dai, Hanzhuo Huang, Yu Wu et al.

CVPR 2025arXiv:2507.06928
7
citations
#6353

Sample Complexity of Distributionally Robust Average-Reward Reinforcement Learning

Zijun Chen, Shengbo Wang, Nian Si

NEURIPS 2025arXiv:2505.10007
7
citations
#6354

CoreGuard: Safeguarding Foundational Capabilities of LLMs Against Model Stealing in Edge Deployment

Qinfeng Li, Tianyue Luo, Xuhong Zhang et al.

NEURIPS 2025arXiv:2410.13903
7
citations
#6355

CoVoMix2: Advancing Zero-Shot Dialogue Generation with Fully Non-Autoregressive Flow Matching

Leying Zhang, Yao Qian, Xiaofei Wang et al.

NEURIPS 2025arXiv:2506.00885
7
citations
#6356

Enhancing Privacy-Utility Trade-offs to Mitigate Memorization in Diffusion Models

Chen Chen, Daochang Liu, Mubarak Shah et al.

CVPR 2025arXiv:2504.18032
7
citations
#6357

RoadSocial: A Diverse VideoQA Dataset and Benchmark for Road Event Understanding from Social Video Narratives

Chirag Parikh, Deepti Rawat, Rakshitha R. T. et al.

CVPR 2025arXiv:2503.21459
7
citations
#6358

Improving Energy Natural Gradient Descent through Woodbury, Momentum, and Randomization

Andrés Guzmán-Cordero, Felix Dangel, Gil Goldshlager et al.

NEURIPS 2025arXiv:2505.12149
7
citations
#6359

GCE-Pose: Global Context Enhancement for Category-level Object Pose Estimation

Weihang Li, Hongli XU, Junwen Huang et al.

CVPR 2025arXiv:2502.04293
7
citations
#6360

Question-Aware Gaussian Experts for Audio-Visual Question Answering

Hongyeob Kim, Inyoung Jung, Dayoon Suh et al.

CVPR 2025highlightarXiv:2503.04459
7
citations
#6361

Enabling Differentially Private Federated Learning for Speech Recognition: Benchmarks, Adaptive Optimizers, and Gradient Clipping

Martin Pelikan, Shams Azam, Vitaly Feldman et al.

NEURIPS 2025arXiv:2310.00098
7
citations
#6362

RelationField: Relate Anything in Radiance Fields

Sebastian Koch, Johanna Wald, Mirco Colosi et al.

CVPR 2025arXiv:2412.13652
7
citations
#6363

LIRM: Large Inverse Rendering Model for Progressive Reconstruction of Shape, Materials and View-dependent Radiance Fields

Zhengqin Li, Dilin Wang, Ka chen et al.

CVPR 2025arXiv:2504.20026
7
citations
#6364

The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation

Bingjie Gao, Xinyu Gao, Xiaoxue Wu et al.

CVPR 2025arXiv:2504.11739
7
citations
#6365

DASH: Detection and Assessment of Systematic Hallucinations of VLMs

Maximilian Augustin, Yannic Neuhaus, Matthias Hein

ICCV 2025arXiv:2503.23573
7
citations
#6366

FlashMD: long-stride, universal prediction of molecular dynamics

Filippo Bigi, Sanggyu Chong, Agustinus Kristiadi et al.

NEURIPS 2025spotlightarXiv:2505.19350
7
citations
#6367

Do Computer Vision Foundation Models Learn the Low-level Characteristics of the Human Visual System?

Yancheng Cai, Fei Yin, Dounia Hammou et al.

CVPR 2025highlightarXiv:2502.20256
7
citations
#6368

Cross-Domain Graph Data Scaling: A Showcase with Diffusion Models

Wenzhuo Tang, Haitao Mao, Danial Dervovic et al.

NEURIPS 2025arXiv:2406.01899
7
citations
#6369

Sculpting Features from Noise: Reward-Guided Hierarchical Diffusion for Task-Optimal Feature Transformation

Nanxu Gong, Zijun Li, Sixun Dong et al.

NEURIPS 2025arXiv:2505.15152
7
citations
#6370

Seeking and Updating with Live Visual Knowledge

Mingyang Fu, Yuyang Peng, Dongping Chen et al.

NEURIPS 2025arXiv:2504.05288
7
citations
#6371

MMTU: A Massive Multi-Task Table Understanding and Reasoning Benchmark

Junjie Xing, Yeye He, Mengyu Zhou et al.

NEURIPS 2025arXiv:2506.05587
7
citations
#6372

ChatHuman: Chatting about 3D Humans with Tools

Jing Lin, Yao Feng, Weiyang Liu et al.

CVPR 2025arXiv:2405.04533
7
citations
#6373

TurboFill: Adapting Few-step Text-to-image Model for Fast Image Inpainting

Liangbin Xie, Daniil Pakhomov, Zhonghao Wang et al.

CVPR 2025arXiv:2504.00996
7
citations
#6374

MotiF: Making Text Count in Image Animation with Motion Focal Loss

Shijie Wang, Samaneh Azadi, Rohit Girdhar et al.

CVPR 2025arXiv:2412.16153
7
citations
#6375

EvEnhancer: Empowering Effectiveness, Efficiency and Generalizability for Continuous Space-Time Video Super-Resolution with Events

Shuoyan Wei, Feng Li, Shengeng Tang et al.

CVPR 2025highlightarXiv:2505.04657
7
citations
#6376

Stochastic Process Learning via Operator Flow Matching

Yaozhong Shi, Zachary Ross, Domniki Asimaki et al.

NEURIPS 2025spotlightarXiv:2501.04126
7
citations
#6377

HawkBench: Investigating Resilience of RAG Methods on Stratified Information-Seeking Tasks

Hongjin Qian, Zheng Liu, Chao Gao et al.

NEURIPS 2025spotlightarXiv:2502.13465
7
citations
#6378

Enhancing Text-to-Image Diffusion Transformer via Split-Text Conditioning

Yu Zhang, Jialei Zhou, Xinchen Li et al.

NEURIPS 2025arXiv:2505.19261
7
citations
#6379

Security Challenges in AI Agent Deployment: Insights from a Large Scale Public Competition

Andy Zou, Maxwell Lin, Eliot Jones et al.

NEURIPS 2025arXiv:2507.20526
7
citations
#6380

GROVE: A Generalized Reward for Learning Open-Vocabulary Physical Skill

Jieming Cui, Tengyu Liu, Ziyu Meng et al.

CVPR 2025arXiv:2504.04191
7
citations
#6381

HERMES: temporal-coHERent long-forM understanding with Episodes and Semantics

Gueter Josmy Faure, Jia-Fong Yeh, Min-Hung Chen et al.

ICCV 2025arXiv:2408.17443
7
citations
#6382

ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models

Hongbo Liu, Jingwen He, Yi Jin et al.

NEURIPS 2025arXiv:2506.21356
7
citations
#6383

GOAL: Global-local Object Alignment Learning

Hyungyu Choi, Young Kyun Jang, Chanho Eom

CVPR 2025arXiv:2503.17782
7
citations
#6384

SceneDiffuser++: City-Scale Traffic Simulation via a Generative World Model

Shuhan Tan, John Wheatley Lambert, Hong Jeon et al.

CVPR 2025arXiv:2506.21976
7
citations
#6385

Discovering Fine-Grained Visual-Concept Relations by Disentangled Optimal Transport Concept Bottleneck Models

Yan Xie, Zequn Zeng, Hao Zhang et al.

CVPR 2025arXiv:2505.07209
7
citations
#6386

MultiVENT 2.0: A Massive Multilingual Benchmark for Event-Centric Video Retrieval

Reno Kriz, Kate Sanders, David Etter et al.

CVPR 2025arXiv:2410.11619
7
citations
#6387

Reconstructing Humans with a Biomechanically Accurate Skeleton

Yan Xia, Xiaowei Zhou, Etienne Vouga et al.

CVPR 2025arXiv:2503.21751
7
citations
#6388

FedAWA: Adaptive Optimization of Aggregation Weights in Federated Learning Using Client Vectors

Changlong Shi, He Zhao, Bingjie Zhang et al.

CVPR 2025arXiv:2503.15842
7
citations
#6389

UNOPose: Unseen Object Pose Estimation with an Unposed RGB-D Reference Image

Xingyu Liu, Gu Wang, Ruida Zhang et al.

CVPR 2025arXiv:2411.16106
7
citations
#6390

Reference-Based 3D-Aware Image Editing with Triplanes

Bahri Batuhan Bilecen, Yiğit Yalın, Ning Yu et al.

CVPR 2025highlightarXiv:2404.03632
7
citations
#6391

Agentic Plan Caching: Test-Time Memory for Fast and Cost-Efficient LLM Agents

Qizheng Zhang, Michael Wornow, Kunle Olukotun

NEURIPS 2025arXiv:2506.14852
7
citations
#6392

LookCloser: Frequency-aware Radiance Field for Tiny-Detail Scene

Xiaoyu Zhang, Weihong Pan, Chong Bao et al.

CVPR 2025arXiv:2503.18513
7
citations
#6393

Coeff-Tuning: A Graph Filter Subspace View for Tuning Attention-Based Large Models

Zichen Miao, WEI CHEN, Qiang Qiu

CVPR 2025highlightarXiv:2503.18337
7
citations
#6394

Neighborhood Self-Dissimilarity Attention for Medical Image Segmentation

Junren Chen, Rui Chen, Wei Wang et al.

NEURIPS 2025
7
citations
#6395

AVF-MAE++: Scaling Affective Video Facial Masked Autoencoders via Efficient Audio-Visual Self-Supervised Learning

Xuecheng Wu, Heli Sun, Yifan Wang et al.

CVPR 2025
7
citations
#6396

Conformal Linguistic Calibration: Trading-off between Factuality and Specificity

Zhengping Jiang, Anqi Liu, Ben Van Durme

NEURIPS 2025arXiv:2502.19110
7
citations
#6397

HarmonySet: A Comprehensive Dataset for Understanding Video-Music Semantic Alignment and Temporal Synchronization

Zitang Zhou, Ke Mei, Yu Lu et al.

CVPR 2025arXiv:2503.01725
7
citations
#6398

Training Language Models to Generate Quality Code with Program Analysis Feedback

Feng Yao, Zilong Wang, Liyuan Liu et al.

NEURIPS 2025arXiv:2505.22704
7
citations
#6399

Validating LLM-as-a-Judge Systems under Rating Indeterminacy

Luke Guerdan, Solon Barocas, Kenneth Holstein et al.

NEURIPS 2025arXiv:2503.05965
7
citations
#6400

Gradient-Guided Annealing for Domain Generalization

Aristotelis Ballas, Christos Diou

CVPR 2025highlightarXiv:2502.20162
7
citations