Most Cited 2025 "test-time computate" Papers

22,274 papers found • Page 33 of 112

#6401

GCE-Pose: Global Context Enhancement for Category-level Object Pose Estimation

Weihang Li, Hongli XU, Junwen Huang et al.

CVPR 2025arXiv:2502.04293
7
citations
#6402

Question-Aware Gaussian Experts for Audio-Visual Question Answering

Hongyeob Kim, Inyoung Jung, Dayoon Suh et al.

CVPR 2025highlightarXiv:2503.04459
7
citations
#6403

RelationField: Relate Anything in Radiance Fields

Sebastian Koch, Johanna Wald, Mirco Colosi et al.

CVPR 2025arXiv:2412.13652
7
citations
#6404

TRCE: Towards Reliable Malicious Concept Erasure in Text-to-Image Diffusion Models

Ruidong Chen, honglin guo, Lanjun Wang et al.

ICCV 2025arXiv:2503.07389
7
citations
#6405

Boosting the visual interpretability of CLIP via adversarial fine-tuning

Shizhan Gong, Haoyu LEI, Qi Dou et al.

ICLR 2025
7
citations
#6406

LIRM: Large Inverse Rendering Model for Progressive Reconstruction of Shape, Materials and View-dependent Radiance Fields

Zhengqin Li, Dilin Wang, Ka chen et al.

CVPR 2025arXiv:2504.20026
7
citations
#6407

LazyMAR: Accelerating Masked Autoregressive Models via Feature Caching

Feihong Yan, qingyan wei, Jiayi Tang et al.

ICCV 2025arXiv:2503.12450
7
citations
#6408

Do Computer Vision Foundation Models Learn the Low-level Characteristics of the Human Visual System?

Yancheng Cai, Fei Yin, Dounia Hammou et al.

CVPR 2025highlightarXiv:2502.20256
7
citations
#6409

Active Task Disambiguation with LLMs

Katarzyna Kobalczyk, Nicolás Astorga, Tennison Liu et al.

ICLR 2025arXiv:2502.04485
7
citations
#6410

MotionAgent: Fine-grained Controllable Video Generation via Motion Field Agent

Xinyao Liao, Xianfang Zeng, Liao Wang et al.

ICCV 2025arXiv:2502.03207
7
citations
#6411

ChatHuman: Chatting about 3D Humans with Tools

Jing Lin, Yao Feng, Weiyang Liu et al.

CVPR 2025arXiv:2405.04533
7
citations
#6412

OpenRSD: Towards Open-prompts for Object Detection in Remote Sensing Images

Ziyue Huang, Yongchao Feng, Ziqi Liu et al.

ICCV 2025arXiv:2503.06146
7
citations
#6413

TurboFill: Adapting Few-step Text-to-image Model for Fast Image Inpainting

Liangbin Xie, Daniil Pakhomov, Zhonghao Wang et al.

CVPR 2025arXiv:2504.00996
7
citations
#6414

S4M: S4 for multivariate time series forecasting with Missing values

Jing Peng, Meiqi Yang, Qiong Zhang et al.

ICLR 2025oralarXiv:2503.00900
7
citations
#6415

MotiF: Making Text Count in Image Animation with Motion Focal Loss

Shijie Wang, Samaneh Azadi, Rohit Girdhar et al.

CVPR 2025arXiv:2412.16153
7
citations
#6416

Keyframe-oriented Vision Token Pruning: Enhancing Efficiency of Large Vision Language Models on Long-Form Video Processing

Yudong Liu, Jingwei Sun, Yueqian Lin et al.

ICCV 2025arXiv:2503.10742
7
citations
#6417

Towards Understanding the Robustness of Diffusion-Based Purification: A Stochastic Perspective

Yiming Liu, Kezhao Liu, Yao Xiao et al.

ICLR 2025arXiv:2404.14309
7
citations
#6418

AerialVG: A Challenging Benchmark for Aerial Visual Grounding by Exploring Positional Relations

Junli Liu, Qizhi Chen, Zhigang Wang et al.

ICCV 2025arXiv:2504.07836
7
citations
#6419

EvEnhancer: Empowering Effectiveness, Efficiency and Generalizability for Continuous Space-Time Video Super-Resolution with Events

Shuoyan Wei, Feng Li, Shengeng Tang et al.

CVPR 2025highlightarXiv:2505.04657
7
citations
#6420

CWNet: Causal Wavelet Network for Low-Light Image Enhancement

Tongshun Zhang, Pingping Liu, Yubing Lu et al.

ICCV 2025arXiv:2507.10689
7
citations
#6421

Rethinking the Embodied Gap in Vision-and-Language Navigation: A Holistic Study of Physical and Visual Disparities

Liuyi Wang, Xinyuan Xia, Hui Zhao et al.

ICCV 2025arXiv:2507.13019
7
citations
#6422

Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing

Peihao Wang, Ruisi Cai, Yuehao Wang et al.

ICLR 2025arXiv:2501.00658
7
citations
#6423

GOAL: Global-local Object Alignment Learning

Hyungyu Choi, Young Kyun Jang, Chanho Eom

CVPR 2025arXiv:2503.17782
7
citations
#6424

GeoLoRA: Geometric integration for parameter efficient fine-tuning

Steffen Schotthöfer, Emanuele Zangrando, Gianluca Ceruti et al.

ICLR 2025arXiv:2410.18720
7
citations
#6425

PADRe: A Unifying Polynomial Attention Drop-in Replacement for Efficient Vision Transformer

Pierre-David Letourneau, Manish Singh, Hsin-Pai Cheng et al.

ICLR 2025arXiv:2407.11306
7
citations
#6426

Reconstructing Humans with a Biomechanically Accurate Skeleton

Yan Xia, Xiaowei Zhou, Etienne Vouga et al.

CVPR 2025arXiv:2503.21751
7
citations
#6427

UNOPose: Unseen Object Pose Estimation with an Unposed RGB-D Reference Image

Xingyu Liu, Gu Wang, Ruida Zhang et al.

CVPR 2025arXiv:2411.16106
7
citations
#6428

Reference-Based 3D-Aware Image Editing with Triplanes

Bahri Batuhan Bilecen, Yiğit Yalın, Ning Yu et al.

CVPR 2025highlightarXiv:2404.03632
7
citations
#6429

LookCloser: Frequency-aware Radiance Field for Tiny-Detail Scene

Xiaoyu Zhang, Weihong Pan, Chong Bao et al.

CVPR 2025arXiv:2503.18513
7
citations
#6430

FlexGen: Flexible Multi-View Generation from Text and Image Inputs

Xinli Xu, Wenhang Ge, Jiantao Lin et al.

ICCV 2025arXiv:2410.10745
7
citations
#6431

Distributed Speculative Inference (DSI): Speculation Parallelism for Provably Faster Lossless Language Model Inference

Nadav Timor, Jonathan Mamou, Daniel Korat et al.

ICLR 2025arXiv:2405.14105
7
citations
#6432

HarmonySet: A Comprehensive Dataset for Understanding Video-Music Semantic Alignment and Temporal Synchronization

Zitang Zhou, Ke Mei, Yu Lu et al.

CVPR 2025arXiv:2503.01725
7
citations
#6433

Gradient-Guided Annealing for Domain Generalization

Aristotelis Ballas, Christos Diou

CVPR 2025highlightarXiv:2502.20162
7
citations
#6434

SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization

Hongrui Jia, Chaoya Jiang, Haiyang Xu et al.

CVPR 2025arXiv:2411.11909
7
citations
#6435

APHQ-ViT: Post-Training Quantization with Average Perturbation Hessian Based Reconstruction for Vision Transformers

Zhuguanyu Wu, Jiayi Zhang, Jiaxin Chen et al.

CVPR 2025arXiv:2504.02508
7
citations
#6436

Co-op: Correspondence-based Novel Object Pose Estimation

Sungphill Moon, Hyeontae Son, Dongcheol Hur et al.

CVPR 2025arXiv:2503.17731
7
citations
#6437

TruthPrInt: Mitigating Large Vision-Language Models Object Hallucination Via Latent Truthful-Guided Pre-Intervention

Jinhao Duan, Fei Kong, Hao Cheng et al.

ICCV 2025
7
citations
#6438

RoBridge: A Hierarchical Architecture Bridging Cognition and Execution for General Robotic Manipulation

Kaidong Zhang, Rongtao Xu, Ren Pengzhen et al.

ICCV 2025arXiv:2505.01709
7
citations
#6439

RigGS: Rigging of 3D Gaussians for Modeling Articulated Objects in Videos

Yuxin Yao, Zhi Deng, Junhui Hou

CVPR 2025arXiv:2503.16822
7
citations
#6440

VLsI: Verbalized Layers-to-Interactions from Large to Small Vision Language Models

Byung-Kwan Lee, Ryo Hachiuma, Yu-Chiang Frank Wang et al.

CVPR 2025arXiv:2412.01822
7
citations
#6441

Can Generative Video Models Help Pose Estimation?

Ruojin Cai, Jason Y. Zhang, Philipp Henzler et al.

CVPR 2025highlightarXiv:2412.16155
7
citations
#6442

RealEdit: Reddit Edits As a Large-scale Empirical Dataset for Image Transformations

Peter Sushko, Ayana Bharadwaj, Zhi Yang Lim et al.

CVPR 2025arXiv:2502.03629
7
citations
#6443

Causal Representation Learning from Multimodal Biomedical Observations

Yuewen Sun, Lingjing Kong, Guangyi Chen et al.

ICLR 2025arXiv:2411.06518
7
citations
#6444

Robustness Auditing for Linear Regression: To Singularity and Beyond

Ittai Rubinstein, Samuel Hopkins

ICLR 2025arXiv:2410.07916
7
citations
#6445

Integrative Decoding: Improving Factuality via Implicit Self-consistency

Yi Cheng, Xiao Liang, Yeyun Gong et al.

ICLR 2025arXiv:2410.01556
7
citations
#6446

FOCUS: Knowledge-enhanced Adaptive Visual Compression for Few-shot Whole Slide Image Classification

Zhengrui Guo, Conghao Xiong, Jiabo MA et al.

CVPR 2025arXiv:2411.14743
7
citations
#6447

Vision-Language Models Can't See the Obvious

YASSER ABDELAZIZ DAHOU DJILALI, Ngoc Huynh, Phúc Lê Khắc et al.

ICCV 2025arXiv:2507.04741
7
citations
#6448

RI3D: Few-Shot Gaussian Splatting With Repair and Inpainting Diffusion Priors

Avinash Paliwal, xilong zhou, Wei Ye et al.

ICCV 2025arXiv:2503.10860
7
citations
#6449

Beyond One Shot, Beyond One Perspective: Cross-View and Long-Horizon Distillation for Better LiDAR Representations

Xiang Xu, Lingdong Kong, Song Wang et al.

ICCV 2025arXiv:2507.05260
7
citations
#6450

VideoRFSplat: Direct Scene-Level Text-to-3D Gaussian Splatting Generation with Flexible Pose and Multi-View Joint Modeling

Hyojun Go, Byeongjun Park, Hyelin Nam et al.

ICCV 2025arXiv:2503.15855
7
citations
#6451

FlowR: Flowing from Sparse to Dense 3D Reconstructions

Tobias Fischer, Samuel Rota Bulò, Yung-Hsu Yang et al.

ICCV 2025highlightarXiv:2504.01647
7
citations
#6452

ODDR: Outlier Detection & Dimension Reduction Based Defense Against Adversarial Patches

Nandish Chattopadhyay, Amira Guesmi, Muhammad Abdullah Hanif et al.

ICCV 2025arXiv:2311.12084
7
citations
#6453

Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences

Hyojin Bahng, Caroline Chan, Fredo Durand et al.

ICCV 2025arXiv:2506.02095
7
citations
#6454

HERMES: temporal-coHERent long-forM understanding with Episodes and Semantics

Gueter Josmy Faure, Jia-Fong Yeh, Min-Hung Chen et al.

ICCV 2025arXiv:2408.17443
7
citations
#6455

DASH: Detection and Assessment of Systematic Hallucinations of VLMs

Maximilian Augustin, Yannic Neuhaus, Matthias Hein

ICCV 2025arXiv:2503.23573
7
citations
#6456

Mesh-RFT: Enhancing Mesh Generation via Fine-grained Reinforcement Fine-Tuning

Jian Liu, Jing Xu, Song Guo et al.

NEURIPS 2025spotlightarXiv:2505.16761
7
citations
#6457

DataRater: Meta-Learned Dataset Curation

Dan Andrei Calian, Greg Farquhar, Iurii Kemaev et al.

NEURIPS 2025arXiv:2505.17895
7
citations
#6458

Hyperbolic Dataset Distillation

Wenyuan Li, Guang Li, Keisuke Maeda et al.

NEURIPS 2025arXiv:2505.24623
7
citations
#6459

Noise Injection Reveals Hidden Capabilities of Sandbagging Language Models

Cameron Tice, Philipp Kreer, Nathan Helm-Burger et al.

NEURIPS 2025arXiv:2412.01784
7
citations
#6460

On the Relation between Rectified Flows and Optimal Transport

Johannes Hertrich, Antonin Chambolle, Julie Delon

NEURIPS 2025arXiv:2505.19712
7
citations
#6461

MLRC-Bench: Can Language Agents Solve Machine Learning Research Challenges?

Yunxiang Zhang, Muhammad Khalifa, Shitanshu Bhushan et al.

NEURIPS 2025arXiv:2504.09702
7
citations
#6462

T-SHIRT: Token-Selective Hierarchical Data Selection for Instruction Tuning

Yanjun Fu, Faisal Hamman, Sanghamitra Dutta

NEURIPS 2025arXiv:2506.01317
7
citations
#6463

Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language Navigation

Zihan Wang, Seungjun Lee, Gim Hee Lee

NEURIPS 2025oralarXiv:2505.11383
7
citations
#6464

LayerCraft: Enhancing Text-to-Image Generation with CoT Reasoning and Layered Object Integration

Yuyao Zhang, Jinghao Li, Yu-Wing Tai

NEURIPS 2025arXiv:2504.00010
7
citations
#6465

Compiler-R1: Towards Agentic Compiler Auto-tuning with Reinforcement Learning

Haolin Pan, Hongyu Lin, Haoran Luo et al.

NEURIPS 2025arXiv:2506.15701
7
citations
#6466

On Extending Direct Preference Optimization to Accommodate Ties

Jinghong Chen, Guangyu Yang, Weizhe Lin et al.

NEURIPS 2025arXiv:2409.17431
7
citations
#6467

Generalized Gradient Norm Clipping & Non-Euclidean $(L_0,L_1)$-Smoothness

Thomas Pethick, Wanyun Xie, Mete Erdogan et al.

NEURIPS 2025oralarXiv:2506.01913
7
citations
#6468

Straight-Line Diffusion Model for Efficient 3D Molecular Generation

Yuyan Ni, Shikun Feng, Haohan Chi et al.

NEURIPS 2025arXiv:2503.02918
7
citations
#6469

Multimodal LLMs as Customized Reward Models for Text-to-Image Generation

Shijie Zhou, Ruiyi Zhang, Huaisheng Zhu et al.

ICCV 2025arXiv:2507.21391
7
citations
#6470

Space Group Equivariant Crystal Diffusion

Rees Chang, Angela Pak, Alex Guerra et al.

NEURIPS 2025arXiv:2505.10994
7
citations
#6471

Latent Policy Barrier: Learning Robust Visuomotor Policies by Staying In-Distribution

Zhanyi Sun, Shuran Song

NEURIPS 2025spotlightarXiv:2508.05941
7
citations
#6472

Accelerating Diffusion Transformer via Gradient-Optimized Cache

Junxiang Qiu, Lin Liu, Shuo Wang et al.

ICCV 2025arXiv:2503.05156
7
citations
#6473

Importance-Based Token Merging for Efficient Image and Video Generation

Haoyu Wu, Jingyi Xu, Hieu Le et al.

ICCV 2025arXiv:2411.16720
7
citations
#6474

Universal generalization guarantees for Wasserstein distributionally robust models

Tam Le, Jerome Malick

ICLR 2025arXiv:2402.11981
7
citations
#6475

TempSamp-R1: Effective Temporal Sampling with Reinforcement Fine-Tuning for Video LLMs

Yunheng Li, Jing Cheng, Shaoyong Jia et al.

NEURIPS 2025oralarXiv:2509.18056
7
citations
#6476

LittleBit: Ultra Low-Bit Quantization via Latent Factorization

Banseok Lee, Dongkyu Kim, Youngcheon You et al.

NEURIPS 2025arXiv:2506.13771
7
citations
#6477

ObjectGS: Object-aware Scene Reconstruction and Scene Understanding via Gaussian Splatting

Ruijie Zhu, Mulin Yu, Linning Xu et al.

ICCV 2025arXiv:2507.15454
7
citations
#6478

MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering

Rushi Qiang, Yuchen Zhuang, Yinghao Li et al.

NEURIPS 2025arXiv:2505.07782
7
citations
#6479

Dense SAE Latents Are Features, Not Bugs

Xiaoqing Sun, Alessandro Stolfo, Joshua Engels et al.

NEURIPS 2025arXiv:2506.15679
7
citations
#6480

HAIF-GS: Hierarchical and Induced Flow-Guided Gaussian Splatting for Dynamic Scene

Jianing Chen, Zehao Li, Yujun Cai et al.

NEURIPS 2025oralarXiv:2506.09518
7
citations
#6481

Domain-RAG: Retrieval-Guided Compositional Image Generation for Cross-Domain Few-Shot Object Detection

Yu Li, Xingyu Qiu, Yuqian Fu et al.

NEURIPS 2025arXiv:2506.05872
7
citations
#6482

NOVA: A Benchmark for Rare Anomaly Localization and Clinical Reasoning in Brain MRI

Cosmin Bercea, Jun Li, Philipp Raffler et al.

NEURIPS 2025oral
7
citations
#6483

Is PRM Necessary? Problem-Solving RL Implicitly Induces PRM Capability in LLMs

Zhangyin Feng, Qianglong Chen, Ning Lu et al.

NEURIPS 2025arXiv:2505.11227
7
citations
#6484

Dynamic Typography: Bringing Text to Life via Video Diffusion Prior

Zichen Liu, Yihao Meng, Hao Ouyang et al.

ICCV 2025arXiv:2404.11614
7
citations
#6485

BOOM: Benchmarking Out-Of-distribution Molecular Property Predictions of Machine Learning Models

Evan Antoniuk, Shehtab Zaman, Tal Ben-Nun et al.

NEURIPS 2025arXiv:2505.01912
7
citations
#6486

Wicked Oddities: Selectively Poisoning for Effective Clean-Label Backdoor Attacks

Hung Quang Nguyen, Hieu Nguyen, Anh Ta et al.

ICLR 2025arXiv:2407.10825
6
citations
#6487

CLIP Under the Microscope: A Fine-Grained Analysis of Multi-Object Representation

Reza Abbasi, Ali Nazari, Aminreza Sefid et al.

CVPR 2025arXiv:2502.19842
6
citations
#6488

Model Reveals What to Cache: Profiling-Based Feature Reuse for Video Diffusion Models

Xuran Ma, Yexin Liu, Yaofu LIU et al.

ICCV 2025arXiv:2504.03140
6
citations
#6489

RGB-Event ISP: The Dataset and Benchmark

Yunfan LU, Yanlin Qian, Ziyang Rao et al.

ICLR 2025arXiv:2501.19129
6
citations
#6490

Cross-modal Ship Re-Identification via Optical and SAR Imagery: A Novel Dataset and Method

Han Wang, Shengyang Li, Jian Yang et al.

ICCV 2025arXiv:2506.22027
6
citations
#6491

mmCooper: A Multi-agent Multi-stage Communication-efficient and Collaboration-robust Cooperative Perception Framework

Bingyi Liu, Jian Teng, Hongfei Xue et al.

ICCV 2025arXiv:2501.12263
6
citations
#6492

CompCap: Improving Multimodal Large Language Models with Composite Captions

Xiaohui Chen, Satya Narayan Shukla, Mahmoud Azab et al.

ICCV 2025arXiv:2412.05243
6
citations
#6493

TurboReg: TurboClique for Robust and Efficient Point Cloud Registration

Shaocheng Yan, Pengcheng Shi, Zhenjun Zhao et al.

ICCV 2025arXiv:2507.01439
6
citations
#6494

SpatialSplat: Efficient Semantic 3D from Sparse Unposed Images

Yu Sheng, Jiajun Deng, Xinran Zhang et al.

ICCV 2025arXiv:2505.23044
6
citations
#6495

Seg4Diff: Unveiling Open-Vocabulary Semantic Segmentation in Text-to-Image Diffusion Transformers

Chaehyun Kim, Heeseong Shin, Eunbeen Hong et al.

NEURIPS 2025
6
citations
#6496

PhysDrive: A Multimodal Remote Physiological Measurement Dataset for In-vehicle Driver Monitoring

Wang, Xiao Yang, Qingyong Hu et al.

NEURIPS 2025arXiv:2507.19172
6
citations
#6497

A Statistical Framework for Ranking LLM-based Chatbots

Siavash Ameli, Siyuan Zhuang, Ion Stoica et al.

ICLR 2025arXiv:2412.18407
6
citations
#6498

IGL-Bench: Establishing the Comprehensive Benchmark for Imbalanced Graph Learning

Jiawen Qin, Haonan Yuan, Qingyun Sun et al.

ICLR 2025arXiv:2406.09870
6
citations
#6499

RadarSplat: Radar Gaussian Splatting for High-Fidelity Data Synthesis and 3D Reconstruction of Autonomous Driving Scenes

Pou-Chun Kung, Skanda Harisha, Ram Vasudevan et al.

ICCV 2025arXiv:2506.01379
6
citations
#6500

DyMU: Dynamic Merging and Virtual Unmerging for Efficient Variable-Length VLMs

Zhenhailong Wang, Senthil Purushwalkam, Caiming Xiong et al.

NEURIPS 2025
6
citations
#6501

Uncertain Multimodal Intention and Emotion Understanding in the Wild

Qu Yang, QingHongYa Shi, Tongxin Wang et al.

CVPR 2025
6
citations
#6502

ConStellaration: A dataset of QI-like stellarator plasma boundaries and optimization benchmarks

Santiago Cadena, Andrea Merlo, Emanuel Laude et al.

NEURIPS 2025arXiv:2506.19583
6
citations
#6503

Not-So-Optimal Transport Flows for 3D Point Cloud Generation

Ka-Hei Hui, Chao Liu, xiaohui zeng et al.

ICLR 2025arXiv:2502.12456
6
citations
#6504

EchoWorld: Learning Motion-Aware World Models for Echocardiography Probe Guidance

Yang Yue, Yulin Wang, Haojun Jiang et al.

CVPR 2025arXiv:2504.13065
6
citations
#6505

CamFreeDiff: Camera-free Image to Panorama Generation with Diffusion Model

Xiaoding Yuan, Shitao Tang, Kejie Li et al.

CVPR 2025arXiv:2407.07174
6
citations
#6506

Multitwine: Multi-Object Compositing with Text and Layout Control

Gemma Canet Tarrés, Zhe Lin, Zhifei Zhang et al.

CVPR 2025highlightarXiv:2502.05165
6
citations
#6507

VELOCITI: Benchmarking Video-Language Compositional Reasoning with Strict Entailment

Darshana Saravanan, Varun Gupta, Darshan Singh S et al.

CVPR 2025arXiv:2406.10889
6
citations
#6508

Reinforcement learning with combinatorial actions for coupled restless bandits

Lily Xu, Bryan Wilder, Elias Khalil et al.

ICLR 2025arXiv:2503.01919
6
citations
#6509

Diffusion Bridge AutoEncoders for Unsupervised Representation Learning

Yeongmin Kim, Kwanghyeon Lee, Minsang Park et al.

ICLR 2025arXiv:2405.17111
6
citations
#6510

Reconciling Model Multiplicity for Downstream Decision Making

Ally Du, Dung Daniel Ngo, Steven Wu

ICLR 2025arXiv:2405.19667
6
citations
#6511

Overcoming Sparsity Artifacts in Crosscoders to Interpret Chat-Tuning

Julian Minder, Clément Dumas, Caden Juang et al.

NEURIPS 2025arXiv:2504.02922
6
citations
#6512

Lay2Story: Extending Diffusion Transformers for Layout-Togglable Story Generation

Ao Ma, Jiasong Feng, Ke Cao et al.

ICCV 2025arXiv:2508.08949
6
citations
#6513

Student-Informed Teacher Training

Nico Messikommer, Jiaxu Xing, Elie Aljalbout et al.

ICLR 2025arXiv:2412.09149
6
citations
#6514

FaceShield: Defending Facial Image against Deepfake Threats

Jaehwan Jeong, Sumin In, Sieun Kim et al.

ICCV 2025arXiv:2412.09921
6
citations
#6515

Improving Gaussian Splatting with Localized Points Management

Haosen Yang, Chenhao Zhang, Wenqing Wang et al.

CVPR 2025highlightarXiv:2406.04251
6
citations
#6516

ReNeg: Learning Negative Embedding with Reward Guidance

Xiaomin Li, yixuan liu, Takashi Isobe et al.

CVPR 2025highlightarXiv:2412.19637
6
citations
#6517

DWIM: Towards Tool-aware Visual Reasoning via Discrepancy-aware Workflow Generation & Instruct-Masking Tuning

Fucai Ke, Vijay Kumar b g, Xingjian Leng et al.

ICCV 2025arXiv:2503.19263
6
citations
#6518

Time Travel is Cheating: Going Live with DeepFund for Real-Time Fund Investment Benchmarking

Changlun Li, Yao SHI, Chen Wang et al.

NEURIPS 2025arXiv:2505.11065
6
citations
#6519

GenPO: Generative Diffusion Models Meet On-Policy Reinforcement Learning

Shutong Ding, Ke Hu, Shan Zhong et al.

NEURIPS 2025arXiv:2505.18763
6
citations
#6520

Rotation-Equivariant Self-Supervised Method in Image Denoising

Hanze Liu, Jiahong Fu, Qi Xie et al.

CVPR 2025arXiv:2505.19618
6
citations
#6521

On Speeding Up Language Model Evaluation

Jin Zhou, Christian Belardi, Ruihan Wu et al.

ICLR 2025arXiv:2407.06172
6
citations
#6522

Understanding Representation Dynamics of Diffusion Models via Low-Dimensional Modeling

Xiao Li, Zekai Zhang, Xiang Li et al.

NEURIPS 2025arXiv:2502.05743
6
citations
#6523

Motion Modes: What Could Happen Next?

Karran Pandey, Yannick Hold-Geoffroy, Matheus Gadelha et al.

CVPR 2025arXiv:2412.00148
6
citations
#6524

Learning from Streaming Video with Orthogonal Gradients

Tengda Han, Dilara Gokay, Joseph Heyward et al.

CVPR 2025arXiv:2504.01961
6
citations
#6525

``Principal Components" Enable A New Language of Images

Xin Wen, Bingchen Zhao, Ismail Elezi et al.

ICCV 2025
6
citations
#6526

CanonSwap: High-Fidelity and Consistent Video Face Swapping via Canonical Space Modulation

Xiangyang Luo, Ye Zhu, Yunfei Liu et al.

ICCV 2025arXiv:2507.02691
6
citations
#6527

DreamFuse: Adaptive Image Fusion with Diffusion Transformer

Junjia Huang, Pengxiang Yan, Jiyang Liu et al.

ICCV 2025arXiv:2504.08291
6
citations
#6528

AURELIA: Test-time Reasoning Distillation in Audio-Visual LLMs

Sanjoy Chowdhury, Hanan Gani, Nishit Anand et al.

ICCV 2025arXiv:2503.23219
6
citations
#6529

Divergence-enhanced Knowledge-guided Context Optimization for Visual-Language Prompt Tuning

Yilun Li, Miaomiao Cheng, Xu Han et al.

ICLR 2025
6
citations
#6530

Robust 3D Shape Reconstruction in Zero-Shot from a Single Image in the Wild

Junhyeong Cho, Kim Youwang, Hunmin Yang et al.

CVPR 2025arXiv:2403.14539
6
citations
#6531

Driving View Synthesis on Free-form Trajectories with Generative Prior

Zeyu Yang, Zijie Pan, Yuankun Yang et al.

ICCV 2025arXiv:2412.01717
6
citations
#6532

Bayesian Experimental Design Via Contrastive Diffusions

Jacopo Iollo, Christophe Heinkelé, Pierre Alliez et al.

ICLR 2025arXiv:2410.11826
6
citations
#6533

Adapt-$\infty$: Scalable Continual Multimodal Instruction Tuning via Dynamic Data Selection

Adyasha Maharana, Jaehong Yoon, Tianlong Chen et al.

ICLR 2025arXiv:2410.10636
6
citations
#6534

Secure and Confidential Certificates of Online Fairness

Olive Franzese, Ali Shahin Shamsabadi, Carter Luck et al.

NEURIPS 2025arXiv:2410.02777
6
citations
#6535

CDI: Copyrighted Data Identification in Diffusion Models

Jan Dubiński, Antoni Kowalczuk, Franziska Boenisch et al.

CVPR 2025arXiv:2411.12858
6
citations
#6536

Learning a Neural Solver for Parametric PDEs to Enhance Physics-Informed Methods

Lise Le Boudec, Emmanuel de Bézenac, Louis Serrano et al.

ICLR 2025arXiv:2410.06820
6
citations
#6537

GIViC: Generative Implicit Video Compression

Ge Gao, Siyue Teng, Tianhao Peng et al.

ICCV 2025arXiv:2503.19604
6
citations
#6538

ZeroStereo: Zero-shot Stereo Matching from Single Images

Xianqi Wang, Hao Yang, Gangwei Xu et al.

ICCV 2025arXiv:2501.08654
6
citations
#6539

CATSplat: Context-Aware Transformer with Spatial Guidance for Generalizable 3D Gaussian Splatting from A Single-View Image

Wonseok Roh, Hwanhee Jung, JongWook Kim et al.

ICCV 2025arXiv:2412.12906
6
citations
#6540

Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval

Davide Caffagni, Sara Sarto, Marcella Cornia et al.

CVPR 2025arXiv:2503.01980
6
citations
#6541

Large Self-Supervised Models Bridge the Gap in Domain Adaptive Object Detection

Marc-Antoine Lavoie, Anas Mahmoud, Steven L. Waslander

CVPR 2025arXiv:2503.23220
6
citations
#6542

RAGDiffusion: Faithful Cloth Generation via External Knowledge Assimilation

Yuhan Li, Xianfeng Tan, Wenxiang Shang et al.

ICCV 2025highlightarXiv:2411.19528
6
citations
#6543

Decoupling Layout from Glyph in Online Chinese Handwriting Generation

Minsi Ren, Yan-Ming Zhang, yi chen

ICLR 2025arXiv:2410.02309
6
citations
#6544

DualPM: Dual Posed-Canonical Point Maps for 3D Shape and Pose Reconstruction

Ben Kaye, Tomas Jakab, Shangzhe Wu et al.

CVPR 2025highlightarXiv:2412.04464
6
citations
#6545

Prompt-CAM: Making Vision Transformers Interpretable for Fine-Grained Analysis

Arpita Chowdhury, Dipanjyoti Paul, Zheda Mai et al.

CVPR 2025arXiv:2501.09333
6
citations
#6546

Efficient Prompt Compression with Evaluator Heads for Long-Context Transformer Inference

Weizhi Fei, Xueyan Niu, XIE GUOQING et al.

NEURIPS 2025spotlightarXiv:2501.12959
6
citations
#6547

3D-MolT5: Leveraging Discrete Structural Information for Molecule-Text Modeling

Qizhi Pei, Rui Yan, Kaiyuan Gao et al.

ICLR 2025arXiv:2406.05797
6
citations
#6548

Sparfels: Fast Reconstruction from Sparse Unposed Imagery

Shubhendu Jena, Amine Ouasfi, Mae Younes et al.

ICCV 2025highlightarXiv:2505.02178
6
citations
#6549

Enhancing Reward Models for High-quality Image Generation: Beyond Text-Image Alignment

ying ba, Tianyu Zhang, Yalong Bai et al.

ICCV 2025arXiv:2507.19002
6
citations
#6550

Curly Flow Matching for Learning Non-gradient Field Dynamics

Katarina Petrović, Lazar Atanackovic, Viggo Moro et al.

NEURIPS 2025arXiv:2510.26645
6
citations
#6551

LASeR: Learning to Adaptively Select Reward Models with Multi-Arm Bandits

Duy Nguyen, Archiki Prasad, Elias Stengel-Eskin et al.

NEURIPS 2025arXiv:2410.01735
6
citations
#6552

Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models

Yudong Jin, Sida Peng, Xuan Wang et al.

ICCV 2025arXiv:2507.13344
6
citations
#6553

Dense-SfM: Structure from Motion with Dense Consistent Matching

JongMin Lee, Sungjoo Yoo

CVPR 2025arXiv:2501.14277
6
citations
#6554

Efficient Source-Free Time-Series Adaptation via Parameter Subspace Disentanglement

Gaurav Patel, Christopher M. Sandino, Behrooz Mahasseni et al.

ICLR 2025arXiv:2410.02147
6
citations
#6555

Dual-Process Image Generation

Grace Luo, Jonathan Granskog, Aleksander Holynski et al.

ICCV 2025arXiv:2506.01955
6
citations
#6556

Multilevel Generative Samplers for Investigating Critical Phenomena

Ankur Singha, Elia Cellini, Kim A. Nicoli et al.

ICLR 2025arXiv:2503.08918
6
citations
#6557

E(3)-equivariant models cannot learn chirality: Field-based molecular generation

Alexandru Dumitrescu, Dani Korpela, Markus Heinonen et al.

ICLR 2025arXiv:2402.15864
6
citations
#6558

Execution Guided Line-by-Line Code Generation

Boaz Lavon, Shahar Katz, Lior Wolf

NEURIPS 2025arXiv:2506.10948
6
citations
#6559

On the Loss of Context Awareness in General Instruction Fine-tuning

Yihan Wang, Andrew Bai, Nanyun Peng et al.

NEURIPS 2025arXiv:2411.02688
6
citations
#6560

KAC: Kolmogorov-Arnold Classifier for Continual Learning

Yusong Hu, Zichen Liang, Fei Yang et al.

CVPR 2025highlightarXiv:2503.21076
6
citations
#6561

Dynamical Low-Rank Compression of Neural Networks with Robustness under Adversarial Attacks

Steffen Schotthöfer, Lexie Yang, Stefan Schnake

NEURIPS 2025oralarXiv:2505.08022
6
citations
#6562

Deep Continuous-Time State-Space Models for Marked Event Sequences

Yuxin Chang, Alex Boyd, Cao (Danica) Xiao et al.

NEURIPS 2025oralarXiv:2412.19634
6
citations
#6563

The Curse of Conditions: Analyzing and Improving Optimal Transport for Conditional Flow-Based Generation

Ho Kei Cheng, Alex Schwing

ICCV 2025arXiv:2503.10636
6
citations
#6564

SparseAlign: a Fully Sparse Framework for Cooperative Object Detection

Yunshuang Yuan, Yan Xia, Daniel Cremers et al.

CVPR 2025arXiv:2503.12982
6
citations
#6565

EVOS: Efficient Implicit Neural Training via EVOlutionary Selector

Weixiang Zhang, Shuzhao Xie, Chengwei Ren et al.

CVPR 2025arXiv:2412.10153
6
citations
#6566

Vgent: Graph-based Retrieval-Reasoning-Augmented Generation For Long Video Understanding

Xiaoqian Shen, Wenxuan Zhang, Jun Chen et al.

NEURIPS 2025oralarXiv:2510.14032
6
citations
#6567

Towards In-the-wild 3D Plane Reconstruction from a Single Image

Jiachen Liu, Rui Yu, Sili Chen et al.

CVPR 2025highlightarXiv:2506.02493
6
citations
#6568

Augmented Mass-Spring Model for Real-Time Dense Hair Simulation

Jorge Herrera, Yi Zhou, Xin Sun et al.

ICCV 2025arXiv:2412.17144
6
citations
#6569

On the Expressive Power of Sparse Geometric MPNNs

Yonatan Sverdlov, Nadav Dym

ICLR 2025arXiv:2407.02025
6
citations
#6570

Cross-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images

Jie Mei, Chenyu Lin, Yu Qiu et al.

CVPR 2025arXiv:2503.17261
6
citations
#6571

IGD: Token Decisiveness Modeling via Information Gain in LLMs for Personalized Recommendation

Zijie Lin, Yang Zhang, Xiaoyan Zhao et al.

NEURIPS 2025arXiv:2506.13229
6
citations
#6572

OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning

Xianhang Li, Yanqing Liu, Haoqin Tu et al.

ICCV 2025arXiv:2505.04601
6
citations
#6573

Tight Time Complexities in Parallel Stochastic Optimization with Arbitrary Computation Dynamics

Alexander Tyurin

ICLR 2025arXiv:2408.04929
6
citations
#6574

Probabilistic Learning to Defer: Handling Missing Expert Annotations and Controlling Workload Distribution

Cuong Nguyen, Thanh-Toan Do, Gustavo Carneiro

ICLR 2025
6
citations
#6575

Control-oriented Clustering of Visual Latent Representation

Han Qi, Haocheng Yin, Heng Yang

ICLR 2025arXiv:2410.05063
6
citations
#6576

Open-Vocabulary Octree-Graph for 3D Scene Understanding

Zhigang Wang, Yifei Su, Chenhui Li et al.

ICCV 2025arXiv:2411.16253
6
citations
#6577

Towards Realistic Example-based Modeling via 3D Gaussian Stitching

Xinyu Gao, Ziyi Yang, Bingchen Gong et al.

CVPR 2025arXiv:2408.15708
6
citations
#6578

Towards Omnimodal Expressions and Reasoning in Referring Audio-Visual Segmentation

Kaining Ying, Henghui Ding, Guangquan Jie et al.

ICCV 2025arXiv:2507.22886
6
citations
#6579

VasTSD: Learning 3D Vascular Tree-state Space Diffusion Model for Angiography Synthesis

Zhifeng Wang, Renjiao Yi, Xin Wen et al.

CVPR 2025arXiv:2503.12758
6
citations
#6580

Ensembles of Low-Rank Expert Adapters

Yinghao Li, Vianne Gao, Chao Zhang et al.

ICLR 2025arXiv:2502.00089
6
citations
#6581

DropletVideo: A Dataset and Approach to Explore Integral Spatio-Temporal Consistent Video Generation

Runze Zhang, Guoguang Du, Xiaochuan Li et al.

ICCV 2025highlightarXiv:2503.06053
6
citations
#6582

Mitigating Hallucination Through Theory-Consistent Symmetric Multimodal Preference Optimization

Wenqi Liu, Xuemeng Song, Jiaxi Li et al.

NEURIPS 2025arXiv:2506.11712
6
citations
#6583

Weakly Supervised Temporal Action Localization via Dual-Prior Collaborative Learning Guided by Multimodal Large Language Models

Quan Zhang, Jinwei Fang, Rui Yuan et al.

CVPR 2025arXiv:2411.08466
6
citations
#6584

On Denoising Walking Videos for Gait Recognition

Dongyang Jin, Chao Fan, Jingzhe Ma et al.

CVPR 2025arXiv:2505.18582
6
citations
#6585

Think-RM: Enabling Long-Horizon Reasoning in Generative Reward Models

Ilgee Hong, Changlong Yu, Liang Qiu et al.

NEURIPS 2025arXiv:2505.16265
6
citations
#6586

Thought Communication in Multiagent Collaboration

Yujia Zheng, Zhuokai Zhao, Zijian Li et al.

NEURIPS 2025spotlightarXiv:2510.20733
6
citations
#6587

TSAM: Temporal SAM Augmented with Multimodal Prompts for Referring Audio-Visual Segmentation

Abduljalil Radman, Jorma Laaksonen

CVPR 2025
6
citations
#6588

3D Occupancy Prediction with Low-Resolution Queries via Prototype-aware View Transformation

Gyeongrok Oh, Sung June Kim, Heeju Ko et al.

CVPR 2025arXiv:2503.15185
6
citations
#6589

Where, What, Why: Towards Explainable Driver Attention Prediction

Yuchen Zhou, Jiayu Tang, Xiaoyan Xiao et al.

ICCV 2025highlightarXiv:2506.23088
6
citations
#6590

Imputation-free and Alignment-free: Incomplete Multi-view Clustering Driven by Consensus Semantic Learning

yuzhuo dai, Jiaqi Jin, Zhibin Dong et al.

CVPR 2025arXiv:2505.11182
6
citations
#6591

Ineq-Comp: Benchmarking Human-Intuitive Compositional Reasoning in Automated Theorem Proving of Inequalities

Haoyu Zhao, Yihan Geng, Shange Tang et al.

NEURIPS 2025arXiv:2505.12680
6
citations
#6592

Hyperbolic Safety-Aware Vision-Language Models

Tobia Poppi, Tejaswi Kasarla, Pascal Mettes et al.

CVPR 2025highlightarXiv:2503.12127
6
citations
#6593

SQLens: An End-to-End Framework for Error Detection and Correction in Text-to-SQL

Yue Gong, Chuan Lei, Xiao Qin et al.

NEURIPS 2025arXiv:2506.04494
6
citations
#6594

KLASS: KL-Guided Fast Inference in Masked Diffusion Models

Seo Hyun Kim, Sunwoo Hong, Hojung Jung et al.

NEURIPS 2025spotlightarXiv:2511.05664
6
citations
#6595

Masked Point-Entity Contrast for Open-Vocabulary 3D Scene Understanding

Yan Wang, Baoxiong Jia, Ziyu Zhu et al.

CVPR 2025arXiv:2504.19500
6
citations
#6596

FlexOLMo: Open Language Models for Flexible Data Use

Weijia Shi, Akshita Bhagia, Kevin Farhat et al.

NEURIPS 2025spotlightarXiv:2507.07024
6
citations
#6597

SimXRD-4M: Big Simulated X-ray Diffraction Data and Crystal Symmetry Classification Benchmark

Bin Cao, Yang Liu, Zinan Zheng et al.

ICLR 2025
6
citations
#6598

Generalizable, real-time neural decoding with hybrid state-space models

Avery Hee-Woon Ryoo, Nanda H Krishna, Ximeng Mao et al.

NEURIPS 2025arXiv:2506.05320
6
citations
#6599

Bridging Information Asymmetry in Text-video Retrieval: A Data-centric Approach

Zechen Bai, Tianjun Xiao, Tong He et al.

ICLR 2025arXiv:2408.07249
6
citations
#6600

VALLR: Visual ASR Language Model for Lip Reading

Marshall Thomas, Edward Fish, Richard Bowden

ICCV 2025arXiv:2503.21408
6
citations