Most Cited 2025 "private estimators" Papers

22,274 papers found • Page 30 of 112

#5801

DistinctAD: Distinctive Audio Description Generation in Contexts

Bo Fang, Wenhao Wu, Qiangqiang Wu et al.

CVPR 2025highlightarXiv:2411.18180
4
citations
#5802

The Atlas of In-Context Learning: How Attention Heads Shape In-Context Retrieval Augmentation

Patrick Kahardipraja, Reduan Achtibat, Thomas Wiegand et al.

NEURIPS 2025arXiv:2505.15807
4
citations
#5803

BiLoRA: Almost-Orthogonal Parameter Spaces for Continual Learning

Hao Zhu, Yifei Zhang, Junhao Dong et al.

CVPR 2025
4
citations
#5804

Towards Generalizable Trajectory Prediction using Dual-Level Representation Learning and Adaptive Prompting

Kaouther Messaoud, Matthieu Cord, Alex Alahi

CVPR 2025arXiv:2501.04815
4
citations
#5805

SnowMaster: Comprehensive Real-world Image Desnowing via MLLM with Multi-Model Feedback Optimization

Jianyu LAI, Sixiang Chen, yunlong lin et al.

CVPR 2025
4
citations
#5806

Understanding Contrastive Learning via Gaussian Mixture Models

Parikshit Bansal, Ali Kavis, Sujay Sanghavi

NEURIPS 2025
4
citations
#5807

AgentBreeder: Mitigating the AI Safety Risks of Multi-Agent Scaffolds via Self-Improvement

J Rosser, Jakob Foerster

NEURIPS 2025spotlightarXiv:2502.00757
4
citations
#5808

TrustMark: Robust Watermarking and Watermark Removal for Arbitrary Resolution Images

Tu Bui, Shruti Agarwal, John Collomosse

ICCV 2025
4
citations
#5809

Optimal Spectral Transitions in High-Dimensional Multi-Index Models

Leonardo Defilippis, Yatin Dandi, Pierre Mergny et al.

NEURIPS 2025arXiv:2502.02545
4
citations
#5810

DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval

Leqi Shen, Guoqiang Gong, Tianxiang Hao et al.

CVPR 2025arXiv:2506.08887
4
citations
#5811

QuCOOP: A Versatile Framework for Solving Composite and Binary-Parametrised Problems on Quantum Annealers

Natacha Kuete Meli, Vladislav Golyanik, Marcel Seelbach Benkner et al.

CVPR 2025highlightarXiv:2503.19718
4
citations
#5812

Repurposing 2D Diffusion Models with Gaussian Atlas for 3D Generation

Tiange Xiang, Kai Li, Chengjiang Long et al.

ICCV 2025arXiv:2503.15877
4
citations
#5813

Doubly Robust Alignment for Large Language Models

Erhan Xu, Kai Ye, Hongyi Zhou et al.

NEURIPS 2025arXiv:2506.01183
4
citations
#5814

LatentHOI: On the Generalizable Hand Object Motion Generation with Latent Hand Diffusion.

Muchen Li, Sammy Christen, Chengde Wan et al.

CVPR 2025
4
citations
#5815

Towards foundational LiDAR world models with efficient latent flow matching

Tianran Liu, Shengwen Zhao, Nicholas Rhinehart

NEURIPS 2025arXiv:2506.23434
4
citations
#5816

ZeroVO: Visual Odometry with Minimal Assumptions

Lei Lai, Zekai Yin, Eshed Ohn-Bar

CVPR 2025arXiv:2506.08005
4
citations
#5817

Do different prompting methods yield a common task representation in language models?

Guy Davidson, Todd Gureckis, Brenden Lake et al.

NEURIPS 2025arXiv:2505.12075
4
citations
#5818

CoDA: Coordinated Diffusion Noise Optimization for Whole-Body Manipulation of Articulated Objects

Huaijin Pi, Zhi Cen, Zhiyang Dou et al.

NEURIPS 2025arXiv:2505.21437
4
citations
#5819

Dual-Agent Optimization framework for Cross-Domain Few-Shot Segmentation

Zhaoyang Li, Yuan Wang, Wangkai Li et al.

CVPR 2025
4
citations
#5820

Self-Calibrated Variance-Stabilizing Transformations for Real-World Image Denoising

Sébastien Herbreteau, Michael Unser

ICCV 2025arXiv:2407.17399
4
citations
#5821

GenFlowRL: Shaping Rewards with Generative Object-Centric Flow in Visual Reinforcement Learning

Kelin Yu, Sheng Zhang, Harshit Soora et al.

ICCV 2025arXiv:2508.11049
4
citations
#5822

Enhanced then Progressive Fusion with View Graph for Multi-View Clustering

Zhibin Dong, Meng Liu, Siwei Wang et al.

CVPR 2025
4
citations
#5823

Unity in Diversity: Video Editing via Gradient-Latent Purification

Junyu Gao, Kunlin Yang, Xuan Yao et al.

CVPR 2025
4
citations
#5824

Lie Detector: Unified Backdoor Detection via Cross-Examination Framework

Xuan Wang, Siyuan Liang, Dongping Liao et al.

NEURIPS 2025arXiv:2503.16872
4
citations
#5825

Deliberation on Priors: Trustworthy Reasoning of Large Language Models on Knowledge Graphs

Jie Ma, NING QU, Zhitao Gao et al.

NEURIPS 2025arXiv:2505.15210
4
citations
#5826

Capturing Individual Human Preferences with Reward Features

Andre Barreto, Vincent Dumoulin, Yiran Mao et al.

NEURIPS 2025arXiv:2503.17338
4
citations
#5827

BG-Triangle: Bézier Gaussian Triangle for 3D Vectorization and Rendering

Minye Wu, Haizhao Dai, Kaixin Yao et al.

CVPR 2025arXiv:2503.13961
4
citations
#5828

Feedback Guidance of Diffusion Models

Felix Koulischer, Florian Handke, Johannes Deleu et al.

NEURIPS 2025arXiv:2506.06085
4
citations
#5829

Robust-MVTON: Learning Cross-Pose Feature Alignment and Fusion for Robust Multi-View Virtual Try-On

Nannan Zhang, Yijiang Li, Dong Du et al.

CVPR 2025
4
citations
#5830

PSBD: Prediction Shift Uncertainty Unlocks Backdoor Detection

Wei Li, Pin-Yu Chen, Sijia Liu et al.

CVPR 2025arXiv:2406.05826
4
citations
#5831

TopoPoint: Enhance Topology Reasoning via Endpoint Detection in Autonomous Driving

Yanping Fu, Xinyuan Liu, Tianyu Li et al.

NEURIPS 2025arXiv:2505.17771
4
citations
#5832

Decomposing Interventional Causality into Synergistic, Redundant, and Unique Components

Abel Jansma

NEURIPS 2025spotlightarXiv:2501.11447
4
citations
#5833

UniEgoMotion: A Unified Model for Egocentric Motion Reconstruction, Forecasting, and Generation

Chaitanya Patel, Hiroki Nakamura, Yuta Kyuragi et al.

ICCV 2025arXiv:2508.01126
4
citations
#5834

CanonSwap: High-Fidelity and Consistent Video Face Swapping via Canonical Space Modulation

Xiangyang Luo, Ye Zhu, Yunfei Liu et al.

ICCV 2025arXiv:2507.02691
4
citations
#5835

Win Fast or Lose Slow: Balancing Speed and Accuracy in Latency-Sensitive Decisions of LLMs

Hao Kang, Qingru Zhang, Han Cai et al.

NEURIPS 2025spotlightarXiv:2505.19481
4
citations
#5836

Knowledge Distillation with Refined Logits

Wujie Sun, Defang Chen, Siwei Lyu et al.

ICCV 2025arXiv:2408.07703
4
citations
#5837

GECKO: Gigapixel Vision-Concept Contrastive Pretraining in Histopathology

Saarthak Kapse, Pushpak Pati, Srikar Yellapragada et al.

ICCV 2025highlightarXiv:2504.01009
4
citations
#5838

$\texttt{STRCMP}$: Integrating Graph Structural Priors with Language Models for Combinatorial Optimization

Xijun Li, Jiexiang Yang, Jinghao Wang et al.

NEURIPS 2025
4
citations
#5839

PanoLlama: Generating Endless and Coherent Panoramas with Next-Token-Prediction LLMs

Teng Zhou, Xiaoyu Zhang, Yongchuan Tang

ICCV 2025highlightarXiv:2411.15867
4
citations
#5840

MIRA: Medical Time Series Foundation Model for Real-World Health Data

Hao Li, Bowen Deng, Chang Xu et al.

NEURIPS 2025oralarXiv:2506.07584
4
citations
#5841

Joint Relational Database Generation via Graph-Conditional Diffusion Models

Mohamed Amine Ketata, David Lüdke, Leo Schwinn et al.

NEURIPS 2025arXiv:2505.16527
4
citations
#5842

3D Dental Model Segmentation with Geometrical Boundary Preserving

Shufan Xi, Zexian Liu, Junlin Chang et al.

CVPR 2025arXiv:2503.23702
4
citations
#5843

Statistical inference for Linear Stochastic Approximation with Markovian Noise

Sergey Samsonov, Marina Sheshukova, Eric Moulines et al.

NEURIPS 2025arXiv:2505.19102
4
citations
#5844

Beyond Single-Task: Robust Multi-Task Length Generalization for LLMs

Yi Hu, Shijia Kang, Haotong Yang et al.

NEURIPS 2025arXiv:2502.11525
4
citations
#5845

CutS3D: Cutting Semantics in 3D for 2D Unsupervised Instance Segmentation

Leon Sick, Dominik Engel, Sebastian Hartwig et al.

ICCV 2025arXiv:2411.16319
4
citations
#5846

Small Singular Values Matter: A Random Matrix Analysis of Transformer Models

Max Staats, Matthias Thamm, Bernd Rosenow

NEURIPS 2025arXiv:2410.17770
4
citations
#5847

Neural Hierarchical Decomposition for Single Image Plant Modeling

Zhihao Liu, Zhanglin Cheng, Naoto Yokoya

CVPR 2025
4
citations
#5848

BiM-VFI: Bidirectional Motion Field-Guided Frame Interpolation for Video with Non-uniform Motions

Wonyong Seo, Jihyong Oh, Munchurl Kim

CVPR 2025arXiv:2412.11365
4
citations
#5849

GRAVER: Generative Graph Vocabularies for Robust Graph Foundation Models Fine-tuning

Haonan Yuan, Qingyun Sun, Junhua Shi et al.

NEURIPS 2025arXiv:2511.05592
4
citations
#5850

ANNEXE: Unified Analyzing, Answering, and Pixel Grounding for Egocentric Interaction

YUEJIAO SU, Yi Wang, Qiongyang Hu et al.

CVPR 2025arXiv:2504.01472
4
citations
#5851

Tight Lower Bounds and Improved Convergence in Performative Prediction

Pedram Khorsandi, Rushil Gupta, Mehrnaz Mofakhami et al.

NEURIPS 2025arXiv:2412.03671
4
citations
#5852

MARBLE: Material Recomposition and Blending in CLIP-Space

Ta-Ying Cheng, Prafull Sharma, Mark Boss et al.

CVPR 2025arXiv:2506.05313
4
citations
#5853

LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale

Joya Chen, Yiqi Lin, Ziyun Zeng et al.

CVPR 2025arXiv:2504.16030
4
citations
#5854

DSV-LFS: Unifying LLM-Driven Semantic Cues with Visual Features for Robust Few-Shot Segmentation

Amin Karimi, Charalambos Poullis

CVPR 2025arXiv:2503.04006
4
citations
#5855

Memories of Forgotten Concepts

Matan Rusanovsky, Shimon Malnick, Amir Jevnisek et al.

CVPR 2025highlightarXiv:2412.00782
4
citations
#5856

Doppelgangers++: Improved Visual Disambiguation with Geometric 3D Features

Yuanbo Xiangli, Ruojin Cai, Hanyu Chen et al.

CVPR 2025highlightarXiv:2412.05826
4
citations
#5857

MixerMDM: Learnable Composition of Human Motion Diffusion Models

Pablo Ruiz-Ponce, German Barquero, Cristina Palmero et al.

CVPR 2025arXiv:2504.01019
4
citations
#5858

PolarFree: Polarization-based Reflection-Free Imaging

Mingde Yao, Menglu Wang, King Man Tam et al.

CVPR 2025arXiv:2503.18055
4
citations
#5859

OmniStereo: Real-time Omnidireactional Depth Estimation with Multiview Fisheye Cameras

Jiaxi Deng, Yushen Wang, Haitao Meng et al.

CVPR 2025
4
citations
#5860

H-MoRe: Learning Human-centric Motion Representation for Action Analysis

Zhanbo Huang, Xiaoming Liu, Yu Kong

CVPR 2025highlightarXiv:2504.10676
4
citations
#5861

Learning 4D Panoptic Scene Graph Generation from Rich 2D Visual Scene

Shengqiong Wu, Hao Fei, Jingkang Yang et al.

CVPR 2025highlightarXiv:2503.15019
4
citations
#5862

Scalable and Cost-Efficient de Novo Template-Based Molecular Generation

Piotr Gaiński, Oussama Boussif, Andrei Rekesh et al.

NEURIPS 2025arXiv:2506.19865
4
citations
#5863

SmartCLIP: Modular Vision-language Alignment with Identification Guarantees

Shaoan Xie, Lingjing Kong, Yujia Zheng et al.

CVPR 2025highlightarXiv:2507.22264
4
citations
#5864

When Thinking Drifts: Evidential Grounding for Robust Video Reasoning

Romy Luo, Zihui (Sherry) Xue, Alex Dimakis et al.

NEURIPS 2025arXiv:2510.06077
4
citations
#5865

SAT-HMR: Real-Time Multi-Person 3D Mesh Estimation via Scale-Adaptive Tokens

Chi Su, Xiaoxuan Ma, Jiajun Su et al.

CVPR 2025arXiv:2411.19824
4
citations
#5866

Vision Transformers with Self-Distilled Registers

Zipeng Yan, Yinjie Chen, Chong Zhou et al.

NEURIPS 2025spotlightarXiv:2505.21501
4
citations
#5867

Dual-view X-ray Detection: Can AI Detect Prohibited Items from Dual-view X-ray Images like Humans?

Renshuai Tao, Haoyu Wang, Yuzhe Guo et al.

CVPR 2025arXiv:2411.18082
4
citations
#5868

Why Knowledge Distillation Works in Generative Models: A Minimal Working Explanation

Sungmin Cha, Kyunghyun Cho

NEURIPS 2025arXiv:2505.13111
4
citations
#5869

Protein Design with Dynamic Protein Vocabulary

Nuowei Liu, Jiahao Kuang, Yanting Liu et al.

NEURIPS 2025spotlightarXiv:2505.18966
4
citations
#5870

Exploring Multimodal Diffusion Transformers for Enhanced Prompt-based Image Editing

Joonghyuk Shin, Alchan Hwang, Yujin Kim et al.

ICCV 2025arXiv:2508.07519
4
citations
#5871

DINGO: Constrained Inference for Diffusion LLMs

Tarun Suresh, Debangshu Banerjee, Shubham Ugare et al.

NEURIPS 2025arXiv:2505.23061
4
citations
#5872

UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning

Ye Liu, Zongyang Ma, Junfu Pu et al.

NEURIPS 2025arXiv:2509.18094
4
citations
#5873

MonoFusion: Sparse-View 4D Reconstruction via Monocular Fusion

Zihan Wang, Jeff Tan, Tarasha Khurana et al.

ICCV 2025arXiv:2507.23782
4
citations
#5874

Ges3ViG : Incorporating Pointing Gestures into Language-Based 3D Visual Grounding for Embodied Reference Understanding

Atharv Mahesh Mane, Dulanga Weerakoon, Vigneshwaran Subbaraju et al.

CVPR 2025arXiv:2504.09623
4
citations
#5875

Spatiotemporal Decoupling for Efficient Vision-Based Occupancy Forecasting

Jingyi Xu, Xieyuanli Chen, Junyi Ma et al.

CVPR 2025arXiv:2411.14169
4
citations
#5876

TRACE: Grounding Time Series in Context for Multimodal Embedding and Retrieval

Jialin Chen, Ziyu Zhao, Gaukhar Nurbek et al.

NEURIPS 2025oralarXiv:2506.09114
4
citations
#5877

Preference Learning with Lie Detectors can Induce Honesty or Evasion

Chris Cundy, Adam Gleave

NEURIPS 2025arXiv:2505.13787
4
citations
#5878

Scaling Laws For Scalable Oversight

Joshua Engels, David Baek, Subhash Kantamneni et al.

NEURIPS 2025spotlightarXiv:2504.18530
4
citations
#5879

Evaluating Generalization Capabilities of LLM-Based Agents in Mixed-Motive Scenarios Using Concordia

Chandler Smith, Marwa Abdulhai, Manfred Díaz et al.

NEURIPS 2025oralarXiv:2512.03318
4
citations
#5880

DC-AR: Efficient Masked Autoregressive Image Generation with Deep Compression Hybrid Tokenizer

Yecheng Wu, Han Cai, Junyu Chen et al.

ICCV 2025arXiv:2507.04947
4
citations
#5881

OccluGaussian: Occlusion-Aware Gaussian Splatting for Large Scene Reconstruction and Rendering

Shiyong Liu, Xiao Tang, Zhihao Li et al.

ICCV 2025arXiv:2503.16177
4
citations
#5882

STiL: Semi-supervised Tabular-Image Learning for Comprehensive Task-Relevant Information Exploration in Multimodal Classification

Siyi Du, Xinzhe Luo, Declan ORegan et al.

CVPR 2025arXiv:2503.06277
4
citations
#5883

LidarGait++: Learning Local Features and Size Awareness from LiDAR Point Clouds for 3D Gait Recognition

Chuanfu Shen, Rui Wang, Lixin Duan et al.

CVPR 2025
4
citations
#5884

Robust Hallucination Detection in LLMs via Adaptive Token Selection

Mengjia Niu, Hamed Haddadi, Guansong Pang

NEURIPS 2025arXiv:2504.07863
4
citations
#5885

DuCos: Duality Constrained Depth Super-Resolution via Foundation Model

Zhiqiang Yan, Zhengxue Wang, Haoye Dong et al.

ICCV 2025arXiv:2503.04171
4
citations
#5886

ZigzagPointMamba: Spatial-Semantic Mamba for Point Cloud Understanding

LinshuangDiao, Sensen Song, Yurong Qian et al.

NEURIPS 2025
4
citations
#5887

CuRe: Cultural Gaps in the Long Tail of Text-to-Image Systems

Aniket Rege, Zinnia Nie, Unmesh Raskar et al.

ICCV 2025arXiv:2506.08071
4
citations
#5888

Enhanced OoD Detection through Cross-Modal Alignment of Multi-Modal Representations

Jeonghyeon Kim, Sangheum Hwang

CVPR 2025arXiv:2503.18817
4
citations
#5889

TKG-DM: Training-free Chroma Key Content Generation Diffusion Model

Ryugo Morita, Stanislav Frolov, Brian Bernhard Moser et al.

CVPR 2025highlightarXiv:2411.15580
4
citations
#5890

GraphMimic: Graph-to-Graphs Generative Modeling from Videos for Policy Learning

Guangyan Chen, Te Cui, Meiling Wang et al.

CVPR 2025
4
citations
#5891

FRAMES-VQA: Benchmarking Fine-Tuning Robustness across Multi-Modal Shifts in Visual Question Answering

Chengyue Huang, Brisa Maneechotesuwan, Shivang Chopra et al.

CVPR 2025arXiv:2505.21755
4
citations
#5892

HumanOLAT: A Large-Scale Dataset for Full-Body Human Relighting and Novel-View Synthesis

Timo Teufel, xilong zhou, Umar Iqbal et al.

ICCV 2025arXiv:2508.09137
4
citations
#5893

GASP: Gaussian Avatars with Synthetic Priors

Jack Saunders, Charlie Hewitt, Yanan Jian et al.

CVPR 2025arXiv:2412.07739
4
citations
#5894

Anomize: Better Open Vocabulary Video Anomaly Detection

Fei Li, Wenxuan Liu, Jingjing Chen et al.

CVPR 2025arXiv:2503.18094
4
citations
#5895

VideoGameQA-Bench: Evaluating Vision-Language Models for Video Game Quality Assurance

Mohammad Reza Taesiri, Abhijay Ghildyal, Saman Zadtootaghaj et al.

NEURIPS 2025arXiv:2505.15952
4
citations
#5896

Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval

Davide Caffagni, Sara Sarto, Marcella Cornia et al.

CVPR 2025arXiv:2503.01980
4
citations
#5897

RELOCATE: A Simple Training-Free Baseline for Visual Query Localization Using Region-Based Representations

Savya Khosla, Sethuraman T V, Alexander G. Schwing et al.

CVPR 2025arXiv:2412.01826
4
citations
#5898

PyTorchGeoNodes: Enabling Differentiable Shape Programs for 3D Shape Reconstruction

Sinisa Stekovic, Arslan Artykov, Stefan Ainetter et al.

CVPR 2025arXiv:2404.10620
4
citations
#5899

Synergistic Prompting for Robust Visual Recognition with Missing Modalities

Zhihui Zhang, Luanyuan Dai, Qika Lin et al.

ICCV 2025arXiv:2507.07802
4
citations
#5900

Activation Control for Efficiently Eliciting Long Chain-of-thought Ability of Language Models

Zekai Zhao, Qi Liu, Kun Zhou et al.

NEURIPS 2025spotlightarXiv:2505.17697
4
citations
#5901

GENIUS: A Generative Framework for Universal Multimodal Search

Sungyeon Kim, Xinliang Zhu, Xiaofan Lin et al.

CVPR 2025arXiv:2503.19868
4
citations
#5902

Common3D: Self-Supervised Learning of 3D Morphable Models for Common Objects in Neural Feature Space

Leonhard Sommer, Olaf Dünkel, Christian Theobalt et al.

CVPR 2025arXiv:2504.21749
4
citations
#5903

CoMPaSS: Enhancing Spatial Understanding in Text-to-Image Diffusion Models

Gaoyang Zhang, Bingtao Fu, Qingnan Fan et al.

ICCV 2025arXiv:2412.13195
4
citations
#5904

Brain-like Variational Inference

Hadi Vafaii, Dekel Galor, Jacob Yates

NEURIPS 2025arXiv:2410.19315
4
citations
#5905

Towards A Generalist Code Embedding Model Based On Massive Data Synthesis

Chaofan Li, Jianlyu Chen, Yingxia Shao et al.

NEURIPS 2025arXiv:2505.12697
4
citations
#5906

Multi-Token Prediction Needs Registers

Anastasios Gerontopoulos, Spyridon Gidaris, Nikos Komodakis

NEURIPS 2025arXiv:2505.10518
4
citations
#5907

Head Pursuit: Probing Attention Specialization in Multimodal Transformers

Lorenzo Basile, Valentino Maiorca, Diego Doimo et al.

NEURIPS 2025spotlightarXiv:2510.21518
4
citations
#5908

Refusal Direction is Universal Across Safety-Aligned Languages

Xinpeng Wang, Mingyang Wang, Yihong Liu et al.

NEURIPS 2025arXiv:2505.17306
4
citations
#5909

🎧MOSPA: Human Motion Generation Driven by Spatial Audio

Shuyang Xu, Zhiyang Dou, Mingyi Shi et al.

NEURIPS 2025spotlightarXiv:2507.11949
4
citations
#5910

Let Me Think! A Long Chain of Thought Can Be Worth Exponentially Many Short Ones

Parsa Mirtaheri, Ezra Edelman, Samy Jelassi et al.

NEURIPS 2025arXiv:2505.21825
4
citations
#5911

Test-Time Visual In-Context Tuning

Jiahao Xie, Alessio Tonioni, Nathalie Rauschmayr et al.

CVPR 2025arXiv:2503.21777
4
citations
#5912

LineArt: A Knowledge-guided Training-free High-quality Appearance Transfer for Design Drawing with Diffusion Model

Xi Wang, Hongzhen Li, Heng Fang et al.

CVPR 2025arXiv:2412.11519
4
citations
#5913

C4D: 4D Made from 3D through Dual Correspondences

Shizun Wang, Zhenxiang Jiang, Xingyi Yang et al.

ICCV 2025arXiv:2510.14960
4
citations
#5914

FruitNinja: 3D Object Interior Texture Generation with Gaussian Splatting

Fangyu Wu, Yuhao Chen

CVPR 2025arXiv:2411.12089
4
citations
#5915

GAP: Gaussianize Any Point Clouds with Text Guidance

Weiqi Zhang, Junsheng Zhou, Haotian Geng et al.

ICCV 2025arXiv:2508.05631
4
citations
#5916

Visual-Oriented Fine-Grained Knowledge Editing for MultiModal Large Language Models

Zhen Zeng, Leijiang Gu, Xun Yang et al.

ICCV 2025arXiv:2411.12790
4
citations
#5917

Not All Frame Features Are Equal: Video-to-4D Generation via Decoupling Dynamic-Static Features

Liying Yang, Chen Liu, Zhenwei Zhu et al.

ICCV 2025highlightarXiv:2502.08377
4
citations
#5918

Synthetic-powered predictive inference

Meshi Bashari, Roy Maor Lotan, Yonghoon Lee et al.

NEURIPS 2025arXiv:2505.13432
4
citations
#5919

Beyond [cls]: Exploring the True Potential of Masked Image Modeling Representations

Marcin Przewięźlikowski, Randall Balestriero, Wojciech Jasiński et al.

ICCV 2025arXiv:2412.03215
4
citations
#5920

AGC-Drive: A Large-Scale Dataset for Real-World Aerial-Ground Collaboration in Driving Scenarios

Yunhao Hou, Bochao Zou, Min Zhang et al.

NEURIPS 2025oralarXiv:2506.16371
4
citations
#5921

Unveiling Concept Attribution in Diffusion Models

Nguyen Hung-Quang, Hoang Phan, Khoa D Doan

NEURIPS 2025arXiv:2412.02542
4
citations
#5922

VOVTrack: Exploring the Potentiality in Raw Videos for Open-Vocabulary Multi-Object Tracking

Zekun Qian, Ruize Han, Junhui Hou et al.

ICCV 2025
4
citations
#5923

Position: Bridge the Gaps between Machine Unlearning and AI Regulation

Bill Marino, Meghdad Kurmanji, Nicholas Lane

NEURIPS 2025oralarXiv:2502.12430
4
citations
#5924

Generating 3D-Consistent Videos from Unposed Internet Photos

Gene Chou, Kai Zhang, Sai Bi et al.

CVPR 2025arXiv:2411.13549
4
citations
#5925

Event Fields: Capturing Light Fields at High Speed, Resolution, and Dynamic Range

Ziyuan Qu, Zihao Zou, Vivek Boominathan et al.

CVPR 2025highlightarXiv:2412.06191
4
citations
#5926

Rethinking Neural Combinatorial Optimization for Vehicle Routing Problems with Different Constraint Tightness Degrees

Fu Luo, Yaoxin Wu, Zhi Zheng et al.

NEURIPS 2025arXiv:2505.24627
4
citations
#5927

OpenSDI: Spotting Diffusion-Generated Images in the Open World

Yabin Wang, Zhiwu Huang, Xiaopeng Hong

CVPR 2025arXiv:2503.19653
4
citations
#5928

Can LLMs Outshine Conventional Recommenders? A Comparative Evaluation

Qijiong Liu, Jieming Zhu, Lu Fan et al.

NEURIPS 2025arXiv:2503.05493
4
citations
#5929

JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation

Kai Liu, Jungang Li, Yuchong Sun et al.

NEURIPS 2025oralarXiv:2512.22905
4
citations
#5930

STSBench: A Spatio-temporal Scenario Benchmark for Multi-modal Large Language Models in Autonomous Driving

Christian Fruhwirth-Reisinger, Dušan Malić, Wei Lin et al.

NEURIPS 2025oralarXiv:2506.06218
4
citations
#5931

Absorb and Converge: Provable Convergence Guarantee for Absorbing Discrete Diffusion Models

Yuchen Liang, Renxiang Huang, Lifeng LAI et al.

NEURIPS 2025arXiv:2506.02318
4
citations
#5932

Learning to Integrate Diffusion ODEs by Averaging the Derivatives

Wenze Liu, Xiangyu Yue

NEURIPS 2025arXiv:2505.14502
4
citations
#5933

On the Out-Of-Distribution Generalization of Large Multimodal Models

Xingxuan Zhang, Jiansheng Li, Wenjing Chu et al.

CVPR 2025
4
citations
#5934

MMPerspective: Do MLLMs Understand Perspective? A Comprehensive Benchmark for Perspective Perception, Reasoning, and Robustness

Yunlong Tang, Pinxin Liu, Mingqian Feng et al.

NEURIPS 2025arXiv:2505.20426
4
citations
#5935

Prompt2Perturb (P2P): Text-Guided Diffusion-Based Adversarial Attack on Breast Ultrasound Images

Yasamin Medghalchi, Moein Heidari, Clayton Allard et al.

CVPR 2025arXiv:2412.09910
4
citations
#5936

VFlowOpt: A Token Pruning Framework for LMMs with Visual Information Flow-Guided Optimization

Sihan Yang, Runsen Xu, Chenhang Cui et al.

ICCV 2025arXiv:2508.05211
4
citations
#5937

Self-Generated In-Context Examples Improve LLM Agents for Sequential Decision-Making Tasks

Vishnu Sarukkai, Zhiqiang Xie, Kayvon Fatahalian

NEURIPS 2025arXiv:2505.00234
4
citations
#5938

VerbDiff: Text-Only Diffusion Models with Enhanced Interaction Awareness

SeungJu Cha, Kwanyoung Lee, Ye-Chan Kim et al.

CVPR 2025arXiv:2503.16406
4
citations
#5939

Humans as a Calibration Pattern: Dynamic 3D Scene Reconstruction from Unsynchronized and Uncalibrated Videos

Changwoon Choi, Jeongjun Kim, Geonho Cha et al.

ICCV 2025arXiv:2412.19089
4
citations
#5940

SURGEON: Memory-Adaptive Fully Test-Time Adaptation via Dynamic Activation Sparsity

Ke Ma, Jiaqi Tang, Bin Guo et al.

CVPR 2025highlightarXiv:2503.20354
4
citations
#5941

FlowerTune: A Cross-Domain Benchmark for Federated Fine-Tuning of Large Language Models

Yan Gao, Massimo R. Scamarcia, Javier Fernandez-Marques et al.

NEURIPS 2025arXiv:2506.02961
4
citations
#5942

IDEA-Bench: How Far are Generative Models from Professional Designing?

Chen Liang, Lianghua Huang, Jingwu Fang et al.

CVPR 2025arXiv:2412.11767
4
citations
#5943

Flatten Graphs as Sequences: Transformers are Scalable Graph Generators

Dexiong Chen, Markus Krimmel, Karsten Borgwardt

NEURIPS 2025arXiv:2502.02216
4
citations
#5944

Enhancing Dataset Distillation via Non-Critical Region Refinement

Minh-Tuan Tran, Trung Le, Xuan-May Le et al.

CVPR 2025arXiv:2503.18267
4
citations
#5945

PLEIADES: Building Temporal Kernels with Orthogonal Polynomials

Yan Ru Pei, Olivier Coenen

NEURIPS 2025oralarXiv:2405.12179
4
citations
#5946

Rethinking Multi-modal Object Detection from the Perspective of Mono-Modality Feature Learning

Tianyi Zhao, Boyang Liu, Yanglei Gao et al.

ICCV 2025arXiv:2503.11780
4
citations
#5947

EconGym: A Scalable AI Testbed with Diverse Economic Tasks

Qirui Mi, Qipeng Yang, Zijun Fan et al.

NEURIPS 2025arXiv:2506.12110
4
citations
#5948

Attend to Not Attended: Structure-then-Detail Token Merging for Post-training DiT Acceleration

Haipeng Fang, Sheng Tang, Juan Cao et al.

CVPR 2025arXiv:2505.11707
4
citations
#5949

DeGauss: Dynamic-Static Decomposition with Gaussian Splatting for Distractor-free 3D Reconstruction

Rui Wang, Quentin Lohmeyer, Mirko Meboldt et al.

ICCV 2025arXiv:2503.13176
4
citations
#5950

Fine-Grained Preference Optimization Improves Spatial Reasoning in VLMs

Yifan Shen, Yuanzhe Liu, Jingyuan Zhu et al.

NEURIPS 2025arXiv:2506.21656
4
citations
#5951

Breaking the Discretization Barrier of Continuous Physics Simulation Learning

Fan Xu, Hao Wu, Nan Wang et al.

NEURIPS 2025oralarXiv:2509.17955
4
citations
#5952

Rethinking Tokenized Graph Transformers for Node Classification

Jinsong Chen, Chenyang Li, Gaichao Li et al.

NEURIPS 2025arXiv:2502.08101
4
citations
#5953

Conformal Prediction for Ensembles: Improving Efficiency via Score-Based Aggregation

Yash Patel, Eduardo Ochoa Rivera, Ambuj Tewari

NEURIPS 2025arXiv:2405.16246
4
citations
#5954

Relation3D : Enhancing Relation Modeling for Point Cloud Instance Segmentation

Edward LOO, Jiacheng Deng

CVPR 2025arXiv:2506.17891
4
citations
#5955

Auto-Vocabulary Semantic Segmentation

Osman Ülger, Maksymilian Kulicki, Yuki Asano et al.

ICCV 2025arXiv:2312.04539
4
citations
#5956

Channel Consistency Prior and Self-Reconstruction Strategy Based Unsupervised Image Deraining

Guanglu Dong, Tianheng Zheng, Yuanzhouhan Cao et al.

CVPR 2025arXiv:2503.18703
4
citations
#5957

Language Models can Self-Improve at State-Value Estimation for Better Search

Ethan Mendes, Alan Ritter

NEURIPS 2025spotlightarXiv:2503.02878
4
citations
#5958

Video Motion Graphs

Haiyang Liu, Zhan Xu, Fating Hong et al.

ICCV 2025highlightarXiv:2503.20218
4
citations
#5959

Dynamic Risk Assessments for Offensive Cybersecurity Agents

Boyi Wei, Benedikt Stroebl, Jiacen Xu et al.

NEURIPS 2025arXiv:2505.18384
4
citations
#5960

Hints of Prompt: Enhancing Visual Representation for Multimodal LLMs in Autonomous Driving

Hao Zhou, Zhanning Gao, Zhili Chen et al.

ICCV 2025arXiv:2411.13076
4
citations
#5961

Reconstructing In-the-Wild Open-Vocabulary Human-Object Interactions

Boran Wen, Dingbang Huang, Zichen Zhang et al.

CVPR 2025arXiv:2503.15898
4
citations
#5962

TF-TI2I: Training-Free Text-and-Image-to-Image Generation via Multi-Modal Implicit-Context Learning In Text-to-Image Models

Teng-Fang Hsiao, Bo-Kai Ruan, Yi-Lun Wu et al.

ICCV 2025arXiv:2503.15283
4
citations
#5963

MarkushGrapher: Joint Visual and Textual Recognition of Markush Structures

Lucas Morin, Valery Weber, Ahmed Nassar et al.

CVPR 2025arXiv:2503.16096
4
citations
#5964

Fast Last-Iterate Convergence of SGD in the Smooth Interpolation Regime

Amit Attia, Matan Schliserman, Uri Sherman et al.

NEURIPS 2025arXiv:2507.11274
4
citations
#5965

MagicHOI: Leveraging 3D Priors for Accurate Hand-object Reconstruction from Short Monocular Video Clips

SHIBO WANG, Haonan He, Maria Parelli et al.

ICCV 2025arXiv:2508.05506
4
citations
#5966

ReRAW: RGB-to-RAW Image Reconstruction via Stratified Sampling for Efficient Object Detection on the Edge

Radu Berdan, Beril Besbinar, Christoph Reinders et al.

CVPR 2025arXiv:2503.03782
4
citations
#5967

Multi-modal Medical Diagnosis via Large-small Model Collaboration

Wanyi Chen, Zihua Zhao, Jiangchao Yao et al.

CVPR 2025
4
citations
#5968

MeshLLM: Empowering Large Language Models to Progressively Understand and Generate 3D Mesh

Shuangkang Fang, I-Chao Shen, Yufeng Wang et al.

ICCV 2025highlightarXiv:2508.01242
4
citations
#5969

Always Tell Me The Odds: Fine-grained Conditional Probability Estimation

Liaoyaqi Wang, Zhengping Jiang, Anqi Liu et al.

COLM 2025paperarXiv:2505.01595
4
citations
#5970

Quantifying Fairness in LLMs Beyond Tokens: A Semantic and Statistical Perspective

Weijie Xu, Yiwen Wang, Chi Xue et al.

COLM 2025paperarXiv:2506.19028
4
citations
#5971

Focal-SAM: Focal Sharpness-Aware Minimization for Long-Tailed Classification

Sicong Li, Qianqian Xu, Zhiyong Yang et al.

ICML 2025arXiv:2505.01660
4
citations
#5972

Can RLHF be More Efficient with Imperfect Reward Models? A Policy Coverage Perspective

Jiawei Huang, Bingcong Li, Christoph Dann et al.

ICML 2025arXiv:2502.19255
4
citations
#5973

Can Performant LLMs Be Ethical? Quantifying the Impact of Web Crawling Opt-Outs

Dongyang Fan, Vinko Sabolčec, Matin Ansaripour et al.

COLM 2025paper
4
citations
#5974

Transformative or Conservative? Conservation laws for ResNets and Transformers

Sibylle Marcotte, Rémi Gribonval, Gabriel Peyré

ICML 2025oralarXiv:2506.06194
4
citations
#5975

Towards Universal Offline Black-Box Optimization via Learning Language Model Embeddings

Rong-Xi Tan, Ming Chen, Ke Xue et al.

ICML 2025arXiv:2506.07109
4
citations
#5976

Overcoming Vocabulary Constraints with Pixel-level Fallback

Jonas F. Lotz, Hendra Setiawan, Stephan Peitz et al.

COLM 2025paperarXiv:2504.02122
4
citations
#5977

Efficient Parallel Training Methods for Spiking Neural Networks with Constant Time Complexity

Wanjin Feng, Xingyu Gao, Wenqian Du et al.

ICML 2025arXiv:2506.12087
4
citations
#5978

X-Hacking: The Threat of Misguided AutoML

Rahul Sharma, Sumantrak Mukherjee, Andrea Šipka et al.

ICML 2025
4
citations
#5979

Differential Privacy Under Class Imbalance: Methods and Empirical Insights

Lucas Rosenblatt, Yuliia Lut, Ethan Turok et al.

ICML 2025arXiv:2411.05733
4
citations
#5980

ADHMR: Aligning Diffusion-based Human Mesh Recovery via Direct Preference Optimization

Wenhao Shen, Wanqi Yin, Xiaofeng Yang et al.

ICML 2025arXiv:2505.10250
4
citations
#5981

LLaVA-ReID: Selective Multi-image Questioner for Interactive Person Re-Identification

Yiding Lu, Mouxing Yang, Dezhong Peng et al.

ICML 2025arXiv:2504.10174
4
citations
#5982

M3-JEPA: Multimodal Alignment via Multi-gate MoE based on the Joint-Embedding Predictive Architecture

Hongyang Lei, Xiaolong Cheng, Qi Qin et al.

ICML 2025arXiv:2409.05929
4
citations
#5983

Understanding Model Ensemble in Transferable Adversarial Attack

Wei Yao, Zeliang Zhang, Huayi Tang et al.

ICML 2025arXiv:2410.06851
4
citations
#5984

Prune 'n Predict: Optimizing LLM Decision-making with Conformal Prediction

Harit Vishwakarma, Alan Mishler, Thomas Cook et al.

ICML 2025arXiv:2501.00555
4
citations
#5985

Boosting Virtual Agent Learning and Reasoning: A Step-Wise, Multi-Dimensional, and Generalist Reward Model with Benchmark

Bingchen Miao, Yang Wu, Minghe Gao et al.

ICML 2025arXiv:2503.18665
4
citations
#5986

Fast and Low-Cost Genomic Foundation Models via Outlier Removal

Haozheng Luo, Chenghao Qiu, Maojiang Su et al.

ICML 2025arXiv:2505.00598
4
citations
#5987

True Multimodal In-Context Learning Needs Attention to the Visual Context

Shuo Chen, Jianzhe Liu, Zhen Han et al.

COLM 2025paperarXiv:2507.15807
4
citations
#5988

Emotional Face-to-Speech

Jiaxin Ye, Boyuan Cao, Hongming Shan

ICML 2025arXiv:2502.01046
4
citations
#5989

Policy Design for Two-sided Platforms with Participation Dynamics

Haruka Kiyohara, Fan Yao, Sarah Dean

ICML 2025arXiv:2502.01792
4
citations
#5990

EvalAgents: Discovering Implicit Evaluation Criteria from the Web

Manya Wadhwa, Zayne Rea Sprague, Chaitanya Malaviya et al.

COLM 2025paperarXiv:2504.15219
4
citations
#5991

CHATS: Combining Human-Aligned Optimization and Test-Time Sampling for Text-to-Image Generation

Minghao Fu, Guo-Hua Wang, Liangfu Cao et al.

ICML 2025arXiv:2502.12579
4
citations
#5992

UncertainSAM: Fast and Efficient Uncertainty Quantification of the Segment Anything Model

Timo Kaiser, Thomas Norrenbrock, Bodo Rosenhahn

ICML 2025arXiv:2505.05049
4
citations
#5993

PANDAS: Improving Many-shot Jailbreaking via Positive Affirmation, Negative Demonstration, and Adaptive Sampling

Avery Ma, Yangchen Pan, Amir-massoud Farahmand

ICML 2025spotlightarXiv:2502.01925
4
citations
#5994

Data-Centric Human Preference with Rationales for Direct Preference Alignment

Hoang Anh Just, Ming Jin, Anit Kumar Sahu et al.

COLM 2025paperarXiv:2407.14477
4
citations
#5995

From Next-Token to Mathematics: The Learning Dynamics of Mathematical Reasoning in Language Models

Shubhra Mishra, Gabriel Poesia, Noah Goodman

COLM 2025paperarXiv:2407.00900
4
citations
#5996

QUDsim: Quantifying Discourse Similarities in LLM-Generated Text

Ramya Namuduri, Yating Wu, Anshun Asher Zheng et al.

COLM 2025paperarXiv:2504.09373
4
citations
#5997

SentenceKV: Efficient LLM Inference via Sentence-Level Semantic KV Caching

Yuxuan Zhu, Ali Falahati, David H. Yang et al.

COLM 2025paperarXiv:2504.00970
4
citations
#5998

Control the Temperature: Selective Sampling for Diverse and High-Quality LLM Outputs

Sergey Troshin, Wafaa Mohammed, Yan Meng et al.

COLM 2025paperarXiv:2510.01218
4
citations
#5999

Generalized Venn and Venn-Abers Calibration with Applications in Conformal Prediction

Lars van der Laan, Ahmed Alaa

ICML 2025arXiv:2502.05676
4
citations
#6000

Planted in Pretraining, Swayed by Finetuning: A Case Study on the Origins of Cognitive Biases in LLMs

Itay Itzhak, Yonatan Belinkov, Gabriel Stanovsky

COLM 2025paperarXiv:2507.07186
3
citations