Most Cited 2025 "temporal question-answering" Papers

22,274 papers found • Page 30 of 112

Filters:Most Cited 2025 temporal question-answering Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

#5801

DistinctAD: Distinctive Audio Description Generation in Contexts

Bo Fang, Wenhao Wu, Qiangqiang Wu et al.

CVPR 2025highlightarXiv:2411.18180

citations

#5802

The Atlas of In-Context Learning: How Attention Heads Shape In-Context Retrieval Augmentation

Patrick Kahardipraja, Reduan Achtibat, Thomas Wiegand et al.

NEURIPS 2025arXiv:2505.15807

citations

#5803

BiLoRA: Almost-Orthogonal Parameter Spaces for Continual Learning

Hao Zhu, Yifei Zhang, Junhao Dong et al.

CVPR 2025

citations

#5804

Towards Generalizable Trajectory Prediction using Dual-Level Representation Learning and Adaptive Prompting

Kaouther Messaoud, Matthieu Cord, Alex Alahi

CVPR 2025arXiv:2501.04815

citations

#5805

SnowMaster: Comprehensive Real-world Image Desnowing via MLLM with Multi-Model Feedback Optimization

Jianyu LAI, Sixiang Chen, yunlong lin et al.

CVPR 2025

citations

#5806

Understanding Contrastive Learning via Gaussian Mixture Models

Parikshit Bansal, Ali Kavis, Sujay Sanghavi

NEURIPS 2025

citations

#5807

AgentBreeder: Mitigating the AI Safety Risks of Multi-Agent Scaffolds via Self-Improvement

J Rosser, Jakob Foerster

NEURIPS 2025spotlightarXiv:2502.00757

citations

#5808

TrustMark: Robust Watermarking and Watermark Removal for Arbitrary Resolution Images

Tu Bui, Shruti Agarwal, John Collomosse

ICCV 2025

citations

#5809

Optimal Spectral Transitions in High-Dimensional Multi-Index Models

Leonardo Defilippis, Yatin Dandi, Pierre Mergny et al.

NEURIPS 2025arXiv:2502.02545

citations

#5810

DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval

Leqi Shen, Guoqiang Gong, Tianxiang Hao et al.

CVPR 2025arXiv:2506.08887

citations

#5811

QuCOOP: A Versatile Framework for Solving Composite and Binary-Parametrised Problems on Quantum Annealers

Natacha Kuete Meli, Vladislav Golyanik, Marcel Seelbach Benkner et al.

CVPR 2025highlightarXiv:2503.19718

citations

#5812

Repurposing 2D Diffusion Models with Gaussian Atlas for 3D Generation

Tiange Xiang, Kai Li, Chengjiang Long et al.

ICCV 2025arXiv:2503.15877

citations

#5813

Doubly Robust Alignment for Large Language Models

Erhan Xu, Kai Ye, Hongyi Zhou et al.

NEURIPS 2025arXiv:2506.01183

citations

#5814

LatentHOI: On the Generalizable Hand Object Motion Generation with Latent Hand Diffusion.

Muchen Li, Sammy Christen, Chengde Wan et al.

CVPR 2025

citations

#5815

Towards foundational LiDAR world models with efficient latent flow matching

Tianran Liu, Shengwen Zhao, Nicholas Rhinehart

NEURIPS 2025arXiv:2506.23434

citations

#5816

ZeroVO: Visual Odometry with Minimal Assumptions

Lei Lai, Zekai Yin, Eshed Ohn-Bar

CVPR 2025arXiv:2506.08005

citations

#5817

Do different prompting methods yield a common task representation in language models?

Guy Davidson, Todd Gureckis, Brenden Lake et al.

NEURIPS 2025arXiv:2505.12075

citations

#5818

CoDA: Coordinated Diffusion Noise Optimization for Whole-Body Manipulation of Articulated Objects

Huaijin Pi, Zhi Cen, Zhiyang Dou et al.

NEURIPS 2025arXiv:2505.21437

citations

#5819

Dual-Agent Optimization framework for Cross-Domain Few-Shot Segmentation

Zhaoyang Li, Yuan Wang, Wangkai Li et al.

CVPR 2025

citations

#5820

Self-Calibrated Variance-Stabilizing Transformations for Real-World Image Denoising

Sébastien Herbreteau, Michael Unser

ICCV 2025arXiv:2407.17399

citations

#5821

GenFlowRL: Shaping Rewards with Generative Object-Centric Flow in Visual Reinforcement Learning

Kelin Yu, Sheng Zhang, Harshit Soora et al.

ICCV 2025arXiv:2508.11049

citations

#5822

Enhanced then Progressive Fusion with View Graph for Multi-View Clustering

Zhibin Dong, Meng Liu, Siwei Wang et al.

CVPR 2025

citations

#5823

Unity in Diversity: Video Editing via Gradient-Latent Purification

Junyu Gao, Kunlin Yang, Xuan Yao et al.

CVPR 2025

citations

#5824

Lie Detector: Unified Backdoor Detection via Cross-Examination Framework

Xuan Wang, Siyuan Liang, Dongping Liao et al.

NEURIPS 2025arXiv:2503.16872

citations

#5825

Deliberation on Priors: Trustworthy Reasoning of Large Language Models on Knowledge Graphs

Jie Ma, NING QU, Zhitao Gao et al.

NEURIPS 2025arXiv:2505.15210

citations

#5826

Capturing Individual Human Preferences with Reward Features

Andre Barreto, Vincent Dumoulin, Yiran Mao et al.

NEURIPS 2025arXiv:2503.17338

citations

#5827

BG-Triangle: Bézier Gaussian Triangle for 3D Vectorization and Rendering

Minye Wu, Haizhao Dai, Kaixin Yao et al.

CVPR 2025arXiv:2503.13961

citations

#5828

Feedback Guidance of Diffusion Models

Felix Koulischer, Florian Handke, Johannes Deleu et al.

NEURIPS 2025arXiv:2506.06085

citations

#5829

Robust-MVTON: Learning Cross-Pose Feature Alignment and Fusion for Robust Multi-View Virtual Try-On

Nannan Zhang, Yijiang Li, Dong Du et al.

CVPR 2025

citations

#5830

PSBD: Prediction Shift Uncertainty Unlocks Backdoor Detection

Wei Li, Pin-Yu Chen, Sijia Liu et al.

CVPR 2025arXiv:2406.05826

citations

#5831

TopoPoint: Enhance Topology Reasoning via Endpoint Detection in Autonomous Driving

Yanping Fu, Xinyuan Liu, Tianyu Li et al.

NEURIPS 2025arXiv:2505.17771

citations

#5832

Decomposing Interventional Causality into Synergistic, Redundant, and Unique Components

Abel Jansma

NEURIPS 2025spotlightarXiv:2501.11447

citations

#5833

UniEgoMotion: A Unified Model for Egocentric Motion Reconstruction, Forecasting, and Generation

Chaitanya Patel, Hiroki Nakamura, Yuta Kyuragi et al.

ICCV 2025arXiv:2508.01126

citations

#5834

CanonSwap: High-Fidelity and Consistent Video Face Swapping via Canonical Space Modulation

Xiangyang Luo, Ye Zhu, Yunfei Liu et al.

ICCV 2025arXiv:2507.02691

citations

#5835

Win Fast or Lose Slow: Balancing Speed and Accuracy in Latency-Sensitive Decisions of LLMs

Hao Kang, Qingru Zhang, Han Cai et al.

NEURIPS 2025spotlightarXiv:2505.19481

citations

#5836

Knowledge Distillation with Refined Logits

Wujie Sun, Defang Chen, Siwei Lyu et al.

ICCV 2025arXiv:2408.07703

citations

#5837

GECKO: Gigapixel Vision-Concept Contrastive Pretraining in Histopathology

Saarthak Kapse, Pushpak Pati, Srikar Yellapragada et al.

ICCV 2025highlightarXiv:2504.01009

citations

#5838

$\texttt{STRCMP}$: Integrating Graph Structural Priors with Language Models for Combinatorial Optimization

Xijun Li, Jiexiang Yang, Jinghao Wang et al.

NEURIPS 2025

citations

#5839

PanoLlama: Generating Endless and Coherent Panoramas with Next-Token-Prediction LLMs

Teng Zhou, Xiaoyu Zhang, Yongchuan Tang

ICCV 2025highlightarXiv:2411.15867

citations

#5840

MIRA: Medical Time Series Foundation Model for Real-World Health Data

Hao Li, Bowen Deng, Chang Xu et al.

NEURIPS 2025oralarXiv:2506.07584

citations

#5841

Joint Relational Database Generation via Graph-Conditional Diffusion Models

Mohamed Amine Ketata, David Lüdke, Leo Schwinn et al.

NEURIPS 2025arXiv:2505.16527

citations

#5842

3D Dental Model Segmentation with Geometrical Boundary Preserving

Shufan Xi, Zexian Liu, Junlin Chang et al.

CVPR 2025arXiv:2503.23702

citations

#5843

Statistical inference for Linear Stochastic Approximation with Markovian Noise

Sergey Samsonov, Marina Sheshukova, Eric Moulines et al.

NEURIPS 2025arXiv:2505.19102

citations

#5844

Beyond Single-Task: Robust Multi-Task Length Generalization for LLMs

Yi Hu, Shijia Kang, Haotong Yang et al.

NEURIPS 2025arXiv:2502.11525

citations

#5845

CutS3D: Cutting Semantics in 3D for 2D Unsupervised Instance Segmentation

Leon Sick, Dominik Engel, Sebastian Hartwig et al.

ICCV 2025arXiv:2411.16319

citations

#5846

Small Singular Values Matter: A Random Matrix Analysis of Transformer Models

Max Staats, Matthias Thamm, Bernd Rosenow

NEURIPS 2025arXiv:2410.17770

citations

#5847

Neural Hierarchical Decomposition for Single Image Plant Modeling

Zhihao Liu, Zhanglin Cheng, Naoto Yokoya

CVPR 2025

citations

#5848

BiM-VFI: Bidirectional Motion Field-Guided Frame Interpolation for Video with Non-uniform Motions

Wonyong Seo, Jihyong Oh, Munchurl Kim

CVPR 2025arXiv:2412.11365

citations

#5849

GRAVER: Generative Graph Vocabularies for Robust Graph Foundation Models Fine-tuning

Haonan Yuan, Qingyun Sun, Junhua Shi et al.

NEURIPS 2025arXiv:2511.05592

citations

#5850

ANNEXE: Unified Analyzing, Answering, and Pixel Grounding for Egocentric Interaction

YUEJIAO SU, Yi Wang, Qiongyang Hu et al.

CVPR 2025arXiv:2504.01472

citations

#5851

Tight Lower Bounds and Improved Convergence in Performative Prediction

Pedram Khorsandi, Rushil Gupta, Mehrnaz Mofakhami et al.

NEURIPS 2025arXiv:2412.03671

citations

#5852

MARBLE: Material Recomposition and Blending in CLIP-Space

Ta-Ying Cheng, Prafull Sharma, Mark Boss et al.

CVPR 2025arXiv:2506.05313

citations

#5853

LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale

Joya Chen, Yiqi Lin, Ziyun Zeng et al.

CVPR 2025arXiv:2504.16030

citations

#5854

DSV-LFS: Unifying LLM-Driven Semantic Cues with Visual Features for Robust Few-Shot Segmentation

Amin Karimi, Charalambos Poullis

CVPR 2025arXiv:2503.04006

citations

#5855

Memories of Forgotten Concepts

Matan Rusanovsky, Shimon Malnick, Amir Jevnisek et al.

CVPR 2025highlightarXiv:2412.00782

citations

#5856

Doppelgangers++: Improved Visual Disambiguation with Geometric 3D Features

Yuanbo Xiangli, Ruojin Cai, Hanyu Chen et al.

CVPR 2025highlightarXiv:2412.05826

citations

#5857

MixerMDM: Learnable Composition of Human Motion Diffusion Models

Pablo Ruiz-Ponce, German Barquero, Cristina Palmero et al.

CVPR 2025arXiv:2504.01019

citations

#5858

PolarFree: Polarization-based Reflection-Free Imaging

Mingde Yao, Menglu Wang, King Man Tam et al.

CVPR 2025arXiv:2503.18055

citations

#5859

OmniStereo: Real-time Omnidireactional Depth Estimation with Multiview Fisheye Cameras

Jiaxi Deng, Yushen Wang, Haitao Meng et al.

CVPR 2025

citations

#5860

H-MoRe: Learning Human-centric Motion Representation for Action Analysis

Zhanbo Huang, Xiaoming Liu, Yu Kong

CVPR 2025highlightarXiv:2504.10676

citations

#5861

Learning 4D Panoptic Scene Graph Generation from Rich 2D Visual Scene

Shengqiong Wu, Hao Fei, Jingkang Yang et al.

CVPR 2025highlightarXiv:2503.15019

citations

#5862

Scalable and Cost-Efficient de Novo Template-Based Molecular Generation

Piotr Gaiński, Oussama Boussif, Andrei Rekesh et al.

NEURIPS 2025arXiv:2506.19865

citations

#5863

SmartCLIP: Modular Vision-language Alignment with Identification Guarantees

Shaoan Xie, Lingjing Kong, Yujia Zheng et al.

CVPR 2025highlightarXiv:2507.22264

citations

#5864

When Thinking Drifts: Evidential Grounding for Robust Video Reasoning

Romy Luo, Zihui (Sherry) Xue, Alex Dimakis et al.

NEURIPS 2025arXiv:2510.06077

citations

#5865

SAT-HMR: Real-Time Multi-Person 3D Mesh Estimation via Scale-Adaptive Tokens

Chi Su, Xiaoxuan Ma, Jiajun Su et al.

CVPR 2025arXiv:2411.19824

citations

#5866

Vision Transformers with Self-Distilled Registers

Zipeng Yan, Yinjie Chen, Chong Zhou et al.

NEURIPS 2025spotlightarXiv:2505.21501

citations

#5867

Dual-view X-ray Detection: Can AI Detect Prohibited Items from Dual-view X-ray Images like Humans?

Renshuai Tao, Haoyu Wang, Yuzhe Guo et al.

CVPR 2025arXiv:2411.18082

citations

#5868

Why Knowledge Distillation Works in Generative Models: A Minimal Working Explanation

Sungmin Cha, Kyunghyun Cho

NEURIPS 2025arXiv:2505.13111

citations

#5869

Protein Design with Dynamic Protein Vocabulary

Nuowei Liu, Jiahao Kuang, Yanting Liu et al.

NEURIPS 2025spotlightarXiv:2505.18966

citations

#5870

Exploring Multimodal Diffusion Transformers for Enhanced Prompt-based Image Editing

Joonghyuk Shin, Alchan Hwang, Yujin Kim et al.

ICCV 2025arXiv:2508.07519

citations

#5871

DINGO: Constrained Inference for Diffusion LLMs

Tarun Suresh, Debangshu Banerjee, Shubham Ugare et al.

NEURIPS 2025arXiv:2505.23061

citations

#5872

UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning

Ye Liu, Zongyang Ma, Junfu Pu et al.

NEURIPS 2025arXiv:2509.18094

citations

#5873

MonoFusion: Sparse-View 4D Reconstruction via Monocular Fusion

Zihan Wang, Jeff Tan, Tarasha Khurana et al.

ICCV 2025arXiv:2507.23782

citations

#5874

Ges3ViG : Incorporating Pointing Gestures into Language-Based 3D Visual Grounding for Embodied Reference Understanding

Atharv Mahesh Mane, Dulanga Weerakoon, Vigneshwaran Subbaraju et al.

CVPR 2025arXiv:2504.09623

citations

#5875

Spatiotemporal Decoupling for Efficient Vision-Based Occupancy Forecasting

Jingyi Xu, Xieyuanli Chen, Junyi Ma et al.

CVPR 2025arXiv:2411.14169

citations

#5876

TRACE: Grounding Time Series in Context for Multimodal Embedding and Retrieval

Jialin Chen, Ziyu Zhao, Gaukhar Nurbek et al.

NEURIPS 2025oralarXiv:2506.09114

citations

#5877

Preference Learning with Lie Detectors can Induce Honesty or Evasion

Chris Cundy, Adam Gleave

NEURIPS 2025arXiv:2505.13787

citations

#5878

Scaling Laws For Scalable Oversight

Joshua Engels, David Baek, Subhash Kantamneni et al.

NEURIPS 2025spotlightarXiv:2504.18530

citations

#5879

Evaluating Generalization Capabilities of LLM-Based Agents in Mixed-Motive Scenarios Using Concordia

Chandler Smith, Marwa Abdulhai, Manfred Díaz et al.

NEURIPS 2025oralarXiv:2512.03318

citations

#5880

DC-AR: Efficient Masked Autoregressive Image Generation with Deep Compression Hybrid Tokenizer

Yecheng Wu, Han Cai, Junyu Chen et al.

ICCV 2025arXiv:2507.04947

citations

#5881

OccluGaussian: Occlusion-Aware Gaussian Splatting for Large Scene Reconstruction and Rendering

Shiyong Liu, Xiao Tang, Zhihao Li et al.

ICCV 2025arXiv:2503.16177

citations

#5882

STiL: Semi-supervised Tabular-Image Learning for Comprehensive Task-Relevant Information Exploration in Multimodal Classification

Siyi Du, Xinzhe Luo, Declan ORegan et al.

CVPR 2025arXiv:2503.06277

citations

#5883

LidarGait++: Learning Local Features and Size Awareness from LiDAR Point Clouds for 3D Gait Recognition

Chuanfu Shen, Rui Wang, Lixin Duan et al.

CVPR 2025

citations

#5884

Robust Hallucination Detection in LLMs via Adaptive Token Selection

Mengjia Niu, Hamed Haddadi, Guansong Pang

NEURIPS 2025arXiv:2504.07863

citations

#5885

DuCos: Duality Constrained Depth Super-Resolution via Foundation Model

Zhiqiang Yan, Zhengxue Wang, Haoye Dong et al.

ICCV 2025arXiv:2503.04171

citations

#5886

ZigzagPointMamba: Spatial-Semantic Mamba for Point Cloud Understanding

LinshuangDiao, Sensen Song, Yurong Qian et al.

NEURIPS 2025

citations

#5887

CuRe: Cultural Gaps in the Long Tail of Text-to-Image Systems

Aniket Rege, Zinnia Nie, Unmesh Raskar et al.

ICCV 2025arXiv:2506.08071

citations

#5888

Enhanced OoD Detection through Cross-Modal Alignment of Multi-Modal Representations

Jeonghyeon Kim, Sangheum Hwang

CVPR 2025arXiv:2503.18817

citations

#5889

TKG-DM: Training-free Chroma Key Content Generation Diffusion Model

Ryugo Morita, Stanislav Frolov, Brian Bernhard Moser et al.

CVPR 2025highlightarXiv:2411.15580

citations

#5890

GraphMimic: Graph-to-Graphs Generative Modeling from Videos for Policy Learning

Guangyan Chen, Te Cui, Meiling Wang et al.

CVPR 2025

citations

#5891

FRAMES-VQA: Benchmarking Fine-Tuning Robustness across Multi-Modal Shifts in Visual Question Answering

Chengyue Huang, Brisa Maneechotesuwan, Shivang Chopra et al.

CVPR 2025arXiv:2505.21755

citations

#5892

HumanOLAT: A Large-Scale Dataset for Full-Body Human Relighting and Novel-View Synthesis

Timo Teufel, xilong zhou, Umar Iqbal et al.

ICCV 2025arXiv:2508.09137

citations

#5893

GASP: Gaussian Avatars with Synthetic Priors

Jack Saunders, Charlie Hewitt, Yanan Jian et al.

CVPR 2025arXiv:2412.07739

citations

#5894

Anomize: Better Open Vocabulary Video Anomaly Detection

Fei Li, Wenxuan Liu, Jingjing Chen et al.

CVPR 2025arXiv:2503.18094

citations

#5895

VideoGameQA-Bench: Evaluating Vision-Language Models for Video Game Quality Assurance

Mohammad Reza Taesiri, Abhijay Ghildyal, Saman Zadtootaghaj et al.

NEURIPS 2025arXiv:2505.15952

citations

#5896

Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval

Davide Caffagni, Sara Sarto, Marcella Cornia et al.

CVPR 2025arXiv:2503.01980

citations

#5897

RELOCATE: A Simple Training-Free Baseline for Visual Query Localization Using Region-Based Representations

Savya Khosla, Sethuraman T V, Alexander G. Schwing et al.

CVPR 2025arXiv:2412.01826

citations

#5898

PyTorchGeoNodes: Enabling Differentiable Shape Programs for 3D Shape Reconstruction

Sinisa Stekovic, Arslan Artykov, Stefan Ainetter et al.

CVPR 2025arXiv:2404.10620

citations

#5899

Synergistic Prompting for Robust Visual Recognition with Missing Modalities

Zhihui Zhang, Luanyuan Dai, Qika Lin et al.

ICCV 2025arXiv:2507.07802

citations

#5900

Activation Control for Efficiently Eliciting Long Chain-of-thought Ability of Language Models

Zekai Zhao, Qi Liu, Kun Zhou et al.

NEURIPS 2025spotlightarXiv:2505.17697

citations

#5901

GENIUS: A Generative Framework for Universal Multimodal Search

Sungyeon Kim, Xinliang Zhu, Xiaofan Lin et al.

CVPR 2025arXiv:2503.19868

citations

#5902

Common3D: Self-Supervised Learning of 3D Morphable Models for Common Objects in Neural Feature Space

Leonhard Sommer, Olaf Dünkel, Christian Theobalt et al.

CVPR 2025arXiv:2504.21749

citations

#5903

CoMPaSS: Enhancing Spatial Understanding in Text-to-Image Diffusion Models

Gaoyang Zhang, Bingtao Fu, Qingnan Fan et al.

ICCV 2025arXiv:2412.13195

citations

#5904

Brain-like Variational Inference

Hadi Vafaii, Dekel Galor, Jacob Yates

NEURIPS 2025arXiv:2410.19315

citations

#5905

Towards A Generalist Code Embedding Model Based On Massive Data Synthesis

Chaofan Li, Jianlyu Chen, Yingxia Shao et al.

NEURIPS 2025arXiv:2505.12697

citations

#5906

Multi-Token Prediction Needs Registers

Anastasios Gerontopoulos, Spyridon Gidaris, Nikos Komodakis

NEURIPS 2025arXiv:2505.10518

citations

#5907

Head Pursuit: Probing Attention Specialization in Multimodal Transformers

Lorenzo Basile, Valentino Maiorca, Diego Doimo et al.

NEURIPS 2025spotlightarXiv:2510.21518

citations

#5908

Refusal Direction is Universal Across Safety-Aligned Languages

Xinpeng Wang, Mingyang Wang, Yihong Liu et al.

NEURIPS 2025arXiv:2505.17306

citations

#5909

🎧MOSPA: Human Motion Generation Driven by Spatial Audio

Shuyang Xu, Zhiyang Dou, Mingyi Shi et al.

NEURIPS 2025spotlightarXiv:2507.11949

citations

#5910

Let Me Think! A Long Chain of Thought Can Be Worth Exponentially Many Short Ones

Parsa Mirtaheri, Ezra Edelman, Samy Jelassi et al.

NEURIPS 2025arXiv:2505.21825

citations

#5911

Test-Time Visual In-Context Tuning

Jiahao Xie, Alessio Tonioni, Nathalie Rauschmayr et al.

CVPR 2025arXiv:2503.21777

citations

#5912

LineArt: A Knowledge-guided Training-free High-quality Appearance Transfer for Design Drawing with Diffusion Model

Xi Wang, Hongzhen Li, Heng Fang et al.

CVPR 2025arXiv:2412.11519

citations

#5913

C4D: 4D Made from 3D through Dual Correspondences

Shizun Wang, Zhenxiang Jiang, Xingyi Yang et al.

ICCV 2025arXiv:2510.14960

citations

#5914

FruitNinja: 3D Object Interior Texture Generation with Gaussian Splatting

Fangyu Wu, Yuhao Chen

CVPR 2025arXiv:2411.12089

citations

#5915

GAP: Gaussianize Any Point Clouds with Text Guidance

Weiqi Zhang, Junsheng Zhou, Haotian Geng et al.

ICCV 2025arXiv:2508.05631

citations

#5916

Visual-Oriented Fine-Grained Knowledge Editing for MultiModal Large Language Models

Zhen Zeng, Leijiang Gu, Xun Yang et al.

ICCV 2025arXiv:2411.12790

citations

#5917

Not All Frame Features Are Equal: Video-to-4D Generation via Decoupling Dynamic-Static Features

Liying Yang, Chen Liu, Zhenwei Zhu et al.

ICCV 2025highlightarXiv:2502.08377

citations

#5918

Synthetic-powered predictive inference

Meshi Bashari, Roy Maor Lotan, Yonghoon Lee et al.

NEURIPS 2025arXiv:2505.13432

citations

#5919

Beyond [cls]: Exploring the True Potential of Masked Image Modeling Representations

Marcin Przewięźlikowski, Randall Balestriero, Wojciech Jasiński et al.

ICCV 2025arXiv:2412.03215

citations

#5920

AGC-Drive: A Large-Scale Dataset for Real-World Aerial-Ground Collaboration in Driving Scenarios

Yunhao Hou, Bochao Zou, Min Zhang et al.

NEURIPS 2025oralarXiv:2506.16371

citations

#5921

Unveiling Concept Attribution in Diffusion Models

Nguyen Hung-Quang, Hoang Phan, Khoa D Doan

NEURIPS 2025arXiv:2412.02542

citations

#5922

VOVTrack: Exploring the Potentiality in Raw Videos for Open-Vocabulary Multi-Object Tracking

Zekun Qian, Ruize Han, Junhui Hou et al.

ICCV 2025

citations

#5923

Position: Bridge the Gaps between Machine Unlearning and AI Regulation

Bill Marino, Meghdad Kurmanji, Nicholas Lane

NEURIPS 2025oralarXiv:2502.12430

citations

#5924

Generating 3D-Consistent Videos from Unposed Internet Photos

Gene Chou, Kai Zhang, Sai Bi et al.

CVPR 2025arXiv:2411.13549

citations

#5925

Event Fields: Capturing Light Fields at High Speed, Resolution, and Dynamic Range

Ziyuan Qu, Zihao Zou, Vivek Boominathan et al.

CVPR 2025highlightarXiv:2412.06191

citations

#5926

Rethinking Neural Combinatorial Optimization for Vehicle Routing Problems with Different Constraint Tightness Degrees

Fu Luo, Yaoxin Wu, Zhi Zheng et al.

NEURIPS 2025arXiv:2505.24627

citations

#5927

OpenSDI: Spotting Diffusion-Generated Images in the Open World

Yabin Wang, Zhiwu Huang, Xiaopeng Hong

CVPR 2025arXiv:2503.19653

citations

#5928

Can LLMs Outshine Conventional Recommenders? A Comparative Evaluation

Qijiong Liu, Jieming Zhu, Lu Fan et al.

NEURIPS 2025arXiv:2503.05493

citations

#5929

JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation

Kai Liu, Jungang Li, Yuchong Sun et al.

NEURIPS 2025oralarXiv:2512.22905

citations

#5930

STSBench: A Spatio-temporal Scenario Benchmark for Multi-modal Large Language Models in Autonomous Driving

Christian Fruhwirth-Reisinger, Dušan Malić, Wei Lin et al.

NEURIPS 2025oralarXiv:2506.06218

citations

#5931

Absorb and Converge: Provable Convergence Guarantee for Absorbing Discrete Diffusion Models

Yuchen Liang, Renxiang Huang, Lifeng LAI et al.

NEURIPS 2025arXiv:2506.02318

citations

#5932

Learning to Integrate Diffusion ODEs by Averaging the Derivatives

Wenze Liu, Xiangyu Yue

NEURIPS 2025arXiv:2505.14502

citations

#5933

On the Out-Of-Distribution Generalization of Large Multimodal Models

Xingxuan Zhang, Jiansheng Li, Wenjing Chu et al.

CVPR 2025

citations

#5934

MMPerspective: Do MLLMs Understand Perspective? A Comprehensive Benchmark for Perspective Perception, Reasoning, and Robustness

Yunlong Tang, Pinxin Liu, Mingqian Feng et al.

NEURIPS 2025arXiv:2505.20426

citations

#5935

Prompt2Perturb (P2P): Text-Guided Diffusion-Based Adversarial Attack on Breast Ultrasound Images

Yasamin Medghalchi, Moein Heidari, Clayton Allard et al.

CVPR 2025arXiv:2412.09910

citations

#5936

VFlowOpt: A Token Pruning Framework for LMMs with Visual Information Flow-Guided Optimization

Sihan Yang, Runsen Xu, Chenhang Cui et al.

ICCV 2025arXiv:2508.05211

citations

#5937

Self-Generated In-Context Examples Improve LLM Agents for Sequential Decision-Making Tasks

Vishnu Sarukkai, Zhiqiang Xie, Kayvon Fatahalian

NEURIPS 2025arXiv:2505.00234

citations

#5938

VerbDiff: Text-Only Diffusion Models with Enhanced Interaction Awareness

SeungJu Cha, Kwanyoung Lee, Ye-Chan Kim et al.

CVPR 2025arXiv:2503.16406

citations

#5939

Humans as a Calibration Pattern: Dynamic 3D Scene Reconstruction from Unsynchronized and Uncalibrated Videos

Changwoon Choi, Jeongjun Kim, Geonho Cha et al.

ICCV 2025arXiv:2412.19089

citations

#5940

SURGEON: Memory-Adaptive Fully Test-Time Adaptation via Dynamic Activation Sparsity

Ke Ma, Jiaqi Tang, Bin Guo et al.

CVPR 2025highlightarXiv:2503.20354

citations

#5941

FlowerTune: A Cross-Domain Benchmark for Federated Fine-Tuning of Large Language Models

Yan Gao, Massimo R. Scamarcia, Javier Fernandez-Marques et al.

NEURIPS 2025arXiv:2506.02961

citations

#5942

IDEA-Bench: How Far are Generative Models from Professional Designing?

Chen Liang, Lianghua Huang, Jingwu Fang et al.

CVPR 2025arXiv:2412.11767

citations

#5943

Flatten Graphs as Sequences: Transformers are Scalable Graph Generators

Dexiong Chen, Markus Krimmel, Karsten Borgwardt

NEURIPS 2025arXiv:2502.02216

citations

#5944

Enhancing Dataset Distillation via Non-Critical Region Refinement

Minh-Tuan Tran, Trung Le, Xuan-May Le et al.

CVPR 2025arXiv:2503.18267

citations

#5945

PLEIADES: Building Temporal Kernels with Orthogonal Polynomials

Yan Ru Pei, Olivier Coenen

NEURIPS 2025oralarXiv:2405.12179

citations

#5946

Rethinking Multi-modal Object Detection from the Perspective of Mono-Modality Feature Learning

Tianyi Zhao, Boyang Liu, Yanglei Gao et al.

ICCV 2025arXiv:2503.11780

citations

#5947

EconGym: A Scalable AI Testbed with Diverse Economic Tasks

Qirui Mi, Qipeng Yang, Zijun Fan et al.

NEURIPS 2025arXiv:2506.12110

citations

#5948

Attend to Not Attended: Structure-then-Detail Token Merging for Post-training DiT Acceleration

Haipeng Fang, Sheng Tang, Juan Cao et al.

CVPR 2025arXiv:2505.11707

citations

#5949

DeGauss: Dynamic-Static Decomposition with Gaussian Splatting for Distractor-free 3D Reconstruction

Rui Wang, Quentin Lohmeyer, Mirko Meboldt et al.

ICCV 2025arXiv:2503.13176

citations

#5950

Fine-Grained Preference Optimization Improves Spatial Reasoning in VLMs

Yifan Shen, Yuanzhe Liu, Jingyuan Zhu et al.

NEURIPS 2025arXiv:2506.21656

citations

#5951

Breaking the Discretization Barrier of Continuous Physics Simulation Learning

Fan Xu, Hao Wu, Nan Wang et al.

NEURIPS 2025oralarXiv:2509.17955

citations

#5952

Rethinking Tokenized Graph Transformers for Node Classification

Jinsong Chen, Chenyang Li, Gaichao Li et al.

NEURIPS 2025arXiv:2502.08101

citations

#5953

Conformal Prediction for Ensembles: Improving Efficiency via Score-Based Aggregation

Yash Patel, Eduardo Ochoa Rivera, Ambuj Tewari

NEURIPS 2025arXiv:2405.16246

citations

#5954

Relation3D : Enhancing Relation Modeling for Point Cloud Instance Segmentation

Edward LOO, Jiacheng Deng

CVPR 2025arXiv:2506.17891

citations

#5955

Auto-Vocabulary Semantic Segmentation

Osman Ülger, Maksymilian Kulicki, Yuki Asano et al.

ICCV 2025arXiv:2312.04539

citations

#5956

Channel Consistency Prior and Self-Reconstruction Strategy Based Unsupervised Image Deraining

Guanglu Dong, Tianheng Zheng, Yuanzhouhan Cao et al.

CVPR 2025arXiv:2503.18703

citations

#5957

Language Models can Self-Improve at State-Value Estimation for Better Search

Ethan Mendes, Alan Ritter

NEURIPS 2025spotlightarXiv:2503.02878

citations

#5958

Video Motion Graphs

Haiyang Liu, Zhan Xu, Fating Hong et al.

ICCV 2025highlightarXiv:2503.20218

citations

#5959

Dynamic Risk Assessments for Offensive Cybersecurity Agents

Boyi Wei, Benedikt Stroebl, Jiacen Xu et al.

NEURIPS 2025arXiv:2505.18384

citations

#5960

Hints of Prompt: Enhancing Visual Representation for Multimodal LLMs in Autonomous Driving

Hao Zhou, Zhanning Gao, Zhili Chen et al.

ICCV 2025arXiv:2411.13076

citations

#5961

Reconstructing In-the-Wild Open-Vocabulary Human-Object Interactions

Boran Wen, Dingbang Huang, Zichen Zhang et al.

CVPR 2025arXiv:2503.15898

citations

#5962

TF-TI2I: Training-Free Text-and-Image-to-Image Generation via Multi-Modal Implicit-Context Learning In Text-to-Image Models

Teng-Fang Hsiao, Bo-Kai Ruan, Yi-Lun Wu et al.

ICCV 2025arXiv:2503.15283

citations

#5963

MarkushGrapher: Joint Visual and Textual Recognition of Markush Structures

Lucas Morin, Valery Weber, Ahmed Nassar et al.

CVPR 2025arXiv:2503.16096

citations

#5964

Fast Last-Iterate Convergence of SGD in the Smooth Interpolation Regime

Amit Attia, Matan Schliserman, Uri Sherman et al.

NEURIPS 2025arXiv:2507.11274

citations

#5965

MagicHOI: Leveraging 3D Priors for Accurate Hand-object Reconstruction from Short Monocular Video Clips

SHIBO WANG, Haonan He, Maria Parelli et al.

ICCV 2025arXiv:2508.05506

citations

#5966

ReRAW: RGB-to-RAW Image Reconstruction via Stratified Sampling for Efficient Object Detection on the Edge

Radu Berdan, Beril Besbinar, Christoph Reinders et al.

CVPR 2025arXiv:2503.03782

citations

#5967

Multi-modal Medical Diagnosis via Large-small Model Collaboration

Wanyi Chen, Zihua Zhao, Jiangchao Yao et al.

CVPR 2025

citations

#5968

MeshLLM: Empowering Large Language Models to Progressively Understand and Generate 3D Mesh

Shuangkang Fang, I-Chao Shen, Yufeng Wang et al.

ICCV 2025highlightarXiv:2508.01242

citations

#5969

Always Tell Me The Odds: Fine-grained Conditional Probability Estimation

Liaoyaqi Wang, Zhengping Jiang, Anqi Liu et al.

COLM 2025paperarXiv:2505.01595

citations

#5970

Quantifying Fairness in LLMs Beyond Tokens: A Semantic and Statistical Perspective

Weijie Xu, Yiwen Wang, Chi Xue et al.

COLM 2025paperarXiv:2506.19028

citations

#5971

Focal-SAM: Focal Sharpness-Aware Minimization for Long-Tailed Classification

Sicong Li, Qianqian Xu, Zhiyong Yang et al.

ICML 2025arXiv:2505.01660

citations

#5972

Can RLHF be More Efficient with Imperfect Reward Models? A Policy Coverage Perspective

Jiawei Huang, Bingcong Li, Christoph Dann et al.

ICML 2025arXiv:2502.19255

citations

#5973

Can Performant LLMs Be Ethical? Quantifying the Impact of Web Crawling Opt-Outs

Dongyang Fan, Vinko Sabolčec, Matin Ansaripour et al.

COLM 2025paper

citations

#5974

Transformative or Conservative? Conservation laws for ResNets and Transformers

Sibylle Marcotte, Rémi Gribonval, Gabriel Peyré

ICML 2025oralarXiv:2506.06194

citations

#5975

Towards Universal Offline Black-Box Optimization via Learning Language Model Embeddings

Rong-Xi Tan, Ming Chen, Ke Xue et al.

ICML 2025arXiv:2506.07109

citations

#5976

Overcoming Vocabulary Constraints with Pixel-level Fallback

Jonas F. Lotz, Hendra Setiawan, Stephan Peitz et al.

COLM 2025paperarXiv:2504.02122

citations

#5977

Efficient Parallel Training Methods for Spiking Neural Networks with Constant Time Complexity

Wanjin Feng, Xingyu Gao, Wenqian Du et al.

ICML 2025arXiv:2506.12087

citations

#5978

X-Hacking: The Threat of Misguided AutoML

Rahul Sharma, Sumantrak Mukherjee, Andrea Šipka et al.

ICML 2025

citations

#5979

Differential Privacy Under Class Imbalance: Methods and Empirical Insights

Lucas Rosenblatt, Yuliia Lut, Ethan Turok et al.

ICML 2025arXiv:2411.05733

citations

#5980

ADHMR: Aligning Diffusion-based Human Mesh Recovery via Direct Preference Optimization

Wenhao Shen, Wanqi Yin, Xiaofeng Yang et al.

ICML 2025arXiv:2505.10250

citations

#5981

LLaVA-ReID: Selective Multi-image Questioner for Interactive Person Re-Identification

Yiding Lu, Mouxing Yang, Dezhong Peng et al.

ICML 2025arXiv:2504.10174

citations

#5982

M3-JEPA: Multimodal Alignment via Multi-gate MoE based on the Joint-Embedding Predictive Architecture

Hongyang Lei, Xiaolong Cheng, Qi Qin et al.

ICML 2025arXiv:2409.05929

citations

#5983

Understanding Model Ensemble in Transferable Adversarial Attack

Wei Yao, Zeliang Zhang, Huayi Tang et al.

ICML 2025arXiv:2410.06851

citations

#5984

Prune 'n Predict: Optimizing LLM Decision-making with Conformal Prediction

Harit Vishwakarma, Alan Mishler, Thomas Cook et al.

ICML 2025arXiv:2501.00555

citations

#5985

Boosting Virtual Agent Learning and Reasoning: A Step-Wise, Multi-Dimensional, and Generalist Reward Model with Benchmark

Bingchen Miao, Yang Wu, Minghe Gao et al.

ICML 2025arXiv:2503.18665

citations

#5986

Fast and Low-Cost Genomic Foundation Models via Outlier Removal

Haozheng Luo, Chenghao Qiu, Maojiang Su et al.

ICML 2025arXiv:2505.00598

citations

#5987

True Multimodal In-Context Learning Needs Attention to the Visual Context

Shuo Chen, Jianzhe Liu, Zhen Han et al.

COLM 2025paperarXiv:2507.15807

citations

#5988

Emotional Face-to-Speech

Jiaxin Ye, Boyuan Cao, Hongming Shan

ICML 2025arXiv:2502.01046

citations

#5989

Policy Design for Two-sided Platforms with Participation Dynamics

Haruka Kiyohara, Fan Yao, Sarah Dean

ICML 2025arXiv:2502.01792

citations

#5990

EvalAgents: Discovering Implicit Evaluation Criteria from the Web

Manya Wadhwa, Zayne Rea Sprague, Chaitanya Malaviya et al.

COLM 2025paperarXiv:2504.15219

citations

#5991

CHATS: Combining Human-Aligned Optimization and Test-Time Sampling for Text-to-Image Generation

Minghao Fu, Guo-Hua Wang, Liangfu Cao et al.

ICML 2025arXiv:2502.12579

citations

#5992

UncertainSAM: Fast and Efficient Uncertainty Quantification of the Segment Anything Model

Timo Kaiser, Thomas Norrenbrock, Bodo Rosenhahn

ICML 2025arXiv:2505.05049

citations

#5993

PANDAS: Improving Many-shot Jailbreaking via Positive Affirmation, Negative Demonstration, and Adaptive Sampling

Avery Ma, Yangchen Pan, Amir-massoud Farahmand

ICML 2025spotlightarXiv:2502.01925

citations

#5994

Data-Centric Human Preference with Rationales for Direct Preference Alignment

Hoang Anh Just, Ming Jin, Anit Kumar Sahu et al.

COLM 2025paperarXiv:2407.14477

citations

#5995

From Next-Token to Mathematics: The Learning Dynamics of Mathematical Reasoning in Language Models

Shubhra Mishra, Gabriel Poesia, Noah Goodman

COLM 2025paperarXiv:2407.00900

citations

#5996

QUDsim: Quantifying Discourse Similarities in LLM-Generated Text

Ramya Namuduri, Yating Wu, Anshun Asher Zheng et al.

COLM 2025paperarXiv:2504.09373

citations

#5997

SentenceKV: Efficient LLM Inference via Sentence-Level Semantic KV Caching

Yuxuan Zhu, Ali Falahati, David H. Yang et al.

COLM 2025paperarXiv:2504.00970

citations

#5998

Control the Temperature: Selective Sampling for Diverse and High-Quality LLM Outputs

Sergey Troshin, Wafaa Mohammed, Yan Meng et al.

COLM 2025paperarXiv:2510.01218

citations

#5999

Generalized Venn and Venn-Abers Calibration with Applications in Conformal Prediction

Lars van der Laan, Ahmed Alaa

ICML 2025arXiv:2502.05676

citations

#6000

Planted in Pretraining, Swayed by Finetuning: A Case Study on the Origins of Cognitive Biases in LLMs

Itay Itzhak, Yonatan Belinkov, Gabriel Stanovsky

COLM 2025paperarXiv:2507.07186

citations

← Previous

1...28 29 30 31 32...112