Most Cited 2025 Poster Papers

22,274 papers found • Page 24 of 112

#4601

Incorporating Visual Correspondence into Diffusion Model for Virtual Try-On

Siqi Wan, Jingwen Chen, Yingwei Pan et al.

ICLR 2025posterarXiv:2505.16977
5
citations
#4602

Hierarchically-Structured Open-Vocabulary Indoor Scene Synthesis with Pre-trained Large Language Model

Weilin Sun, Xinran Li, Manyi Li et al.

AAAI 2025paperarXiv:2502.10675
5
citations
#4603

BUFFER-X: Towards Zero-Shot Point Cloud Registration in Diverse Scenes

Minkyun Seo, Hyungtae Lim, Kanghee Lee et al.

ICCV 2025highlightarXiv:2503.07940
5
citations
#4604

SAVVY: Spatial Awareness via Audio-Visual LLMs through Seeing and Hearing

Mingfei Chen, Zijun Cui, Xiulong Liu et al.

NEURIPS 2025oralarXiv:2506.05414
5
citations
#4605

From Bytes to Ideas: Language Modeling with Autoregressive U-Nets

Mathurin VIDEAU, Badr Youbi Idrissi, Alessandro Leite et al.

NEURIPS 2025posterarXiv:2506.14761
5
citations
#4606

TGBFormer: Transformer-GraphFormer Blender Network for Video Object Detection

Qiang Qi, Xiao Wang

AAAI 2025paperarXiv:2503.13903
5
citations
#4607

Geo-Sign: Hyperbolic Contrastive Regularisation for Geometrically Aware Sign Language Translation

Edward Fish, Richard Bowden

NEURIPS 2025oralarXiv:2506.00129
5
citations
#4608

FlexEvent: Towards Flexible Event-Frame Object Detection at Varying Operational Frequencies

Dongyue Lu, Lingdong Kong, Gim Hee Lee et al.

NEURIPS 2025oralarXiv:2412.06708
5
citations
#4609

Thinker: Learning to Think Fast and Slow

Stephen Chung, Wenyu Du, Jie Fu

NEURIPS 2025posterarXiv:2505.21097
5
citations
#4610

VeriThoughts: Enabling Automated Verilog Code Generation using Reasoning and Formal Verification

Patrick Yubeaton, Andre Nakkab, Weihua Xiao et al.

NEURIPS 2025posterarXiv:2505.20302
5
citations
#4611

Rethinking Cross-Modal Interaction in Multimodal Diffusion Transformers

Zhengyao Lyu, Tianlin Pan, Chenyang Si et al.

ICCV 2025posterarXiv:2506.07986
5
citations
#4612

EvdCLIP: Improving Vision-Language Retrieval with Entity Visual Descriptions from Large Language Models

GuangHao Meng, Sunan He, Jinpeng Wang et al.

AAAI 2025paperarXiv:2505.18594
5
citations
#4613

Towards Homogeneous Lexical Tone Decoding from Heterogeneous Intracranial Recordings

Di Wu, Siyuan Li, Chen Feng et al.

ICLR 2025posterarXiv:2410.12866
5
citations
#4614

WorldWeaver: Generating Long-Horizon Video Worlds via Rich Perception

Zhiheng Liu, Xueqing Deng, Shoufa Chen et al.

NEURIPS 2025oralarXiv:2508.15720
5
citations
#4615

Learning-Guided Rolling Horizon Optimization for Long-Horizon Flexible Job-Shop Scheduling

Sirui Li, Wenbin Ouyang, Yining Ma et al.

ICLR 2025posterarXiv:2502.15791
5
citations
#4616

CLIP-PCQA: Exploring Subjective-Aligned Vision-Language Modeling for Point Cloud Quality Assessment

Yating Liu, Yujie Zhang, Ziyu Shan et al.

AAAI 2025paperarXiv:2501.10071
5
citations
#4617

FaceMe: Robust Blind Face Restoration with Personal Identification

Siyu Liu, Zheng-Peng Duan, Jia OuYang et al.

AAAI 2025paperarXiv:2501.05177
5
citations
#4618

Purifying Shampoo: Investigating Shampoo's Heuristics by Decomposing its Preconditioner

Runa Eschenhagen, Aaron Defazio, Tsung-Hsien Lee et al.

NEURIPS 2025spotlightarXiv:2506.03595
5
citations
#4619

Poison as Cure: Visual Noise for Mitigating Object Hallucinations in LVMs

Kejia Zhang, Keda TAO, Jiasheng Tang et al.

NEURIPS 2025posterarXiv:2501.19164
5
citations
#4620

Multi-head Transformers Provably Learn Symbolic Multi-step Reasoning via Gradient Descent

Tong Yang, Yu Huang, Yingbin Liang et al.

NEURIPS 2025posterarXiv:2508.08222
5
citations
#4621

Chiron-o1: Igniting Multimodal Large Language Models towards Generalizable Medical Reasoning via Mentor-Intern Collaborative Search

Haoran Sun, Yankai Jiang, Wenjie Lou et al.

NEURIPS 2025posterarXiv:2506.16962
5
citations
#4622

OnlineAnySeg: Online Zero-Shot 3D Segmentation by Visual Foundation Model Guided 2D Mask Merging

Yijie Tang, Jiazhao Zhang, Yuqing Lan et al.

CVPR 2025posterarXiv:2503.01309
5
citations
#4623

MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds

Bingquan Dai, Luo Li, Qihong Tang et al.

NEURIPS 2025posterarXiv:2508.14879
5
citations
#4624

Learning Orthogonal Multi-Index Models: A Fine-Grained Information Exponent Analysis

Yunwei Ren, Jason Lee

NEURIPS 2025posterarXiv:2410.09678
5
citations
#4625

Revisiting End-to-End Learning with Slide-level Supervision in Computational Pathology

Wenhao Tang, Rong Qin, Heng Fang et al.

NEURIPS 2025posterarXiv:2506.02408
5
citations
#4626

Multi-Modal Grounded Planning and Efficient Replanning for Learning Embodied Agents with a Few Examples

Taewoong Kim, Byeonghwi Kim, Jonghyun Choi

AAAI 2025paperarXiv:2412.17288
5
citations
#4627

Interpretable Generative Models through Post-hoc Concept Bottlenecks

Akshay R. Kulkarni, Ge Yan, Chung-En Sun et al.

CVPR 2025posterarXiv:2503.19377
5
citations
#4628

Ego4o: Egocentric Human Motion Capture and Understanding from Multi-Modal Input

Jian Wang, Rishabh Dabral, Diogo Luvizon et al.

CVPR 2025posterarXiv:2504.08449
5
citations
#4629

PixelMan: Consistent Object Editing with Diffusion Models via Pixel Manipulation and Generation

Liyao Jiang, Negar Hassanpour, Mohammad Salameh et al.

AAAI 2025paperarXiv:2412.14283
5
citations
#4630

LASeR: Learning to Adaptively Select Reward Models with Multi-Arm Bandits

Duy Nguyen, Archiki Prasad, Elias Stengel-Eskin et al.

NEURIPS 2025posterarXiv:2410.01735
5
citations
#4631

Seeing What Matters: Empowering CLIP with Patch Generation-to-Selection

Gensheng Pei, Tao Chen, Yujia Wang et al.

CVPR 2025posterarXiv:2503.17080
5
citations
#4632

Predicting Empirical AI Research Outcomes with Language Models

Jiaxin Wen, Chenglei Si, Yueh-Han Chen et al.

NEURIPS 2025posterarXiv:2506.00794
5
citations
#4633

EgoBridge: Domain Adaptation for Generalizable Imitation from Egocentric Human Data

Ryan Punamiya, Dhruv Patel, Patcharapong Aphiwetsa et al.

NEURIPS 2025posterarXiv:2509.19626
5
citations
#4634

Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language Navigation

Zihan Wang, Seungjun Lee, Gim Hee Lee

NEURIPS 2025oralarXiv:2505.11383
5
citations
#4635

HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding

Chenxin Tao, Shiqian Su, Xizhou Zhu et al.

CVPR 2025posterarXiv:2412.16158
5
citations
#4636

VProChart: Answering Chart Question Through Visual Perception Alignment Agent and Programmatic Solution Reasoning

Muye Huang, Lingling Zhang, Han Lai et al.

AAAI 2025paperarXiv:2409.01667
5
citations
#4637

Learning from Neighbors: Category Extrapolation for Long-Tail Learning

Shizhen Zhao, Xin Wen, Jiahui Liu et al.

CVPR 2025posterarXiv:2410.15980
5
citations
#4638

Bringing CLIP to the Clinic: Dynamic Soft Labels and Negation-Aware Learning for Medical Analysis

Hanbin Ko, Chang Min Park

CVPR 2025posterarXiv:2505.22079
5
citations
#4639

On scalable and efficient training of diffusion samplers

Minkyu Kim, Kiyoung Seong, Dongyeop Woo et al.

NEURIPS 2025posterarXiv:2505.19552
5
citations
#4640

LuxDiT: Lighting Estimation with Video Diffusion Transformer

Ruofan Liang, Kai He, Zan Gojcic et al.

NEURIPS 2025posterarXiv:2509.03680
5
citations
#4641

Critical Batch Size Revisited: A Simple Empirical Approach to Large-Batch Language Model Training

Will Merrill, Shane Arora, Dirk Groeneveld et al.

NEURIPS 2025spotlightarXiv:2505.23971
5
citations
#4642

Compliant Residual DAgger: Improving Real-World Contact-Rich Manipulation with Human Corrections

Xiaomeng Xu, Yifan Hou, Zeyi Liu et al.

NEURIPS 2025posterarXiv:2506.16685
5
citations
#4643

A Unified Solution to Video Fusion: From Multi-Frame Learning to Benchmarking

Zixiang Zhao, Haowen Bai, Bingxin Ke et al.

NEURIPS 2025oralarXiv:2505.19858
5
citations
#4644

Multi-Modal View Enhanced Large Vision Models for Long-Term Time Series Forecasting

ChengAo Shen, Wenchao Yu, Ziming Zhao et al.

NEURIPS 2025posterarXiv:2505.24003
5
citations
#4645

GarmentPile: Point-Level Visual Affordance Guided Retrieval and Adaptation for Cluttered Garments Manipulation

Ruihai Wu, Ziyu Zhu, Yuran Wang et al.

CVPR 2025posterarXiv:2503.09243
5
citations
#4646

EchoONE: Segmenting Multiple Echocardiography Planes in One Model

Jiongtong Hu, Wei Zhuo, Jun Cheng et al.

CVPR 2025posterarXiv:2412.02993
5
citations
#4647

When Are Concepts Erased From Diffusion Models?

Kevin Lu, Nicky Kriplani, Rohit Gandikota et al.

NEURIPS 2025posterarXiv:2505.17013
5
citations
#4648

Improved Balanced Classification with Theoretically Grounded Loss Functions

Corinna Cortes, Mehryar Mohri, Yutao Zhong

NEURIPS 2025posterarXiv:2512.23947
5
citations
#4649

Multi-Resolution Pathology-Language Pre-training Model with Text-Guided Visual Representation

Shahad Albastaki, Anabia Sohail, IYYAKUTTI IYAPPAN GANAPATHI et al.

CVPR 2025posterarXiv:2504.18856
5
citations
#4650

Learning Graph Invariance by Harnessing Spuriosity

Tianjun Yao, Yongqiang Chen, Kai Hu et al.

ICLR 2025poster
5
citations
#4651

FADA: Fast Diffusion Avatar Synthesis with Mixed-Supervised Multi-CFG Distillation

Tianyun Zhong, Chao Liang, Jianwen Jiang et al.

CVPR 2025posterarXiv:2412.16915
5
citations
#4652

PFDiff: Training-Free Acceleration of Diffusion Models Combining Past and Future Scores

Guangyi Wang, Yuren Cai, lijiang Li et al.

ICLR 2025posterarXiv:2408.08822
5
citations
#4653

Language Models Can Predict Their Own Behavior

Dhananjay Ashok, Jonathan May

NEURIPS 2025posterarXiv:2502.13329
5
citations
#4654

One2Any: One-Reference 6D Pose Estimation for Any Object

Mengya Liu, Siyuan Li, Ajad Chhatkuli et al.

CVPR 2025posterarXiv:2505.04109
5
citations
#4655

D^3-Human: Dynamic Disentangled Digital Human from Monocular Video

Honghu Chen, Bo Peng, Yunfan Tao et al.

CVPR 2025posterarXiv:2501.01589
5
citations
#4656

Estimating Model Performance Under Covariate Shift Without Labels

Jakub Białek, Juhani Kivimäki, Wojciech Kuberski et al.

NEURIPS 2025posterarXiv:2401.08348
5
citations
#4657

Towards Understanding Text Hallucination of Diffusion Models via Local Generation Bias

Rui Lu, Runzhe Wang, Kaifeng Lyu et al.

ICLR 2025posterarXiv:2503.03595
5
citations
#4658

ViPOcc: Leveraging Visual Priors from Vision Foundation Models for Single-View 3D Occupancy Prediction

Yi Feng, Yu Han, Xijing Zhang et al.

AAAI 2025paperarXiv:2412.11210
5
citations
#4659

Beyond FVD: An Enhanced Evaluation Metrics for Video Generation Distribution Quality

Ge Ya Luo, Gian M Favero, Zhi Hao Luo et al.

ICLR 2025oral
5
citations
#4660

Detect Any Mirrors: Boosting Learning Reliability on Large-Scale Unlabeled Data with an Iterative Data Engine

Zhaohu Xing, Lihao Liu, Yijun Yang et al.

CVPR 2025poster
5
citations
#4661

QMP: Q-switch Mixture of Policies for Multi-Task Behavior Sharing

Grace Zhang, Ayush Jain, Injune Hwang et al.

ICLR 2025oralarXiv:2302.00671
5
citations
#4662

PoseLLaVA: Pose Centric Multimodal LLM for Fine-Grained 3D Pose Manipulation

Dong Feng, Ping Guo, Encheng Peng et al.

AAAI 2025paper
5
citations
#4663

Learning Interpretable Queries for Explainable Image Classification with Information Pursuit

Stefan Kolek, Aditya Chattopadhyay, Kwan Ho Ryan Chan et al.

ICCV 2025posterarXiv:2312.11548
5
citations
#4664

Zero-shot 3D Question Answering via Voxel-based Dynamic Token Compression

Hsiang-Wei Huang, Fu-Chen Chen, Wenhao Chai et al.

CVPR 2025poster
5
citations
#4665

BiggerGait: Unlocking Gait Recognition with Layer-wise Representations from Large Vision Models

Dingqiang Ye, Chao Fan, Zhanbo Huang et al.

NEURIPS 2025posterarXiv:2505.18132
5
citations
#4666

Towards RAW Object Detection in Diverse Conditions

Zhong-Yu Li, Xin Jin, Bo-Yuan Sun et al.

CVPR 2025highlightarXiv:2411.15678
5
citations
#4667

Why LVLMs Are More Prone to Hallucinations in Longer Responses: The Role of Context

Ge Zheng, Jiaye Qian, Jiajin Tang et al.

ICCV 2025posterarXiv:2510.20229
5
citations
#4668

Training-free Dense-Aligned Diffusion Guidance for Modular Conditional Image Synthesis

Zixuan Wang, DUO PENG, Feng Chen et al.

CVPR 2025posterarXiv:2504.01515
5
citations
#4669

Learning Spatial-Semantic Features for Robust Video Object Segmentation

Xin Li, Deshui Miao, Zhenyu He et al.

ICLR 2025posterarXiv:2407.07760
5
citations
#4670

Unlocking Point Processes through Point Set Diffusion

David Lüdke, Enric Rabasseda Raventós, Marcel Kollovieh et al.

ICLR 2025oralarXiv:2410.22493
5
citations
#4671

Denoising with a Joint-Embedding Predictive Architecture

Chen Dengsheng, Jie Hu, Xiaoming Wei et al.

ICLR 2025posterarXiv:2410.03755
5
citations
#4672

FreeGave: 3D Physics Learning from Dynamic Videos by Gaussian Velocity

Jinxi Li, Ziyang Song, Siyuan Zhou et al.

CVPR 2025posterarXiv:2506.07865
5
citations
#4673

A Polarization-Aided Transformer for Image Deblurring via Motion Vector Decomposition

Duosheng Chen, Shihao Zhou, Jinshan Pan et al.

CVPR 2025highlight
5
citations
#4674

EvHDR-NeRF: Building High Dynamic Range Radiance Fields with Single Exposure Images and Events

Zehao Chen, Zhanfeng Liao, De Ma et al.

AAAI 2025paper
5
citations
#4675

Learning Diffusion Models with Flexible Representation Guidance

Chenyu Wang, Cai Zhou, Sharut Gupta et al.

NEURIPS 2025posterarXiv:2507.08980
5
citations
#4676

AoP-SAM: Automation of Prompts for Efficient Segmentation

Yi Chen, Muyoung Son, Chuanbo Hua et al.

AAAI 2025paperarXiv:2505.11980
5
citations
#4677

The Fluorescent Veil: A Stealthy and Effective Physical Adversarial Patch Against Traffic Sign Recognition

Shuai Yuan, Xingshuo Han, Hongwei Li et al.

NEURIPS 2025posterarXiv:2409.12394
5
citations
#4678

Uncertainty Quantification with the Empirical Neural Tangent Kernel

Joseph Wilson, Chris van der Heide, Liam Hodgkinson et al.

NEURIPS 2025posterarXiv:2502.02870
5
citations
#4679

Attribution Analysis Meets Model Editing: Advancing Knowledge Correction in Vision Language Models with VisEdit

Qizhou Chen, Taolin Zhang, Chengyu Wang et al.

AAAI 2025paperarXiv:2408.09916
5
citations
#4680

Efficient Quadratic Corrections for Frank-Wolfe Algorithms

Jannis Halbey, Seta Rakotomandimby, Mathieu Besançon et al.

NEURIPS 2025posterarXiv:2506.02635
5
citations
#4681

DrVD-Bench: Do Vision-Language Models Reason Like Human Doctors in Medical Image Diagnosis?

Tianhong Zhou, xu yin, Yingtao Zhu et al.

NEURIPS 2025posterarXiv:2505.24173
5
citations
#4682

LightsOut: Diffusion-based Outpainting for Enhanced Lens Flare Removal

Shr-Ruei Tsai, Wei-Cheng Chang, Jie-Ying Lee et al.

ICCV 2025posterarXiv:2510.15868
5
citations
#4683

Progressive Compression with Universally Quantized Diffusion Models

Yibo Yang, Justus Will, Stephan Mandt

ICLR 2025posterarXiv:2412.10935
5
citations
#4684

BigDocs: An Open Dataset for Training Multimodal Models on Document and Code Tasks

Juan A. Rodriguez, Xiangru Jian, Siba Smarak Panigrahi et al.

ICLR 2025posterarXiv:2412.04626
5
citations
#4685

OCRT: Boosting Foundation Models in the Open World with Object-Concept-Relation Triad

Luyao Tang, Chaoqi Chen, Yuxuan Yuan et al.

CVPR 2025posterarXiv:2503.18695
5
citations
#4686

MINGLE: Mixture of Null-Space Gated Low-Rank Experts for Test-Time Continual Model Merging

Zihuan Qiu, Yi Xu, Chiyuan He et al.

NEURIPS 2025posterarXiv:2505.11883
5
citations
#4687

InfinityStar: Unified Spacetime AutoRegressive Modeling for Visual Generation

Jinlai Liu, Jian Han, Bin Yan et al.

NEURIPS 2025oral
5
citations
#4688

Context Clues: Evaluating Long Context Models for Clinical Prediction Tasks on EHR Data

Michael Wornow, Suhana Bedi, Miguel Angel Fuentes Hernandez et al.

ICLR 2025poster
5
citations
#4689

Constructing Confidence Intervals for Average Treatment Effects from Multiple Datasets

Yuxin Wang, Maresa Schröder, Dennis Frauen et al.

ICLR 2025posterarXiv:2412.11511
5
citations
#4690

GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding

Rui Hu, Yuxuan Zhang, Lianghui Zhu et al.

ICCV 2025posterarXiv:2503.10596
5
citations
#4691

ZAPBench: A Benchmark for Whole-Brain Activity Prediction in Zebrafish

Jan-Matthis Lueckmann, Alexander Immer, Alex Chen et al.

ICLR 2025posterarXiv:2503.02618
5
citations
#4692

Prosody-Enhanced Acoustic Pre-training and Acoustic-Disentangled Prosody Adapting for Movie Dubbing

Zhedong Zhang, Liang Li, Chenggang Yan et al.

CVPR 2025posterarXiv:2503.12042
5
citations
#4693

Ranked Entropy Minimization for Continual Test-Time Adaptation

Jisu Han, Jaemin Na, Wonjun Hwang

ICML 2025posterarXiv:2505.16441
5
citations
#4694

Lawma: The Power of Specialization for Legal Annotation

Ricardo Dominguez-Olmedo, Vedant Nanda, Rediet Abebe et al.

ICLR 2025posterarXiv:2407.16615
5
citations
#4695

In-Context Learning and Occam's Razor

Eric Elmoznino, Tom Marty, Tejas Kasetty et al.

ICML 2025posterarXiv:2410.14086
5
citations
#4696

Curly Flow Matching for Learning Non-gradient Field Dynamics

Katarina Petrović, Lazar Atanackovic, Viggo Moro et al.

NEURIPS 2025posterarXiv:2510.26645
5
citations
#4697

Contextual Online Decision Making with Infinite-Dimensional Functional Regression

Haichen Hu, Rui Ai, Stephen Bates et al.

ICML 2025posterarXiv:2501.18359
5
citations
#4698

ReNeg: Learning Negative Embedding with Reward Guidance

Xiaomin Li, yixuan liu, Takashi Isobe et al.

CVPR 2025highlightarXiv:2412.19637
5
citations
#4699

DipLLM: Fine-Tuning LLM for Strategic Decision-making in Diplomacy

Kaixuan Xu, Jiajun Chai, Sicheng Li et al.

ICML 2025posterarXiv:2506.09655
5
citations
#4700

Automatically Identify and Rectify: Robust Deep Contrastive Multi-view Clustering in Noisy Scenarios

xihong yang, Siwei Wang, Fangdi Wang et al.

ICML 2025spotlightarXiv:2505.21387
5
citations
#4701

Free360: Layered Gaussian Splatting for Unbounded 360-Degree View Synthesis from Extremely Sparse and Unposed Views

Chong Bao, Xiyu Zhang, Zehao Yu et al.

CVPR 2025posterarXiv:2503.24382
5
citations
#4702

GTG: Generalizable Trajectory Generation Model for Urban Mobility

Jingyuan Wang, Yujing Lin, Yudong Li

AAAI 2025paperarXiv:2502.01107
5
citations
#4703

Understanding LLM Behaviors via Compression: Data Generation, Knowledge Acquisition and Scaling Laws

Zhixuan Pan, Shaowen Wang, Liao Pengfei et al.

NEURIPS 2025spotlightarXiv:2504.09597
5
citations
#4704

Constrained Belief Updates Explain Geometric Structures in Transformer Representations

Mateusz Piotrowski, Paul Riechers, Daniel Filan et al.

ICML 2025posterarXiv:2502.01954
5
citations
#4705

Time-IMM: A Dataset and Benchmark for Irregular Multimodal Multivariate Time Series

Ching Chang, Jeehyun Hwang, Yidan Shi et al.

NEURIPS 2025posterarXiv:2506.10412
5
citations
#4706

Autonomous Goal Detection and Cessation in Reinforcement Learning: A Case Study on Source Term Estimation

Yiwei Shi, Muning Wen, Qi Zhang et al.

AAAI 2025paperarXiv:2409.09541
5
citations
#4707

Efficient Multi-agent Offline Coordination via Diffusion-based Trajectory Stitching

Lei Yuan, Yuqi Bian, Lihe Li et al.

ICLR 2025oral
5
citations
#4708

Semantic and Expressive Variations in Image Captions Across Languages

Andre Ye, Sebastin Santy, Jena D. Hwang et al.

CVPR 2025posterarXiv:2310.14356
5
citations
#4709

TFCustom: Customized Image Generation with Time-Aware Frequency Feature Guidance

Mushui Liu, Dong She, Qihan Huang et al.

CVPR 2025highlight
5
citations
#4710

EgoExOR: An Ego-Exo-Centric Operating Room Dataset for Surgical Activity Understanding

Ege Özsoy, Arda Mamur, Felix Tristram et al.

NEURIPS 2025posterarXiv:2505.24287
5
citations
#4711

Graph-Based Cross-Domain Knowledge Distillation for Cross-Dataset Text-to-Image Person Retrieval

Bingjun Luo, Jinpeng Wang, Zewen Wang et al.

AAAI 2025paperarXiv:2501.15052
5
citations
#4712

Long Video Diffusion Generation with Segmented Cross-Attention and Content-Rich Video Data Curation

Xin Yan, Yuxuan Cai, Qiuyue Wang et al.

CVPR 2025posterarXiv:2412.01316
5
citations
#4713

GARLIC: GPT-Augmented Reinforcement Learning with Intelligent Control for Vehicle Dispatching

Xiao Han, Zijian Zhang, Xiangyu Zhao et al.

AAAI 2025paperarXiv:2408.10286
5
citations
#4714

CoPRA: Bridging Cross-domain Pretrained Sequence Models with Complex Structures for Protein-RNA Binding Affinity Prediction

Rong Han, Xiaohong Liu, Tong Pan et al.

AAAI 2025paperarXiv:2409.03773
5
citations
#4715

Risk-Controlling Model Selection via Guided Bayesian Optimization

Adam Fisch, Regina Barzilay, Bracha Laufer-Goldshtein et al.

ICLR 2025posterarXiv:2312.01692
5
citations
#4716

Robust Federated Finetuning of LLMs via Alternating Optimization of LoRA

Shuangyi Chen, Yuanxin Guo, Yue Ju et al.

NEURIPS 2025posterarXiv:2502.01755
5
citations
#4717

IAAO: Interactive Affordance Learning for Articulated Objects in 3D Environments

Can Zhang, Gim Hee Lee

CVPR 2025posterarXiv:2504.06827
5
citations
#4718

mmFAS: Multimodal Face Anti-Spoofing Using Multi-Level Alignment and Switch-Attention Fusion

Geng Chen, Wuyuan Xie, Di Lin et al.

AAAI 2025paper
5
citations
#4719

KaSA: Knowledge-Aware Singular-Value Adaptation of Large Language Models

Fan Wang, Juyong Jiang, Chansung Park et al.

ICLR 2025posterarXiv:2412.06071
5
citations
#4720

SAME: Learning Generic Language-Guided Visual Navigation with State-Adaptive Mixture of Experts

Gengze Zhou, Yicong Hong, Zun Wang et al.

ICCV 2025posterarXiv:2412.05552
5
citations
#4721

LUCAS: Layered Universal Codec Avatars

Di Liu, Teng Deng, Giljoo Nam et al.

CVPR 2025posterarXiv:2502.19739
5
citations
#4722

Faster Inference of Flow-Based Generative Models via Improved Data-Noise Coupling

Aram Davtyan, Leello Dadi, Volkan Cevher et al.

ICLR 2025poster
5
citations
#4723

KAC: Kolmogorov-Arnold Classifier for Continual Learning

Yusong Hu, Zichen Liang, Fei Yang et al.

CVPR 2025highlightarXiv:2503.21076
5
citations
#4724

DiET-GS: Diffusion Prior and Event Stream-Assisted Motion Deblurring 3D Gaussian Splatting

Seungjun Lee, Gim Hee Lee

CVPR 2025posterarXiv:2503.24210
5
citations
#4725

High-Precision Dichotomous Image Segmentation via Probing Diffusion Capacity

Qian Yu, Peng-Tao Jiang, Hao Zhang et al.

ICLR 2025posterarXiv:2410.10105
5
citations
#4726

Anti-Exposure Bias in Diffusion Models

Junyu Zhang, Daochang Liu, Eunbyung Park et al.

ICLR 2025poster
5
citations
#4727

Φ-GAN:Physics-Inspired GAN for Generating SAR Images Under Limited Data

Xidan Zhang, Yihan Zhuang, Qian Guo et al.

ICCV 2025poster
5
citations
#4728

Modeling Thousands of Human Annotators for Generalizable Text-to-Image Person Re-identification

Jiayu Jiang, Changxing Ding, Wentao Tan et al.

CVPR 2025highlightarXiv:2503.09962
5
citations
#4729

Spotting the Unexpected (STU): A 3D LiDAR Dataset for Anomaly Segmentation in Autonomous Driving

Alexey Nekrasov, Malcolm Burdorf, Stewart Worrall et al.

CVPR 2025posterarXiv:2505.02148
5
citations
#4730

Spiral: Semantic-Aware Progressive LiDAR Scene Generation and Understanding

Dekai Zhu, Yixuan Hu, Youquan Liu et al.

NEURIPS 2025posterarXiv:2505.22643
5
citations
#4731

Interpretable Image Classification via Non-parametric Part Prototype Learning

Zhijie Zhu, Lei Fan, Maurice Pagnucco et al.

CVPR 2025posterarXiv:2503.10247
5
citations
#4732

Learning Normal Flow Directly From Events

Dehao Yuan, Levi Burner, Jiayi Wu et al.

ICCV 2025posterarXiv:2412.11284
5
citations
#4733

Language Agents Meet Causality -- Bridging LLMs and Causal World Models

John Gkountouras, Matthias Lindemann, Phillip Lippe et al.

ICLR 2025oralarXiv:2410.19923
5
citations
#4734

V-Stylist: Video Stylization via Collaboration and Reflection of MLLM Agents

Zhengrong Yue, Shaobin Zhuang, Kunchang Li et al.

CVPR 2025posterarXiv:2503.12077
5
citations
#4735

AdaFisher: Adaptive Second Order Optimization via Fisher Information

Damien GOMES, Yanlei Zhang, Eugene Belilovsky et al.

ICLR 2025posterarXiv:2405.16397
5
citations
#4736

Towards Omnimodal Expressions and Reasoning in Referring Audio-Visual Segmentation

Kaining Ying, Henghui Ding, Guangquan Jie et al.

ICCV 2025posterarXiv:2507.22886
5
citations
#4737

Variational Search Distributions

Dan Steinberg, Rafael Oliveira, Cheng Soon Ong et al.

ICLR 2025posterarXiv:2409.06142
5
citations
#4738

When Selection Meets Intervention: Additional Complexities in Causal Discovery

Haoyue Dai, Ignavier Ng, Jianle Sun et al.

ICLR 2025posterarXiv:2503.07302
5
citations
#4739

Towards Improving Exploration through Sibling Augmented GFlowNets

Kanika Madan, Alex Lamb, Emmanuel Bengio et al.

ICLR 2025poster
5
citations
#4740

Lifelong Knowledge Editing for Vision Language Models with Low-Rank Mixture-of-Experts

Qizhou Chen, Chengyu Wang, Dakan Wang et al.

CVPR 2025posterarXiv:2411.15432
5
citations
#4741

Topological Schrödinger Bridge Matching

Maosheng Yang

ICLR 2025posterarXiv:2504.04799
5
citations
#4742

Unleashing the Potential of Vision-Language Pre-Training for 3D Zero-Shot Lesion Segmentation via Mask-Attribute Alignment

Yankai Jiang, Wenhui Lei, Xiaofan Zhang et al.

ICLR 2025posterarXiv:2410.15744
5
citations
#4743

DualPM: Dual Posed-Canonical Point Maps for 3D Shape and Pose Reconstruction

Ben Kaye, Tomas Jakab, Shangzhe Wu et al.

CVPR 2025highlightarXiv:2412.04464
5
citations
#4744

AtomSurf: Surface Representation for Learning on Protein Structures

Vincent Mallet, Yangyang Miao, Souhaib Attaiki et al.

ICLR 2025posterarXiv:2309.16519
5
citations
#4745

Contrastive Test-Time Composition of Multiple LoRA Models for Image Generation

Tuna Meral, Enis Simsar, Federico Tombari et al.

ICCV 2025highlightarXiv:2403.19776
5
citations
#4746

FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model

Yukang Cao, Chenyang Si, Jinghao Wang et al.

ICCV 2025posterarXiv:2507.01953
5
citations
#4747

A Solvable Attention for Neural Scaling Laws

Bochen Lyu, Di Wang, Zhanxing Zhu

ICLR 2025poster
5
citations
#4748

Multi-turn Consistent Image Editing

Zijun Zhou, Yingying Deng, Xiangyu He et al.

ICCV 2025posterarXiv:2505.04320
5
citations
#4749

Shape it Up! Restoring LLM Safety during Finetuning

ShengYun Peng, Pin-Yu Chen, Jianfeng Chi et al.

NEURIPS 2025posterarXiv:2505.17196
5
citations
#4750

PRIMAL: Physically Reactive and Interactive Motor Model for Avatar Learning

Yan Zhang, Yao Feng, Alpár Cseke et al.

ICCV 2025posterarXiv:2503.17544
5
citations
#4751

Importance Corrected Neural JKO Sampling

Johannes Hertrich, Robert Gruhlke

ICML 2025posterarXiv:2407.20444
5
citations
#4752

A Common Pitfall of Margin-based Language Model Alignment: Gradient Entanglement

Hui Yuan, Yifan Zeng, Yue Wu et al.

ICLR 2025posterarXiv:2410.13828
5
citations
#4753

GENMANIP: LLM-driven Simulation for Generalizable Instruction-Following Manipulation

Ning Gao, Yilun Chen, Shuai Yang et al.

CVPR 2025posterarXiv:2506.10966
5
citations
#4754

ELITE: Enhanced Language-Image Toxicity Evaluation for Safety

Wonjun Lee, Doehyeon Lee, Eugene Choi et al.

ICML 2025posterarXiv:2502.04757
5
citations
#4755

What should a neuron aim for? Designing local objective functions based on information theory

Andreas C. Schneider, Valentin Neuhaus, David Ehrlich et al.

ICLR 2025posterarXiv:2412.02482
5
citations
#4756

Scaling Laws for Floating–Point Quantization Training

Xingwu Sun, Shuaipeng Li, Ruobing Xie et al.

ICML 2025poster
5
citations
#4757

DiTASK: Multi-Task Fine-Tuning with Diffeomorphic Transformations

Krishna Sri Ipsit Mantri, Carola-Bibiane Schönlieb, Bruno Ribeiro et al.

CVPR 2025posterarXiv:2502.06029
5
citations
#4758

Can Generative Geospatial Diffusion Models Excel as Discriminative Geospatial Foundation Models?

Yuru Jia, Valerio Marsocci, Ziyang Gong et al.

ICCV 2025posterarXiv:2503.07890
5
citations
#4759

CTRL-D: Controllable Dynamic 3D Scene Editing with Personalized 2D Diffusion

Kai He, Chin-Hsuan Wu, Igor Gilitschenski

CVPR 2025posterarXiv:2412.01792
5
citations
#4760

IMFine: 3D Inpainting via Geometry-guided Multi-view Refinement

Zhihao Shi, Dong Huo, Yuhongze Zhou et al.

CVPR 2025posterarXiv:2503.04501
5
citations
#4761

High-Fidelity Relightable Monocular Portrait Animation with Lighting-Controllable Video Diffusion Model

Mingtao Guo, Guanyu Xing, Yanli Liu

CVPR 2025posterarXiv:2502.19894
5
citations
#4762

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Wenqi Zhang, Hang Zhang, Xin Li et al.

ICCV 2025highlightarXiv:2501.00958
5
citations
#4763

egoPPG: Heart Rate Estimation from Eye-Tracking Cameras in Egocentric Systems to Benefit Downstream Vision Tasks

Björn Braun, Rayan Armani, Manuel Meier et al.

ICCV 2025posterarXiv:2502.20879
5
citations
#4764

Pos3R: 6D Pose Estimation for Unseen Objects Made Easy

Weijian Deng, Dylan Campbell, Chunyi Sun et al.

CVPR 2025poster
5
citations
#4765

Fine-grained Spatiotemporal Grounding on Egocentric Videos

Shuo LIANG, Yiwu Zhong, Zi-Yuan Hu et al.

ICCV 2025posterarXiv:2508.00518
5
citations
#4766

KV Shifting Attention Enhances Language Modeling

Mingyu Xu, Bingning Wang, Weipeng Chen

ICML 2025oralarXiv:2411.19574
5
citations
#4767

PARQ: Piecewise-Affine Regularized Quantization

Lisa Jin, Jianhao Ma, Zechun Liu et al.

ICML 2025posterarXiv:2503.15748
5
citations
#4768

ProbeSDF: Light Field Probes For Neural Surface Reconstruction

Briac Toussaint, Diego Thomas, Jean-Sébastien Franco

CVPR 2025posterarXiv:2412.10084
5
citations
#4769

HandOS: 3D Hand Reconstruction in One Stage

Xingyu Chen, Zhuheng Song, Xiaoke Jiang et al.

CVPR 2025posterarXiv:2412.01537
5
citations
#4770

DeSplat: Decomposed Gaussian Splatting for Distractor-Free Rendering

Yihao Wang, Marcus Klasson, Matias Turkulainen et al.

CVPR 2025posterarXiv:2411.19756
5
citations
#4771

From Alexnet to Transformers: Measuring the Non-linearity of Deep Neural Networks with Affine Optimal Transport

Quentin Bouniot, Ievgen Redko, Anton Mallasto et al.

CVPR 2025posterarXiv:2310.11439
5
citations
#4772

Context-Enhanced Memory-Refined Transformer for Online Action Detection

Zhanzhong Pang, Fadime Sener, Angela Yao

CVPR 2025posterarXiv:2503.18359
5
citations
#4773

How to Verify Any (Reasonable) Distribution Property: Computationally Sound Argument Systems for Distributions

Tal Herman, Guy Rothblum

ICLR 2025posterarXiv:2409.06594
5
citations
#4774

Zeroth-Order Fine-Tuning of LLMs with Transferable Static Sparsity

Wentao Guo, Jikai Long, Yimeng Zeng et al.

ICLR 2025poster
5
citations
#4775

Cropper: Vision-Language Model for Image Cropping through In-Context Learning

Seung Hyun Lee, Jijun jiang, Yiran Xu et al.

CVPR 2025posterarXiv:2408.07790
5
citations
#4776

Severing Spurious Correlations with Data Pruning

Varun Mulchandani, Jung-Eun Kim

ICLR 2025posterarXiv:2503.18258
5
citations
#4777

HUSH: Holistic Panoramic 3D Scene Understanding using Spherical Harmonics

Jongsung Lee, HARIN PARK, Byeong-Uk Lee et al.

CVPR 2025poster
5
citations
#4778

On Generalization Across Environments In Multi-Objective Reinforcement Learning

Jayden Teoh, Pradeep Varakantham, Peter Vamplew

ICLR 2025posterarXiv:2503.00799
5
citations
#4779

EffiCoder: Enhancing Code Generation in Large Language Models through Efficiency-Aware Fine-tuning

Dong HUANG, Guangtao Zeng, Jianbo Dai et al.

ICML 2025posterarXiv:2410.10209
5
citations
#4780

ConStellaration: A dataset of QI-like stellarator plasma boundaries and optimization benchmarks

Santiago Cadena, Andrea Merlo, Emanuel Laude et al.

NEURIPS 2025posterarXiv:2506.19583
5
citations
#4781

IterIS: Iterative Inference-Solving Alignment for LoRA Merging

Hongxu chen, Zhen Wang, Runshi Li et al.

CVPR 2025posterarXiv:2411.15231
5
citations
#4782

(How) Can Transformers Predict Pseudo-Random Numbers?

Tao Tao, Darshil Doshi, Dayal Singh Kalra et al.

ICML 2025posterarXiv:2502.10390
5
citations
#4783

Multi-modal Vision Pre-training for Medical Image Analysis

Shaohao Rui, Lingzhi Chen, Zhenyu Tang et al.

CVPR 2025highlightarXiv:2410.10604
5
citations
#4784

Bayesian WeakS-to-Strong from Text Classification to Generation

Ziyun Cui, Ziyang Zhang, Guangzhi Sun et al.

ICLR 2025posterarXiv:2406.03199
5
citations
#4785

CLOC: Contrastive Learning for Ordinal Classification with Multi-Margin N-pair Loss

Dileepa Pitawela, Gustavo Carneiro, Hsiang-Ting Chen

CVPR 2025posterarXiv:2504.17813
5
citations
#4786

MAGNET: A Multi-agent Framework for Finding Audio-Visual Needles by Reasoning over Multi-Video Haystacks

Sanjoy Chowdhury, Mohamed Elmoghany, Yohan Abeysinghe et al.

NEURIPS 2025oralarXiv:2506.07016
5
citations
#4787

Learning to Communicate Through Implicit Communication Channels

Han Wang, Binbin Chen, zhang et al.

ICLR 2025posterarXiv:2411.01553
5
citations
#4788

TTFSFormer: A TTFS-based Lossless Conversion of Spiking Transformer

Lusen Zhao, Zihan Huang, Ding Jianhao et al.

ICML 2025poster
5
citations
#4789

Understanding Generalization in Quantum Machine Learning with Margins

TAK HUR, Daniel Kyungdeock Park

ICML 2025posterarXiv:2411.06919
5
citations
#4790

Rethinking Fair Representation Learning for Performance-Sensitive Tasks

Charles Jones, Fabio De Sousa Ribeiro, Mélanie Roschewitz et al.

ICLR 2025posterarXiv:2410.04120
5
citations
#4791

ABC-Former: Auxiliary Bimodal Cross-domain Transformer with Interactive Channel Attention for White Balance

Yu-Cheng Chiu, GUAN-RONG CHEN, Zihao Chen et al.

CVPR 2025poster
5
citations
#4792

SDGOCC: Semantic and Depth-Guided Bird's-Eye View Transformation for 3D Multimodal Occupancy Prediction

ZaiPeng Duan, Xuzhong Hu, Pei An et al.

CVPR 2025posterarXiv:2507.17083
5
citations
#4793

Improving Adversarial Transferability on Vision Transformers via Forward Propagation Refinement

Yuchen Ren, Zhengyu Zhao, Chenhao Lin et al.

CVPR 2025posterarXiv:2503.15404
5
citations
#4794

Reward-Augmented Data Enhances Direct Preference Alignment of LLMs

Shenao Zhang, Zhihan Liu, Boyi Liu et al.

ICML 2025posterarXiv:2410.08067
5
citations
#4795

Question-Aware Gaussian Experts for Audio-Visual Question Answering

Hongyeob Kim, Inyoung Jung, Dayoon Suh et al.

CVPR 2025highlightarXiv:2503.04459
5
citations
#4796

DiffVsgg: Diffusion-Driven Online Video Scene Graph Generation

Mu Chen, Liulei Li, Wenguan Wang et al.

CVPR 2025posterarXiv:2503.13957
5
citations
#4797

BrainOmni: A Brain Foundation Model for Unified EEG and MEG Signals

Qinfan Xiao, Ziyun Cui, Chi Zhang et al.

NEURIPS 2025oralarXiv:2505.18185
4
citations
#4798

Bridging the Semantic Gap Between Text and Table: A Case Study on NL2SQL

Lin Long, Xijun Gu, Xinjie Sun et al.

ICLR 2025poster
4
citations
#4799

TKG-DM: Training-free Chroma Key Content Generation Diffusion Model

Ryugo Morita, Stanislav Frolov, Brian Bernhard Moser et al.

CVPR 2025highlightarXiv:2411.15580
4
citations
#4800

OpenSDI: Spotting Diffusion-Generated Images in the Open World

Yabin Wang, Zhiwu Huang, Xiaopeng Hong

CVPR 2025posterarXiv:2503.19653
4
citations