Most Cited 2025 "feature dimension selection" Papers

22,274 papers found • Page 13 of 112

#2401

EasyHOI: Unleashing the Power of Large Models for Reconstructing Hand-Object Interactions in the Wild

Yumeng Liu, Xiaoxiao Long, Zemin Yang et al.

CVPR 2025posterarXiv:2411.14280
12
citations
#2402

SVG: 3D Stereoscopic Video Generation via Denoising Frame Matrix

Peng Dai, Feitong Tan, Qiangeng Xu et al.

ICLR 2025posterarXiv:2407.00367
12
citations
#2403

LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization

Guanzheng Chen, Xin Li, Michael Qizhe Shieh et al.

ICLR 2025posterarXiv:2502.13922
12
citations
#2404

Detail Matters: Mamba-Inspired Joint Unfolding Network for Snapshot Spectral Compressive Imaging

Mengjie Qin, Yuchao Feng, Zongliang Wu et al.

AAAI 2025paperarXiv:2501.01262
12
citations
#2405

Dissecting Submission Limit in Desk-Rejections: A Mathematical Analysis of Fairness in AI Conference Policies

Yuefan Cao, Xiaoyu Li, Yingyu Liang et al.

ICML 2025posterarXiv:2502.00690
12
citations
#2406

Accelerating RL for LLM Reasoning with Optimal Advantage Regression

Kianté Brantley, Mingyu Chen, Zhaolin Gao et al.

NEURIPS 2025posterarXiv:2505.20686
12
citations
#2407

Decision Information Meets Large Language Models: The Future of Explainable Operations Research

Yansen Zhang, Qingcan Kang, Wing Yin YU et al.

ICLR 2025posterarXiv:2502.09994
12
citations
#2408

LeanAgent: Lifelong Learning for Formal Theorem Proving

Adarsh Kumarappan, Mohit Tiwari, Peiyang Song et al.

ICLR 2025posterarXiv:2410.06209
12
citations
#2409

Uncertainty-guided Perturbation for Image Super-Resolution Diffusion Model

Leheng Zhang, Weiyi You, Kexuan Shi et al.

CVPR 2025posterarXiv:2503.18512
12
citations
#2410

MG-MotionLLM: A Unified Framework for Motion Comprehension and Generation across Multiple Granularities

Bizhu Wu, Jinheng Xie, Keming Shen et al.

CVPR 2025posterarXiv:2504.02478
12
citations
#2411

Broadening Target Distributions for Accelerated Diffusion Models via a Novel Analysis Approach

Yuchen Liang, Peizhong Ju, Yingbin Liang et al.

ICLR 2025posterarXiv:2402.13901
12
citations
#2412

Zero-Shot Monocular Scene Flow Estimation in the Wild

Yiqing Liang, Abhishek Badki, Hang Su et al.

CVPR 2025posterarXiv:2501.10357
12
citations
#2413

Imagine360: Immersive 360 Video Generation from Perspective Anchor

Jing Tan, Shuai Yang, Tong Wu et al.

NEURIPS 2025posterarXiv:2412.03552
12
citations
#2414

RelCon: Relative Contrastive Learning for a Motion Foundation Model for Wearable Data

Maxwell Xu, Jaya Narain, Gregory Darnell et al.

ICLR 2025posterarXiv:2411.18822
12
citations
#2415

Adversarial Distribution Matching for Diffusion Distillation Towards Efficient Image and Video Synthesis

Yanzuo Lu, Yuxi Ren, Xin Xia et al.

ICCV 2025highlightarXiv:2507.18569
12
citations
#2416

MotionPro: A Precise Motion Controller for Image-to-Video Generation

Zhongwei Zhang, Fuchen Long, Zhaofan Qiu et al.

CVPR 2025posterarXiv:2505.20287
12
citations
#2417

CROW: Eliminating Backdoors from Large Language Models via Internal Consistency Regularization

Nay Myat Min, Long H. Pham, Yige Li et al.

ICML 2025posterarXiv:2411.12768
12
citations
#2418

ICE-Bench: A Unified and Comprehensive Benchmark for Image Creating and Editing

Yulin Pan, Xiangteng He, Chaojie Mao et al.

ICCV 2025posterarXiv:2503.14482
12
citations
#2419

FramePainter: Endowing Interactive Image Editing with Video Diffusion Priors

Yabo Zhang, xinpeng zhou, Yihan Zeng et al.

ICCV 2025posterarXiv:2501.08225
12
citations
#2420

Galileo: Learning Global & Local Features of Many Remote Sensing Modalities

Gabriel Tseng, Anthony Fuller, Marlena Reil et al.

ICML 2025posterarXiv:2502.09356
12
citations
#2421

HandDiffuse: Generative Controllers for Two-Hand Interactions via Diffusion Models

Pei Lin

AAAI 2025paperarXiv:2312.04867
12
citations
#2422

On Reasoning Strength Planning in Large Reasoning Models

Leheng Sheng, An Zhang, Zijian Wu et al.

NEURIPS 2025posterarXiv:2506.08390
12
citations
#2423

RealRAG: Retrieval-augmented Realistic Image Generation via Self-reflective Contrastive Learning

Yuanhuiyi Lyu, Xu Zheng, Lutao Jiang et al.

ICML 2025posterarXiv:2502.00848
12
citations
#2424

MAP: Multi-Human-Value Alignment Palette

Xinran Wang, Qi Le, Ammar Ahmed et al.

ICLR 2025posterarXiv:2410.19198
12
citations
#2425

Generative Planning with 3D-Vision Language Pre-training for End-to-End Autonomous Driving

Tengpeng Li, Hanli Wang, Xianfei Li et al.

AAAI 2025paperarXiv:2501.08861
12
citations
#2426

TREAD: Token Routing for Efficient Architecture-agnostic Diffusion Training

Felix Krause, Timy Phan, Ming Gui et al.

ICCV 2025posterarXiv:2501.04765
12
citations
#2427

Point-RFT: Improving Multimodal Reasoning with Visually Grounded Reinforcement Finetuning

Minheng Ni, Zhengyuan Yang, Linjie Li et al.

NEURIPS 2025posterarXiv:2505.19702
12
citations
#2428

Rethinking Spiking Neural Networks from an Ensemble Learning Perspective

Yongqi Ding, Lin Zuo, Mengmeng Jing et al.

ICLR 2025oralarXiv:2502.14218
12
citations
#2429

Cost-efficient Collaboration between On-device and Cloud Language Models

Avanika Narayan, Dan Biderman, Sabri Eyuboglu et al.

ICML 2025posterarXiv:2502.15964
12
citations
#2430

Once-for-All: Controllable Generative Image Compression with Dynamic Granularity Adaptation

Anqi Li, Feng Li, Yuxi Liu et al.

ICLR 2025posterarXiv:2406.00758
12
citations
#2431

Instant Adversarial Purification with Adversarial Consistency Distillation

Chun Tong Lei, Hon Ming Yam, Zhongliang Guo et al.

CVPR 2025posterarXiv:2408.17064
12
citations
#2432

Trusted Multi-View Classification via Evolutionary Multi-View Fusion

Xinyan Liang, Pinhan Fu, Yuhua Qian et al.

ICLR 2025poster
12
citations
#2433

EgoLM: Multi-Modal Language Model of Egocentric Motions

Fangzhou Hong, Vladimir Guzov, Hyo Jin Kim et al.

CVPR 2025posterarXiv:2409.18127
12
citations
#2434

VLA-OS: Structuring and Dissecting Planning Representations and Paradigms in Vision-Language-Action Models

Chongkai Gao, Zixuan Liu, Zhenghao Chi et al.

NEURIPS 2025posterarXiv:2506.17561
12
citations
#2435

SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models

Jiale Cheng, Xiao Liu, Cunxiang Wang et al.

ICLR 2025posterarXiv:2412.11605
12
citations
#2436

One-for-More: Continual Diffusion Model for Anomaly Detection

Xiaofan Li, Xin Tan, Zhuo Chen et al.

CVPR 2025posterarXiv:2502.19848
12
citations
#2437

PaTH Attention: Position Encoding via Accumulating Householder Transformations

Songlin Yang, Yikang Shen, Kaiyue Wen et al.

NEURIPS 2025posterarXiv:2505.16381
12
citations
#2438

Inference Optimal VLMs Need Fewer Visual Tokens and More Parameters

Kevin Li, Sachin Goyal, João D Semedo et al.

ICLR 2025posterarXiv:2411.03312
12
citations
#2439

METASCENES: Towards Automated Replica Creation for Real-world 3D Scans

Huangyue Yu, Baoxiong Jia, Yixin Chen et al.

CVPR 2025posterarXiv:2505.02388
12
citations
#2440

ConfigX: Modular Configuration for Evolutionary Algorithms via Multitask Reinforcement Learning

Hongshu Guo, Zeyuan Ma, Jiacheng Chen et al.

AAAI 2025paperarXiv:2412.07507
12
citations
#2441

MotionLab: Unified Human Motion Generation and Editing via the Motion-Condition-Motion Paradigm

Ziyan Guo, Zeyu HU, Na Zhao et al.

ICCV 2025posterarXiv:2502.02358
12
citations
#2442

Generative Classifiers Avoid Shortcut Solutions

Alexander Li, Ananya Kumar, Deepak Pathak

ICLR 2025posterarXiv:2512.25034
12
citations
#2443

Towards a Universal Synthetic Video Detector: From Face or Background Manipulations to Fully AI-Generated Content

Rohit Kundu, Hao Xiong, Vishal Mohanty et al.

CVPR 2025posterarXiv:2412.12278
12
citations
#2444

Speech Robust Bench: A Robustness Benchmark For Speech Recognition

Muhammad Shah, David Solans Noguero, Mikko Heikkilä et al.

ICLR 2025posterarXiv:2403.07937
12
citations
#2445

CL-Attack: Textual Backdoor Attacks via Cross-Lingual Triggers

Jingyi Zheng, Tianyi Hu, Tianshuo Cong et al.

AAAI 2025paperarXiv:2412.19037
12
citations
#2446

LeanQuant: Accurate and Scalable Large Language Model Quantization with Loss-error-aware Grid

Tianyi Zhang, Anshumali Shrivastava

ICLR 2025posterarXiv:2407.10032
12
citations
#2447

Bag of Tricks for Inference-time Computation of LLM Reasoning

Fan LIU, Wen-Shuo Chao, Naiqiang Tan et al.

NEURIPS 2025posterarXiv:2502.07191
12
citations
#2448

SuperEdit: Rectifying and Facilitating Supervision for Instruction-Based Image Editing

Ming Li, Xin Gu, Fan Chen et al.

ICCV 2025posterarXiv:2505.02370
12
citations
#2449

Exploring the limits of strong membership inference attacks on large language models

Jamie Hayes, I Shumailov, Christopher A. Choquette-Choo et al.

NEURIPS 2025posterarXiv:2505.18773
12
citations
#2450

Brain Mapping with Dense Features: Grounding Cortical Semantic Selectivity in Natural Images With Vision Transformers

Andrew Luo, Jacob Yeung, Rushikesh Zawar et al.

ICLR 2025posterarXiv:2410.05266
12
citations
#2451

Defeasible Visual Entailment: Benchmark, Evaluator, and Reward-Driven Optimization

Yue Zhang, Liqiang Jing, Vibhav Gogate

AAAI 2025paperarXiv:2412.16232
12
citations
#2452

$\mathbb{X}$-Sample Contrastive Loss: Improving Contrastive Learning with Sample Similarity Graphs

Vlad Sobal, Mark Ibrahim, Randall Balestriero et al.

ICLR 2025posterarXiv:2407.18134
12
citations
#2453

Diffusion Models for Attribution

Xiongren Chen, Jiuyong Li, Jixue Liu et al.

AAAI 2025paperarXiv:2403.14790
12
citations
#2454

The Illusion of Unlearning: The Unstable Nature of Machine Unlearning in Text-to-Image Diffusion Models

Naveen George, Karthik Nandan Dasaraju, Rutheesh Reddy Chittepu et al.

CVPR 2025poster
12
citations
#2455

MVPortrait: Text-Guided Motion and Emotion Control for Multi-view Vivid Portrait Animation

Yukang Lin, Hokit Fung, Jianjin Xu et al.

CVPR 2025posterarXiv:2503.19383
12
citations
#2456

SecureGS: Boosting the Security and Fidelity of 3D Gaussian Splatting Steganography

Xuanyu Zhang, Jiarui Meng, Zhipei Xu et al.

ICLR 2025posterarXiv:2503.06118
12
citations
#2457

MVGenMaster: Scaling Multi-View Generation from Any Image via 3D Priors Enhanced Diffusion Model

Chenjie Cao, Chaohui Yu, Shang Liu et al.

CVPR 2025posterarXiv:2411.16157
12
citations
#2458

DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses

Yatian Pang, Bin Zhu, Bin Lin et al.

ICCV 2025posterarXiv:2412.00397
12
citations
#2459

Stable Segment Anything Model

Qi Fan, Xin Tao, Lei Ke et al.

ICLR 2025posterarXiv:2311.15776
12
citations
#2460

UniMuMo: Unified Text, Music, and Motion Generation

Han Yang, Kun Su, Yutong Zhang et al.

AAAI 2025paperarXiv:2410.04534
12
citations
#2461

Imputation for prediction: beware of diminishing returns.

Marine Le Morvan, Gael Varoquaux

ICLR 2025posterarXiv:2407.19804
12
citations
#2462

The Power of Context: How Multimodality Improves Image Super-Resolution

Kangfu Mei, Vishal M. Patel, Mojtaba Sahraee-Ardakan et al.

CVPR 2025posterarXiv:2503.14503
12
citations
#2463

DexHandDiff: Interaction-aware Diffusion Planning for Adaptive Dexterous Manipulation

Zhixuan Liang, Yao Mu, Yixiao Wang et al.

CVPR 2025posterarXiv:2411.18562
12
citations
#2464

CRUST-Bench: A Comprehensive Benchmark for C-to-safe-Rust Transpilation

Anirudh Khatry, Robert Zhang, Jia Pan et al.

COLM 2025paperarXiv:2504.15254
12
citations
#2465

Advancing Spiking Neural Networks Towards Multiscale Spatiotemporal Interaction Learning

Yimeng Shan, Malu Zhang, Rui-jie Zhu et al.

AAAI 2025paperarXiv:2405.13672
12
citations
#2466

NAVIX: Scaling MiniGrid Environments with JAX

Eduardo Pignatelli, Jarek Liesen, Robert Lange et al.

NEURIPS 2025posterarXiv:2407.19396
12
citations
#2467

Equivariance Everywhere All At Once: A Recipe for Graph Foundation Models

Ben Finkelshtein, Ismail Ilkan Ceylan, Michael Bronstein et al.

NEURIPS 2025posterarXiv:2506.14291
12
citations
#2468

LUDVIG: Learning-Free Uplifting of 2D Visual Features to Gaussian Splatting Scenes

Juliette Marrie, Romain Menegaux, Michael Arbel et al.

ICCV 2025posterarXiv:2410.14462
12
citations
#2469

NuPlanQA: A Large-Scale Dataset and Benchmark for Multi-View Driving Scene Understanding in Multi-Modal Large Language Models

Sung-Yeon Park, Can Cui, Yunsheng Ma et al.

ICCV 2025posterarXiv:2503.12772
12
citations
#2470

Speech Recognition Meets Large Language Model: Benchmarking, Models, and Exploration

Ziyang Ma, Guanrou Yang, Yifan Yang et al.

AAAI 2025paper
12
citations
#2471

Heterogeneous Swarms: Jointly Optimizing Model Roles and Weights for Multi-LLM Systems

Shangbin Feng, Zifeng Wang, Palash Goyal et al.

NEURIPS 2025posterarXiv:2502.04510
12
citations
#2472

DualTalk: Dual-Speaker Interaction for 3D Talking Head Conversations

Ziqiao Peng, Yanbo Fan, Haoyu Wu et al.

CVPR 2025posterarXiv:2505.18096
12
citations
#2473

In Search of Adam’s Secret Sauce

Antonio Orvieto, Robert Gower

NEURIPS 2025oralarXiv:2505.21829
12
citations
#2474

HierarQ: Task-Aware Hierarchical Q-Former for Enhanced Video Understanding

Shehreen Azad, Vibhav Vineet, Yogesh S. Rawat

CVPR 2025posterarXiv:2503.08585
12
citations
#2475

CoT Red-Handed: Stress Testing Chain-of-Thought Monitoring

Benjamin Arnav, Pablo Bernabeu-Perez, Nathan Helm-Burger et al.

NEURIPS 2025posterarXiv:2505.23575
12
citations
#2476

Citations and Trust in LLM Generated Responses

Yifan Ding, Matthew Facciani, Ellen Joyce et al.

AAAI 2025paperarXiv:2501.01303
12
citations
#2477

MoDec-GS: Global-to-Local Motion Decomposition and Temporal Interval Adjustment for Compact Dynamic 3D Gaussian Splatting

Sangwoon Kwak, Joonsoo Kim, Jun Young Jeong et al.

CVPR 2025posterarXiv:2501.03714
12
citations
#2478

Learning Molecular Representation in a Cell

Gang Liu, Srijit Seal, John Arevalo et al.

ICLR 2025posterarXiv:2406.12056
12
citations
#2479

Prototype-Based Image Prompting for Weakly Supervised Histopathological Image Segmentation

Qingchen Tang, Lei Fan, Maurice Pagnucco et al.

CVPR 2025posterarXiv:2503.12068
12
citations
#2480

Mr. DETR: Instructive Multi-Route Training for Detection Transformers

Chang-Bin Zhang, Yujie Zhong, Kai Han

CVPR 2025poster
12
citations
#2481

Ambient Diffusion Omni: Training Good Models with Bad Data

Giannis Daras, Adrian Rodriguez-Munoz, Adam Klivans et al.

NEURIPS 2025spotlightarXiv:2506.10038
12
citations
#2482

Identifying Query-Relevant Neurons in Large Language Models for Long-Form Texts

Lihu Chen, Adam Dejl, Francesca Toni

AAAI 2025paperarXiv:2406.10868
12
citations
#2483

BVINet: Unlocking Blind Video Inpainting with Zero Annotations

zhiliang wu, Kerui Chen, Kun Li et al.

ICCV 2025posterarXiv:2502.01181
12
citations
#2484

MDNS: Masked Diffusion Neural Sampler via Stochastic Optimal Control

Yuchen Zhu, Wei Guo, Jaemoo Choi et al.

NEURIPS 2025posterarXiv:2508.10684
12
citations
#2485

Mitigating the Backdoor Effect for Multi-Task Model Merging via Safety-Aware Subspace

Jinluan Yang, Anke Tang, Didi Zhu et al.

ICLR 2025posterarXiv:2410.13910
12
citations
#2486

Panacea: Mitigating Harmful Fine-tuning for Large Language Models via Post-fine-tuning Perturbation

Yibo Wang, Tiansheng Huang, Li Shen et al.

NEURIPS 2025posterarXiv:2501.18100
12
citations
#2487

RealGeneral: Unifying Visual Generation via Temporal In-Context Learning with Video Models

Yijing Lin, Mengqi Huang, Shuhan Zhuang et al.

ICCV 2025posterarXiv:2503.10406
12
citations
#2488

Patch-wise Structural Loss for Time Series Forecasting

Dilfira Kudrat, Zongxia Xie, Yanru Sun et al.

ICML 2025oralarXiv:2503.00877
12
citations
#2489

SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction

Ling Yang, Zhaochen Yu, Tianjun Zhang et al.

ICLR 2025posterarXiv:2410.09008
12
citations
#2490

Searching Latent Program Spaces

Matthew Macfarlane, Clem Bonnet

NEURIPS 2025spotlightarXiv:2411.08706
12
citations
#2491

DropoutGS: Dropping Out Gaussians for Better Sparse-view Rendering

Yexing Xu, Longguang Wang, Minglin Chen et al.

CVPR 2025posterarXiv:2504.09491
12
citations
#2492

Accelerating Large Language Model Reasoning via Speculative Search

Zhihai Wang, Jie Wang, Jilai Pan et al.

ICML 2025posterarXiv:2505.02865
12
citations
#2493

BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model

Adibvafa Fallahpour, Andrew Magnuson, Purav Gupta et al.

NEURIPS 2025posterarXiv:2505.23579
12
citations
#2494

Learning Hazing to Dehazing: Towards Realistic Haze Generation for Real-World Image Dehazing

Ruiyi Wang, Yushuo Zheng, Zicheng Zhang et al.

CVPR 2025posterarXiv:2503.19262
12
citations
#2495

Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLMs

Xinyu Fang, Zhijian Chen, Kai Lan et al.

ICCV 2025posterarXiv:2503.14478
12
citations
#2496

InfiniPot-V: Memory-Constrained KV Cache Compression for Streaming Video Understanding

Minsoo Kim, Kyuhong Shim, Jungwook Choi et al.

NEURIPS 2025oralarXiv:2506.15745
12
citations
#2497

GASP: Efficient Black-Box Generation of Adversarial Suffixes for Jailbreaking LLMs

Advik Basani, Xiao Zhang

NEURIPS 2025posterarXiv:2411.14133
12
citations
#2498

KeyDiff: Key Similarity-Based KV Cache Eviction for Long-Context LLM Inference in Resource-Constrained Environments

Junyoung Park, Dalton Jones, Matthew Morse et al.

NEURIPS 2025posterarXiv:2504.15364
12
citations
#2499

VideoICL: Confidence-based Iterative In-context Learning for Out-of-Distribution Video Understanding

Kangsan Kim, Geon Park, Youngwan Lee et al.

CVPR 2025posterarXiv:2412.02186
12
citations
#2500

Quest: Query-centric Data Synthesis Approach for Long-context Scaling of Large Language Model

Chaochen Gao, Xing W, Qi Fu et al.

ICLR 2025posterarXiv:2405.19846
12
citations
#2501

Enhancing Personalized Multi-Turn Dialogue with Curiosity Reward

Yanming Wan, Jiaxing Wu, Marwa Abdulhai et al.

NEURIPS 2025posterarXiv:2504.03206
12
citations
#2502

Fine-Grained Evaluation of Large Vision-Language Models in Autonomous Driving

Yue Li, Meng Tian, Zhenyu Lin et al.

ICCV 2025posterarXiv:2503.21505
12
citations
#2503

Vec2Face: Scaling Face Dataset Generation with Loosely Constrained Vectors

Haiyu Wu, Jaskirat Singh, Sicong Tian et al.

ICLR 2025posterarXiv:2409.02979
12
citations
#2504

Reward-Guided Iterative Refinement in Diffusion Models at Test-Time with Applications to Protein and DNA Design

Masatoshi Uehara, su, Yulai Zhao et al.

ICML 2025posterarXiv:2502.14944
12
citations
#2505

Solving New Tasks by Adapting Internet Video Knowledge

Calvin Luo, Zilai Zeng, Yilun Du et al.

ICLR 2025posterarXiv:2504.15369
12
citations
#2506

Post-hoc Reward Calibration: A Case Study on Length Bias

Zeyu Huang, Zihan Qiu, zili wang et al.

ICLR 2025posterarXiv:2409.17407
12
citations
#2507

DRiVE: Diffusion-based Rigging Empowers Generation of Versatile and Expressive Characters

Mingze Sun, Junting Dong, Junhao Chen et al.

CVPR 2025posterarXiv:2411.17423
12
citations
#2508

VinTAGe: Joint Video and Text Conditioning for Holistic Audio Generation

Saksham Singh Kushwaha, Yapeng Tian

CVPR 2025posterarXiv:2412.10768
12
citations
#2509

Revisiting Zeroth-Order Optimization: Minimum-Variance Two-Point Estimators and Directionally Aligned Perturbations

Shaocong Ma, Heng Huang

ICLR 2025posterarXiv:2510.19975
12
citations
#2510

P-SPIKESSM: HARNESSING PROBABILISTIC SPIKING STATE SPACE MODELS FOR LONG-RANGE DEPENDENCY TASKS

Malyaban Bal, Abhronil Sengupta

ICLR 2025posterarXiv:2406.02923
12
citations
#2511

LoRACLR: Contrastive Adaptation for Customization of Diffusion Models

Enis Simsar, Thomas Hofmann, Federico Tombari et al.

CVPR 2025posterarXiv:2412.09622
12
citations
#2512

Completion as Enhancement: A Degradation-Aware Selective Image Guided Network for Depth Completion

Zhiqiang Yan, Zhengxue Wang, Kun Wang et al.

CVPR 2025posterarXiv:2412.19225
12
citations
#2513

Flow matching achieves almost minimax optimal convergence

Kenji Fukumizu, Taiji Suzuki, Noboru Isobe et al.

ICLR 2025posterarXiv:2405.20879
12
citations
#2514

MCCD: Multi-Agent Collaboration-based Compositional Diffusion for Complex Text-to-Image Generation

Mingcheng Li, Xiaolu Hou, Ziyang Liu et al.

CVPR 2025posterarXiv:2505.02648
12
citations
#2515

OmniSep: Unified Omni-Modality Sound Separation with Query-Mixup

Xize Cheng, Siqi Zheng, zehan wang et al.

ICLR 2025posterarXiv:2410.21269
12
citations
#2516

Spatial-Temporal Graph Diffusion Policy with Kinematic Modeling for Bimanual Robotic Manipulation

Qi Lv, Hao Li, Xiang Deng et al.

CVPR 2025posterarXiv:2503.10743
12
citations
#2517

StarGen: A Spatiotemporal Autoregression Framework with Video Diffusion Model for Scalable and Controllable Scene Generation

Shangjin Zhai, Zhichao Ye, Jialin Liu et al.

CVPR 2025posterarXiv:2501.05763
12
citations
#2518

Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering

Xingrui Wang, Wufei Ma, Angtian Wang et al.

ICLR 2025oralarXiv:2406.00622
12
citations
#2519

Repulsive Latent Score Distillation for Solving Inverse Problems

Nicolas Zilberstein, Morteza Mardani, Santiago Segarra

ICLR 2025posterarXiv:2406.16683
12
citations
#2520

Backdoor Cleaning without External Guidance in MLLM Fine-tuning

Xuankun Rong, Wenke Huang, Jian Liang et al.

NEURIPS 2025posterarXiv:2505.16916
12
citations
#2521

Direct Discriminative Optimization: Your Likelihood-Based Visual Generative Model is Secretly a GAN Discriminator

Kaiwen Zheng, Yongxin Chen, Huayu Chen et al.

ICML 2025spotlightarXiv:2503.01103
12
citations
#2522

Joint MoE Scaling Laws: Mixture of Experts Can Be Memory Efficient

Jan Ludziejewski, Maciej Pióro, Jakub Krajewski et al.

ICML 2025posterarXiv:2502.05172
12
citations
#2523

RAT: Adversarial Attacks on Deep Reinforcement Agents for Targeted Behaviors

Fengshuo Bai, Runze Liu, Yali Du et al.

AAAI 2025paperarXiv:2412.10713
12
citations
#2524

Coreset Selection via Reducible Loss in Continual Learning

Ruilin Tong, Yuhang Liu, Javen Qinfeng Shi et al.

ICLR 2025poster
12
citations
#2525

Enhancing Multi-Robot Semantic Navigation Through Multimodal Chain-of-Thought Score Collaboration

Zhixuan Shen, Haonan Luo, Kexun Chen et al.

AAAI 2025paperarXiv:2412.18292
12
citations
#2526

NexusGS: Sparse View Synthesis with Epipolar Depth Priors in 3D Gaussian Splatting

Yulong Zheng, Zicheng Jiang, Shengfeng He et al.

CVPR 2025highlightarXiv:2503.18794
12
citations
#2527

CoRA: Collaborative Information Perception by Large Language Model’s Weights for Recommendation

Yuting Liu, Jinghao Zhang, Yizhou Dang et al.

AAAI 2025paperarXiv:2408.10645
12
citations
#2528

AKiRa: Augmentation Kit on Rays for Optical Video Generation

Xi Wang, Robin Courant, Marc Christie et al.

CVPR 2025posterarXiv:2412.14158
12
citations
#2529

MaskControl: Spatio-Temporal Control for Masked Motion Synthesis

Ekkasit Pinyoanuntapong, Muhammad Usama Saleem, Korrawe Karunratanakul et al.

ICCV 2025posterarXiv:2410.10780
12
citations
#2530

Back on Track: Bundle Adjustment for Dynamic Scene Reconstruction

Weirong Chen, Ganlin Zhang, Felix Wimbauer et al.

ICCV 2025posterarXiv:2504.14516
12
citations
#2531

Active Evaluation Acquisition for Efficient LLM Benchmarking

Yang Li, Jie Ma, Miguel Ballesteros et al.

ICML 2025posterarXiv:2410.05952
12
citations
#2532

Formation of Representations in Neural Networks

Liu Ziyin, Isaac Chuang, Tomer Galanti et al.

ICLR 2025posterarXiv:2410.03006
12
citations
#2533

RhythmMamba: Fast, Lightweight, and Accurate Remote Physiological Measurement

Bochao Zou, Zizheng Guo, Xiaocheng Hu et al.

AAAI 2025paperarXiv:2404.06483
12
citations
#2534

CyberPal.AI: Empowering LLMs with Expert-Driven Cybersecurity Instructions

Matan Levi, Yair Allouche, Daniel Ohayon et al.

AAAI 2025paperarXiv:2408.09304
12
citations
#2535

PSHuman: Photorealistic Single-image 3D Human Reconstruction using Cross-Scale Multiview Diffusion and Explicit Remeshing

Peng Li, Wangguandong Zheng, Yuan Liu et al.

CVPR 2025posterarXiv:2409.10141
12
citations
#2536

Rethinking Invariance in In-context Learning

Lizhe Fang, Yifei Wang, Khashayar Gatmiry et al.

ICLR 2025posterarXiv:2505.04994
11
citations
#2537

Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts

Guorui Zheng, Xidong Wang, Juhao Liang et al.

ICLR 2025posterarXiv:2410.10626
11
citations
#2538

GeoLLaVA-8K: Scaling Remote-Sensing Multimodal Large Language Models to 8K Resolution

Fengxiang Wang, Mingshuo Chen, Yueying Li et al.

NEURIPS 2025spotlightarXiv:2505.21375
11
citations
#2539

EMOE: Modality-Specific Enhanced Dynamic Emotion Experts

Yiyang Fang, Wenke Huang, Guancheng Wan et al.

CVPR 2025poster
11
citations
#2540

Proxy Denoising for Source-Free Domain Adaptation

Song Tang, Wenxin Su, Yan Gan et al.

ICLR 2025posterarXiv:2406.01658
11
citations
#2541

Gazing Into Missteps: Leveraging Eye-Gaze for Unsupervised Mistake Detection in Egocentric Videos of Skilled Human Activities

Michele Mazzamuto, Antonino Furnari, Yoichi Sato et al.

CVPR 2025posterarXiv:2406.08379
11
citations
#2542

Touch in the Wild: Learning Fine-Grained Manipulation with a Portable Visuo-Tactile Gripper

Xinyue Zhu, Binghao Huang, Yunzhu Li

NEURIPS 2025posterarXiv:2507.15062
11
citations
#2543

Precise Localization of Memories: A Fine-grained Neuron-level Knowledge Editing Technique for LLMs

Haowen Pan, Xiaozhi Wang, Yixin Cao et al.

ICLR 2025posterarXiv:2503.01090
11
citations
#2544

Few for Many: Tchebycheff Set Scalarization for Many-Objective Optimization

Xi Lin, Yilu Liu, Xiaoyuan Zhang et al.

ICLR 2025posterarXiv:2405.19650
11
citations
#2545

Latent Chain-of-Thought for Visual Reasoning

Guohao Sun, Hang Hua, Jian Wang et al.

NEURIPS 2025posterarXiv:2510.23925
11
citations
#2546

Jailbreak Antidote: Runtime Safety-Utility Balance via Sparse Representation Adjustment in Large Language Models

Guobin Shen, Dongcheng Zhao, Yiting Dong et al.

ICLR 2025posterarXiv:2410.02298
11
citations
#2547

LiveXiv - A Multi-Modal live benchmark based on Arxiv papers content

Nimrod Shabtay, Felipe Maia Polo, Sivan Doveh et al.

ICLR 2025posterarXiv:2410.10783
11
citations
#2548

Almost Optimal Batch-Regret Tradeoff for Batch Linear Contextual Bandits

Zihan Zhang, Xiangyang Ji, Yuan Zhou

ICLR 2025posterarXiv:2110.08057
11
citations
#2549

Adapter Merging with Centroid Prototype Mapping for Scalable Class-Incremental Learning

Takuma Fukuda, Hiroshi Kera, Kazuhiko Kawamoto

CVPR 2025posterarXiv:2412.18219
11
citations
#2550

GenPC: Zero-shot Point Cloud Completion via 3D Generative Priors

An Li, Zhe Zhu, Mingqiang Wei

CVPR 2025posterarXiv:2502.19896
11
citations
#2551

Modeling Fine-Grained Hand-Object Dynamics for Egocentric Video Representation Learning

Baoqi Pei, Yifei Huang, Jilan Xu et al.

ICLR 2025posterarXiv:2503.00986
11
citations
#2552

From Commands to Prompts: LLM-based Semantic File System for AIOS

Zeru Shi, Kai Mei, Mingyu Jin et al.

ICLR 2025posterarXiv:2410.11843
11
citations
#2553

NeRFPrior: Learning Neural Radiance Field as a Prior for Indoor Scene Reconstruction

Wenyuan Zhang, Emily Yue-ting Jia, Junsheng Zhou et al.

CVPR 2025highlightarXiv:2503.18361
11
citations
#2554

SANER: Annotation-free Societal Attribute Neutralizer for Debiasing CLIP

Yusuke Hirota, Min-Hung Chen, Chien-Yi Wang et al.

ICLR 2025posterarXiv:2408.10202
11
citations
#2555

DASK: Distribution Rehearsing via Adaptive Style Kernel Learning for Exemplar-Free Lifelong Person Re-Identification

Kunlun Xu, Chenghao Jiang, Peixi Xiong et al.

AAAI 2025paperarXiv:2412.09224
11
citations
#2556

GaussianUDF: Inferring Unsigned Distance Functions through 3D Gaussian Splatting

Shujuan Li, Yu-Shen Liu, Zhizhong Han

CVPR 2025highlightarXiv:2503.19458
11
citations
#2557

Differentiable Optimization of Similarity Scores Between Models and Brains

Nathan Cloos, Moufan Li, Markus Siegel et al.

ICLR 2025posterarXiv:2407.07059
11
citations
#2558

ICT: Image-Object Cross-Level Trusted Intervention for Mitigating Object Hallucination in Large Vision-Language Models

Junzhe Chen, Tianshu Zhang, Shiyu Huang et al.

CVPR 2025posterarXiv:2411.15268
11
citations
#2559

ViSpeak: Visual Instruction Feedback in Streaming Videos

Shenghao Fu, Qize Yang, Yuan-Ming Li et al.

ICCV 2025posterarXiv:2503.12769
11
citations
#2560

CoRe: Context-Regularized Text Embedding Learning for Text-to-Image Personalization

Feize Wu, Yun Pang, Junyi Zhang et al.

AAAI 2025paperarXiv:2408.15914
11
citations
#2561

Hyperbolic Fine-Tuning for Large Language Models

Menglin Yang, Ram Samarth B B, Aosong Feng et al.

NEURIPS 2025spotlightarXiv:2410.04010
11
citations
#2562

When the Future Becomes the Past: Taming Temporal Correspondence for Self-supervised Video Representation Learning

Yang Liu, Qianqian Xu, Peisong Wen et al.

CVPR 2025posterarXiv:2503.15096
11
citations
#2563

Knowledge Localization: Mission Not Accomplished? Enter Query Localization!

Yuheng Chen, Pengfei Cao, Yubo Chen et al.

ICLR 2025posterarXiv:2405.14117
11
citations
#2564

DriveGEN: Generalized and Robust 3D Detection in Driving via Controllable Text-to-Image Diffusion Generation

Hongbin Lin, Zilu Guo, Yifan Zhang et al.

CVPR 2025posterarXiv:2503.11122
11
citations
#2565

Learning Few-Step Diffusion Models by Trajectory Distribution Matching

Yihong Luo, Tianyang Hu, Jiacheng Sun et al.

ICCV 2025posterarXiv:2503.06674
11
citations
#2566

R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing

Tianyu Fu, Yi Ge, Yichen You et al.

NEURIPS 2025posterarXiv:2505.21600
11
citations
#2567

EVolSplat: Efficient Volume-based Gaussian Splatting for Urban View Synthesis

Sheng Miao, Jiaxin Huang, Dongfeng Bai et al.

CVPR 2025posterarXiv:2503.20168
11
citations
#2568

Beyond Training: Dynamic Token Merging for Zero-Shot Video Understanding

Yiming Zhang, Zhuokai Zhao, Zhaorun Chen et al.

ICCV 2025posterarXiv:2411.14401
11
citations
#2569

Preference Optimization on Pareto Sets: On a Theory of Multi-Objective Optimization

Abhishek Roy, Geelon So, Yian Ma

NEURIPS 2025poster
11
citations
#2570

BlockDance: Reuse Structurally Similar Spatio-Temporal Features to Accelerate Diffusion Transformers

Hui Zhang, Tingwei Gao, Jie Shao et al.

CVPR 2025posterarXiv:2503.15927
11
citations
#2571

Glad: A Streaming Scene Generator for Autonomous Driving

Bin Xie, Yingfei Liu, Tiancai Wang et al.

ICLR 2025oralarXiv:2503.00045
11
citations
#2572

NoT: Federated Unlearning via Weight Negation

Yasser Khalil, Leo Maxime Brunswic, Soufiane Lamghari et al.

CVPR 2025posterarXiv:2503.05657
11
citations
#2573

Node Identifiers: Compact, Discrete Representations for Efficient Graph Learning

Yuankai Luo, Hongkang Li, Qijiong Liu et al.

ICLR 2025posterarXiv:2405.16435
11
citations
#2574

Horizon-GS: Unified 3D Gaussian Splatting for Large-Scale Aerial-to-Ground Scenes

Lihan Jiang, Kerui Ren, Mulin Yu et al.

CVPR 2025posterarXiv:2412.01745
11
citations
#2575

Latent Thought Models with Variational Bayes Inference-Time Computation

Deqian Kong, Minglu Zhao, Dehong Xu et al.

ICML 2025posterarXiv:2502.01567
11
citations
#2576

MM-CamObj: A Comprehensive Multimodal Dataset for Camouflaged Object Scenarios

Jiacheng Ruan, Wenzhen Yuan, Zehao Lin et al.

AAAI 2025paperarXiv:2409.16084
11
citations
#2577

Context Steering: Controllable Personalization at Inference Time

Zhiyang He, Sashrika Pandey, Mariah Schrum et al.

ICLR 2025posterarXiv:2405.01768
11
citations
#2578

SeaS: Few-shot Industrial Anomaly Image Generation with Separation and Sharing Fine-tuning

Zhewei Dai, Shilei Zeng, Haotian Liu et al.

ICCV 2025posterarXiv:2410.14987
11
citations
#2579

Neural Sampling from Boltzmann Densities: Fisher-Rao Curves in the Wasserstein Geometry

Jannis Chemseddine, Christian Wald, Richard Duong et al.

ICLR 2025posterarXiv:2410.03282
11
citations
#2580

DenseDPO: Fine-Grained Temporal Preference Optimization for Video Diffusion Models

Ziyi Wu, Anil Kag, Ivan Skorokhodov et al.

NEURIPS 2025oralarXiv:2506.03517
11
citations
#2581

Hidden in the Noise: Two-Stage Robust Watermarking for Images

Kasra Arabi, Benjamin Feuer, R. Teal Witter et al.

ICLR 2025posterarXiv:2412.04653
11
citations
#2582

Data Synthesis with Diverse Styles for Face Recognition via 3DMM-Guided Diffusion

Yuxi Mi, Zhizhou Zhong, Yuge Huang et al.

CVPR 2025posterarXiv:2504.00430
11
citations
#2583

ConVis: Contrastive Decoding with Hallucination Visualization for Mitigating Hallucinations in Multimodal Large Language Models

Yeji Park, Deokyeong Lee, Junsuk Choe et al.

AAAI 2025paperarXiv:2408.13906
11
citations
#2584

FSFM: A Generalizable Face Security Foundation Model via Self-Supervised Facial Representation Learning

Gaojian Wang, Feng Lin, Tong Wu et al.

CVPR 2025posterarXiv:2412.12032
11
citations
#2585

XLand-100B: A Large-Scale Multi-Task Dataset for In-Context Reinforcement Learning

Alexander Nikulin, Ilya Zisman, Alexey Zemtsov et al.

ICLR 2025posterarXiv:2406.08973
11
citations
#2586

RecFlow: An Industrial Full Flow Recommendation Dataset

Qi Liu, Kai Zheng, Rui Huang et al.

ICLR 2025posterarXiv:2410.20868
11
citations
#2587

Residual Stream Analysis with Multi-Layer SAEs

Tim Lawson, Lucy Farnik, Conor Houghton et al.

ICLR 2025posterarXiv:2409.04185
11
citations
#2588

Multi-view Reconstruction via SfM-guided Monocular Depth Estimation

Haoyu Guo, He Zhu, Sida Peng et al.

CVPR 2025posterarXiv:2503.14483
11
citations
#2589

FALCON: Resolving Visual Redundancy and Fragmentation in High-resolution Multimodal Large Language Models via Visual Registers

Renshan Zhang, Rui Shao, Gongwei Chen et al.

ICCV 2025posterarXiv:2501.16297
11
citations
#2590

Diff-Shadow: Global-guided Diffusion Model for Shadow Removal

Jinting Luo, Ru Li, Chengzhi Jiang et al.

AAAI 2025paperarXiv:2407.16214
11
citations
#2591

Revisiting In-context Learning Inference Circuit in Large Language Models

Hakaze Cho, Mariko Kato, Yoshihiro Sakai et al.

ICLR 2025posterarXiv:2410.04468
11
citations
#2592

Deep Learning Alternatives Of The Kolmogorov Superposition Theorem

Leonardo Ferreira Guilhoto, Paris Perdikaris

ICLR 2025posterarXiv:2410.01990
11
citations
#2593

Learning the RoPEs: Better 2D and 3D Position Encodings with STRING

Connor Schenck, Isaac Reid, Mithun Jacob et al.

ICML 2025spotlightarXiv:2502.02562
11
citations
#2594

LoLCATs: On Low-Rank Linearizing of Large Language Models

Michael Zhang, Simran Arora, Rahul Chalamala et al.

ICLR 2025posterarXiv:2410.10254
11
citations
#2595

LoRASculpt: Sculpting LoRA for Harmonizing General and Specialized Knowledge in Multimodal Large Language Models

Jian Liang, Wenke Huang, Guancheng Wan et al.

CVPR 2025posterarXiv:2503.16843
11
citations
#2596

On Conformal Isometry of Grid Cells: Learning Distance-Preserving Position Embedding

Dehong Xu, Ruiqi Gao, Wenhao Zhang et al.

ICLR 2025posterarXiv:2405.16865
11
citations
#2597

Unifying Appearance Codes and Bilateral Grids for Driving Scene Gaussian Splatting

Nan Wang, Lixing Xiao, Yuantao Chen et al.

NEURIPS 2025posterarXiv:2506.05280
11
citations
#2598

Federated Learning with Sample-level Client Drift Mitigation

Haoran Xu, Jiaze Li, Wanyi Wu et al.

AAAI 2025paperarXiv:2501.11360
11
citations
#2599

VideoScene: Distilling Video Diffusion Model to Generate 3D Scenes in One Step

Hanyang Wang, Fangfu Liu, Jiawei Chi et al.

CVPR 2025highlightarXiv:2504.01956
11
citations
#2600

TIGER: Time-frequency Interleaved Gain Extraction and Reconstruction for Efficient Speech Separation

Mohan Xu, Kai Li, Guo Chen et al.

ICLR 2025oralarXiv:2410.01469
11
citations