Most Cited 2025 "underdamped langevin diffusion" Papers

22,274 papers found • Page 24 of 112

#4601

FormalAlign: Automated Alignment Evaluation for Autoformalization

Jianqiao Lu, Yingjia Wan, Yinya Huang et al.

ICLR 2025arXiv:2410.10135
10
citations
#4602

Self-Improving Embodied Foundation Models

Seyed Kamyar Seyed Ghasemipour, Ayzaan Wahid, Jonathan Tompson et al.

NEURIPS 2025oralarXiv:2509.15155
10
citations
#4603

ReCap: Better Gaussian Relighting with Cross-Environment Captures

Jingzhi Li, Zongwei Wu, Eduard Zamfir et al.

CVPR 2025arXiv:2412.07534
10
citations
#4604

SliderSpace: Decomposing the Visual Capabilities of Diffusion Models

Rohit Gandikota, Zongze Wu, Richard Zhang et al.

ICCV 2025arXiv:2502.01639
10
citations
#4605

AniMer: Animal Pose and Shape Estimation Using Family Aware Transformer

Jin Lyu, Tianyi Zhu, Yi Gu et al.

CVPR 2025arXiv:2412.00837
10
citations
#4606

What Does It Mean to Be a Transformer? Insights from a Theoretical Hessian Analysis

Weronika Ormaniec, Felix Dangel, Sidak Pal Singh

ICLR 2025arXiv:2410.10986
10
citations
#4607

VERA: Explainable Video Anomaly Detection via Verbalized Learning of Vision-Language Models

Muchao Ye, Weiyang Liu, Pan He

CVPR 2025arXiv:2412.01095
10
citations
#4608

SfM-Free 3D Gaussian Splatting via Hierarchical Training

Bo Ji, Angela Yao

CVPR 2025arXiv:2412.01553
10
citations
#4609

Continuous Diffusion for Mixed-Type Tabular Data

Markus Mueller, Kathrin Gruber, Dennis Fok

ICLR 2025arXiv:2312.10431
10
citations
#4610

Volumetrically Consistent 3D Gaussian Rasterization

Chinmay Talegaonkar, Yash Belhe, Ravi Ramamoorthi et al.

CVPR 2025highlightarXiv:2412.03378
10
citations
#4611

Step-by-Step Reasoning for Math Problems via Twisted Sequential Monte Carlo

Shengyu Feng, Xiang Kong, shuang ma et al.

ICLR 2025arXiv:2410.01920
10
citations
#4612

Radiology Report Generation via Multi-objective Preference Optimization

Ting Xiao, Lei Shi, Peng Liu et al.

AAAI 2025paperarXiv:2412.08901
10
citations
#4613

IRIS: Inverse Rendering of Indoor Scenes from Low Dynamic Range Images

Chih-Hao Lin, Jia-Bin Huang, Zhengqin Li et al.

CVPR 2025arXiv:2401.12977
10
citations
#4614

Embodied Scene Understanding for Vision Language Models via MetaVQA

Weizhen Wang, Chenda Duan, Zhenghao Peng et al.

CVPR 2025arXiv:2501.09167
10
citations
#4615

Robot-R1: Reinforcement Learning for Enhanced Embodied Reasoning in Robotics

Dongyoung Kim, Huiwon Jang, Sumin Park et al.

NEURIPS 2025arXiv:2506.00070
10
citations
#4616

Advancing Myopia To Holism: Fully Contrastive Language-Image Pre-training

Haicheng Wang, Chen Ju, Weixiong Lin et al.

CVPR 2025arXiv:2412.00440
10
citations
#4617

Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding

Wenxuan Guo, Xiuwei Xu, Ziwei Wang et al.

CVPR 2025highlightarXiv:2502.10392
10
citations
#4618

ICM-Assistant: Instruction-tuning Multimodal Large Language Models for Rule-based Explainable Image Content Moderation

Mengyang Wu, Yuzhi Zhao, Jialun Cao et al.

AAAI 2025paperarXiv:2412.18216
10
citations
#4619

SpatialDreamer: Self-supervised Stereo Video Synthesis from Monocular Input

Zhen Lv, Yangqi Long, Congzhentao Huang et al.

CVPR 2025arXiv:2411.11934
10
citations
#4620

Gaussian Mixture Flow Matching Models

Hansheng Chen, Kai Zhang, Hao Tan et al.

ICML 2025arXiv:2504.05304
10
citations
#4621

GraphAvatar: Compact Head Avatars with GNN-Generated 3D Gaussians

Xiaobao Wei, Peng Chen, Ming Lu et al.

AAAI 2025paperarXiv:2412.13983
10
citations
#4622

CL-MoE: Enhancing Multimodal Large Language Model with Dual Momentum Mixture-of-Experts for Continual Visual Question Answering

Tianyu Huai, Jie Zhou, Xingjiao Wu et al.

CVPR 2025highlightarXiv:2503.00413
10
citations
#4623

Rashomon Sets for Prototypical-Part Networks: Editing Interpretable Models in Real-Time

Jon Donnelly, Zhicheng Guo, Alina Jade Barnett et al.

CVPR 2025arXiv:2503.01087
10
citations
#4624

SmartPretrain: Model-Agnostic and Dataset-Agnostic Representation Learning for Motion Prediction

Yang Zhou, Hao Shao, Letian Wang et al.

ICLR 2025oralarXiv:2410.08669
10
citations
#4625

Reward Fine-Tuning Two-Step Diffusion Models via Learning Differentiable Latent-Space Surrogate Reward

Zhiwei Jia, Yuesong Nan, Huixi Zhao et al.

CVPR 2025arXiv:2411.15247
10
citations
#4626

MLLM-For3D: Adapting Multimodal Large Language Model for 3D Reasoning Segmentation

Jiaxin Huang, Runnan Chen, Ziwen Li et al.

NEURIPS 2025arXiv:2503.18135
10
citations
#4627

Deep Linear Probe Generators for Weight Space Learning

Jonathan Kahana, Eliahu Horwitz, Imri Shuval et al.

ICLR 2025arXiv:2410.10811
10
citations
#4628

Safety Reasoning with Guidelines

Haoyu Wang, Zeyu Qin, Li Shen et al.

ICML 2025arXiv:2502.04040
10
citations
#4629

Towards Satellite Image Road Graph Extraction: A Global-Scale Dataset and A Novel Method

Pan Yin, Kaiyu Li, Xiangyong Cao et al.

CVPR 2025arXiv:2411.16733
10
citations
#4630

Trajectory Mamba: Efficient Attention-Mamba Forecasting Model Based on Selective SSM

Yizhou Huang, Yihua Cheng, Kezhi Wang

CVPR 2025arXiv:2503.10898
10
citations
#4631

Scaling Stick-Breaking Attention: An Efficient Implementation and In-depth Study

Shawn Tan, Songlin Yang, Aaron Courville et al.

ICLR 2025arXiv:2410.17980
10
citations
#4632

Self-Improving Language Models for Evolutionary Program Synthesis: A Case Study on ARC-AGI

Julien Pourcel, Cédric Colas, Pierre-Yves Oudeyer

ICML 2025arXiv:2507.14172
10
citations
#4633

PENCIL: Long Thoughts with Short Memory

Chenxiao Yang, Nati Srebro, David McAllester et al.

ICML 2025arXiv:2503.14337
10
citations
#4634

AnimateAnyMesh: A Feed-Forward 4D Foundation Model for Text-Driven Universal Mesh Animation

zijie wu, Chaohui Yu, Fan Wang et al.

ICCV 2025arXiv:2506.09982
10
citations
#4635

Hybrid Global-Local Representation with Augmented Spatial Guidance for Zero-Shot Referring Image Segmentation

Ting Liu, Siyuan Li

CVPR 2025arXiv:2504.00356
10
citations
#4636

GTR: Improving Large 3D Reconstruction Models through Geometry and Texture Refinement

Peiye Zhuang, Songfang Han, Chaoyang Wang et al.

ICLR 2025arXiv:2406.05649
10
citations
#4637

The Non-Linear Representation Dilemma: Is Causal Abstraction Enough for Mechanistic Interpretability?

Denis Sutter, Julian Minder, Thomas Hofmann et al.

NEURIPS 2025spotlightarXiv:2507.08802
10
citations
#4638

Beyond the convexity assumption: Realistic tabular data generation under quantifier-free real linear constraints

Mihaela Stoian, Eleonora Giunchiglia

ICLR 2025arXiv:2502.18237
10
citations
#4639

UltraFusion: Ultra High Dynamic Imaging using Exposure Fusion

Zixuan Chen, Yujin Wang, Xin Cai et al.

CVPR 2025highlightarXiv:2501.11515
10
citations
#4640

Data-Driven Performance Guarantees for Classical and Learned Optimizers

Rajiv Sambharya, Bartolomeo Stellato

NEURIPS 2025arXiv:2404.13831
10
citations
#4641

VTON-HandFit: Virtual Try-on for Arbitrary Hand Pose Guided by Hand Priors Embedding

Yujie Liang, Xiaobin Hu, Boyuan Jiang et al.

CVPR 2025arXiv:2408.12340
10
citations
#4642

CCIN: Compositional Conflict Identification and Neutralization for Composed Image Retrieval

Likai Tian, Jian Zhao, Zechao Hu et al.

CVPR 2025highlight
10
citations
#4643

ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models

Liyan Tang, Grace Kim, Xinyu Zhao et al.

NEURIPS 2025arXiv:2505.13444
10
citations
#4644

Extrapolating and Decoupling Image-to-Video Generation Models: Motion Modeling is Easier Than You Think

Zhenyi Lu, Xiaoye Qu, Zhenyi Lu et al.

CVPR 2025highlightarXiv:2503.00948
10
citations
#4645

Hydra-SGG: Hybrid Relation Assignment for One-stage Scene Graph Generation

Minghan Chen, Guikun Chen, Wenguan Wang et al.

ICLR 2025arXiv:2409.10262
10
citations
#4646

SEFE: Superficial and Essential Forgetting Eliminator for Multimodal Continual Instruction Tuning

Jinpeng Chen, Runmin Cong, Yuzhi Zhao et al.

ICML 2025arXiv:2505.02486
10
citations
#4647

M2OST: Many-to-one Regression for Predicting Spatial Transcriptomics from Digital Pathology Images

Hongyi Wang, Xiuju Du, Jing Liu et al.

AAAI 2025paperarXiv:2409.15092
10
citations
#4648

ADBA: Approximation Decision Boundary Approach for Black-Box Adversarial Attacks

Feiyang Wang, Xingquan Zuo, Hai Huang et al.

AAAI 2025paper
10
citations
#4649

TACO: Taming Diffusion for in-the-wild Video Amodal Completion

Ruijie Lu, Yixin Chen, Yu Liu et al.

ICCV 2025arXiv:2503.12049
10
citations
#4650

Large Convolutional Model Tuning via Filter Subspace

Wei Chen, Zichen Miao, Qiang Qiu

ICLR 2025arXiv:2403.00269
10
citations
#4651

RHanDS: Refining Malformed Hands for Generated Images with Decoupled Structure and Style Guidance

Chengrui Wang, Pengfei Liu, Min Zhou et al.

AAAI 2025paperarXiv:2404.13984
10
citations
#4652

Unisolver: PDE-Conditional Transformers Towards Universal Neural PDE Solvers

Hang Zhou, Yuezhou Ma, Haixu Wu et al.

ICML 2025arXiv:2405.17527
10
citations
#4653

A Little Goes a Long Way: Efficient Long Context Training and Inference with Partial Contexts

Suyu Ge, Xihui Lin, Yunan Zhang et al.

ICLR 2025arXiv:2410.01485
10
citations
#4654

Constrain Alignment with Sparse Autoencoders

Qingyu Yin, Chak Tou Leong, Hongbo Zhang et al.

ICML 2025arXiv:2411.07618
10
citations
#4655

FPSAttention: Training-Aware FP8 and Sparsity Co-Design for Fast Video Diffusion

Akide Liu, Zeyu Zhang, Zhexin Li et al.

NEURIPS 2025spotlightarXiv:2506.04648
10
citations
#4656

Think Then React: Towards Unconstrained Action-to-Reaction Motion Generation

Wenhui Tan, Boyuan Li, Chuhao Jin et al.

ICLR 2025
10
citations
#4657

Tuning the Frequencies: Robust Training for Sinusoidal Neural Networks

Tiago Novello, Diana Aldana Moreno, André Araujo et al.

CVPR 2025highlightarXiv:2407.21121
10
citations
#4658

BadToken: Token-level Backdoor Attacks to Multi-modal Large Language Models

Zenghui Yuan, Jiawen Shi, Pan Zhou et al.

CVPR 2025arXiv:2503.16023
10
citations
#4659

SoMA: Singular Value Decomposed Minor Components Adaptation for Domain Generalizable Representation Learning

Seokju Yun, Seunghye Chae, Dongheon Lee et al.

CVPR 2025highlightarXiv:2412.04077
10
citations
#4660

STAA-SNN: Spatial-Temporal Attention Aggregator for Spiking Neural Networks

Tianqing Zhang, Kairong Yu, Xian Zhong et al.

CVPR 2025arXiv:2503.02689
10
citations
#4661

Distilling Spectral Graph for Object-Context Aware Open-Vocabulary Semantic Segmentation

Chanyoung Kim, Dayun Ju, Woojung Han et al.

CVPR 2025arXiv:2411.17150
10
citations
#4662

PassionSR: Post-Training Quantization with Adaptive Scale in One-Step Diffusion based Image Super-Resolution

Zhu Li Bo, Jianze Li, Haotong Qin et al.

CVPR 2025arXiv:2411.17106
10
citations
#4663

Deformable Radial Kernel Splatting

Yihua Huang, Mingxian Lin, Yangtian Sun et al.

CVPR 2025arXiv:2412.11752
10
citations
#4664

Efficient Model Editing with Task-Localized Sparse Fine-tuning

Leonardo Iurada, Marco Ciccone, Tatiana Tommasi

ICLR 2025arXiv:2504.02620
10
citations
#4665

Lessons and Insights from a Unifying Study of Parameter-Efficient Fine-Tuning (PEFT) in Visual Recognition

Zheda Mai, Ping Zhang, Cheng-Hao Tu et al.

CVPR 2025highlightarXiv:2409.16434
10
citations
#4666

IterGen: Iterative Semantic-aware Structured LLM Generation with Backtracking

Shubham Dipak Ugare, Rohan Gumaste, Tarun Suresh et al.

ICLR 2025arXiv:2410.07295
10
citations
#4667

Quartet: Native FP4 Training Can Be Optimal for Large Language Models

Roberto Castro, Andrei Panferov, Rush Tabesh et al.

NEURIPS 2025arXiv:2505.14669
10
citations
#4668

Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data

Binghui Li, Yuanzhi Li

ICLR 2025arXiv:2410.08503
10
citations
#4669

GuidedQuant: Large Language Model Quantization via Exploiting End Loss Guidance

Jinuk Kim, Marwa El Halabi, Wonpyo Park et al.

ICML 2025arXiv:2505.07004
10
citations
#4670

On the Training Convergence of Transformers for In-Context Classification of Gaussian Mixtures

Wei Shen, Ruida Zhou, Jing Yang et al.

ICML 2025arXiv:2410.11778
10
citations
#4671

OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts

Yuxuan Wang, Yueqian Wang, Bo Chen et al.

CVPR 2025arXiv:2503.22952
10
citations
#4672

Lifting the Veil on Visual Information Flow in MLLMs: Unlocking Pathways to Faster Inference

Hao Yin, Guangzong Si, Zilei Wang

CVPR 2025arXiv:2503.13108
10
citations
#4673

R.I.P.: Better Models by Survival of the Fittest Prompts

Ping Yu, Weizhe Yuan, Olga Golovneva et al.

ICML 2025arXiv:2501.18578
10
citations
#4674

Zeroth-Order Policy Gradient for Reinforcement Learning from Human Feedback without Reward Inference

Qining Zhang, Lei Ying

ICLR 2025arXiv:2409.17401
10
citations
#4675

Atlas Gaussians Diffusion for 3D Generation

Haitao Yang, Yuan Dong, Hanwen Jiang et al.

ICLR 2025arXiv:2408.13055
10
citations
#4676

Intrinsic User-Centric Interpretability through Global Mixture of Experts

Vinitra Swamy, Syrielle Montariol, Julian Blackwell et al.

ICLR 2025arXiv:2402.02933
10
citations
#4677

MMKE-Bench: A Multimodal Editing Benchmark for Diverse Visual Knowledge

yuntao du, Kailin Jiang, Zhi Gao et al.

ICLR 2025arXiv:2502.19870
10
citations
#4678

Node-Time Conditional Prompt Learning in Dynamic Graphs

Xingtong Yu, Zhenghao Liu, Xinming Zhang et al.

ICLR 2025oralarXiv:2405.13937
10
citations
#4679

DreamRelation: Bridging Customization and Relation Generation

Qingyu Shi, Lu Qi, Jianzong Wu et al.

CVPR 2025arXiv:2410.23280
10
citations
#4680

Language Models Trained to do Arithmetic Predict Human Risky and Intertemporal Choice

Jian-Qiao Zhu, Haijiang Yan, Thomas L. Griffiths

ICLR 2025oralarXiv:2405.19313
10
citations
#4681

MegaSynth: Scaling Up 3D Scene Reconstruction with Synthesized Data

Hanwen Jiang, Zexiang Xu, Desai Xie et al.

CVPR 2025arXiv:2412.14166
10
citations
#4682

GLASS: Guided Latent Slot Diffusion for Object-Centric Learning

Krishnakant Singh, Simone Schaub-Meyer, Stefan Roth

CVPR 2025arXiv:2407.17929
10
citations
#4683

Language Guided Concept Bottleneck Models for Interpretable Continual Learning

Lu Yu, HaoYu Han, Zhe Tao et al.

CVPR 2025arXiv:2503.23283
10
citations
#4684

$\infty$-Video: A Training-Free Approach to Long Video Understanding via Continuous-Time Memory Consolidation

Saúl Santos, António Farinhas, Daniel McNamee et al.

ICML 2025arXiv:2501.19098
10
citations
#4685

O-TPT: Orthogonality Constraints for Calibrating Test-time Prompt Tuning in Vision-Language Models

Ashshak Sharifdeen, Muhammad Akhtar Munir, Sanoojan Baliah et al.

CVPR 2025highlightarXiv:2503.12096
10
citations
#4686

Momentum-SAM: Sharpness Aware Minimization without Computational Overhead

Marlon Becker, Frederick Altrock, Benjamin Risse

NEURIPS 2025arXiv:2401.12033
10
citations
#4687

PatchDPO: Patch-level DPO for Finetuning-free Personalized Image Generation

Qihan Huang, Weilong Dai, Jinlong Liu et al.

CVPR 2025arXiv:2412.03177
10
citations
#4688

Neural Video Compression with Context Modulation

Chuanbo Tang, Zhuoyuan Li, Yifan Bian et al.

CVPR 2025arXiv:2505.14541
10
citations
#4689

SurFhead: Affine Rig Blending for Geometrically Accurate 2D Gaussian Surfel Head Avatars

Jaeseong Lee, Taewoong Kang, Marcel Buehler et al.

ICLR 2025arXiv:2410.11682
10
citations
#4690

Antidistillation Sampling

Yash Savani, Asher Trockman, Zhili Feng et al.

NEURIPS 2025arXiv:2504.13146
10
citations
#4691

A Closer Look at Multimodal Representation Collapse

Abhra Chaudhuri, Anjan Dutta, Tu Bui et al.

ICML 2025spotlightarXiv:2505.22483
10
citations
#4692

ConfTuner: Training Large Language Models to Express Their Confidence Verbally

Yibo Li, Miao Xiong, Jiaying Wu et al.

NEURIPS 2025arXiv:2508.18847
10
citations
#4693

Teller: Real-Time Streaming Audio-Driven Portrait Animation with Autoregressive Motion Generation

Dingcheng Zhen, Shunshun Yin, Shiyang Qin et al.

CVPR 2025arXiv:2503.18429
10
citations
#4694

Consistency Checks for Language Model Forecasters

Daniel Paleka, Abhimanyu Pallavi Sudhir, Alejandro Alvarez et al.

ICLR 2025arXiv:2412.18544
10
citations
#4695

MTU-Bench: A Multi-granularity Tool-Use Benchmark for Large Language Models

Pei Wang, Yanan Wu, Zekun Wang et al.

ICLR 2025arXiv:2410.11710
10
citations
#4696

Interpreting the Repeated Token Phenomenon in Large Language Models

Itay Yona, Ilia Shumailov, Jamie Hayes et al.

ICML 2025arXiv:2503.08908
10
citations
#4697

MONA: Myopic Optimization with Non-myopic Approval Can Mitigate Multi-step Reward Hacking

Sebastian Farquhar, Vikrant Varma, David Lindner et al.

ICML 2025arXiv:2501.13011
10
citations
#4698

MagicColor: Multi-instance Sketch Colorization

yinhan Zhang, Yue Ma, Bingyuan Wang et al.

ICCV 2025
10
citations
#4699

Beyond Walking: A Large-Scale Image-Text Benchmark for Text-based Person Anomaly Search

Shuyu Yang, Yaxiong Wang, Li Zhu et al.

ICCV 2025highlightarXiv:2411.17776
10
citations
#4700

Benchmarking Multimodal CoT Reward Model Stepwise by Visual Program

Minghe Gao, Xuqi Liu, Zhongqi Yue et al.

ICCV 2025arXiv:2504.06606
10
citations
#4701

MIP against Agent: Malicious Image Patches Hijacking Multimodal OS Agents

Lukas Aichberger, Alasdair Paren, Guohao Li et al.

NEURIPS 2025arXiv:2503.10809
10
citations
#4702

Neural Exploratory Landscape Analysis for Meta-Black-Box-Optimization

Zeyuan Ma, Jiacheng Chen, Hongshu Guo et al.

ICLR 2025arXiv:2408.10672
10
citations
#4703

SSLAM: Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes

Tony Alex, Sara Atito, Armin Mustafa et al.

ICLR 2025arXiv:2506.12222
10
citations
#4704

Context-Parametric Inversion: Why Instruction Finetuning May Not Actually Improve Context Reliance

Sachin Goyal, Christina Baek, Zico Kolter et al.

ICLR 2025
10
citations
#4705

Vector-ICL: In-context Learning with Continuous Vector Representations

Yufan Zhuang, Chandan Singh, Liyuan Liu et al.

ICLR 2025arXiv:2410.05629
10
citations
#4706

WAVE: Weight Templates for Adaptive Initialization of Variable-sized Models

Fu Feng, Yucheng Xie, Jing Wang et al.

CVPR 2025arXiv:2406.17503
10
citations
#4707

BodyGen: Advancing Towards Efficient Embodiment Co-Design

Haofei Lu, Zhe Wu, Junliang Xing et al.

ICLR 2025oralarXiv:2503.00533
10
citations
#4708

Synthesizing Software Engineering Data in a Test-Driven Manner

Lei Zhang, Jiaxi Yang, Min Yang et al.

ICML 2025arXiv:2506.09003
10
citations
#4709

RocketKV: Accelerating Long-Context LLM Inference via Two-Stage KV Cache Compression

Payman Behnam, Yaosheng Fu, Ritchie Zhao et al.

ICML 2025arXiv:2502.14051
10
citations
#4710

How Much Can Transfer? BRIDGE: Bounded Multi-Domain Graph Foundation Model with Generalization Guarantees

Haonan Yuan, Qingyun Sun, Junhua Shi et al.

ICML 2025
10
citations
#4711

The 3D-PC: a benchmark for visual perspective taking in humans and machines

Drew Linsley, Peisen Zhou, Alekh Ashok et al.

ICLR 2025arXiv:2406.04138
10
citations
#4712

FilterTS: Comprehensive Frequency Filtering for Multivariate Time Series Forecasting

Yulong Wang, Yushuo Liu, Xiaoyi Duan et al.

AAAI 2025paperarXiv:2505.04158
10
citations
#4713

When Do LLMs Help With Node Classification? A Comprehensive Analysis

Xixi Wu, Yifei Shen, Fangzhou Ge et al.

ICML 2025arXiv:2502.00829
10
citations
#4714

RayFlow: Instance-Aware Diffusion Acceleration via Adaptive Flow Trajectories

Huiyang Shao, Xin Xia, Yuhong Yang et al.

CVPR 2025arXiv:2503.07699
10
citations
#4715

SILMM: Self-Improving Large Multimodal Models for Compositional Text-to-Image Generation

Leigang Qu, Haochuan Li, Wenjie Wang et al.

CVPR 2025arXiv:2412.05818
10
citations
#4716

Advancing Textual Prompt Learning with Anchored Attributes

Zheng Li, Yibing Song, Ming-Ming Cheng et al.

ICCV 2025arXiv:2412.09442
10
citations
#4717

DM-Adapter: Domain-Aware Mixture-of-Adapters for Text-Based Person Retrieval

Yating Liu, Zimo Liu, Xiangyuan Lan et al.

AAAI 2025paperarXiv:2503.04144
10
citations
#4718

PhyloLM: Inferring the Phylogeny of Large Language Models and Predicting their Performances in Benchmarks

Nicolas Yax, Pierre-Yves Oudeyer, Stefano Palminteri

ICLR 2025arXiv:2404.04671
10
citations
#4719

Unleashing Hour-Scale Video Training for Long Video-Language Understanding

Jingyang Lin, Jialian Wu, Ximeng Sun et al.

NEURIPS 2025oralarXiv:2506.05332
10
citations
#4720

Taylor Series-Inspired Local Structure Fitting Network for Few-shot Point Cloud Semantic Segmentation

Changshuo Wang, Shuting He, Xiang Fang et al.

AAAI 2025paperarXiv:2504.02454
10
citations
#4721

Efficient Attention-Sharing Information Distillation Transformer for Lightweight Single Image Super-Resolution

Karam Park, Jae Woong Soh, Nam Ik Cho

AAAI 2025paperarXiv:2501.15774
10
citations
#4722

Fundamental limits of learning in sequence multi-index models and deep attention networks: high-dimensional asymptotics and sharp thresholds

Emanuele Troiani, Hugo Cui, Yatin Dandi et al.

ICML 2025arXiv:2502.00901
10
citations
#4723

Selective Prompt Anchoring for Code Generation

Yuan Tian, Tianyi Zhang

ICML 2025arXiv:2408.09121
10
citations
#4724

Sensor-Invariant Tactile Representation

Harsh Gupta, Yuchen Mo, Shengmiao Jin et al.

ICLR 2025arXiv:2502.19638
10
citations
#4725

Boosting Latent Diffusion with Perceptual Objectives

Tariq Berrada, Pietro Astolfi, Melissa Hall et al.

ICLR 2025arXiv:2411.04873
10
citations
#4726

InsTaG: Learning Personalized 3D Talking Head from Few-Second Video

Jiahe Li, Jiawei Zhang, Xiao Bai et al.

CVPR 2025arXiv:2502.20387
10
citations
#4727

Make Me Happier: Evoking Emotions Through Image Diffusion Models

Qing Lin, Jingfeng Zhang, YEW-SOON ONG et al.

ICCV 2025arXiv:2403.08255
10
citations
#4728

Are They the Same? Exploring Visual Correspondence Shortcomings of Multimodal LLMs

Yikang Zhou, Tao Zhang, Shilin Xu et al.

ICCV 2025arXiv:2501.04670
10
citations
#4729

RoMA: Scaling up Mamba-based Foundation Models for Remote Sensing

Fengxiang Wang, Yulin Wang, Mingshuo Chen et al.

NEURIPS 2025arXiv:2503.10392
10
citations
#4730

InfoChartQA: A Benchmark for Multimodal Question Answering on Infographic Charts

Tianchi Xie, Minzhi Lin, Mengchen Liu et al.

NEURIPS 2025arXiv:2505.19028
10
citations
#4731

Identifying Macro Conditional Independencies and Macro Total Effects in Summary Causal Graphs with Latent Confounding

Simon Ferreira, Charles K. Assaad

AAAI 2025paperarXiv:2407.07934
10
citations
#4732

Not All Prompts Are Made Equal: Prompt-based Pruning of Text-to-Image Diffusion Models

Alireza Ganjdanesh, Reza Shirkavand, Shangqian Gao et al.

ICLR 2025arXiv:2406.12042
10
citations
#4733

VSTAR: Generative Temporal Nursing for Longer Dynamic Video Synthesis

Yumeng Li, William H Beluch, Margret Keuper et al.

ICLR 2025oralarXiv:2403.13501
10
citations
#4734

Stochastic Control for Fine-tuning Diffusion Models: Optimality, Regularity, and Convergence

Yinbin Han, Meisam Razaviyayn, Renyuan Xu

ICML 2025arXiv:2412.18164
10
citations
#4735

ViG: Linear-complexity Visual Sequence Learning with Gated Linear Attention

Bencheng Liao, Xinggang Wang, Lianghui Zhu et al.

AAAI 2025paperarXiv:2405.18425
10
citations
#4736

Towards Hierarchical Rectified Flow

Yichi Zhang, Yici Yan, Alex Schwing et al.

ICLR 2025arXiv:2502.17436
10
citations
#4737

Beyond Verifiable Rewards: Scaling Reinforcement Learning in Language Models to Unverifiable Data

Yunhao Tang, Sid Wang, Lovish Madaan et al.

NEURIPS 2025arXiv:2503.19618
10
citations
#4738

SoundCTM: Unifying Score-based and Consistency Models for Full-band Text-to-Sound Generation

Koichi Saito, Dongjun Kim, Takashi Shibuya et al.

ICLR 2025arXiv:2405.18503
10
citations
#4739

WildSAT: Learning Satellite Image Representations from Wildlife Observations

Rangel Daroya, Elijah Cole, Oisin Mac Aodha et al.

ICCV 2025arXiv:2412.14428
10
citations
#4740

More RLHF, More Trust? On The Impact of Preference Alignment On Trustworthiness

Aaron J. Li, Satyapriya Krishna, Hima Lakkaraju

ICLR 2025arXiv:2404.18870
10
citations
#4741

Solving Video Inverse Problems Using Image Diffusion Models

Taesung Kwon, Jong Chul YE

ICLR 2025oralarXiv:2409.02574
10
citations
#4742

Multi-Session Budget Optimization for Forward Auction-based Federated Learning

Xiaoli Tang, Han Yu, Zengxiang Li et al.

ICML 2025arXiv:2311.12548
10
citations
#4743

WSI-LLaVA: A Multimodal Large Language Model for Whole Slide Image

Yuci Liang, Xinheng Lyu, Meidan Ding et al.

ICCV 2025arXiv:2412.02141
10
citations
#4744

DIFFER: Disentangling Identity Features via Semantic Cues for Clothes-Changing Person Re-ID

Xin Liang, Yogesh S. Rawat

CVPR 2025arXiv:2503.22912
10
citations
#4745

Tool Unlearning for Tool-Augmented LLMs

Jiali Cheng, Hadi Amiri

ICML 2025arXiv:2502.01083
10
citations
#4746

Di[M]O: Distilling Masked Diffusion Models into One-step Generator

Yuanzhi Zhu, Xi WANG, Stéphane Lathuilière et al.

ICCV 2025
10
citations
#4747

Robotic Visual Instruction

Yanbang Li, ZiYang Gong, Haoyang Li et al.

CVPR 2025arXiv:2505.00693
10
citations
#4748

CoA: Towards Real Image Dehazing via Compression-and-Adaptation

Long Ma, Yuxin Feng, Yan Zhang et al.

CVPR 2025arXiv:2504.05590
10
citations
#4749

Direct Alignment with Heterogeneous Preferences

Ali Shirali, Arash Nasr-Esfahany, Abdullah Alomar et al.

NEURIPS 2025arXiv:2502.16320
10
citations
#4750

HumanRig: Learning Automatic Rigging for Humanoid Character in a Large Scale Dataset

Zedong Chu, Feng Xiong, Meiduo Liu et al.

CVPR 2025highlightarXiv:2412.02317
10
citations
#4751

On Measuring Long-Range Interactions in Graph Neural Networks

Jacob Bamberger, Benjamin Gutteridge, Scott le Roux et al.

ICML 2025arXiv:2506.05971
10
citations
#4752

Model-based RL as a Minimalist Approach to Horizon-Free and Second-Order Bounds

Zhiyong Wang, Dongruo Zhou, John C.S. Lui et al.

ICLR 2025arXiv:2408.08994
10
citations
#4753

Efficient Gaussian Splatting for Monocular Dynamic Scene Rendering via Sparse Time-Variant Attribute Modeling

Hanyang Kong, Xingyi Yang, Xinchao Wang

AAAI 2025paperarXiv:2502.20378
10
citations
#4754

Quadratic Gaussian Splatting: High Quality Surface Reconstruction with Second-order Geometric Primitives

ziyu zhang, Binbin Huang, Hanqing Jiang et al.

ICCV 2025arXiv:2411.16392
10
citations
#4755

Objective drives the consistency of representational similarity across datasets

Laure Ciernik, Lorenz Linhardt, Marco Morik et al.

ICML 2025arXiv:2411.05561
10
citations
#4756

Improving Parallel Program Performance with LLM Optimizers via Agent-System Interfaces

Anjiang Wei, Allen Nie, Thiago Teixeira et al.

ICML 2025arXiv:2410.15625
10
citations
#4757

Improved Off-policy Reinforcement Learning in Biological Sequence Design

Hyeonah Kim, Minsu Kim, Taeyoung Yun et al.

ICML 2025arXiv:2410.04461
10
citations
#4758

Joint Vision-Language Social Bias Removal for CLIP

Haoyu Zhang, Yangyang Guo, Mohan Kankanhalli

CVPR 2025arXiv:2411.12785
10
citations
#4759

Decoupled Diffusion Sparks Adaptive Scene Generation

Yunsong Zhou, Naisheng Ye, William Ljungbergh et al.

ICCV 2025arXiv:2504.10485
10
citations
#4760

ProtoOcc: Accurate, Efficient 3D Occupancy Prediction Using Dual Branch Encoder-Prototype Query Decoder

Jungho Kim, Changwon Kang, Dongyoung Lee et al.

AAAI 2025paperarXiv:2412.08774
10
citations
#4761

Disentangled Motion Modeling for Video Frame Interpolation

Jaihyun Lew, Jooyoung Choi, Chaehun Shin et al.

AAAI 2025paperarXiv:2406.17256
10
citations
#4762

Towards Understanding Fine-Tuning Mechanisms of LLMs via Circuit Analysis

Xu Wang, Yan Hu, Wenyu Du et al.

ICML 2025arXiv:2502.11812
10
citations
#4763

SnapMoGen: Human Motion Generation from Expressive Texts

chuan guo, Inwoo Hwang, Jian Wang et al.

NEURIPS 2025oralarXiv:2507.09122
10
citations
#4764

Advancing Expert Specialization for Better MoE

Hongcan Guo, Haolang Lu, Guoshun Nan et al.

NEURIPS 2025oralarXiv:2505.22323
10
citations
#4765

Sail into the Headwind: Alignment via Robust Rewards and Dynamic Labels against Reward Hacking

Paria Rashidinejad, Yuandong Tian

ICLR 2025arXiv:2412.09544
10
citations
#4766

FlexiTex: Enhancing Texture Generation via Visual Guidance

Dadong Jiang, Xianghui Yang, Zibo Zhao et al.

AAAI 2025paperarXiv:2409.12431
10
citations
#4767

Optimal Protocols for Continual Learning via Statistical Physics and Control Theory

Francesco Mori, Stefano Sarao Mannelli, Francesca Mignacco

ICLR 2025arXiv:2409.18061
10
citations
#4768

Towards Transformer-Based Aligned Generation with Self-Coherence Guidance

Shulei Wang, Wang Lin, Hai Huang et al.

CVPR 2025arXiv:2503.17675
10
citations
#4769

Rethinking Query-based Transformer for Continual Image Segmentation

Yuchen Zhu, Cheng Shi, Dingyou Wang et al.

CVPR 2025arXiv:2507.07831
10
citations
#4770

Generalized and Efficient 2D Gaussian Splatting for Arbitrary-scale Super-Resolution

Du Chen, Liyi Chen, Zhengqiang ZHANG et al.

ICCV 2025arXiv:2501.06838
10
citations
#4771

Improving Transformer World Models for Data-Efficient RL

Antoine Dedieu, Joseph Ortiz, Xinghua Lou et al.

ICML 2025arXiv:2502.01591
10
citations
#4772

LLMs Encode Harmfulness and Refusal Separately

Jiachen Zhao, Jing Huang, Zhengxuan Wu et al.

NEURIPS 2025arXiv:2507.11878
10
citations
#4773

Diffusion On Syntax Trees For Program Synthesis

Shreyas Kapur, Erik Jenner, Stuart Russell

ICLR 2025arXiv:2405.20519
10
citations
#4774

FedMIA: An Effective Membership Inference Attack Exploiting "All for One" Principle in Federated Learning

Gongxi Zhu, Donghao Li, Hanlin Gu et al.

CVPR 2025
10
citations
#4775

Seeing Your Speech Style: A Novel Zero-Shot Identity-Disentanglement Face-based Voice Conversion

Yan Rong, Li Liu

AAAI 2025paperarXiv:2409.00700
10
citations
#4776

MA-LoT: Model-Collaboration Lean-based Long Chain-of-Thought Reasoning enhances Formal Theorem Proving

Ruida Wang, Rui Pan, Yuxin Li et al.

ICML 2025arXiv:2503.03205
10
citations
#4777

MagCache: Fast Video Generation with Magnitude-Aware Cache

Zehong Ma, Longhui Wei, Feng Wang et al.

NEURIPS 2025arXiv:2506.09045
10
citations
#4778

ZoRI: Towards Discriminative Zero-Shot Remote Sensing Instance Segmentation

Shiqi Huang, Shuting He, Bihan Wen

AAAI 2025paperarXiv:2412.12798
10
citations
#4779

Towards Multiple Character Image Animation Through Enhancing Implicit Decoupling

Jingyun Xue, WANG HongFa, Qi Tian et al.

ICLR 2025arXiv:2406.03035
10
citations
#4780

AvatarArtist: Open-Domain 4D Avatarization

Hongyu Liu, Xuan Wang, Ziyu Wan et al.

CVPR 2025arXiv:2503.19906
10
citations
#4781

Confidence Estimation for Error Detection in Text-to-SQL Systems

Oleg Somov, Elena Tutubalina

AAAI 2025paperarXiv:2501.09527
10
citations
#4782

CADDreamer: CAD Object Generation from Single-view Images

Yuan Li, Cheng Lin, Yuan Liu et al.

CVPR 2025highlightarXiv:2502.20732
10
citations
#4783

Quality-Driven Curation of Remote Sensing Vision-Language Data via Learned Scoring Models

Dilxat Muhtar, Enzhuo Zhang, Zhenshi Li et al.

NEURIPS 2025arXiv:2503.00743
10
citations
#4784

Rethinking Artistic Copyright Infringements In the Era Of Text-to-Image Generative Models

Mazda Moayeri, Sriram Balasubramanian, Samyadeep Basu et al.

ICLR 2025arXiv:2404.08030
10
citations
#4785

Scaffold-BPE: Enhancing Byte Pair Encoding for Large Language Models with Simple and Effective Scaffold Token Removal

Haoran Lian, Yizhe Xiong, Jianwei Niu et al.

AAAI 2025paperarXiv:2404.17808
10
citations
#4786

BEVDiffuser: Plug-and-Play Diffusion Model for BEV Denoising with Ground-Truth Guidance

Xin Ye, Burhan Yaman, Sheng Cheng et al.

CVPR 2025highlightarXiv:2502.19694
10
citations
#4787

STOP: Integrated Spatial-Temporal Dynamic Prompting for Video Understanding

Zichen Liu, Kunlun Xu, Bing Su et al.

CVPR 2025arXiv:2503.15973
10
citations
#4788

GenHancer: Imperfect Generative Models are Secretly Strong Vision-Centric Enhancers

Shijie Ma, Yuying Ge, Teng Wang et al.

ICCV 2025arXiv:2503.19480
10
citations
#4789

From Experts to a Generalist: Toward General Whole-Body Control for Humanoid Robots

Yuxuan Wang, Ming Yang, Gang Ding et al.

NEURIPS 2025oralarXiv:2506.12779
10
citations
#4790

Graph Neural Preconditioners for Iterative Solutions of Sparse Linear Systems

Jie Chen

ICLR 2025arXiv:2406.00809
10
citations
#4791

V2C-CBM: Building Concept Bottlenecks with Vision-to-Concept Tokenizer

Hangzhou He, Lei Zhu, Xinliang Zhang et al.

AAAI 2025paperarXiv:2501.04975
10
citations
#4792

QuantSpec: Self-Speculative Decoding with Hierarchical Quantized KV Cache

Rishabh Tiwari, Haocheng Xi, Aditya Tomar et al.

ICML 2025arXiv:2502.10424
10
citations
#4793

MINERVA: Evaluating Complex Video Reasoning

Arsha Nagrani, Sachit Menon, Ahmet Iscen et al.

ICCV 2025arXiv:2505.00681
10
citations
#4794

Two Experts Are All You Need for Steering Thinking: Reinforcing Cognitive Effort in MoE Reasoning Models Without Additional Training

Mengru Wang, Xingyu Chen, Yue Wang et al.

NEURIPS 2025arXiv:2505.14681
10
citations
#4795

First SFT, Second RL, Third UPT: Continual Improving Multi-Modal LLM Reasoning via Unsupervised Post-Training

Lai Wei, Yuting Li, Chen Wang et al.

NEURIPS 2025arXiv:2505.22453
10
citations
#4796

LLaVA Needs More Knowledge: Retrieval Augmented Natural Language Generation with Knowledge Graph for Explaining Thoracic Pathologies

Ameer Hamza, Abdullah, Yong Hyun Ahn et al.

AAAI 2025paperarXiv:2410.04749
10
citations
#4797

Scalable and Certifiable Graph Unlearning: Overcoming the Approximation Error Barrier

Lu Yi, Zhewei Wei

ICLR 2025arXiv:2408.09212
10
citations
#4798

Towards a Comprehensive, Efficient and Promptable Anatomic Structure Segmentation Model Using 3D Whole-Body CT Scans

Heng Guo, Jianfeng Zhang, Jiaxing Huang et al.

AAAI 2025paperarXiv:2403.15063
10
citations
#4799

CriSPO: Multi-Aspect Critique-Suggestion-guided Automatic Prompt Optimization for Text Generation

Han He, Qianchu Liu, Lei Xu et al.

AAAI 2025paperarXiv:2410.02748
10
citations
#4800

Surgical Workflow Recognition and Blocking Effectiveness Detection in Laparoscopic Liver Resection with Pringle Maneuver

Diandian Guo, Weixin Si, Zhixi Li et al.

AAAI 2025paperarXiv:2408.10538
10
citations