Most Cited 2025 "multimodal reasoning" Papers

22,274 papers found • Page 28 of 112

#5401

AnomalyNCD: Towards Novel Anomaly Class Discovery in Industrial Scenarios

Ziming Huang, Xurui Li, Haotian Liu et al.

CVPR 2025arXiv:2410.14379
8
citations
#5402

CONTRA: Conformal Prediction Region via Normalizing Flow Transformation

Zhenhan FANG, Aixin Tan, Jian Huang

ICLR 2025
8
citations
#5403

Micro-macro Wavelet-based Gaussian Splatting for 3D Reconstruction from Unconstrained Images

Yihui Li, Chengxin Lv, Hongyu Yang et al.

AAAI 2025paperarXiv:2501.14231
8
citations
#5404

DAViD: Modeling Dynamic Affordance of 3D Objects Using Pre-trained Video Diffusion Models

Hyeonwoo Kim, Sangwon Baik, Hanbyul Joo

ICCV 2025arXiv:2501.08333
8
citations
#5405

Multi-Marginal Stochastic Flow Matching for High-Dimensional Snapshot Data at Irregular Time Points

Justin Lee, Behnaz Moradi-Jamei, Heman Shakeri

ICML 2025arXiv:2508.04351
8
citations
#5406

Principled Algorithms for Optimizing Generalized Metrics in Binary Classification

Anqi Mao, Mehryar Mohri, Yutao Zhong

ICML 2025arXiv:2512.23133
8
citations
#5407

MergeNet: Knowledge Migration Across Heterogeneous Models, Tasks, and Modalities

Kunxi Li, Tianyu Zhan, Kairui Fu et al.

AAAI 2025paperarXiv:2404.13322
8
citations
#5408

TopoDiffusionNet: A Topology-aware Diffusion Model

Saumya Gupta, Dimitris Samaras, Chao Chen

ICLR 2025arXiv:2410.16646
8
citations
#5409

Glauber Generative Model: Discrete Diffusion Models via Binary Classification

Harshit Varma, Dheeraj Nagaraj, Karthikeyan Shanmugam

ICLR 2025arXiv:2405.17035
8
citations
#5410

H3D-DGS: Exploring Heterogeneous 3D Motion Representation for Deformable 3D Gaussian Splatting

Bing He, Yunuo Chen, Guo Lu et al.

NEURIPS 2025arXiv:2408.13036
8
citations
#5411

EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing

Kaizhi Zheng, Xiaotong Chen, Xuehai He et al.

ICLR 2025arXiv:2410.12836
8
citations
#5412

PRE-Mamba: A 4D State Space Model for Ultra-High-Frequent Event Camera Deraining

Ciyu Ruan, Ruishan Guo, Zihang GONG et al.

ICCV 2025arXiv:2505.05307
8
citations
#5413

Position: The Future of Bayesian Prediction Is Prior-Fitted

Samuel Gabriel Müller, Arik Reuter, Noah Hollmann et al.

ICML 2025arXiv:2505.23947
8
citations
#5414

Stable Mean Teacher for Semi-supervised Video Action Detection

Akash Kumar, Sirshapan Mitra, Yogesh Singh Rawat

AAAI 2025paperarXiv:2412.07072
8
citations
#5415

Revisiting Random Walks for Learning on Graphs

Jinwoo Kim, Olga Zaghen, Ayhan Suleymanzade et al.

ICLR 2025arXiv:2407.01214
8
citations
#5416

Seeing Eye to AI: Human Alignment via Gaze-Based Response Rewards for Large Language Models

Ángela López-Cardona, Carlos Segura, Alexandros Karatzoglou et al.

ICLR 2025arXiv:2410.01532
8
citations
#5417

Uncertainty Modeling in Graph Neural Networks via Stochastic Differential Equations

Richard Bergna, Sergio Calvo Ordoñez, Felix Opolka et al.

ICLR 2025arXiv:2408.16115
8
citations
#5418

Slot-Guided Adaptation of Pre-trained Diffusion Models for Object-Centric Learning and Compositional Generation

adil kaan akan, Yucel Yemez

ICLR 2025arXiv:2501.15878
8
citations
#5419

Rectifying Conformity Scores for Better Conditional Coverage

Vincent Plassier, Alexander Fishkov, Victor Dheur et al.

ICML 2025arXiv:2502.16336
8
citations
#5420

ProtComposer: Compositional Protein Structure Generation with 3D Ellipsoids

Hannes Stärk, Bowen Jing, Tomas Geffner et al.

ICLR 2025arXiv:2503.05025
8
citations
#5421

Towards Generalizable Scene Change Detection

Jae-Woo KIM, Ue-Hwan Kim

CVPR 2025arXiv:2409.06214
8
citations
#5422

RobuRCDet: Enhancing Robustness of Radar-Camera Fusion in Bird's Eye View for 3D Object Detection

Jingtong Yue, Zhiwei Lin, Xin Lin et al.

ICLR 2025arXiv:2502.13071
8
citations
#5423

GraphCL: Graph-based Clustering for Semi-Supervised Medical Image Segmentation

Mengzhu Wang, houcheng su, Jiao Li et al.

ICML 2025arXiv:2411.13147
8
citations
#5424

Tartan IMU: A Light Foundation Model for Inertial Positioning in Robotics

Shibo Zhao, Sifan Zhou, Raphael Blanchard et al.

CVPR 2025
8
citations
#5425

Federated Residual Low-Rank Adaption of Large Language Models

Yunlu Yan, Chun-Mei Feng, Wangmeng Zuo et al.

ICLR 2025
8
citations
#5426

Not all solutions are created equal: An analytical dissociation of functional and representational similarity in deep linear neural networks

Lukas Braun, Erin Grant, Andrew Saxe

ICML 2025spotlight
8
citations
#5427

Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning

Joey Hong, Anca Dragan, Sergey Levine

ICLR 2025arXiv:2411.05193
8
citations
#5428

Manta: Enhancing Mamba for Few-Shot Action Recognition of Long Sub-Sequence

Wenbo Huang, Jinghui Zhang, Guang Li et al.

AAAI 2025paperarXiv:2412.07481
8
citations
#5429

SWE-SQL: Illuminating LLM Pathways to Solve User SQL Issues in Real-World Applications

Jinyang Li, Xiaolong Li, Ge Qu et al.

NEURIPS 2025arXiv:2506.18951
8
citations
#5430

Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels

Maximilian Beck, Korbinian Pöppel, Phillip Lippe et al.

NEURIPS 2025arXiv:2503.14376
8
citations
#5431

Non-equilibrium Annealed Adjoint Sampler

Jaemoo Choi, Yongxin Chen, Molei Tao et al.

NEURIPS 2025arXiv:2506.18165
8
citations
#5432

One-Step Diffusion for Detail-Rich and Temporally Consistent Video Super-Resolution

Yujing Sun, Lingchen Sun, Shuaizheng Liu et al.

NEURIPS 2025oralarXiv:2506.15591
8
citations
#5433

MANTRA: The Manifold Triangulations Assemblage

Rubén Ballester, Ernst Roell, Daniel Bin Schmid et al.

ICLR 2025arXiv:2410.02392
8
citations
#5434

Quantum-PEFT: Ultra parameter-efficient fine-tuning

Toshiaki Koike-Akino, Francesco Tonin, Yongtao Wu et al.

ICLR 2025arXiv:2503.05431
8
citations
#5435

Expensive Multi-Objective Bayesian Optimization Based on Diffusion Models

Bingdong Li, Zixiang Di, Yongfan Lu et al.

AAAI 2025paperarXiv:2405.08674
8
citations
#5436

Learning Strategic Language Agents in the Werewolf Game with Iterative Latent Space Policy Optimization

Zelai Xu, Wanjun Gu, Chao Yu et al.

ICML 2025arXiv:2502.04686
8
citations
#5437

Unified Breakdown Analysis for Byzantine Robust Gossip

Renaud Gaucher, Aymeric Dieuleveut, Hadrien Hendrikx

ICML 2025arXiv:2410.10418
8
citations
#5438

RoboPEPP: Vision-Based Robot Pose and Joint Angle Estimation through Embedding Predictive Pre-Training

Raktim Gautam Goswami, Prashanth Krishnamurthy, Yann LeCun et al.

CVPR 2025highlightarXiv:2411.17662
8
citations
#5439

DepR: Depth Guided Single-view Scene Reconstruction with Instance-level Diffusion

Qingcheng Zhao, Xiang Zhang, Haiyang Xu et al.

ICCV 2025arXiv:2507.22825
8
citations
#5440

Optical-Flow Guided Prompt Optimization for Coherent Video Generation

Hyelin Nam, Jaemin Kim, Dohun Lee et al.

CVPR 2025arXiv:2411.15540
8
citations
#5441

Implicit Bias of Spectral Descent and Muon on Multiclass Separable Data

Chen Fan, Mark Schmidt, Christos Thrampoulidis

NEURIPS 2025spotlightarXiv:2502.04664
8
citations
#5442

Analyzing Finetuning Representation Shift for Multimodal LLMs Steering

Pegah KHAYATAN, Mustafa Shukor, Jayneel Parekh et al.

ICCV 2025arXiv:2501.03012
8
citations
#5443

SemGeoMo: Dynamic Contextual Human Motion Generation with Semantic and Geometric Guidance

Peishan Cong, Ziyi Wang, Yuexin Ma et al.

CVPR 2025arXiv:2503.01291
8
citations
#5444

HMoRA: Making LLMs More Effective with Hierarchical Mixture of LoRA Experts

Mengqi Liao, Wei Chen, Junfeng Shen et al.

ICLR 2025
8
citations
#5445

Benchmarking Quantum Reinforcement Learning

Nico Meyer, Christian Ufrecht, George Yammine et al.

ICML 2025arXiv:2501.15893
8
citations
#5446

DTGBrepGen: A Novel B-rep Generative Model through Decoupling Topology and Geometry

Jing Li, Yihang Fu, Falai Chen

CVPR 2025arXiv:2503.13110
8
citations
#5447

SIGMAN: Scaling 3D Human Gaussian Generation with Millions of Assets

Yuhang Yang, Fengqi Liu, Yixing Lu et al.

ICCV 2025
8
citations
#5448

Topograph: An Efficient Graph-Based Framework for Strictly Topology Preserving Image Segmentation

Laurin Lux, Alexander H Berger, Alexander Weers et al.

ICLR 2025arXiv:2411.03228
8
citations
#5449

SynQ: Accurate Zero-shot Quantization by Synthesis-aware Fine-tuning

Minjun Kim, Jongjin Kim, U Kang

ICLR 2025
8
citations
#5450

Mask^2DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation

Tianhao Qi, Jianlong Yuan, Wanquan Feng et al.

CVPR 2025
8
citations
#5451

DroneSplat: 3D Gaussian Splatting for Robust 3D Reconstruction from In-the-Wild Drone Imagery

Jiadong Tang, Yu Gao, Dianyi Yang et al.

CVPR 2025highlightarXiv:2503.16964
8
citations
#5452

ROD-MLLM: Towards More Reliable Object Detection in Multimodal Large Language Models

Heng Yin, Yuqiang Ren, Ke Yan et al.

CVPR 2025
8
citations
#5453

SplatFormer: Point Transformer for Robust 3D Gaussian Splatting

Yutong Chen, Marko Mihajlovic, Xiyi Chen et al.

ICLR 2025arXiv:2411.06390
8
citations
#5454

DuMo: Dual Encoder Modulation Network for Precise Concept Erasure

Feng Han, Kai Chen, Chao Gong et al.

AAAI 2025paperarXiv:2501.01125
8
citations
#5455

UniMamba: Unified Spatial-Channel Representation Learning with Group-Efficient Mamba for LiDAR-based 3D Object Detection

Xin Jin, Haisheng Su, Kai Liu et al.

CVPR 2025arXiv:2503.12009
8
citations
#5456

Towards Physical Understanding in Video Generation: A 3D Point Regularization Approach

Yunuo Chen, Junli Cao, Vidit Goel et al.

NEURIPS 2025arXiv:2502.03639
8
citations
#5457

PokerBench: Training Large Language Models to Become Professional Poker Players

Richard Zhuang, Akshat Gupta, Richard Yang et al.

AAAI 2025paperarXiv:2501.08328
8
citations
#5458

Symbolic regression via MDLformer-guided search: from minimizing prediction error to minimizing description length

Zihan Yu, Jingtao Ding, Yong Li et al.

ICLR 2025arXiv:2411.03753
8
citations
#5459

DiveR-CT: Diversity-enhanced Red Teaming Large Language Model Assistants with Relaxing Constraints

Andrew Zhao, Quentin Xu, Matthieu Lin et al.

AAAI 2025paperarXiv:2405.19026
8
citations
#5460

Discrete Neural Flow Samplers with Locally Equivariant Transformer

Zijing Ou, Ruixiang Zhang, Yingzhen Li

NEURIPS 2025arXiv:2505.17741
8
citations
#5461

MimeQA: Towards Socially-Intelligent Nonverbal Foundation Models

Hengzhi Li, Megan Tjandrasuwita, Yi R. (May) Fung et al.

NEURIPS 2025arXiv:2502.16671
8
citations
#5462

Out of Length Text Recognition with Sub-String Matching

Yongkun Du, Zhineng Chen, Caiyan Jia et al.

AAAI 2025paperarXiv:2407.12317
8
citations
#5463

Segment Any 3D Object with Language

Seungjun Lee, Yuyang Zhao, Gim H Lee

ICLR 2025arXiv:2404.02157
8
citations
#5464

LLM Strategic Reasoning: Agentic Study through Behavioral Game Theory

Jingru Jia, Zehua Yuan, Junhao Pan et al.

NEURIPS 2025oralarXiv:2502.20432
8
citations
#5465

MMReason: An Open-Ended Multi-Modal Multi-Step Reasoning Benchmark for MLLMs Toward AGI

Huanjin Yao, Jiaxing Huang, Yawen Qiu et al.

ICCV 2025arXiv:2506.23563
8
citations
#5466

Can DPO Learn Diverse Human Values? A Theoretical Scaling Law

Shawn Im, Sharon Li

NEURIPS 2025arXiv:2408.03459
8
citations
#5467

Among Us: A Sandbox for Measuring and Detecting Agentic Deception

Satvik Golechha, Adrià Garriga-Alonso

NEURIPS 2025spotlightarXiv:2504.04072
8
citations
#5468

DCBM: Data-Efficient Visual Concept Bottleneck Models

Katharina Prasse, Patrick Knab, Sascha Marton et al.

ICML 2025arXiv:2412.11576
8
citations
#5469

Generating Multimodal Driving Scenes via Next-Scene Prediction

Yanhao Wu, Haoyang Zhang, Tianwei Lin et al.

CVPR 2025arXiv:2503.14945
8
citations
#5470

MUNBa: Machine Unlearning via Nash Bargaining

Jing Wu, Mehrtash Harandi

ICCV 2025arXiv:2411.15537
8
citations
#5471

EMHI: A Multimodal Egocentric Human Motion Dataset with HMD and Body-Worn IMUs

Zhen Fan, Peng Dai, Zhuo Su et al.

AAAI 2025paperarXiv:2408.17168
8
citations
#5472

PIG: Physics-Informed Gaussians as Adaptive Parametric Mesh Representations

Namgyu Kang, Jaemin Oh, Youngjoon Hong et al.

ICLR 2025arXiv:2412.05994
8
citations
#5473

DARE the Extreme: Revisiting Delta-Parameter Pruning For Fine-Tuned Models

Wenlong Deng, Yize Zhao, Vala Vakilian et al.

ICLR 2025arXiv:2410.09344
8
citations
#5474

Scalable Fingerprinting of Large Language Models

Anshul Nasery, Jonathan Hayase, Creston Brooks et al.

NEURIPS 2025spotlightarXiv:2502.07760
8
citations
#5475

Filter or Compensate: Towards Invariant Representation from Distribution Shift for Anomaly Detection

Zining Chen, Xingshuang Luo, Weiqiu Wang et al.

AAAI 2025paperarXiv:2412.10115
8
citations
#5476

LMM-Det: Make Large Multimodal Models Excel in Object Detection

Jincheng Li, Chunyu Xie, Ji Ao et al.

ICCV 2025arXiv:2507.18300
8
citations
#5477

Why LVLMs Are More Prone to Hallucinations in Longer Responses: The Role of Context

Ge Zheng, Jiaye Qian, Jiajin Tang et al.

ICCV 2025arXiv:2510.20229
8
citations
#5478

Zero-Shot Scene Change Detection

Kyusik Cho, Dong Yeop Kim, Euntai Kim

AAAI 2025paperarXiv:2406.11210
8
citations
#5479

Alleviate and Mining: Rethinking Unsupervised Domain Adaptation for Mitochondria Segmentation from Pseudo-Label Perspective

Yujia Chen, Rui Sun, Wangkai Li et al.

AAAI 2025paper
8
citations
#5480

p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay

Jun Zhang, Desen Meng, Zhengming Zhang et al.

ICCV 2025arXiv:2412.04449
8
citations
#5481

Oasis: One Image is All You Need for Multimodal Instruction Data Synthesis

Letian Zhang, Quan Cui, Bingchen Zhao et al.

ICCV 2025arXiv:2503.08741
8
citations
#5482

SVGBuilder: Component-Based Colored SVG Generation with Text-Guided Autoregressive Transformers

Zehao Chen, Rong Pan

AAAI 2025paperarXiv:2412.10488
8
citations
#5483

ConMo: Controllable Motion Disentanglement and Recomposition for Zero-Shot Motion Transfer

Jiayi Gao, Zijin Yin, Changcheng Hua et al.

CVPR 2025arXiv:2504.02451
8
citations
#5484

Interactive Medical Image Analysis with Concept-based Similarity Reasoning

Ta Duc Huy, Sen Kim Tran, Phan Nguyen et al.

CVPR 2025arXiv:2503.06873
8
citations
#5485

PurpCode: Reasoning for Safer Code Generation

Jiawei Liu, Nirav Diwan, Zhe Wang et al.

NEURIPS 2025arXiv:2507.19060
8
citations
#5486

VideoMAR: Autoregressive Video Generation with Continuous Tokens

Hu Yu, Biao Gong, Hangjie Yuan et al.

NEURIPS 2025oral
8
citations
#5487

A Smooth Sea Never Made a Skilled SAILOR: Robust Imitation via Learning to Search

Arnav Kumar Jain, Vibhakar Mohta, Subin Kim et al.

NEURIPS 2025oralarXiv:2506.05294
8
citations
#5488

EBBS: An Ensemble with Bi-Level Beam Search for Zero-Shot Machine Translation

Yuqiao Wen, Behzad Shayegh, Chenyang Huang et al.

AAAI 2025paperarXiv:2403.00144
8
citations
#5489

MaFeRw: Query Rewriting with Multi-Aspect Feedbacks for Retrieval-Augmented Large Language Models

Yujing Wang, Hainan Zhang, Liang Pang et al.

AAAI 2025paperarXiv:2408.17072
8
citations
#5490

Dual-Level Precision Edges Guided Multi-View Stereo with Accurate Planarization

Kehua Chen, Zhenlong Yuan, Tianlu Mao et al.

AAAI 2025paperarXiv:2412.20328
8
citations
#5491

Adaptive Few-shot Prompting for Machine Translation with Pre-trained Language Models

Lei Tang, Jinghui Qin, Wenxuan Ye et al.

AAAI 2025paperarXiv:2501.01679
8
citations
#5492

SDE Matching: Scalable and Simulation-Free Training of Latent Stochastic Differential Equations

Grigory Bartosh, Dmitry Vetrov, Christian Andersson Naesseth

ICML 2025arXiv:2502.02472
8
citations
#5493

GaussianFusion: Gaussian-Based Multi-Sensor Fusion for End-to-End Autonomous Driving

Shuai Liu, Quanmin Liang, Zefeng Li et al.

NEURIPS 2025spotlightarXiv:2506.00034
8
citations
#5494

A Comprehensive Evaluation on Event Reasoning of Large Language Models

Zhengwei Tao, Zhi Jin, Yifan Zhang et al.

AAAI 2025paperarXiv:2404.17513
8
citations
#5495

Beware of Calibration Data for Pruning Large Language Models

Yixin Ji, Yang Xiang, Juntao Li et al.

ICLR 2025arXiv:2410.17711
8
citations
#5496

Enhancing Adversarial Transferability with Adversarial Weight Tuning

Jiahao Chen, Zhou Feng, Rui Zeng et al.

AAAI 2025paperarXiv:2408.09469
8
citations
#5497

Just What You Desire: Constrained Timeline Summarization with Self-Reflection for Enhanced Relevance

Muhammad Reza Qorib, Qisheng Hu, Hwee Tou Ng

AAAI 2025paperarXiv:2412.17408
8
citations
#5498

Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation

Bolin Lai, Felix Juefei-Xu, Miao Liu et al.

CVPR 2025highlightarXiv:2412.01027
8
citations
#5499

nvBench 2.0: Resolving Ambiguity in Text-to-Visualization through Stepwise Reasoning

Tianqi Luo, Chuhan Huang, Leixian Shen et al.

NEURIPS 2025arXiv:2503.12880
8
citations
#5500

Learning Fine-grained Domain Generalization via Hyperbolic State Space Hallucination

Qi Bi, Jingjun Yi, Haolan Zhan et al.

AAAI 2025paperarXiv:2504.08020
8
citations
#5501

GNS: Solving Plane Geometry Problems by Neural-Symbolic Reasoning with Multi-Modal LLMs

Maizhen Ning, Zihao Zhou, Qiufeng Wang et al.

AAAI 2025paper
8
citations
#5502

Split Gibbs Discrete Diffusion Posterior Sampling

Wenda Chu, Zihui Wu, Yifan Chen et al.

NEURIPS 2025arXiv:2503.01161
8
citations
#5503

Sparse Learning for State Space Models on Mobile

Xuan Shen, Hangyu Zheng, Yifan Gong et al.

ICLR 2025
8
citations
#5504

Evaluating LLM Reasoning in the Operations Research Domain with ORQA

Mahdi Mostajabdaveh, Timothy Tin Long Yu, Samarendra Chandan Bindu Dash et al.

AAAI 2025paperarXiv:2412.17874
8
citations
#5505

PhysSplat: Efficient Physics Simulation for 3D Scenes via MLLM-Guided Gaussian Splatting

Haoyu Zhao, Hao Wang, Xingyue Zhao et al.

ICCV 2025
8
citations
#5506

Data Pruning by Information Maximization

Haoru Tan, Sitong Wu, Wei Huang et al.

ICLR 2025arXiv:2506.01701
8
citations
#5507

Erasing More Than Intended? How Concept Erasure Degrades the Generation of Non-Target Concepts

Ibtihel Amara, Ahmed Imtiaz Humayun, Ivana Kajic et al.

ICCV 2025arXiv:2501.09833
8
citations
#5508

Adaptive Draft-Verification for Efficient Large Language Model Decoding

Xukun Liu, Bowen Lei, Ruqi Zhang et al.

AAAI 2025paperarXiv:2407.12021
8
citations
#5509

Adapters for Altering LLM Vocabularies: What Languages Benefit the Most?

HyoJung Han, Akiko Eriguchi, Haoran Xu et al.

ICLR 2025arXiv:2410.09644
8
citations
#5510

DOLLAR: Few-Step Video Generation via Distillation and Latent Reward Optimization

Zihan Ding, Chi Jin, Difan Liu et al.

ICCV 2025arXiv:2412.15689
8
citations
#5511

Error-quantified Conformal Inference for Time Series

Junxi Wu, Dongjian Hu, Yajie Bao et al.

ICLR 2025oralarXiv:2502.00818
8
citations
#5512

SWEA: Updating Factual Knowledge in Large Language Models via Subject Word Embedding Altering

Xiaopeng Li, Shasha Li, Shezheng Song et al.

AAAI 2025paperarXiv:2401.17809
8
citations
#5513

Mitigating Parameter Interference in Model Merging via Sharpness-Aware Fine-Tuning

Yeoreum Lee, Jinwook Jung, Sungyong Baik

ICLR 2025arXiv:2504.14662
8
citations
#5514

Exploring Model Editing for LLM-based Aspect-Based Sentiment Classification

Shichen Li, Zhongqing Wang, Zheyu Zhao et al.

AAAI 2025paperarXiv:2503.15117
8
citations
#5515

Embedding Safety into RL: A New Take on Trust Region Methods

Nikola Milosevic, Johannes Müller, Nico Scherf

ICML 2025arXiv:2411.02957
8
citations
#5516

ToMATO: Verbalizing the Mental States of Role-Playing LLMs for Benchmarking Theory of Mind

Kazutoshi Shinoda, Nobukatsu Hojo, Kyosuke Nishida et al.

AAAI 2025paperarXiv:2501.08838
8
citations
#5517

Generalized Consistency Trajectory Models for Image Manipulation

Beomsu Kim, Jaemin Kim, Jeongsol Kim et al.

ICLR 2025arXiv:2403.12510
8
citations
#5518

Interpretable Vision-Language Survival Analysis with Ordinal Inductive Bias for Computational Pathology

Pei Liu, Luping Ji, Jiaxiang Gou et al.

ICLR 2025arXiv:2409.09369
8
citations
#5519

Token Activation Map to Visually Explain Multimodal LLMs

Yi Li, Hualiang Wang, Xinpeng Ding et al.

ICCV 2025arXiv:2506.23270
8
citations
#5520

GENMANIP: LLM-driven Simulation for Generalizable Instruction-Following Manipulation

Ning Gao, Yilun Chen, Shuai Yang et al.

CVPR 2025arXiv:2506.10966
8
citations
#5521

Is LLMs Hallucination Usable? LLM-based Negative Reasoning for Fake News Detection

Chaowei Zhang, Zongling Feng, Zewei Zhang et al.

AAAI 2025paperarXiv:2503.09153
8
citations
#5522

De-mark: Watermark Removal in Large Language Models

Ruibo Chen, Yihan Wu, Junfeng Guo et al.

ICML 2025arXiv:2410.13808
8
citations
#5523

DiC: Rethinking Conv3x3 Designs in Diffusion Models

Yuchuan Tian, Jing Han, Chengcheng Wang et al.

CVPR 2025arXiv:2501.00603
8
citations
#5524

X-Drive: Cross-modality Consistent Multi-Sensor Data Synthesis for Driving Scenarios

Yichen Xie, Chenfeng Xu, Chensheng Peng et al.

ICLR 2025arXiv:2411.01123
8
citations
#5525

3D-RPE: Enhancing Long-Context Modeling Through 3D Rotary Position Encoding

Xindian Ma, Wenyuan Liu, Peng Zhang et al.

AAAI 2025paperarXiv:2406.09897
8
citations
#5526

Enhancing Large Language Model Performance with Gradient-Based Parameter Selection

Haoling Li, Xin Zhang, Xiao Liu et al.

AAAI 2025paperarXiv:2406.15330
8
citations
#5527

Mitigating Social Bias in Large Language Models: A Multi-Objective Approach Within a Multi-Agent Framework

Zhenjie Xu, Wenqing Chen, Yi Tang et al.

AAAI 2025paperarXiv:2412.15504
8
citations
#5528

ComPO: Preference Alignment via Comparison Oracles

Peter Chen, Xi Chen, Wotao Yin et al.

NEURIPS 2025arXiv:2505.05465
8
citations
#5529

Tree-Wasserstein Distance for High Dimensional Data with a Latent Feature Hierarchy

Ya-Wei Eileen Lin, Ronald Coifman, Gal Mishne et al.

ICLR 2025arXiv:2410.21107
8
citations
#5530

A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation

Liang Chen, Sinan Tan, Zefan Cai et al.

ICLR 2025arXiv:2410.01912
8
citations
#5531

Progressive Mixed-Precision Decoding for Efficient LLM Inference

Hao (Mark) Chen, Fuwen Tan, Alexandros Kouris et al.

ICLR 2025arXiv:2410.13461
8
citations
#5532

Controllable Protein Sequence Generation with LLM Preference Optimization

Xiangyu Liu, Yi Liu, Silei Chen et al.

AAAI 2025paperarXiv:2501.15007
8
citations
#5533

Efficient Fine-Tuning and Concept Suppression for Pruned Diffusion Models

Reza Shirkavand, Peiran Yu, Shangqian Gao et al.

CVPR 2025arXiv:2412.15341
8
citations
#5534

MLLM as Retriever: Interactively Learning Multimodal Retrieval for Embodied Agents

Junpeng Yue, Xinrun Xu, Börje F. Karlsson et al.

ICLR 2025arXiv:2410.03450
8
citations
#5535

DISC: Dynamic Decomposition Improves LLM Inference Scaling

Jonathan Li, Wei Cheng, Benjamin Riviere et al.

NEURIPS 2025arXiv:2502.16706
8
citations
#5536

EWMoE: An Effective Model for Global Weather Forecasting with Mixture-of-Experts

Lihao Gan, Xin Man, Chenghong Zhang et al.

AAAI 2025paperarXiv:2405.06004
8
citations
#5537

Knowledge in Superposition: Unveiling the Failures of Lifelong Knowledge Editing for Large Language Models

Chenhui Hu, Pengfei Cao, Yubo Chen et al.

AAAI 2025paperarXiv:2408.07413
8
citations
#5538

FlowMo: Variance-Based Flow Guidance for Coherent Motion in Video Generation

Ariel Shaulov, Itay Hazan, Lior Wolf et al.

NEURIPS 2025oralarXiv:2506.01144
8
citations
#5539

Rethinking the role of frames for SE(3)-invariant crystal structure modeling

Yusei Ito, Tatsunori Taniai, Ryo Igarashi et al.

ICLR 2025arXiv:2503.02209
8
citations
#5540

Instruction-based Image Manipulation by Watching How Things Move

Mingdeng Cao, Xuaner Zhang, Yinqiang Zheng et al.

CVPR 2025highlightarXiv:2412.12087
8
citations
#5541

MPS-Prover: Advancing Stepwise Theorem Proving by Multi-Perspective Search and Data Curation

Zhenwen Liang, Linfeng Song, Yang Li et al.

NEURIPS 2025arXiv:2505.10962
8
citations
#5542

An All-Atom Generative Model for Designing Protein Complexes

Ruizhe Chen, Dongyu Xue, Xiangxin Zhou et al.

ICML 2025arXiv:2504.13075
8
citations
#5543

InfiniDreamer: Arbitrarily Long Human Motion Generation via Segment Score Distillation

Wenjie Zhuo, Fan Ma, Hehe Fan

ICCV 2025arXiv:2411.18303
8
citations
#5544

Accelerated Over-Relaxation Heavy-Ball Method: Achieving Global Accelerated Convergence with Broad Generalization

Jingrong Wei, Long Chen

ICLR 2025arXiv:2406.09772
8
citations
#5545

CoTMR: Chain-of-Thought Multi-Scale Reasoning for Training-Free Zero-Shot Composed Image Retrieval

Zelong Sun, Dong Jing, Zhiwu Lu

ICCV 2025arXiv:2502.20826
8
citations
#5546

Overcoming Challenges of Long-Horizon Prediction in Driving World Models

Arian Mousakhan, Sudhanshu Mittal, Silvio Galesso et al.

NEURIPS 2025arXiv:2507.13162
8
citations
#5547

NutriBench: A Dataset for Evaluating Large Language Models in Nutrition Estimation from Meal Descriptions

Mehak Dhaliwal, Andong Hua, Laya Pullela et al.

ICLR 2025arXiv:2407.12843
8
citations
#5548

CacheQuant: Comprehensively Accelerated Diffusion Models

Xuewen Liu, Zhikai Li, Qingyi Gu

CVPR 2025arXiv:2503.01323
8
citations
#5549

MDSGen: Fast and Efficient Masked Diffusion Temporal-Aware Transformers for Open-Domain Sound Generation

Trung X. Pham, Tri Ton, Chang Yoo

ICLR 2025oralarXiv:2410.02130
8
citations
#5550

TabWak: A Watermark for Tabular Diffusion Models

Chaoyi Zhu, Jiayi Tang, Jeroen Galjaard et al.

ICLR 2025
8
citations
#5551

Does Training with Synthetic Data Truly Protect Privacy?

Yunpeng Zhao, Jie Zhang

ICLR 2025arXiv:2502.12976
8
citations
#5552

Compression-Aware One-Step Diffusion Model for JPEG Artifact Removal

Jinpei Guo, Zheng Chen, Wenbo Li et al.

ICCV 2025arXiv:2502.09873
8
citations
#5553

Mixture of Experts as Representation Learner for Deep Multi-View Clustering

Yunhe Zhang, Jinyu Cai, Zhihao Wu et al.

AAAI 2025paper
8
citations
#5554

Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects

Weimin Qiu, Jieke Wang, Meng Tang

CVPR 2025arXiv:2411.18936
8
citations
#5555

AV-Flow: Transforming Text to Audio-Visual Human-like Interactions

Aggelina Chatziagapi, Louis-Philippe Morency, Hongyu Gong et al.

ICCV 2025arXiv:2502.13133
8
citations
#5556

AFL: A Single-Round Analytic Approach for Federated Learning with Pre-trained Models

Run He, Kai Tong, Di Fang et al.

CVPR 2025arXiv:2405.16240
8
citations
#5557

Parameter-efficient Fine-tuning in Hyperspherical Space for Open-vocabulary Semantic Segmentation

Zelin Peng, Zhengqin Xu, Zhilin Zeng et al.

CVPR 2025arXiv:2405.18840
8
citations
#5558

Can Diffusion Models Learn Hidden Inter-Feature Rules Behind Images?

Yujin Han, Andi Han, Wei Huang et al.

ICML 2025arXiv:2502.04725
8
citations
#5559

Generating Physically Stable and Buildable Brick Structures from Text

Ava Pun, Kangle Deng, Ruixuan Liu et al.

ICCV 2025arXiv:2505.05469
8
citations
#5560

Bayesian Optimization via Continual Variational Last Layer Training

Paul Brunzema, Mikkel Jordahn, John Willes et al.

ICLR 2025arXiv:2412.09477
8
citations
#5561

Accessing Vision Foundation Models via ImageNet-1K

Yitian Zhang, Xu Ma, Yue Bai et al.

ICLR 2025arXiv:2407.10366
8
citations
#5562

Circuit Transformer: A Transformer That Preserves Logical Equivalence

Xihan Li, Xing Li, Lei Chen et al.

ICLR 2025arXiv:2403.13838
8
citations
#5563

Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding

Jinlong Li, Cristiano Saltori, Fabio Poiesi et al.

CVPR 2025arXiv:2503.16707
8
citations
#5564

KinMo: Kinematic-aware Human Motion Understanding and Generation

Pengfei Zhang, Pinxin Liu, Pablo Garrido et al.

ICCV 2025arXiv:2411.15472
8
citations
#5565

Homomorphism Counts as Structural Encodings for Graph Learning

Linus Bao, Emily Jin, Michael Bronstein et al.

ICLR 2025arXiv:2410.18676
8
citations
#5566

Near-Optimal Policy Identification in Robust Constrained Markov Decision Processes via Epigraph Form

Toshinori Kitamura, Tadashi Kozuno, Wataru Kumagai et al.

ICLR 2025arXiv:2408.16286
8
citations
#5567

Sharpness-Aware Minimization: General Analysis and Improved Rates

Dimitris Oikonomou, Nicolas Loizou

ICLR 2025arXiv:2503.02225
8
citations
#5568

WHAT MAKES MATH PROBLEMS HARD FOR REINFORCEMENT LEARNING: A CASE STUDY

Ali Shehper, Anibal Medina-Mardones, Lucas Fagan et al.

NEURIPS 2025arXiv:2408.15332
8
citations
#5569

Are LLMs Prescient? A Continuous Evaluation using Daily News as the Oracle

Hui Dai, Ryan Teehan, Mengye Ren

ICML 2025oralarXiv:2411.08324
8
citations
#5570

Understanding Adam Requires Better Rotation Dependent Assumptions

Tianyue Zhang, Lucas Maes, Alan Milligan et al.

NEURIPS 2025arXiv:2410.19964
8
citations
#5571

SoPo: Text-to-Motion Generation Using Semi-Online Preference Optimization

Xiaofeng Tan, Hongsong Wang, Xin Geng et al.

NEURIPS 2025arXiv:2412.05095
8
citations
#5572

From PEFT to DEFT: Parameter Efficient Finetuning for Reducing Activation Density in Transformers

Bharat Runwal, Tejaswini Pedapati, Pin-Yu Chen

AAAI 2025paperarXiv:2402.01911
8
citations
#5573

Risk and cross validation in ridge regression with correlated samples

Alexander Atanasov, Jacob A Zavatone-Veth, Cengiz Pehlevan

ICML 2025arXiv:2408.04607
8
citations
#5574

Data Unlearning in Diffusion Models

Silas Alberti, Kenan Hasanaliyev, Manav Shah et al.

ICLR 2025arXiv:2503.01034
8
citations
#5575

SUMO: Search-Based Uncertainty Estimation for Model-Based Offline Reinforcement Learning

Zhongjian Qiao, Jiafei Lyu, Kechen Jiao et al.

AAAI 2025paperarXiv:2408.12970
8
citations
#5576

Offline-to-Online Hyperparameter Transfer for Stochastic Bandits

Dravyansh Sharma, Arun Suggala

AAAI 2025paperarXiv:2501.02926
8
citations
#5577

4KAgent: Agentic Any Image to 4K Super-Resolution

Yushen Zuo, Qi Zheng, Mingyang Wu et al.

NEURIPS 2025arXiv:2507.07105
8
citations
#5578

Federated Domain Generalization with Data-free On-server Matching Gradient

Binh Nguyen, Minh-Duong Nguyen, Jinsun Park et al.

ICLR 2025arXiv:2501.14653
8
citations
#5579

Causally Reliable Concept Bottleneck Models

Giovanni De Felice, Arianna Casanova Flores, Francesco De Santis et al.

NEURIPS 2025arXiv:2503.04363
8
citations
#5580

RAD: Region-Aware Diffusion Models for Image Inpainting

Sora Kim, Sungho Suh, Minsik Lee

CVPR 2025arXiv:2412.09191
8
citations
#5581

NeSyC: A Neuro-symbolic Continual Learner For Complex Embodied Tasks in Open Domains

Wonje Choi, Jinwoo Park, Sanghyun Ahn et al.

ICLR 2025arXiv:2503.00870
8
citations
#5582

REvolve: Reward Evolution with Large Language Models using Human Feedback

RISHI HAZRA, Alkis Sygkounas, Andreas Persson et al.

ICLR 2025arXiv:2406.01309
8
citations
#5583

Directional Gradient Projection for Robust Fine-Tuning of Foundation Models

Chengyue Huang, Junjiao Tian, Brisa Maneechotesuwan et al.

ICLR 2025arXiv:2502.15895
8
citations
#5584

Approaching Rate-Distortion Limits in Neural Compression with Lattice Transform Coding

Eric Lei, Hamed Hassani, Shirin Saeedi Bidokhti

ICLR 2025arXiv:2403.07320
8
citations
#5585

CTSyn: A Foundation Model for Cross Tabular Data Generation

Xiaofeng Lin, Chenheng Xu, Matthew Yang et al.

ICLR 2025arXiv:2406.04619
8
citations
#5586

GC4NC: A Benchmark Framework for Graph Condensation on Node Classification with New Insights

Shengbo Gong, Juntong Ni, Noveen Sachdeva et al.

NEURIPS 2025arXiv:2406.16715
8
citations
#5587

CPathAgent: An Agent-based Foundation Model for Interpretable High-Resolution Pathology Image Analysis Mimicking Pathologists' Diagnostic Logic

YUXUAN SUN, Yixuan Si, Chenglu Zhu et al.

NEURIPS 2025arXiv:2505.20510
8
citations
#5588

Layerwise Recurrent Router for Mixture-of-Experts

Zihan Qiu, Zeyu Huang, Shuang Cheng et al.

ICLR 2025arXiv:2408.06793
8
citations
#5589

Face-Human-Bench: A Comprehensive Benchmark of Face and Human Understanding for Multi-modal Assistants

Lixiong Qin, Shilong Ou, Miaoxuan Zhang et al.

NEURIPS 2025arXiv:2501.01243
8
citations
#5590

Web-Shepherd: Advancing PRMs for Reinforcing Web Agents

Hyungjoo Chae, Seonghwan Kim, Junhee Cho et al.

NEURIPS 2025spotlightarXiv:2505.15277
8
citations
#5591

NullSwap: Proactive Identity Cloaking Against Deepfake Face Swapping

Tianyi Wang, Shuaicheng Niu, Harry Cheng et al.

ICCV 2025arXiv:2503.18678
8
citations
#5592

3D Student Splatting and Scooping

Jialin Zhu, Jiangbei Yue, Feixiang He et al.

CVPR 2025arXiv:2503.10148
8
citations
#5593

A General Framework for Producing Interpretable Semantic Text Embeddings

Yiqun Sun, Qiang Huang, Yixuan Tang et al.

ICLR 2025arXiv:2410.03435
8
citations
#5594

Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents

Yaxin Luo, Zhaoyi Li, Jiacheng Liu et al.

NEURIPS 2025arXiv:2505.24878
8
citations
#5595

SolidGeo: Measuring Multimodal Spatial Math Reasoning in Solid Geometry

Peijie Wang, Chao Yang, Zhong-Zhi Li et al.

NEURIPS 2025arXiv:2505.21177
8
citations
#5596

IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model

Yatai Ji, Shilong Zhang, Jie Wu et al.

ICLR 2025arXiv:2407.07577
8
citations
#5597

Erase Diffusion: Empowering Object Removal Through Calibrating Diffusion Pathways

Yi Liu, Hao Zhou, Benlei Cui et al.

CVPR 2025highlightarXiv:2503.07026
8
citations
#5598

Adaptive Unimodal Regulation for Balanced Multimodal Information Acquisition

Chengxiang Huang, Yake Wei, Zequn Yang et al.

CVPR 2025arXiv:2503.18595
8
citations
#5599

A multiscale analysis of mean-field transformers in the moderate interaction regime

Giuseppe Bruno, Federico Pasqualotto, Andrea Agazzi

NEURIPS 2025oralarXiv:2509.25040
8
citations
#5600

Finite-Sample Analysis of Policy Evaluation for Robust Average Reward Reinforcement Learning

Yang Xu, Washim Mondal, Vaneet Aggarwal

NEURIPS 2025arXiv:2502.16816
8
citations