Most Cited 2025 "multimodal reasoning" Papers

22,274 papers found • Page 28 of 112

Filters:Most Cited 2025 multimodal reasoning Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

#5401

AnomalyNCD: Towards Novel Anomaly Class Discovery in Industrial Scenarios

Ziming Huang, Xurui Li, Haotian Liu et al.

CVPR 2025arXiv:2410.14379

citations

#5402

CONTRA: Conformal Prediction Region via Normalizing Flow Transformation

Zhenhan FANG, Aixin Tan, Jian Huang

ICLR 2025

citations

#5403

Micro-macro Wavelet-based Gaussian Splatting for 3D Reconstruction from Unconstrained Images

Yihui Li, Chengxin Lv, Hongyu Yang et al.

AAAI 2025paperarXiv:2501.14231

citations

#5404

DAViD: Modeling Dynamic Affordance of 3D Objects Using Pre-trained Video Diffusion Models

Hyeonwoo Kim, Sangwon Baik, Hanbyul Joo

ICCV 2025arXiv:2501.08333

citations

#5405

Multi-Marginal Stochastic Flow Matching for High-Dimensional Snapshot Data at Irregular Time Points

Justin Lee, Behnaz Moradi-Jamei, Heman Shakeri

ICML 2025arXiv:2508.04351

citations

#5406

Principled Algorithms for Optimizing Generalized Metrics in Binary Classification

Anqi Mao, Mehryar Mohri, Yutao Zhong

ICML 2025arXiv:2512.23133

citations

#5407

MergeNet: Knowledge Migration Across Heterogeneous Models, Tasks, and Modalities

Kunxi Li, Tianyu Zhan, Kairui Fu et al.

AAAI 2025paperarXiv:2404.13322

citations

#5408

TopoDiffusionNet: A Topology-aware Diffusion Model

Saumya Gupta, Dimitris Samaras, Chao Chen

ICLR 2025arXiv:2410.16646

citations

#5409

Glauber Generative Model: Discrete Diffusion Models via Binary Classification

Harshit Varma, Dheeraj Nagaraj, Karthikeyan Shanmugam

ICLR 2025arXiv:2405.17035

citations

#5410

H3D-DGS: Exploring Heterogeneous 3D Motion Representation for Deformable 3D Gaussian Splatting

Bing He, Yunuo Chen, Guo Lu et al.

NEURIPS 2025arXiv:2408.13036

citations

#5411

EditRoom: LLM-parameterized Graph Diffusion for Composable 3D Room Layout Editing

Kaizhi Zheng, Xiaotong Chen, Xuehai He et al.

ICLR 2025arXiv:2410.12836

citations

#5412

PRE-Mamba: A 4D State Space Model for Ultra-High-Frequent Event Camera Deraining

Ciyu Ruan, Ruishan Guo, Zihang GONG et al.

ICCV 2025arXiv:2505.05307

citations

#5413

Position: The Future of Bayesian Prediction Is Prior-Fitted

Samuel Gabriel Müller, Arik Reuter, Noah Hollmann et al.

ICML 2025arXiv:2505.23947

citations

#5414

Stable Mean Teacher for Semi-supervised Video Action Detection

Akash Kumar, Sirshapan Mitra, Yogesh Singh Rawat

AAAI 2025paperarXiv:2412.07072

citations

#5415

Revisiting Random Walks for Learning on Graphs

Jinwoo Kim, Olga Zaghen, Ayhan Suleymanzade et al.

ICLR 2025arXiv:2407.01214

citations

#5416

Seeing Eye to AI: Human Alignment via Gaze-Based Response Rewards for Large Language Models

Ángela López-Cardona, Carlos Segura, Alexandros Karatzoglou et al.

ICLR 2025arXiv:2410.01532

citations

#5417

Uncertainty Modeling in Graph Neural Networks via Stochastic Differential Equations

Richard Bergna, Sergio Calvo Ordoñez, Felix Opolka et al.

ICLR 2025arXiv:2408.16115

citations

#5418

Slot-Guided Adaptation of Pre-trained Diffusion Models for Object-Centric Learning and Compositional Generation

adil kaan akan, Yucel Yemez

ICLR 2025arXiv:2501.15878

citations

#5419

Rectifying Conformity Scores for Better Conditional Coverage

Vincent Plassier, Alexander Fishkov, Victor Dheur et al.

ICML 2025arXiv:2502.16336

citations

#5420

ProtComposer: Compositional Protein Structure Generation with 3D Ellipsoids

Hannes Stärk, Bowen Jing, Tomas Geffner et al.

ICLR 2025arXiv:2503.05025

citations

#5421

Towards Generalizable Scene Change Detection

Jae-Woo KIM, Ue-Hwan Kim

CVPR 2025arXiv:2409.06214

citations

#5422

RobuRCDet: Enhancing Robustness of Radar-Camera Fusion in Bird's Eye View for 3D Object Detection

Jingtong Yue, Zhiwei Lin, Xin Lin et al.

ICLR 2025arXiv:2502.13071

citations

#5423

GraphCL: Graph-based Clustering for Semi-Supervised Medical Image Segmentation

Mengzhu Wang, houcheng su, Jiao Li et al.

ICML 2025arXiv:2411.13147

citations

#5424

Tartan IMU: A Light Foundation Model for Inertial Positioning in Robotics

Shibo Zhao, Sifan Zhou, Raphael Blanchard et al.

CVPR 2025

citations

#5425

Federated Residual Low-Rank Adaption of Large Language Models

Yunlu Yan, Chun-Mei Feng, Wangmeng Zuo et al.

ICLR 2025

citations

#5426

Not all solutions are created equal: An analytical dissociation of functional and representational similarity in deep linear neural networks

Lukas Braun, Erin Grant, Andrew Saxe

ICML 2025spotlight

citations

#5427

Q-SFT: Q-Learning for Language Models via Supervised Fine-Tuning

Joey Hong, Anca Dragan, Sergey Levine

ICLR 2025arXiv:2411.05193

citations

#5428

Manta: Enhancing Mamba for Few-Shot Action Recognition of Long Sub-Sequence

Wenbo Huang, Jinghui Zhang, Guang Li et al.

AAAI 2025paperarXiv:2412.07481

citations

#5429

SWE-SQL: Illuminating LLM Pathways to Solve User SQL Issues in Real-World Applications

Jinyang Li, Xiaolong Li, Ge Qu et al.

NEURIPS 2025arXiv:2506.18951

citations

#5430

Tiled Flash Linear Attention: More Efficient Linear RNN and xLSTM Kernels

Maximilian Beck, Korbinian Pöppel, Phillip Lippe et al.

NEURIPS 2025arXiv:2503.14376

citations

#5431

Non-equilibrium Annealed Adjoint Sampler

Jaemoo Choi, Yongxin Chen, Molei Tao et al.

NEURIPS 2025arXiv:2506.18165

citations

#5432

One-Step Diffusion for Detail-Rich and Temporally Consistent Video Super-Resolution

Yujing Sun, Lingchen Sun, Shuaizheng Liu et al.

NEURIPS 2025oralarXiv:2506.15591

citations

#5433

MANTRA: The Manifold Triangulations Assemblage

Rubén Ballester, Ernst Roell, Daniel Bin Schmid et al.

ICLR 2025arXiv:2410.02392

citations

#5434

Quantum-PEFT: Ultra parameter-efficient fine-tuning

Toshiaki Koike-Akino, Francesco Tonin, Yongtao Wu et al.

ICLR 2025arXiv:2503.05431

citations

#5435

Expensive Multi-Objective Bayesian Optimization Based on Diffusion Models

Bingdong Li, Zixiang Di, Yongfan Lu et al.

AAAI 2025paperarXiv:2405.08674

citations

#5436

Learning Strategic Language Agents in the Werewolf Game with Iterative Latent Space Policy Optimization

Zelai Xu, Wanjun Gu, Chao Yu et al.

ICML 2025arXiv:2502.04686

citations

#5437

Unified Breakdown Analysis for Byzantine Robust Gossip

Renaud Gaucher, Aymeric Dieuleveut, Hadrien Hendrikx

ICML 2025arXiv:2410.10418

citations

#5438

RoboPEPP: Vision-Based Robot Pose and Joint Angle Estimation through Embedding Predictive Pre-Training

Raktim Gautam Goswami, Prashanth Krishnamurthy, Yann LeCun et al.

CVPR 2025highlightarXiv:2411.17662

citations

#5439

DepR: Depth Guided Single-view Scene Reconstruction with Instance-level Diffusion

Qingcheng Zhao, Xiang Zhang, Haiyang Xu et al.

ICCV 2025arXiv:2507.22825

citations

#5440

Optical-Flow Guided Prompt Optimization for Coherent Video Generation

Hyelin Nam, Jaemin Kim, Dohun Lee et al.

CVPR 2025arXiv:2411.15540

citations

#5441

Implicit Bias of Spectral Descent and Muon on Multiclass Separable Data

Chen Fan, Mark Schmidt, Christos Thrampoulidis

NEURIPS 2025spotlightarXiv:2502.04664

citations

#5442

Analyzing Finetuning Representation Shift for Multimodal LLMs Steering

Pegah KHAYATAN, Mustafa Shukor, Jayneel Parekh et al.

ICCV 2025arXiv:2501.03012

citations

#5443

SemGeoMo: Dynamic Contextual Human Motion Generation with Semantic and Geometric Guidance

Peishan Cong, Ziyi Wang, Yuexin Ma et al.

CVPR 2025arXiv:2503.01291

citations

#5444

HMoRA: Making LLMs More Effective with Hierarchical Mixture of LoRA Experts

Mengqi Liao, Wei Chen, Junfeng Shen et al.

ICLR 2025

citations

#5445

Benchmarking Quantum Reinforcement Learning

Nico Meyer, Christian Ufrecht, George Yammine et al.

ICML 2025arXiv:2501.15893

citations

#5446

DTGBrepGen: A Novel B-rep Generative Model through Decoupling Topology and Geometry

Jing Li, Yihang Fu, Falai Chen

CVPR 2025arXiv:2503.13110

citations

#5447

SIGMAN: Scaling 3D Human Gaussian Generation with Millions of Assets

Yuhang Yang, Fengqi Liu, Yixing Lu et al.

ICCV 2025

citations

#5448

Topograph: An Efficient Graph-Based Framework for Strictly Topology Preserving Image Segmentation

Laurin Lux, Alexander H Berger, Alexander Weers et al.

ICLR 2025arXiv:2411.03228

citations

#5449

SynQ: Accurate Zero-shot Quantization by Synthesis-aware Fine-tuning

Minjun Kim, Jongjin Kim, U Kang

ICLR 2025

citations

#5450

Mask^2DiT: Dual Mask-based Diffusion Transformer for Multi-Scene Long Video Generation

Tianhao Qi, Jianlong Yuan, Wanquan Feng et al.

CVPR 2025

citations

#5451

DroneSplat: 3D Gaussian Splatting for Robust 3D Reconstruction from In-the-Wild Drone Imagery

Jiadong Tang, Yu Gao, Dianyi Yang et al.

CVPR 2025highlightarXiv:2503.16964

citations

#5452

ROD-MLLM: Towards More Reliable Object Detection in Multimodal Large Language Models

Heng Yin, Yuqiang Ren, Ke Yan et al.

CVPR 2025

citations

#5453

SplatFormer: Point Transformer for Robust 3D Gaussian Splatting

Yutong Chen, Marko Mihajlovic, Xiyi Chen et al.

ICLR 2025arXiv:2411.06390

citations

#5454

DuMo: Dual Encoder Modulation Network for Precise Concept Erasure

Feng Han, Kai Chen, Chao Gong et al.

AAAI 2025paperarXiv:2501.01125

citations

#5455

UniMamba: Unified Spatial-Channel Representation Learning with Group-Efficient Mamba for LiDAR-based 3D Object Detection

Xin Jin, Haisheng Su, Kai Liu et al.

CVPR 2025arXiv:2503.12009

citations

#5456

Towards Physical Understanding in Video Generation: A 3D Point Regularization Approach

Yunuo Chen, Junli Cao, Vidit Goel et al.

NEURIPS 2025arXiv:2502.03639

citations

#5457

PokerBench: Training Large Language Models to Become Professional Poker Players

Richard Zhuang, Akshat Gupta, Richard Yang et al.

AAAI 2025paperarXiv:2501.08328

citations

#5458

Symbolic regression via MDLformer-guided search: from minimizing prediction error to minimizing description length

Zihan Yu, Jingtao Ding, Yong Li et al.

ICLR 2025arXiv:2411.03753

citations

#5459

DiveR-CT: Diversity-enhanced Red Teaming Large Language Model Assistants with Relaxing Constraints

Andrew Zhao, Quentin Xu, Matthieu Lin et al.

AAAI 2025paperarXiv:2405.19026

citations

#5460

Discrete Neural Flow Samplers with Locally Equivariant Transformer

Zijing Ou, Ruixiang Zhang, Yingzhen Li

NEURIPS 2025arXiv:2505.17741

citations

#5461

MimeQA: Towards Socially-Intelligent Nonverbal Foundation Models

Hengzhi Li, Megan Tjandrasuwita, Yi R. (May) Fung et al.

NEURIPS 2025arXiv:2502.16671

citations

#5462

Out of Length Text Recognition with Sub-String Matching

Yongkun Du, Zhineng Chen, Caiyan Jia et al.

AAAI 2025paperarXiv:2407.12317

citations

#5463

Segment Any 3D Object with Language

Seungjun Lee, Yuyang Zhao, Gim H Lee

ICLR 2025arXiv:2404.02157

citations

#5464

LLM Strategic Reasoning: Agentic Study through Behavioral Game Theory

Jingru Jia, Zehua Yuan, Junhao Pan et al.

NEURIPS 2025oralarXiv:2502.20432

citations

#5465

MMReason: An Open-Ended Multi-Modal Multi-Step Reasoning Benchmark for MLLMs Toward AGI

Huanjin Yao, Jiaxing Huang, Yawen Qiu et al.

ICCV 2025arXiv:2506.23563

citations

#5466

Can DPO Learn Diverse Human Values? A Theoretical Scaling Law

Shawn Im, Sharon Li

NEURIPS 2025arXiv:2408.03459

citations

#5467

Among Us: A Sandbox for Measuring and Detecting Agentic Deception

Satvik Golechha, Adrià Garriga-Alonso

NEURIPS 2025spotlightarXiv:2504.04072

citations

#5468

DCBM: Data-Efficient Visual Concept Bottleneck Models

Katharina Prasse, Patrick Knab, Sascha Marton et al.

ICML 2025arXiv:2412.11576

citations

#5469

Generating Multimodal Driving Scenes via Next-Scene Prediction

Yanhao Wu, Haoyang Zhang, Tianwei Lin et al.

CVPR 2025arXiv:2503.14945

citations

#5470

MUNBa: Machine Unlearning via Nash Bargaining

Jing Wu, Mehrtash Harandi

ICCV 2025arXiv:2411.15537

citations

#5471

EMHI: A Multimodal Egocentric Human Motion Dataset with HMD and Body-Worn IMUs

Zhen Fan, Peng Dai, Zhuo Su et al.

AAAI 2025paperarXiv:2408.17168

citations

#5472

PIG: Physics-Informed Gaussians as Adaptive Parametric Mesh Representations

Namgyu Kang, Jaemin Oh, Youngjoon Hong et al.

ICLR 2025arXiv:2412.05994

citations

#5473

DARE the Extreme: Revisiting Delta-Parameter Pruning For Fine-Tuned Models

Wenlong Deng, Yize Zhao, Vala Vakilian et al.

ICLR 2025arXiv:2410.09344

citations

#5474

Scalable Fingerprinting of Large Language Models

Anshul Nasery, Jonathan Hayase, Creston Brooks et al.

NEURIPS 2025spotlightarXiv:2502.07760

citations

#5475

Filter or Compensate: Towards Invariant Representation from Distribution Shift for Anomaly Detection

Zining Chen, Xingshuang Luo, Weiqiu Wang et al.

AAAI 2025paperarXiv:2412.10115

citations

#5476

LMM-Det: Make Large Multimodal Models Excel in Object Detection

Jincheng Li, Chunyu Xie, Ji Ao et al.

ICCV 2025arXiv:2507.18300

citations

#5477

Why LVLMs Are More Prone to Hallucinations in Longer Responses: The Role of Context

Ge Zheng, Jiaye Qian, Jiajin Tang et al.

ICCV 2025arXiv:2510.20229

citations

#5478

Zero-Shot Scene Change Detection

Kyusik Cho, Dong Yeop Kim, Euntai Kim

AAAI 2025paperarXiv:2406.11210

citations

#5479

Alleviate and Mining: Rethinking Unsupervised Domain Adaptation for Mitochondria Segmentation from Pseudo-Label Perspective

Yujia Chen, Rui Sun, Wangkai Li et al.

AAAI 2025paper

citations

#5480

p-MoD: Building Mixture-of-Depths MLLMs via Progressive Ratio Decay

Jun Zhang, Desen Meng, Zhengming Zhang et al.

ICCV 2025arXiv:2412.04449

citations

#5481

Oasis: One Image is All You Need for Multimodal Instruction Data Synthesis

Letian Zhang, Quan Cui, Bingchen Zhao et al.

ICCV 2025arXiv:2503.08741

citations

#5482

SVGBuilder: Component-Based Colored SVG Generation with Text-Guided Autoregressive Transformers

Zehao Chen, Rong Pan

AAAI 2025paperarXiv:2412.10488

citations

#5483

ConMo: Controllable Motion Disentanglement and Recomposition for Zero-Shot Motion Transfer

Jiayi Gao, Zijin Yin, Changcheng Hua et al.

CVPR 2025arXiv:2504.02451

citations

#5484

Interactive Medical Image Analysis with Concept-based Similarity Reasoning

Ta Duc Huy, Sen Kim Tran, Phan Nguyen et al.

CVPR 2025arXiv:2503.06873

citations

#5485

PurpCode: Reasoning for Safer Code Generation

Jiawei Liu, Nirav Diwan, Zhe Wang et al.

NEURIPS 2025arXiv:2507.19060

citations

#5486

VideoMAR: Autoregressive Video Generation with Continuous Tokens

Hu Yu, Biao Gong, Hangjie Yuan et al.

NEURIPS 2025oral

citations

#5487

A Smooth Sea Never Made a Skilled SAILOR: Robust Imitation via Learning to Search

Arnav Kumar Jain, Vibhakar Mohta, Subin Kim et al.

NEURIPS 2025oralarXiv:2506.05294

citations

#5488

EBBS: An Ensemble with Bi-Level Beam Search for Zero-Shot Machine Translation

Yuqiao Wen, Behzad Shayegh, Chenyang Huang et al.

AAAI 2025paperarXiv:2403.00144

citations

#5489

MaFeRw: Query Rewriting with Multi-Aspect Feedbacks for Retrieval-Augmented Large Language Models

Yujing Wang, Hainan Zhang, Liang Pang et al.

AAAI 2025paperarXiv:2408.17072

citations

#5490

Dual-Level Precision Edges Guided Multi-View Stereo with Accurate Planarization

Kehua Chen, Zhenlong Yuan, Tianlu Mao et al.

AAAI 2025paperarXiv:2412.20328

citations

#5491

Adaptive Few-shot Prompting for Machine Translation with Pre-trained Language Models

Lei Tang, Jinghui Qin, Wenxuan Ye et al.

AAAI 2025paperarXiv:2501.01679

citations

#5492

SDE Matching: Scalable and Simulation-Free Training of Latent Stochastic Differential Equations

Grigory Bartosh, Dmitry Vetrov, Christian Andersson Naesseth

ICML 2025arXiv:2502.02472

citations

#5493

GaussianFusion: Gaussian-Based Multi-Sensor Fusion for End-to-End Autonomous Driving

Shuai Liu, Quanmin Liang, Zefeng Li et al.

NEURIPS 2025spotlightarXiv:2506.00034

citations

#5494

A Comprehensive Evaluation on Event Reasoning of Large Language Models

Zhengwei Tao, Zhi Jin, Yifan Zhang et al.

AAAI 2025paperarXiv:2404.17513

citations

#5495

Beware of Calibration Data for Pruning Large Language Models

Yixin Ji, Yang Xiang, Juntao Li et al.

ICLR 2025arXiv:2410.17711

citations

#5496

Enhancing Adversarial Transferability with Adversarial Weight Tuning

Jiahao Chen, Zhou Feng, Rui Zeng et al.

AAAI 2025paperarXiv:2408.09469

citations

#5497

Just What You Desire: Constrained Timeline Summarization with Self-Reflection for Enhanced Relevance

Muhammad Reza Qorib, Qisheng Hu, Hwee Tou Ng

AAAI 2025paperarXiv:2412.17408

citations

#5498

Unleashing In-context Learning of Autoregressive Models for Few-shot Image Manipulation

Bolin Lai, Felix Juefei-Xu, Miao Liu et al.

CVPR 2025highlightarXiv:2412.01027

citations

#5499

nvBench 2.0: Resolving Ambiguity in Text-to-Visualization through Stepwise Reasoning

Tianqi Luo, Chuhan Huang, Leixian Shen et al.

NEURIPS 2025arXiv:2503.12880

citations

#5500

Learning Fine-grained Domain Generalization via Hyperbolic State Space Hallucination

Qi Bi, Jingjun Yi, Haolan Zhan et al.

AAAI 2025paperarXiv:2504.08020

citations

#5501

GNS: Solving Plane Geometry Problems by Neural-Symbolic Reasoning with Multi-Modal LLMs

Maizhen Ning, Zihao Zhou, Qiufeng Wang et al.

AAAI 2025paper

citations

#5502

Split Gibbs Discrete Diffusion Posterior Sampling

Wenda Chu, Zihui Wu, Yifan Chen et al.

NEURIPS 2025arXiv:2503.01161

citations

#5503

Sparse Learning for State Space Models on Mobile

Xuan Shen, Hangyu Zheng, Yifan Gong et al.

ICLR 2025

citations

#5504

Evaluating LLM Reasoning in the Operations Research Domain with ORQA

Mahdi Mostajabdaveh, Timothy Tin Long Yu, Samarendra Chandan Bindu Dash et al.

AAAI 2025paperarXiv:2412.17874

citations

#5505

PhysSplat: Efficient Physics Simulation for 3D Scenes via MLLM-Guided Gaussian Splatting

Haoyu Zhao, Hao Wang, Xingyue Zhao et al.

ICCV 2025

citations

#5506

Data Pruning by Information Maximization

Haoru Tan, Sitong Wu, Wei Huang et al.

ICLR 2025arXiv:2506.01701

citations

#5507

Erasing More Than Intended? How Concept Erasure Degrades the Generation of Non-Target Concepts

Ibtihel Amara, Ahmed Imtiaz Humayun, Ivana Kajic et al.

ICCV 2025arXiv:2501.09833

citations

#5508

Adaptive Draft-Verification for Efficient Large Language Model Decoding

Xukun Liu, Bowen Lei, Ruqi Zhang et al.

AAAI 2025paperarXiv:2407.12021

citations

#5509

Adapters for Altering LLM Vocabularies: What Languages Benefit the Most?

HyoJung Han, Akiko Eriguchi, Haoran Xu et al.

ICLR 2025arXiv:2410.09644

citations

#5510

DOLLAR: Few-Step Video Generation via Distillation and Latent Reward Optimization

Zihan Ding, Chi Jin, Difan Liu et al.

ICCV 2025arXiv:2412.15689

citations

#5511

Error-quantified Conformal Inference for Time Series

Junxi Wu, Dongjian Hu, Yajie Bao et al.

ICLR 2025oralarXiv:2502.00818

citations

#5512

SWEA: Updating Factual Knowledge in Large Language Models via Subject Word Embedding Altering

Xiaopeng Li, Shasha Li, Shezheng Song et al.

AAAI 2025paperarXiv:2401.17809

citations

#5513

Mitigating Parameter Interference in Model Merging via Sharpness-Aware Fine-Tuning

Yeoreum Lee, Jinwook Jung, Sungyong Baik

ICLR 2025arXiv:2504.14662

citations

#5514

Exploring Model Editing for LLM-based Aspect-Based Sentiment Classification

Shichen Li, Zhongqing Wang, Zheyu Zhao et al.

AAAI 2025paperarXiv:2503.15117

citations

#5515

Embedding Safety into RL: A New Take on Trust Region Methods

Nikola Milosevic, Johannes Müller, Nico Scherf

ICML 2025arXiv:2411.02957

citations

#5516

ToMATO: Verbalizing the Mental States of Role-Playing LLMs for Benchmarking Theory of Mind

Kazutoshi Shinoda, Nobukatsu Hojo, Kyosuke Nishida et al.

AAAI 2025paperarXiv:2501.08838

citations

#5517

Generalized Consistency Trajectory Models for Image Manipulation

Beomsu Kim, Jaemin Kim, Jeongsol Kim et al.

ICLR 2025arXiv:2403.12510

citations

#5518

Interpretable Vision-Language Survival Analysis with Ordinal Inductive Bias for Computational Pathology

Pei Liu, Luping Ji, Jiaxiang Gou et al.

ICLR 2025arXiv:2409.09369

citations

#5519

Token Activation Map to Visually Explain Multimodal LLMs

Yi Li, Hualiang Wang, Xinpeng Ding et al.

ICCV 2025arXiv:2506.23270

citations

#5520

GENMANIP: LLM-driven Simulation for Generalizable Instruction-Following Manipulation

Ning Gao, Yilun Chen, Shuai Yang et al.

CVPR 2025arXiv:2506.10966

citations

#5521

Is LLMs Hallucination Usable? LLM-based Negative Reasoning for Fake News Detection

Chaowei Zhang, Zongling Feng, Zewei Zhang et al.

AAAI 2025paperarXiv:2503.09153

citations

#5522

De-mark: Watermark Removal in Large Language Models

Ruibo Chen, Yihan Wu, Junfeng Guo et al.

ICML 2025arXiv:2410.13808

citations

#5523

DiC: Rethinking Conv3x3 Designs in Diffusion Models

Yuchuan Tian, Jing Han, Chengcheng Wang et al.

CVPR 2025arXiv:2501.00603

citations

#5524

X-Drive: Cross-modality Consistent Multi-Sensor Data Synthesis for Driving Scenarios

Yichen Xie, Chenfeng Xu, Chensheng Peng et al.

ICLR 2025arXiv:2411.01123

citations

#5525

3D-RPE: Enhancing Long-Context Modeling Through 3D Rotary Position Encoding

Xindian Ma, Wenyuan Liu, Peng Zhang et al.

AAAI 2025paperarXiv:2406.09897

citations

#5526

Enhancing Large Language Model Performance with Gradient-Based Parameter Selection

Haoling Li, Xin Zhang, Xiao Liu et al.

AAAI 2025paperarXiv:2406.15330

citations

#5527

Mitigating Social Bias in Large Language Models: A Multi-Objective Approach Within a Multi-Agent Framework

Zhenjie Xu, Wenqing Chen, Yi Tang et al.

AAAI 2025paperarXiv:2412.15504

citations

#5528

ComPO: Preference Alignment via Comparison Oracles

Peter Chen, Xi Chen, Wotao Yin et al.

NEURIPS 2025arXiv:2505.05465

citations

#5529

Tree-Wasserstein Distance for High Dimensional Data with a Latent Feature Hierarchy

Ya-Wei Eileen Lin, Ronald Coifman, Gal Mishne et al.

ICLR 2025arXiv:2410.21107

citations

#5530

A Spark of Vision-Language Intelligence: 2-Dimensional Autoregressive Transformer for Efficient Finegrained Image Generation

Liang Chen, Sinan Tan, Zefan Cai et al.

ICLR 2025arXiv:2410.01912

citations

#5531

Progressive Mixed-Precision Decoding for Efficient LLM Inference

Hao (Mark) Chen, Fuwen Tan, Alexandros Kouris et al.

ICLR 2025arXiv:2410.13461

citations

#5532

Controllable Protein Sequence Generation with LLM Preference Optimization

Xiangyu Liu, Yi Liu, Silei Chen et al.

AAAI 2025paperarXiv:2501.15007

citations

#5533

Efficient Fine-Tuning and Concept Suppression for Pruned Diffusion Models

Reza Shirkavand, Peiran Yu, Shangqian Gao et al.

CVPR 2025arXiv:2412.15341

citations

#5534

MLLM as Retriever: Interactively Learning Multimodal Retrieval for Embodied Agents

Junpeng Yue, Xinrun Xu, Börje F. Karlsson et al.

ICLR 2025arXiv:2410.03450

citations

#5535

DISC: Dynamic Decomposition Improves LLM Inference Scaling

Jonathan Li, Wei Cheng, Benjamin Riviere et al.

NEURIPS 2025arXiv:2502.16706

citations

#5536

EWMoE: An Effective Model for Global Weather Forecasting with Mixture-of-Experts

Lihao Gan, Xin Man, Chenghong Zhang et al.

AAAI 2025paperarXiv:2405.06004

citations

#5537

Knowledge in Superposition: Unveiling the Failures of Lifelong Knowledge Editing for Large Language Models

Chenhui Hu, Pengfei Cao, Yubo Chen et al.

AAAI 2025paperarXiv:2408.07413

citations

#5538

FlowMo: Variance-Based Flow Guidance for Coherent Motion in Video Generation

Ariel Shaulov, Itay Hazan, Lior Wolf et al.

NEURIPS 2025oralarXiv:2506.01144

citations

#5539

Rethinking the role of frames for SE(3)-invariant crystal structure modeling

Yusei Ito, Tatsunori Taniai, Ryo Igarashi et al.

ICLR 2025arXiv:2503.02209

citations

#5540

Instruction-based Image Manipulation by Watching How Things Move

Mingdeng Cao, Xuaner Zhang, Yinqiang Zheng et al.

CVPR 2025highlightarXiv:2412.12087

citations

#5541

MPS-Prover: Advancing Stepwise Theorem Proving by Multi-Perspective Search and Data Curation

Zhenwen Liang, Linfeng Song, Yang Li et al.

NEURIPS 2025arXiv:2505.10962

citations

#5542

An All-Atom Generative Model for Designing Protein Complexes

Ruizhe Chen, Dongyu Xue, Xiangxin Zhou et al.

ICML 2025arXiv:2504.13075

citations

#5543

InfiniDreamer: Arbitrarily Long Human Motion Generation via Segment Score Distillation

Wenjie Zhuo, Fan Ma, Hehe Fan

ICCV 2025arXiv:2411.18303

citations

#5544

Accelerated Over-Relaxation Heavy-Ball Method: Achieving Global Accelerated Convergence with Broad Generalization

Jingrong Wei, Long Chen

ICLR 2025arXiv:2406.09772

citations

#5545

CoTMR: Chain-of-Thought Multi-Scale Reasoning for Training-Free Zero-Shot Composed Image Retrieval

Zelong Sun, Dong Jing, Zhiwu Lu

ICCV 2025arXiv:2502.20826

citations

#5546

Overcoming Challenges of Long-Horizon Prediction in Driving World Models

Arian Mousakhan, Sudhanshu Mittal, Silvio Galesso et al.

NEURIPS 2025arXiv:2507.13162

citations

#5547

NutriBench: A Dataset for Evaluating Large Language Models in Nutrition Estimation from Meal Descriptions

Mehak Dhaliwal, Andong Hua, Laya Pullela et al.

ICLR 2025arXiv:2407.12843

citations

#5548

CacheQuant: Comprehensively Accelerated Diffusion Models

Xuewen Liu, Zhikai Li, Qingyi Gu

CVPR 2025arXiv:2503.01323

citations

#5549

MDSGen: Fast and Efficient Masked Diffusion Temporal-Aware Transformers for Open-Domain Sound Generation

Trung X. Pham, Tri Ton, Chang Yoo

ICLR 2025oralarXiv:2410.02130

citations

#5550

TabWak: A Watermark for Tabular Diffusion Models

Chaoyi Zhu, Jiayi Tang, Jeroen Galjaard et al.

ICLR 2025

citations

#5551

Does Training with Synthetic Data Truly Protect Privacy?

Yunpeng Zhao, Jie Zhang

ICLR 2025arXiv:2502.12976

citations

#5552

Compression-Aware One-Step Diffusion Model for JPEG Artifact Removal

Jinpei Guo, Zheng Chen, Wenbo Li et al.

ICCV 2025arXiv:2502.09873

citations

#5553

Mixture of Experts as Representation Learner for Deep Multi-View Clustering

Yunhe Zhang, Jinyu Cai, Zhihao Wu et al.

AAAI 2025paper

citations

#5554

Self-Cross Diffusion Guidance for Text-to-Image Synthesis of Similar Subjects

Weimin Qiu, Jieke Wang, Meng Tang

CVPR 2025arXiv:2411.18936

citations

#5555

AV-Flow: Transforming Text to Audio-Visual Human-like Interactions

Aggelina Chatziagapi, Louis-Philippe Morency, Hongyu Gong et al.

ICCV 2025arXiv:2502.13133

citations

#5556

AFL: A Single-Round Analytic Approach for Federated Learning with Pre-trained Models

Run He, Kai Tong, Di Fang et al.

CVPR 2025arXiv:2405.16240

citations

#5557

Parameter-efficient Fine-tuning in Hyperspherical Space for Open-vocabulary Semantic Segmentation

Zelin Peng, Zhengqin Xu, Zhilin Zeng et al.

CVPR 2025arXiv:2405.18840

citations

#5558

Can Diffusion Models Learn Hidden Inter-Feature Rules Behind Images?

Yujin Han, Andi Han, Wei Huang et al.

ICML 2025arXiv:2502.04725

citations

#5559

Generating Physically Stable and Buildable Brick Structures from Text

Ava Pun, Kangle Deng, Ruixuan Liu et al.

ICCV 2025arXiv:2505.05469

citations

#5560

Bayesian Optimization via Continual Variational Last Layer Training

Paul Brunzema, Mikkel Jordahn, John Willes et al.

ICLR 2025arXiv:2412.09477

citations

#5561

Accessing Vision Foundation Models via ImageNet-1K

Yitian Zhang, Xu Ma, Yue Bai et al.

ICLR 2025arXiv:2407.10366

citations

#5562

Circuit Transformer: A Transformer That Preserves Logical Equivalence

Xihan Li, Xing Li, Lei Chen et al.

ICLR 2025arXiv:2403.13838

citations

#5563

Cross-Modal and Uncertainty-Aware Agglomeration for Open-Vocabulary 3D Scene Understanding

Jinlong Li, Cristiano Saltori, Fabio Poiesi et al.

CVPR 2025arXiv:2503.16707

citations

#5564

KinMo: Kinematic-aware Human Motion Understanding and Generation

Pengfei Zhang, Pinxin Liu, Pablo Garrido et al.

ICCV 2025arXiv:2411.15472

citations

#5565

Homomorphism Counts as Structural Encodings for Graph Learning

Linus Bao, Emily Jin, Michael Bronstein et al.

ICLR 2025arXiv:2410.18676

citations

#5566

Near-Optimal Policy Identification in Robust Constrained Markov Decision Processes via Epigraph Form

Toshinori Kitamura, Tadashi Kozuno, Wataru Kumagai et al.

ICLR 2025arXiv:2408.16286

citations

#5567

Sharpness-Aware Minimization: General Analysis and Improved Rates

Dimitris Oikonomou, Nicolas Loizou

ICLR 2025arXiv:2503.02225

citations

#5568

WHAT MAKES MATH PROBLEMS HARD FOR REINFORCEMENT LEARNING: A CASE STUDY

Ali Shehper, Anibal Medina-Mardones, Lucas Fagan et al.

NEURIPS 2025arXiv:2408.15332

citations

#5569

Are LLMs Prescient? A Continuous Evaluation using Daily News as the Oracle

Hui Dai, Ryan Teehan, Mengye Ren

ICML 2025oralarXiv:2411.08324

citations

#5570

Understanding Adam Requires Better Rotation Dependent Assumptions

Tianyue Zhang, Lucas Maes, Alan Milligan et al.

NEURIPS 2025arXiv:2410.19964

citations

#5571

SoPo: Text-to-Motion Generation Using Semi-Online Preference Optimization

Xiaofeng Tan, Hongsong Wang, Xin Geng et al.

NEURIPS 2025arXiv:2412.05095

citations

#5572

From PEFT to DEFT: Parameter Efficient Finetuning for Reducing Activation Density in Transformers

Bharat Runwal, Tejaswini Pedapati, Pin-Yu Chen

AAAI 2025paperarXiv:2402.01911

citations

#5573

Risk and cross validation in ridge regression with correlated samples

Alexander Atanasov, Jacob A Zavatone-Veth, Cengiz Pehlevan

ICML 2025arXiv:2408.04607

citations

#5574

Data Unlearning in Diffusion Models

Silas Alberti, Kenan Hasanaliyev, Manav Shah et al.

ICLR 2025arXiv:2503.01034

citations

#5575

SUMO: Search-Based Uncertainty Estimation for Model-Based Offline Reinforcement Learning

Zhongjian Qiao, Jiafei Lyu, Kechen Jiao et al.

AAAI 2025paperarXiv:2408.12970

citations

#5576

Offline-to-Online Hyperparameter Transfer for Stochastic Bandits

Dravyansh Sharma, Arun Suggala

AAAI 2025paperarXiv:2501.02926

citations

#5577

4KAgent: Agentic Any Image to 4K Super-Resolution

Yushen Zuo, Qi Zheng, Mingyang Wu et al.

NEURIPS 2025arXiv:2507.07105

citations

#5578

Federated Domain Generalization with Data-free On-server Matching Gradient

Binh Nguyen, Minh-Duong Nguyen, Jinsun Park et al.

ICLR 2025arXiv:2501.14653

citations

#5579

Causally Reliable Concept Bottleneck Models

Giovanni De Felice, Arianna Casanova Flores, Francesco De Santis et al.

NEURIPS 2025arXiv:2503.04363

citations

#5580

RAD: Region-Aware Diffusion Models for Image Inpainting

Sora Kim, Sungho Suh, Minsik Lee

CVPR 2025arXiv:2412.09191

citations

#5581

NeSyC: A Neuro-symbolic Continual Learner For Complex Embodied Tasks in Open Domains

Wonje Choi, Jinwoo Park, Sanghyun Ahn et al.

ICLR 2025arXiv:2503.00870

citations

#5582

REvolve: Reward Evolution with Large Language Models using Human Feedback

RISHI HAZRA, Alkis Sygkounas, Andreas Persson et al.

ICLR 2025arXiv:2406.01309

citations

#5583

Directional Gradient Projection for Robust Fine-Tuning of Foundation Models

Chengyue Huang, Junjiao Tian, Brisa Maneechotesuwan et al.

ICLR 2025arXiv:2502.15895

citations

#5584

Approaching Rate-Distortion Limits in Neural Compression with Lattice Transform Coding

Eric Lei, Hamed Hassani, Shirin Saeedi Bidokhti

ICLR 2025arXiv:2403.07320

citations

#5585

CTSyn: A Foundation Model for Cross Tabular Data Generation

Xiaofeng Lin, Chenheng Xu, Matthew Yang et al.

ICLR 2025arXiv:2406.04619

citations

#5586

GC4NC: A Benchmark Framework for Graph Condensation on Node Classification with New Insights

Shengbo Gong, Juntong Ni, Noveen Sachdeva et al.

NEURIPS 2025arXiv:2406.16715

citations

#5587

CPathAgent: An Agent-based Foundation Model for Interpretable High-Resolution Pathology Image Analysis Mimicking Pathologists' Diagnostic Logic

YUXUAN SUN, Yixuan Si, Chenglu Zhu et al.

NEURIPS 2025arXiv:2505.20510

citations

#5588

Layerwise Recurrent Router for Mixture-of-Experts

Zihan Qiu, Zeyu Huang, Shuang Cheng et al.

ICLR 2025arXiv:2408.06793

citations

#5589

Face-Human-Bench: A Comprehensive Benchmark of Face and Human Understanding for Multi-modal Assistants

Lixiong Qin, Shilong Ou, Miaoxuan Zhang et al.

NEURIPS 2025arXiv:2501.01243

citations

#5590

Web-Shepherd: Advancing PRMs for Reinforcing Web Agents

Hyungjoo Chae, Seonghwan Kim, Junhee Cho et al.

NEURIPS 2025spotlightarXiv:2505.15277

citations

#5591

NullSwap: Proactive Identity Cloaking Against Deepfake Face Swapping

Tianyi Wang, Shuaicheng Niu, Harry Cheng et al.

ICCV 2025arXiv:2503.18678

citations

#5592

3D Student Splatting and Scooping

Jialin Zhu, Jiangbei Yue, Feixiang He et al.

CVPR 2025arXiv:2503.10148

citations

#5593

A General Framework for Producing Interpretable Semantic Text Embeddings

Yiqun Sun, Qiang Huang, Yixuan Tang et al.

ICLR 2025arXiv:2410.03435

citations

#5594

Open CaptchaWorld: A Comprehensive Web-based Platform for Testing and Benchmarking Multimodal LLM Agents

Yaxin Luo, Zhaoyi Li, Jiacheng Liu et al.

NEURIPS 2025arXiv:2505.24878

citations

#5595

SolidGeo: Measuring Multimodal Spatial Math Reasoning in Solid Geometry

Peijie Wang, Chao Yang, Zhong-Zhi Li et al.

NEURIPS 2025arXiv:2505.21177

citations

#5596

IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model

Yatai Ji, Shilong Zhang, Jie Wu et al.

ICLR 2025arXiv:2407.07577

citations

#5597

Erase Diffusion: Empowering Object Removal Through Calibrating Diffusion Pathways

Yi Liu, Hao Zhou, Benlei Cui et al.

CVPR 2025highlightarXiv:2503.07026

citations

#5598

Adaptive Unimodal Regulation for Balanced Multimodal Information Acquisition

Chengxiang Huang, Yake Wei, Zequn Yang et al.

CVPR 2025arXiv:2503.18595

citations

#5599

A multiscale analysis of mean-field transformers in the moderate interaction regime

Giuseppe Bruno, Federico Pasqualotto, Andrea Agazzi

NEURIPS 2025oralarXiv:2509.25040

citations

#5600

Finite-Sample Analysis of Policy Evaluation for Robust Average Reward Reinforcement Learning

Yang Xu, Washim Mondal, Vaneet Aggarwal

NEURIPS 2025arXiv:2502.16816

citations

← Previous

1...26 27 28 29 30...112