Most Cited 2025 &quot;parameterized environment configurations&quot; Papers

AAAI 2025paperarXiv:2406.11497

#1402

CrAM: Credibility-Aware Attention Modification in LLMs for Combating Misinformation in RAG

Boyi Deng, Wenjie Wang, Fengbin Zhu et al.

ICLR 2025posterarXiv:2412.07775

#1403

Efficient Diversity-Preserving Diffusion Alignment via Gradient-Informed GFlowNets

Zhen Liu, Tim Xiao, Weiyang Liu et al.

ICLR 2025posterarXiv:2405.20935

#1404

Effective Interplay between Sparsity and Quantization: From Theory to Practice

Simla Harma, Ayan Chakraborty, Elizaveta Kostenok et al.

CVPR 2025posterarXiv:2405.12661

#1405

EmoEdit: Evoking Emotions through Image Manipulation

Jingyuan Yang, Jiawei Feng, Weibin Luo et al.

CVPR 2025posterarXiv:2408.09859

#1406

OccMamba: Semantic Occupancy Prediction with State Space Models

Heng Li, Yuenan Hou, Xiaohan Xing et al.

NEURIPS 2025posterarXiv:2506.10943

#1407

Self-Adapting Language Models

Adam Zweiger, Jyo Pari, Han Guo et al.

CVPR 2025posterarXiv:2503.11423

#1408

TASTE-Rob: Advancing Video Generation of Task-Oriented Hand-Object Interaction for Generalizable Robotic Manipulation

Hongxiang Zhao, Xingchen Liu, Mutian Xu et al.

CVPR 2025posterarXiv:2505.23766

#1409

Argus: Vision-Centric Reasoning with Grounded Chain-of-Thought

Yunze Man, De-An Huang, Guilin Liu et al.

NEURIPS 2025oralarXiv:2502.01506

#1410

TwinMarket: A Scalable Behavioral and Social Simulation for Financial Markets

Yuzhe YANG, Yifei Zhang, Minghao Wu et al.

ICCV 2025posterarXiv:2504.01016

#1411

GeometryCrafter: Consistent Geometry Estimation for Open-world Videos with Diffusion Priors

Tian-Xing Xu, Xiangjun Gao, Wenbo Hu et al.

ICLR 2025posterarXiv:2406.12034

#1412

Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts

Junmo Kang, Leonid Karlinsky, Hongyin Luo et al.

ICLR 2025posterarXiv:2502.10707

#1413

Reading Your Heart: Learning ECG Words and Sentences via Pre-training ECG Language Model

Jiarui Jin, Haoyu Wang, Hongyan Li et al.

ICLR 2025posterarXiv:2406.03386

#1414

Learning Long Range Dependencies on Graphs via Random Walks

Dexiong Chen, Till Schulz, Karsten Borgwardt

AAAI 2025paperarXiv:2407.20584

#1415

Pruning Large Language Models with Semi-Structural Adaptive Sparse Training

Weiyu Huang, Yuezhou Hu, Guohao Jian et al.

ICLR 2025posterarXiv:2410.03878

#1416

SPARTUN3D: Situated Spatial Understanding of 3D World in Large Language Model

Yue Zhang, Zhiyang Xu, Ying Shen et al.

#1417

Influence-Guided Diffusion for Dataset Distillation

Mingyang Chen, Jiawei Du, Bo Huang et al.

ICLR 2025posterarXiv:2410.23228

#1418

Emergence of meta-stable clustering in mean-field transformer models

Giuseppe Bruno, Federico Pasqualotto, Andrea Agazzi

ICLR 2025posterarXiv:2410.07869

#1419

Benchmarking Agentic Workflow Generation

Shuofei Qiao, Runnan Fang, Zhisong Qiu et al.

AAAI 2025paperarXiv:2408.11491

#1420

SCANS: Mitigating the Exaggerated Safety for LLMs via Safety-Conscious Activation Steering

Zouying Cao, Yifei Yang, Hai Zhao

ICLR 2025posterarXiv:2411.09009

#1421

Cut Your Losses in Large-Vocabulary Language Models

Erik Wijmans, Brody Huval, Alexander Hertzberg et al.

CVPR 2025posterarXiv:2411.17864

#1422

Generative Image Layer Decomposition with Visual Effects

Jinrui Yang, Qing Liu, Yijun Li et al.

AAAI 2025paperarXiv:2409.02664

#1423

Standing on the Shoulders of Giants: Reprogramming Visual-Language Model for General Deepfake Detection

Kaiqing Lin, Yuzhen Lin, Weixiang Li et al.

ICLR 2025posterarXiv:2409.15477

#1424

MediConfusion: Can you trust your AI radiologist? Probing the reliability of multimodal medical foundation models

Mohammad Shahab Sepehri, Zalan Fabian, Maryam Soltanolkotabi et al.

CVPR 2025posterarXiv:2406.10638

#1425

Unveiling the Ignorance of MLLMs: Seeing Clearly, Answering Incorrectly

Yexin Liu, Zhengyang Liang, Yueze Wang et al.

NEURIPS 2025posterarXiv:2506.04210

#1426

Does Thinking More Always Help? Mirage of Test-Time Scaling in Reasoning Models

Soumya Suvra Ghosal, Souradip Chakraborty, Avinash Reddy et al.

ICLR 2025posterarXiv:2412.06394

#1427

GameArena: Evaluating LLM Reasoning through Live Computer Games

Lanxiang Hu, Qiyu Li, Anze Xie et al.

CVPR 2025posterarXiv:2408.08665

#1428

QMambaBSR: Burst Image Super-Resolution with Query State Space Model

Xin Di, Long Peng, Peizhe Xia et al.

ICCV 2025posterarXiv:2504.15485

#1429

CAPTURE: Evaluating Spatial Reasoning in Vision Language Models via Occluded Object Counting

Atin Pothiraj, Jaemin Cho, Elias Stengel-Eskin et al.

NEURIPS 2025spotlightarXiv:2504.13161

#1430

Nemotron-CLIMB: Clustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Shizhe Diao, Yu Yang, Yonggan Fu et al.

CVPR 2025highlightarXiv:2502.10794

#1431

Distraction is All You Need for Multimodal Large Language Model Jailbreaking

Zuopeng Yang, Jiluan Fan, Anli Yan et al.

AAAI 2025paperarXiv:2412.17176

#1432

WPMixer: Efficient Multi-Resolution Mixing for Long-Term Time Series Forecasting

Md Mahmuddun Nabi Murad, Mehmet Aktukmak, Yasin Yilmaz

ICLR 2025posterarXiv:2410.17637

#1433

MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models

Ziyu Liu, Yuhang Zang, Xiaoyi Dong et al.

NEURIPS 2025posterarXiv:2504.11543

#1434

REAL: Benchmarking Autonomous Agents on Deterministic Simulations of Real Websites

Div Garg, Diego Caples, Andis Draguns et al.

CVPR 2025posterarXiv:2411.11505

#1435

LaVin-DiT: Large Vision Diffusion Transformer

Zhaoqing Wang, Xiaobo Xia, Runnan Chen et al.

NEURIPS 2025posterarXiv:2505.16552

#1436

Think Silently, Think Fast: Dynamic Latent Compression of LLM Reasoning Chains

Wenhui Tan, Jiaze Li, Jianzhong Ju et al.

ICML 2025posterarXiv:2412.04141

#1437

Reducing Tool Hallucination via Reliability Alignment

Hongshen Xu, Zichen Zhu, Lei Pan et al.

ICLR 2025posterarXiv:2410.04223

#1438

Multimodal Large Language Models for Inverse Molecular Design with Retrosynthetic Planning

Gang Liu, Michael Sun, Wojciech Matusik et al.

CVPR 2025posterarXiv:2503.11240

#1439

Towards Better Alignment: Training Diffusion Models with Reinforcement Learning Against Sparse Rewards

Zijing Hu, Fengda Zhang, Long Chen et al.

ICLR 2025posterarXiv:2305.18512

#1440

A Rainbow in Deep Network Black Boxes

Florentin Guth, Brice Ménard, Gaspar Rochette et al.

CVPR 2025posterarXiv:2504.09228

#1441

Learning Occlusion-Robust Vision Transformers for Real-Time UAV Tracking

You Wu, Xucheng Wang, Xiangyang Yang et al.

NEURIPS 2025posterarXiv:2505.20347

#1442

SeRL: Self-play Reinforcement Learning for Large Language Models with Limited Data

Wenkai Fang, Shunyu Liu, Yang Zhou et al.

CVPR 2025highlightarXiv:2503.08257

#1443

DexGrasp Anything: Towards Universal Robotic Dexterous Grasping with Physics Awareness

Yiming Zhong, Qi Jiang, Jingyi Yu et al.

CVPR 2025posterarXiv:2412.19326

#1444

Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment

ziang yan, Zhilin Li, Yinan He et al.

ICLR 2025posterarXiv:2405.15429

#1445

E(n) Equivariant Topological Neural Networks

Claudio Battiloro, Ege Karaismailoglu, Mauricio Tec et al.

AAAI 2025paperarXiv:2311.14768

#1446

AdaDiff: Adaptive Step Selection for Fast Diffusion Models

Hui Zhang, Zuxuan Wu, Zhen Xing et al.

NEURIPS 2025oralarXiv:2506.03719

#1447

On the Closed-Form of Flow Matching: Generalization Does Not Arise from Target Stochasticity

Quentin Bertrand, Anne Gagneux, Mathurin Massias et al.

#1448

Occlusion-Embedded Hybrid Transformer for Light Field Super-Resolution

Zeyu Xiao, Zhuoyuan Li, Wei Jia

AAAI 2025paper

CVPR 2025highlightarXiv:2412.03240

#1449

Task-driven Image Fusion with Learnable Fusion Loss

Haowen Bai, Jiangshe Zhang, Zixiang Zhao et al.

#1450

Efficiently Scaling LLM Reasoning Programs with Certaindex

Yichao Fu, Junda Chen, Siqi Zhu et al.

ICLR 2025posterarXiv:2501.12735

#1451

Online Preference Alignment for Language Models via Count-based Exploration

Chenjia Bai, Yang Zhang, Shuang Qiu et al.

ICLR 2025posterarXiv:2410.03765

#1452

Basis Sharing: Cross-Layer Parameter Sharing for Large Language Model Compression

Jingcun Wang, Yu-Guang Chen, Ing-Chao Lin et al.

AAAI 2025paperarXiv:2403.15249

#1453

Spectral Motion Alignment for Video Motion Transfer Using Diffusion Models

Geon Yeong Park, Hyeonho Jeong, Sang Wan Lee et al.

ICML 2025posterarXiv:2410.10347

#1454

A Unified Approach to Routing and Cascading for LLMs

Jasper Dekoninck, Maximilian Baader, Martin Vechev

CVPR 2025posterarXiv:2411.18625

#1455

Textured Gaussians for Enhanced 3D Scene Appearance Modeling

Brian Chao, Hung-Yu Tseng, Lorenzo Porzi et al.

ICLR 2025posterarXiv:2410.24206

#1456

Understanding Optimization in Deep Learning with Central Flows

Jeremy Cohen, Alex Damian, Ameet Talwalkar et al.

ICCV 2025posterarXiv:2503.16421

#1457

MagicMotion: Controllable Video Generation with Dense-to-Sparse Trajectory Guidance

Quanhao Li, Zhen Xing, Rui Wang et al.

NEURIPS 2025posterarXiv:2503.21322

#1458

HyperGraphRAG: Retrieval-Augmented Generation via Hypergraph-Structured Knowledge Representation

Haoran Luo, Haihong E, Guanting Chen et al.

ICML 2025posterarXiv:2407.03310

#1459

Universal Length Generalization with Turing Programs

Kaiying Hou, David Brandfonbrener, Sham Kakade et al.

ICLR 2025posterarXiv:2410.19803

#1460

First-Person Fairness in Chatbots

Tyna Eloundou, Alex Beutel, David Robinson et al.

NEURIPS 2025posterarXiv:2502.20313

#1461

FlexVAR: Flexible Visual Autoregressive Modeling without Residual Prediction

Siyu Jiao, Gengwei Zhang, Yinlong Qian et al.

ICLR 2025posterarXiv:2409.19151

#1462

Can LLMs Really Learn to Translate a Low-Resource Language from One Grammar Book?

Seth Aycock, David Stap, Di Wu et al.

ICLR 2025posterarXiv:2504.12532

#1463

Generalization through variance: how noise shapes inductive biases in diffusion models

John Vastola

CVPR 2025highlightarXiv:2412.04462

#1464

4Real-Video: Learning Generalizable Photo-Realistic 4D Video Diffusion

Chaoyang Wang, Peiye Zhuang, Tuan Duc Ngo et al.

NEURIPS 2025posterarXiv:2505.16394

#1465

Raw2Drive: Reinforcement Learning with Aligned World Models for End-to-End Autonomous Driving (in CARLA v2)

Zhenjie Yang, Xiaosong Jia, Qifeng Li et al.

CVPR 2025posterarXiv:2412.06647

#1466

Object Detection using Event Camera: A MoE Heat Conduction based Detector and A New Benchmark Dataset

Xiao Wang, Yu Jin, Wentao Wu et al.

CVPR 2025highlightarXiv:2412.04458

#1467

Cubify Anything: Scaling Indoor 3D Object Detection

Justin Lazarow, David Griffiths, Gefen Kohavi et al.

NEURIPS 2025oralarXiv:2503.13139

#1468

Logic-in-Frames: Dynamic Keyframe Search via Visual Semantic-Logical Verification for Long Video Understanding

Weiyu Guo, Ziyang Chen, Shaoguang WANG et al.

ICLR 2025oralarXiv:2404.02148

#1469

Diffusion$^2$: Dynamic 3D Content Generation via Score Composition of Video and Multi-view Diffusion Models

Zeyu Yang, Zijie Pan, Chun Gu et al.

CVPR 2025posterarXiv:2412.01615

#1470

OmniGuard: Hybrid Manipulation Localization via Augmented Versatile Deep Image Watermarking

Xuanyu Zhang, Zecheng Tang, Zhipei Xu et al.

ICLR 2025oralarXiv:2410.01153

#1471

Text2PDE: Latent Diffusion Models for Accessible Physics Simulation

Anthony Zhou, Zijie Li, Michael Schneier et al.

ICML 2025oralarXiv:2503.07067

#1472

DistiLLM-2: A Contrastive Approach Boosts the Distillation of LLMs

Jongwoo Ko, Tianyi Chen, Sungnyun Kim et al.

ICLR 2025posterarXiv:2403.10380

#1473

BirdSet: A Large-Scale Dataset for Audio Classification in Avian Bioacoustics

Lukas Rauch, Raphael Schwinger, Moritz Wirth et al.

ICLR 2025posterarXiv:2501.13381

#1474

Do as We Do, Not as You Think: the Conformity of Large Language Models

Zhiyuan Weng, Guikun Chen, Wenguan Wang

ICLR 2025posterarXiv:2410.17195

#1475

Non-myopic Generation of Language Models for Reasoning and Planning

Chang Ma, Haiteng Zhao, Junlei Zhang et al.

ICML 2025posterarXiv:2502.17298

#1476

Delta Decompression for MoE-based LLMs Compression

Hao Gu, Wei Li, Lujun Li et al.

NEURIPS 2025posterarXiv:2505.13031

#1477

MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO

Yicheng Xiao, Lin Song, Yukang Chen et al.

NEURIPS 2025oralarXiv:2503.14345

#1478

MoonCast: High-Quality Zero-Shot Podcast Generation

Zeqian Ju, Dongchao Yang, Shen Kai et al.

CVPR 2025posterarXiv:2503.14830

#1479

Decompositional Neural Scene Reconstruction with Generative Diffusion Prior

Junfeng Ni, Yu Liu, Ruijie Lu et al.

CVPR 2025posterarXiv:2503.16591

#1480

UniK3D: Universal Camera Monocular 3D Estimation

Luigi Piccinelli, Christos Sakaridis, Mattia Segu et al.

CVPR 2025posterarXiv:2412.18108

#1481

Unveiling Visual Perception in Language Models: An Attention Head Analysis Approach

Jing Bi, Lianggong Bruce Wen, Zhang Liu et al.

AAAI 2025paperarXiv:2407.19323

#1482

MSP-MVS: Multi-Granularity Segmentation Prior Guided Multi-View Stereo

Zhenlong Yuan, Cong Liu, Fei Shen et al.

#1483

Discretization-invariance? On the Discretization Mismatch Errors in Neural Operators

Wenhan Gao, Ruichen Xu, Yuefan Deng et al.

ICLR 2025posterarXiv:2407.01163

#1484

Benchmarking Predictive Coding Networks -- Made Simple

Luca Pinchetti, Chang Qi, Oleh Lokshyn et al.

ICCV 2025highlightarXiv:2502.06788

#1485

EVEv2: Improved Baselines for Encoder-Free Vision-Language Models

Haiwen Diao, Xiaotong Li, Yufeng Cui et al.

CVPR 2025posterarXiv:2412.07776

#1486

Video Motion Transfer with Diffusion Transformers

Alexander Pondaven, Aliaksandr Siarohin, Sergey Tulyakov et al.

NEURIPS 2025posterarXiv:2505.12448

#1487

SSR: Enhancing Depth Perception in Vision-Language Models via Rationale-Guided Spatial Reasoning

Yang Liu, Ming Ma, Xiaomin Yu et al.

NEURIPS 2025posterarXiv:2210.14051

#1488

Bridging Distributional and Risk-sensitive Reinforcement Learning with Provable Regret Bounds

Hao Liang, Zhiquan Luo

CVPR 2025posterarXiv:2504.14666

#1489

Generative Multimodal Pretraining with Discrete Diffusion Timestep Tokens

Kaihang Pan, Wang Lin, Zhongqi Yue et al.

ICCV 2025posterarXiv:2412.09573

#1490

FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction

Jiale Xu, Shenghua Gao, Ying Shan

ICLR 2025posterarXiv:2412.01769

#1491

Commit0: Library Generation from Scratch

Wenting Zhao, Nan Jiang, Celine Lee et al.

NEURIPS 2025spotlightarXiv:2505.23433

#1492

Diversity-Aware Policy Optimization for Large Language Model Reasoning

Jian Yao, Ran Cheng, Xingyu Wu et al.

ICLR 2025posterarXiv:2410.02486

#1493

Encryption-Friendly LLM Architecture

Donghwan Rho, Taeseong Kim, Minje Park et al.

#1494

Boosting Neural Combinatorial Optimization for Large-Scale Vehicle Routing Problems

Fu Luo, Xi Lin, Yaoxin Wu et al.

ICLR 2025posterarXiv:2412.12098

#1495

MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization

Bhavya, Stelian Coros, Andreas Krause et al.

ICLR 2025posterarXiv:2406.10630

#1496

Emerging Safety Attack and Defense in Federated Instruction Tuning of Large Language Models

Rui Ye, Jingyi Chai, Xiangrui Liu et al.

AAAI 2025paperarXiv:2408.14909

#1497

SpikingSSMs: Learning Long Sequences with Sparse and Parallel Spiking State Space Models

Shuaijie Shen, Chao Wang, Renzhuo Huang et al.

#1498

VSP: Diagnosing the Dual Challenges of Perception and Reasoning in Spatial Planning Tasks for MLLMs

Qiucheng Wu, Handong Zhao, Michael Saxon et al.

ICCV 2025poster

ICLR 2025posterarXiv:2502.21212

#1499

Transformers Learn to Implement Multi-step Gradient Descent with Chain of Thought

Jianhao Huang, Zixuan Wang, Jason Lee

NEURIPS 2025posterarXiv:2502.14819

#1500

Learning from Reward-Free Offline Data: A Case for Planning with Latent Dynamics Models

Uladzislau Sobal, Wancong Zhang, Kyunghyun Cho et al.

ICLR 2025posterarXiv:2412.07961

#1501

Forking Paths in Neural Text Generation

Eric Bigelow, Ari Holtzman, Hidenori Tanaka et al.

AAAI 2025paperarXiv:2304.11787

#1502

B2Opt: Learning to Optimize Black-box Optimization with Little Budget

Xiaobin Li, Kai Wu, Xiaoyu Zhang et al.

ICLR 2025posterarXiv:2410.02479

#1503

Cross-Embodiment Dexterous Grasping with Reinforcement Learning

Haoqi Yuan, Bohan Zhou, Yuhui Fu et al.

CVPR 2025posterarXiv:2503.20823

#1504

Playing the Fool: Jailbreaking LLMs and Multimodal LLMs with Out-of-Distribution Strategy

Joonhyun Jeong, Seyun Bae, Yeonsung Jung et al.

CVPR 2025posterarXiv:2503.01610

#1505

Vid2Avatar-Pro: Authentic Avatar from Videos in the Wild via Universal Prior

Chen Guo, Junxuan Li, Yash Kant et al.

ICLR 2025posterarXiv:2407.17773

#1506

KiVA: Kid-inspired Visual Analogies for Testing Large Multimodal Models

Eunice Yiu, Maan Qraitem, Anisa Majhi et al.

AAAI 2025paperarXiv:2411.12877

#1507

The Illusion of Empathy: How AI Chatbots Shape Conversation Perception

Tingting Liu, Salvatore Giorgi, Ankit Aich et al.

AAAI 2025paperarXiv:2502.13308

#1508

A Label-free Heterophily-guided Approach for Unsupervised Graph Fraud Detection

Junjun Pan, Yixin Liu, Xin Zheng et al.

ICLR 2025posterarXiv:2409.11406

#1509

Phidias: A Generative Model for Creating 3D Content from Text, Image, and 3D Conditions with Reference-Augmented Diffusion

zhenwei Wang, Tengfei Wang, Zexin He et al.

ICLR 2025posterarXiv:2411.16525

#1510

Fundamental Limits of Prompt Tuning Transformers: Universality, Capacity and Efficiency

Jerry Yao-Chieh Hu, Wei-Po Wang, Ammar Gilani et al.

ICLR 2025posterarXiv:2410.07164

#1511

AvatarGO: Zero-shot 4D Human-Object Interaction Generation and Animation

Yukang Cao, Liang Pan, Kai Han et al.

AAAI 2025paperarXiv:2501.04628

#1512

FatesGS: Fast and Accurate Sparse-View Surface Reconstruction Using Gaussian Splatting with Depth-Feature Consistency

Han Huang, Yulun Wu, Chao Deng et al.

ICLR 2025oralarXiv:2410.03024

#1513

Flow Matching with Gaussian Process Priors for Probabilistic Time Series Forecasting

Marcel Kollovieh, Marten Lienen, David Lüdke et al.

ICLR 2025posterarXiv:2411.00418

#1514

SELF-EVOLVED REWARD LEARNING FOR LLMS

Chenghua Huang, Zhizhen Fan, Lu Wang et al.

#1515

Merging on the Fly Without Retraining: A Sequential Approach to Scalable Continual Model Merging

Anke Tang, Enneng Yang, Li Shen et al.

CVPR 2025highlightarXiv:2412.09586

#1516

Gaze-LLE: Gaze Target Estimation via Large-Scale Learned Encoders

Fiona Ryan, Ajay Bati, Sangmin Lee et al.

ICLR 2025posterarXiv:2405.01033

#1517

CrossMPT: Cross-attention Message-passing Transformer for Error Correcting Codes

Seong-Joon Park, Hee-Youl Kwak, Sang-Hyo Kim et al.

ICLR 2025posterarXiv:2501.15140

#1518

Analyzing and Boosting the Power of Fine-Grained Visual Recognition for Multi-modal Large Language Models

Hulingxiao He, Geng Li, Zijun Geng et al.

#1519

VLM-R³: Region Recognition, Reasoning, and Refinement for Enhanced Multimodal Chain-of-Thought

Chaoya Jiang, Yongrui Heng, Wei Ye et al.

CVPR 2025posterarXiv:2412.00905

#1520

Ref-GS: Directional Factorization for 2D Gaussian Splatting

Youjia Zhang, Anpei Chen, Yumin Wan et al.

NEURIPS 2025spotlightarXiv:2503.04412

#1521

Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search

Yuichi Inoue, Kou Misaki, Yuki Imajuku et al.

ICLR 2025posterarXiv:2409.19913

#1522

Scaling Optimal LR Across Token Horizons

Johan Bjorck, Alon Benhaim, Vishrav Chaudhary et al.

AAAI 2025paperarXiv:2403.10045

#1523

Towards Adversarially Robust Dataset Distillation by Curvature Regularization

Eric Xue, Yijiang Li, Haoyang Liu et al.

ICLR 2025posterarXiv:2408.14774

#1524

Instruct-SkillMix: A Powerful Pipeline for LLM Instruction Tuning

Simran Kaur, Simon Park, Anirudh Goyal et al.

NEURIPS 2025posterarXiv:2502.04780

#1525

SiriuS: Self-improving Multi-agent Systems via Bootstrapped Reasoning

Wanjia Zhao, Mert Yuksekgonul, Shirley Wu et al.

CVPR 2025posterarXiv:2412.08614

#1526

Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioning

Fan Lu, Wei Wu, Kecheng Zheng et al.

ICLR 2025posterarXiv:2403.10444

#1527

Block Verification Accelerates Speculative Decoding

Ziteng Sun, Uri Mendlovic, Yaniv Leviathan et al.

CVPR 2025posterarXiv:2503.11629

#1528

TreeMeshGPT: Artistic Mesh Generation with Autoregressive Tree Sequencing

Stefan Lionar, Jiabin Liang, Gim Hee Lee

#1529

TIME-FS: Joint Learning of Tensorial Incomplete Multi-View Unsupervised Feature Selection and Missing-View Imputation

Yanyong Huang, Minghui Lu, Wei Huang et al.

AAAI 2025paper

ICLR 2025posterarXiv:2407.19451

#1530

Perm: A Parametric Representation for Multi-Style 3D Hair Modeling

Chengan He, Xin Sun, Zhixin Shu et al.

NEURIPS 2025spotlightarXiv:2506.14866

#1531

OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents

Thomas Kuntz, Agatha Duzan, Hao Zhao et al.

ICLR 2025posterarXiv:2407.05013

#1532

Progress or Regress? Self-Improvement Reversal in Post-training

Ting Wu, Xuefeng Li, Pengfei Liu

ICLR 2025posterarXiv:2405.15506

#1533

Learning to Discretize Denoising Diffusion ODEs

Vinh Tong, Trung-Dung Hoang, Anji Liu et al.

CVPR 2025posterarXiv:2411.19654

#1534

TexGaussian: Generating High-quality PBR Material via Octree-based 3D Gaussian Splatting

Bojun Xiong, Jialun Liu, JiaKui Hu et al.

ICLR 2025posterarXiv:2402.02746

#1535

Standard Gaussian Process is All You Need for High-Dimensional Bayesian Optimization

Zhitong Xu, Haitao Wang, Jeff Phillips et al.

ICLR 2025posterarXiv:2403.12931

#1536

You Only Sample Once: Taming One-Step Text-to-Image Synthesis by Self-Cooperative Diffusion GANs

Yihong Luo, Xiaolong Chen, Xinghua Qu et al.

ICML 2025posterarXiv:2411.16375

#1537

Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing

Kaifeng Gao, Jiaxin Shi, Hanwang Zhang et al.

CVPR 2025posterarXiv:2503.24026

#1538

HumanDreamer: Generating Controllable Human-Motion Videos via Decoupled Generation

Boyuan Wang, Xiaofeng Wang, Chaojun Ni et al.

CVPR 2025posterarXiv:2411.18499

#1539

OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation

Pengfei Zhou, Xiaopeng Peng, Jiajun Song et al.

CVPR 2025posterarXiv:2404.06091

#1540

Hash3D: Training-free Acceleration for 3D Generation

Xingyi Yang, Songhua Liu, Xinchao Wang

AAAI 2025paperarXiv:2501.02268

#1541

What Kind of Visual Tokens Do We Need? Training-Free Visual Token Pruning for Multi-Modal Large Language Models from the Perspective of Graph

Yutao Jiang, Qiong Wu, Wenhao Lin et al.

NEURIPS 2025oralarXiv:2505.21334

#1542

HoliTom: Holistic Token Merging for Fast Video Large Language Models

Kele Shao, Keda TAO, Can Qin et al.

ICLR 2025oralarXiv:2410.19892

#1543

Air Quality Prediction with Physics-Guided Dual Neural ODEs in Open Systems

jindong tian, Yuxuan Liang, Ronghui Xu et al.

NEURIPS 2025posterarXiv:2506.13691

#1544

UltraVideo: High-Quality UHD Video Dataset with Comprehensive Captions

Xue zhucun, Jiangning Zhang, Teng Hu et al.

AAAI 2025paperarXiv:2412.12628

#1545

Dense Audio-Visual Event Localization Under Cross-Modal Consistency and Multi-Temporal Granularity Collaboration

Ziheng Zhou, Jinxing Zhou, Wei Qian et al.

ICLR 2025posterarXiv:2405.01155

#1546

SynFlowNet: Design of Diverse and Novel Molecules with Synthesis Constraints

Miruna Cretu, Charles Harris, Ilia Igashov et al.

NEURIPS 2025posterarXiv:2506.10967

#1547

Beyond Attention or Similarity: Maximizing Conditional Diversity for Token Pruning in MLLMs

Qizhe Zhang, Mengzhen Liu, Lichen Li et al.

ICLR 2025posterarXiv:2405.17537

#1548

CLIBD: Bridging Vision and Genomics for Biodiversity Monitoring at Scale

ZeMing Gong, Austin Wang, Xiaoliang Huo et al.

ICLR 2025posterarXiv:2405.18065

#1549

EffoVPR: Effective Foundation Model Utilization for Visual Place Recognition

Issar Tzachor, Boaz Lerner, Matan Levy et al.

NEURIPS 2025posterarXiv:2505.11049

#1550

GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning

Yue Liu, Shengfang Zhai, Mingzhe Du et al.

ICLR 2025posterarXiv:2503.13881

#1551

MMR: A Large-scale Benchmark Dataset for Multi-target and Multi-granularity Reasoning Segmentation

Donggon Jang, Yucheol Cho, Suin Lee et al.

CVPR 2025posterarXiv:2412.06143

#1552

Precise, Fast, and Low-cost Concept Erasure in Value Space: Orthogonal Complement Matters

Yuan Wang, Ouxiang Li, Tingting Mu et al.

ICCV 2025posterarXiv:2503.06132

#1553

USP: Unified Self-Supervised Pretraining for Image Generation and Understanding

Xiangxiang Chu, Renda Li, Yong Wang

CVPR 2025posterarXiv:2412.00833

#1554

AlignMamba: Enhancing Multimodal Mamba with Local and Global Cross-modal Alignment

Yan Li, Yifei Xing, Xiangyuan Lan et al.

ICLR 2025posterarXiv:2405.16821

#1555

Perturbation-Restrained Sequential Model Editing

Jun-Yu Ma, Hong Wang, Hao-Xiang Xu et al.

ICLR 2025posterarXiv:2409.15771

#1556

Zero-shot forecasting of chaotic systems

Yuanzhao Zhang, William Gilpin

#1557

VITA-Audio: Fast Interleaved Audio-Text Token Generation for Efficient Large Speech-Language Model

Zuwei Long, Yunhang Shen, Chaoyou Fu et al.

ICLR 2025posterarXiv:2503.22166

#1558

Reasoning of Large Language Models over Knowledge Graphs with Super-Relations

Song Wang, Junhong Lin, Xiaojie Guo et al.

ICLR 2025posterarXiv:2410.06672

#1559

Towards Universality: Studying Mechanistic Similarity Across Language Model Architectures

Junxuan Wang, Xuyang Ge, Wentao Shu et al.

ICLR 2025posterarXiv:2405.18654

#1560

Mitigating Object Hallucination in MLLMs via Data-augmented Phrase-level Alignment

Pritam Sarkar, Sayna Ebrahimi, Ali Etemad et al.

ICLR 2025posterarXiv:2406.15523

#1561

Unifying Unsupervised Graph-Level Anomaly Detection and Out-of-Distribution Detection: A Benchmark

Yili Wang, Yixin Liu, Xu Shen et al.

CVPR 2025posterarXiv:2406.14235

#1562

Mitigating the Human-Robot Domain Discrepancy in Visual Pre-training for Robotic Manipulation

Jiaming Zhou, Teli Ma, Kun-Yu Lin et al.

ICLR 2025posterarXiv:2502.06501

#1563

Learning Clustering-based Prototypes for Compositional Zero-Shot Learning

Hongyu Qu, Jianan Wei, Xiangbo Shu et al.

ICCV 2025posterarXiv:2406.04875

#1564

3DRealCar: An In-the-wild RGB-D Car Dataset with 360-degree Views

Xiaobiao Du, Yida Wang, Haiyang Sun et al.

ICLR 2025oralarXiv:2405.17890

#1565

SLMRec: Distilling Large Language Models into Small for Sequential Recommendation

Wujiang Xu, Qitian Wu, Zujie Liang et al.

ICLR 2025posterarXiv:2411.07404

#1566

Controllable Context Sensitivity and the Knob Behind It

Julian Minder, Kevin Du, Niklas Stoehr et al.

CVPR 2025posterarXiv:2412.09606

#1567

Feat2GS: Probing Visual Foundation Models with Gaussian Splatting

Yue Chen, Xingyu Chen, Anpei Chen et al.

CVPR 2025posterarXiv:2311.01479

#1568

Detecting Out-of-Distribution Through the Lens of Neural Collapse

Litian Liu, Yao Qin

ICLR 2025posterarXiv:2503.10728

#1569

DarkBench: Benchmarking Dark Patterns in Large Language Models

Esben Kran, Hieu Minh Nguyen, Akash Kundu et al.

NEURIPS 2025posterarXiv:2505.05495

#1570

Learning 3D Persistent Embodied World Models

Siyuan Zhou, Yilun Du, Yuncong Yang et al.

NEURIPS 2025posterarXiv:2506.01946

#1571

MLLMs Need 3D-Aware Representation Supervision for Scene Understanding

Xiaohu Huang, Jingjing Wu, Qunyi Xie et al.

ICLR 2025posterarXiv:2405.14260

#1572

Graph Sparsification via Mixture of Graphs

Guibin Zhang, Xiangguo SUN, Yanwei Yue et al.

ICLR 2025posterarXiv:2410.09846

#1573

A Quantum Circuit-Based Compression Perspective for Parameter-Efficient Learning

Chen-Yu Liu, Chao-Han Huck Yang, Hsi-Sheng Goan et al.

ICLR 2025posterarXiv:2507.23143

#1574

X-NeMo: Expressive Neural Motion Reenactment via Disentangled Latent Attention

XiaoChen Zhao, Hongyi Xu, Guoxian Song et al.

ICLR 2025posterarXiv:2412.01786

#1575

Gradient-Free Generation for Hard-Constrained Systems

Chaoran Cheng, Boran Han, Danielle Maddix et al.

ICCV 2025posterarXiv:2503.23368

#1576

VLIPP: Towards Physically Plausible Video Generation with Vision and Language Informed Physical Prior

Xindi Yang, Baolu Li, Yiming Zhang et al.

ICLR 2025oralarXiv:2410.23277

#1577

SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation

Yining Hong, Beide Liu, Maxine Wu et al.

CVPR 2025highlightarXiv:2501.02973

#1578

HaWoR: World-Space Hand Motion Reconstruction from Egocentric Videos

Jinglei Zhang, Jiankang Deng, Chao Ma et al.

ICLR 2025posterarXiv:2410.02381

#1579

MetaMetrics: Calibrating Metrics for Generation Tasks Using Human Preferences

Genta Winata, David Anugraha, Lucky Susanto et al.

AAAI 2025paperarXiv:2409.02914

#1580

Can LVLMs Obtain a Driver’s License? A Benchmark Towards Reliable AGI for Autonomous Driving

Yuhang Lu, Yichen Yao, Jiadong Tu et al.

CVPR 2025highlightarXiv:2504.03193

#1581

Mamba as a Bridge: Where Vision Foundation Models Meet Vision Language Models for Domain-Generalized Semantic Segmentation

Xin Zhang, Robby T. Tan

ICCV 2025posterarXiv:2412.14042

#1582

CAD-Recode: Reverse Engineering CAD Code from Point Clouds

Danila Rukhovich, Elona Dupont, Dimitrios Mallis et al.

AAAI 2025paperarXiv:2408.11330

#1583

Design Principle Transfer in Neural Architecture Search via Large Language Models

Xun Zhou, Xingyu Wu, Liang Feng et al.

#1584

Optimal Transport for Time Series Imputation

Hao Wang, zhengnan li, Haoxuan Li et al.

ICLR 2025oral

CVPR 2025posterarXiv:2503.22420

#1585

Unveiling the Mist over 3D Vision-Language Understanding: Object-centric Evaluation with Chain-of-Analysis

Jiangyong Huang, Baoxiong Jia, Yan Wang et al.

ICCV 2025posterarXiv:2411.14961

#1586

LoRA-FAIR: Federated LoRA Fine-Tuning with Aggregation and Initialization Refinement

Jieming Bian, Lei Wang, Letian Zhang et al.

ICCV 2025posterarXiv:2503.16430

#1587

Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation

Yuqing Wang, Zhijie Lin, Yao Teng et al.

ICLR 2025posterarXiv:2402.10727

#1588

From Risk to Uncertainty: Generating Predictive Uncertainty Measures via Bayesian Estimation

Nikita Kotelevskii, Vladimir Kondratyev, Martin Takáč et al.

CVPR 2025posterarXiv:2412.04037

#1589

INFP: Audio-Driven Interactive Head Generation in Dyadic Conversations

Yongming Zhu, Longhao Zhang, Zhengkun Rong et al.

AAAI 2025paperarXiv:2408.08931

#1590

Personalized Federated Collaborative Filtering: A Variational AutoEncoder Approach

Zhiwei Li, Guodong Long, Tianyi Zhou et al.

ICCV 2025posterarXiv:2408.10123

#1591

Learning Precise Affordances from Egocentric Videos for Robotic Manipulation

Li, Nikolaos Tsagkas, Jifei Song et al.

ICLR 2025posterarXiv:2407.16936

#1592

Provable Benefit of Annealed Langevin Monte Carlo for Non-log-concave Sampling

Wei Guo, Molei Tao, Yongxin Chen

ICLR 2025posterarXiv:2402.02392

#1593

DeLLMa: Decision Making Under Uncertainty with Large Language Models

Ollie Liu, Deqing Fu, Dani Yogatama et al.

ICLR 2025posterarXiv:2403.13164

#1594

VL-ICL Bench: The Devil in the Details of Multimodal In-Context Learning

Yongshuo Zong, Ondrej Bohdal, Timothy Hospedales

ICLR 2025posterarXiv:2410.02098

#1595

EC-DIT: Scaling Diffusion Transformers with Adaptive Expert-Choice Routing

Haotian Sun, Tao Lei, Bowen Zhang et al.

ICLR 2025posterarXiv:2410.04203

#1596

RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization

Hanyang Zhao, Genta Winata, Anirban Das et al.

ICLR 2025posterarXiv:2405.16674

#1597

Limits of Deep Learning: Sequence Modeling through the Lens of Complexity Theory

Nikola Zubic, Federico Soldà, Aurelio Sulser et al.

CVPR 2025posterarXiv:2412.01243

#1598

Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation

Zilyu Ye, Zhiyang Chen, Tiancheng Li et al.

ICLR 2025posterarXiv:2410.18325

#1599

AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models

Kim Sung-Bin, Oh Hyun-Bin, Lee Jung-Mok et al.

#1600

u-$\mu$P: The Unit-Scaled Maximal Update Parametrization

Charles Blake, Constantin Eichenberg, Josef Dean et al.