Most Cited 2025 "consistency model distillation" Papers

22,274 papers found • Page 8 of 112

Filters:Most Cited 2025 consistency model distillation Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

#1401

The Curse of Depth in Large Language Models

Wenfang Sun, Xinyuan Song, Pengxiang Li et al.

NEURIPS 2025arXiv:2502.05795

citations

#1402

Sparse autoencoders reveal selective remapping of visual concepts during adaptation

Hyesu Lim, Jinho Choi, Jaegul Choo et al.

ICLR 2025arXiv:2412.05276

citations

#1403

Spike No More: Stabilizing the Pre-training of Large Language Models

Sho Takase, Shun Kiyono, Sosuke Kobayashi et al.

COLM 2025paperarXiv:2312.16903

citations

#1404

MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models

Peng Xia, Siwei Han, Shi Qiu et al.

ICLR 2025arXiv:2410.10139

citations

#1405

OptiBench Meets ReSocratic: Measure and Improve LLMs for Optimization Modeling

Zhicheng YANG, Yiwei Wang, Yinya Huang et al.

ICLR 2025arXiv:2407.09887

citations

#1406

Overtrained Language Models Are Harder to Fine-Tune

Jacob Mitchell Springer, Sachin Goyal, Kaiyue Wen et al.

ICML 2025arXiv:2503.19206

citations

#1407

Feed-Forward Bullet-Time Reconstruction of Dynamic Scenes from Monocular Videos

hanxue liang, Jiawei Ren, Ashkan Mirzaei et al.

NEURIPS 2025arXiv:2412.03526

citations

#1408

IRGS: Inter-Reflective Gaussian Splatting with 2D Gaussian Ray Tracing

Chun Gu, Xiaofei Wei, Zixuan Zeng et al.

CVPR 2025arXiv:2412.15867

citations

#1409

Steer LLM Latents for Hallucination Detection

Seongheon Park, Xuefeng Du, Min-Hsuan Yeh et al.

ICML 2025arXiv:2503.01917

citations

#1410

World4Drive: End-to-End Autonomous Driving via Intention-aware Physical Latent World Model

Yupeng Zheng, Pengxuan Yang, Zebin Xing et al.

ICCV 2025arXiv:2507.00603

citations

#1411

Values in the Wild: Discovering and Mapping Values in Real-World Language Model Interactions

Saffron Huang, Esin DURMUS, Kunal Handa et al.

COLM 2025paper

citations

#1412

Forgetting Transformer: Softmax Attention with a Forget Gate

Zhixuan Lin, Evgenii Nikishin, Xu He et al.

ICLR 2025arXiv:2503.02130

citations

#1413

VidBot: Learning Generalizable 3D Actions from In-the-Wild 2D Human Videos for Zero-Shot Robotic Manipulation

Hanzhi Chen, Boyang Sun, Anran Zhang et al.

CVPR 2025arXiv:2503.07135

citations

#1414

Multi-Agent Verification: Scaling Test-Time Compute with Multiple Verifiers

Shalev Lifshitz, Sheila A. McIlraith, Yilun Du

COLM 2025paperarXiv:2502.20379

citations

#1415

VideoGLaMM : A Large Multimodal Model for Pixel-Level Visual Grounding in Videos

Shehan Munasinghe, Hanan Gani, Wenqi Zhu et al.

CVPR 2025arXiv:2411.04923

citations

#1416

G-Memory: Tracing Hierarchical Memory for Multi-Agent Systems

Guibin Zhang, Muxin Fu, Kun Wang et al.

NEURIPS 2025spotlightarXiv:2506.07398

citations

#1417

CBQ: Cross-Block Quantization for Large Language Models

Xin Ding, Xiaoyu Liu, Zhijun Tu et al.

ICLR 2025arXiv:2312.07950

citations

#1418

VRAG-RL: Empower Vision-Perception-Based RAG for Visually Rich Information Understanding via Iterative Reasoning with Reinforcement Learning

Qiuchen Wang, Ruixue Ding, Yu Zeng et al.

NEURIPS 2025arXiv:2505.22019

citations

#1419

PivotMesh: Generic 3D Mesh Generation via Pivot Vertices Guidance

Haohan Weng, Yikai Wang, Tong Zhang et al.

ICLR 2025arXiv:2405.16890

citations

#1420

Exploring Intrinsic Normal Prototypes within a Single Image for Universal Anomaly Detection

Wei Luo, Yunkang Cao, Haiming Yao et al.

CVPR 2025arXiv:2503.02424

citations

#1421

VCA: Video Curious Agent for Long Video Understanding

Zeyuan Yang, Delin Chen, Xueyang Yu et al.

ICCV 2025arXiv:2412.10471

citations

#1422

From Tokens to Words: On the Inner Lexicon of LLMs

Guy Kaplan, Matanel Oren, Yuval Reif et al.

ICLR 2025arXiv:2410.05864

citations

#1423

Language Model Alignment in Multilingual Trolley Problems

Zhijing Jin, Max Kleiman-Weiner, Giorgio Piatti et al.

ICLR 2025oralarXiv:2407.02273

citations

#1424

McEval: Massively Multilingual Code Evaluation

Linzheng Chai, Shukai Liu, Jian Yang et al.

ICLR 2025arXiv:2406.07436

citations

#1425

PartCrafter: Structured 3D Mesh Generation via Compositional Latent Diffusion Transformers

Yuchen Lin, Chenguo Lin, Panwang Pan et al.

NEURIPS 2025arXiv:2506.05573

citations

#1426

TempMe: Video Temporal Token Merging for Efficient Text-Video Retrieval

Leqi Shen, Tianxiang Hao, Tao He et al.

ICLR 2025oralarXiv:2409.01156

citations

#1427

STAMP: Scalable Task- And Model-agnostic Collaborative Perception

Xiangbo Gao, Runsheng Xu, Jiachen Li et al.

ICLR 2025arXiv:2501.18616

citations

#1428

CodeHalu: Investigating Code Hallucinations in LLMs via Execution-based Verification

Yuchen Tian, Weixiang Yan, Qian Yang et al.

AAAI 2025paperarXiv:2405.00253

citations

#1429

Unlocking Multimodal Mathematical Reasoning via Process Reward Model

Ruilin Luo, Zhuofan Zheng, Lei Wang et al.

NEURIPS 2025arXiv:2501.04686

citations

#1430

CViT: Continuous Vision Transformer for Operator Learning

Sifan Wang, Jacob Seidman, Shyam Sankaran et al.

ICLR 2025oralarXiv:2405.13998

citations

#1431

Best-of-N Jailbreaking

John Hughes, Sara Price, Aengus Lynch et al.

NEURIPS 2025arXiv:2412.03556

citations

#1432

PhysGen3D: Crafting a Miniature Interactive World from a Single Image

Boyuan Chen, Hanxiao Jiang, Shaowei Liu et al.

CVPR 2025arXiv:2503.20746

citations

#1433

How to build a consistency model: Learning flow maps via self-distillation

Nicholas Boffi, Michael Albergo, Eric Vanden-Eijnden

NEURIPS 2025arXiv:2505.18825

citations

#1434

GPUDrive: Data-driven, multi-agent driving simulation at 1 million FPS

Saman Kazemkhani, Aarav Pandya, Daphne Cornelisse et al.

ICLR 2025arXiv:2408.01584

citations

#1435

Visual Agentic AI for Spatial Reasoning with a Dynamic API

Damiano Marsili, Rohun Agrawal, Yisong Yue et al.

CVPR 2025arXiv:2502.06787

citations

#1436

From Isolated Conversations to Hierarchical Schemas: Dynamic Tree Memory Representation for LLMs

Alireza Rezazadeh, Zichao Li, Wei Wei et al.

ICLR 2025arXiv:2410.14052

citations

#1437

GeoPixel: Pixel Grounding Large Multimodal Model in Remote Sensing

Akashah Shabbir, Ilmuz Zaman Mohammed Zumri, Mohammed Bennamoun et al.

ICML 2025arXiv:2501.13925

citations

#1438

ASGO: Adaptive Structured Gradient Optimization

Kang An, Yuxing Liu, Rui Pan et al.

NEURIPS 2025arXiv:2503.20762

citations

#1439

LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior

Hanyu Wang, Saksham Suri, Yixuan Ren et al.

ICLR 2025arXiv:2410.21264

citations

#1440

FourierMamba: Fourier Learning Integration with State Space Models for Image Deraining

Dong Li, Yidi Liu, Xueyang Fu et al.

ICML 2025oralarXiv:2405.19450

citations

#1441

SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training

Jintao Zhang, Jia wei, Haoxu Wang et al.

NEURIPS 2025spotlightarXiv:2505.11594

citations

#1442

SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator

Guoxuan Chen, Han Shi, jiawei li et al.

ICML 2025arXiv:2412.12094

citations

#1443

Understanding Chain-of-Thought in LLMs through Information Theory

Jean-Francois Ton, Muhammad Faaiz Taufiq, Yang Liu

ICML 2025arXiv:2411.11984

citations

#1444

Sample, Scrutinize and Scale: Effective Inference-Time Search by Scaling Verification

Eric Zhao, Pranjal Awasthi, Sreenivas Gollapudi

ICML 2025arXiv:2502.01839

citations

#1445

REEF: Representation Encoding Fingerprints for Large Language Models

Jie Zhang, Dongrui Liu, Chen Qian et al.

ICLR 2025arXiv:2410.14273

citations

#1446

Exploring Prosocial Irrationality for LLM Agents: A Social Cognition View

Xuan Liu, Jie ZHANG, HaoYang Shang et al.

ICLR 2025arXiv:2405.14744

citations

#1447

ReMoE: Fully Differentiable Mixture-of-Experts with ReLU Routing

Ziteng Wang, Jun Zhu, Jianfei Chen

ICLR 2025arXiv:2412.14711

citations

#1448

AbstentionBench: Reasoning LLMs Fail on Unanswerable Questions

Polina Kirichenko, Mark Ibrahim, Kamalika Chaudhuri et al.

NEURIPS 2025arXiv:2506.09038

citations

#1449

Longhorn: State Space Models are Amortized Online Learners

Bo Liu, Rui Wang, Lemeng Wu et al.

ICLR 2025arXiv:2407.14207

citations

#1450

Efficient Online Reinforcement Learning for Diffusion Policy

Haitong Ma, Tianyi Chen, Kai Wang et al.

ICML 2025arXiv:2502.00361

citations

#1451

Herald: A Natural Language Annotated Lean 4 Dataset

Guoxiong Gao, Yutong Wang, Jiedong Jiang et al.

ICLR 2025arXiv:2410.10878

citations

#1452

GaussTR: Foundation Model-Aligned Gaussian Transformer for Self-Supervised 3D Spatial Understanding

Haoyi Jiang, Liu Liu, Tianheng Cheng et al.

CVPR 2025arXiv:2412.13193

citations

#1453

AI-Face: A Million-Scale Demographically Annotated AI-Generated Face Dataset and Fairness Benchmark

Li Lin, Santosh Santosh, Mingyang Wu et al.

CVPR 2025arXiv:2406.00783

citations

#1454

Methods for Convex $(L_0,L_1)$-Smooth Optimization: Clipping, Acceleration, and Adaptivity

Eduard Gorbunov, Nazarii Tupitsa, Sayantan Choudhury et al.

ICLR 2025arXiv:2409.14989

citations

#1455

UGPhysics: A Comprehensive Benchmark for Undergraduate Physics Reasoning with Large Language Models

Xin Xu, Qiyun Xu, Tong Xiao et al.

ICML 2025arXiv:2502.00334

citations

#1456

ConDSeg: A General Medical Image Segmentation Framework via Contrast-Driven Feature Enhancement

Mengqi Lei, Haochen Wu, Xinhua Lv et al.

AAAI 2025paperarXiv:2412.08345

citations

#1457

Streaming Video Understanding and Multi-round Interaction with Memory-enhanced Knowledge

Haomiao Xiong, Zongxin Yang, Jiazuo Yu et al.

ICLR 2025arXiv:2501.13468

citations

#1458

DeSiRe-GS: 4D Street Gaussians for Static-Dynamic Decomposition and Surface Reconstruction for Urban Driving Scenes

Chensheng Peng, Chengwei Zhang, Yixiao Wang et al.

CVPR 2025arXiv:2411.11921

citations

#1459

Democratizing Text-to-Image Masked Generative Models with Compact Text-Aware One-Dimensional Tokens

Dongwon Kim, Ju He, Qihang Yu et al.

ICCV 2025arXiv:2501.07730

citations

#1460

LEARN: Knowledge Adaptation from Large Language Model to Recommendation for Practical Industrial Application

Jian Jia, Yipei Wang, Yan Li et al.

AAAI 2025paperarXiv:2405.03988

citations

#1461

On Large Language Model Continual Unlearning

Chongyang Gao, Lixu Wang, Kaize Ding et al.

ICLR 2025arXiv:2407.10223

citations

#1462

XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation

Bowen Chen, Brynn zhao, Haomiao Sun et al.

NEURIPS 2025arXiv:2506.21416

citations

#1463

Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling

Hongzhi Huang, Defa Zhu, Banggu Wu et al.

ICML 2025arXiv:2501.16975

citations

#1464

3D Convex Splatting: Radiance Field Rendering with 3D Smooth Convexes

Jan Held, Renaud Vandeghen, Abdullah J Hamdi et al.

CVPR 2025highlightarXiv:2411.14974

citations

#1465

EgoTextVQA: Towards Egocentric Scene-Text Aware Video Question Answering

Sheng Zhou, Junbin Xiao, Qingyun Li et al.

CVPR 2025arXiv:2502.07411

citations

#1466

PHYBench: Holistic Evaluation of Physical Perception and Reasoning in Large Language Models

Shi Qiu, Shaoyang Guo, Zhuo-Yang Song et al.

NEURIPS 2025arXiv:2504.16074

citations

#1467

Distillation Scaling Laws

Dan Busbridge, Amitis Shidani, Floris Weers et al.

ICML 2025arXiv:2502.08606

citations

#1468

Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark

Tsung-Han Wu, Giscard Biamby, Jerome Quenum et al.

ICLR 2025arXiv:2407.13766

citations

#1469

Epona: Autoregressive Diffusion World Model for Autonomous Driving

Kaiwen Zhang, Zhenyu Tang, Xiaotao Hu et al.

ICCV 2025arXiv:2506.24113

citations

#1470

Image Watermarks are Removable using Controllable Regeneration from Clean Noise

Yepeng Liu, Yiren Song, Hai Ci et al.

ICLR 2025arXiv:2410.05470

citations

#1471

MEGA-Bench: Scaling Multimodal Evaluation to over 500 Real-World Tasks

Jiacheng Chen, Tianhao Liang, Sherman Siu et al.

ICLR 2025arXiv:2410.10563

citations

#1472

Unleashing the Potential of Large Language Models as Prompt Optimizers: Analogical Analysis with Gradient-based Model Optimizers

Xinyu Tang, Xiaolei Wang, Wayne Xin Zhao et al.

AAAI 2025paperarXiv:2402.17564

citations

#1473

GenSE: Generative Speech Enhancement via Language Models using Hierarchical Modeling

Jixun Yao, Hexin Liu, CHEN CHEN et al.

ICLR 2025arXiv:2502.02942

citations

#1474

Inst3D-LMM: Instance-Aware 3D Scene Understanding with Multi-modal Instruction Tuning

Hanxun Yu, Wentong Li, Song Wang et al.

CVPR 2025highlightarXiv:2503.00513

citations

#1475

BridgeVLA: Input-Output Alignment for Efficient 3D Manipulation Learning with Vision-Language Models

Peiyan Li, Yixiang Chen, Hongtao Wu et al.

NEURIPS 2025arXiv:2506.07961

citations

#1476

MRAG-Bench: Vision-Centric Evaluation for Retrieval-Augmented Multimodal Models

Wenbo Hu, Jia-Chen Gu, Zi-Yi Dou et al.

ICLR 2025arXiv:2410.08182

citations

#1477

Faster Algorithms for Structured John Ellipsoid Computation

Yang Cao, Xiaoyu Li, Zhao Song et al.

NEURIPS 2025arXiv:2211.14407

citations

#1478

LOKI: A Comprehensive Synthetic Data Detection Benchmark using Large Multimodal Models

Junyan Ye, Baichuan Zhou, Zilong Huang et al.

ICLR 2025arXiv:2410.09732

citations

#1479

One Perturbation is Enough: On Generating Universal Adversarial Perturbations against Vision-Language Pre-training Models

Hao Fang, Jiawei Kong, Wenbo Yu et al.

ICCV 2025arXiv:2406.05491

citations

#1480

Motion-Agent: A Conversational Framework for Human Motion Generation with LLMs

Qi Wu, Yubo Zhao, Yifan Wang et al.

ICLR 2025arXiv:2405.17013

citations

#1481

OmniBind: Large-scale Omni Multimodal Representation via Binding Spaces

zehan wang, Ziang Zhang, Minjie Hong et al.

ICLR 2025arXiv:2407.11895

citations

#1482

KABB: Knowledge-Aware Bayesian Bandits for Dynamic Expert Coordination in Multi-Agent Systems

Jusheng Zhang, Zimeng Huang, Yijia Fan et al.

ICML 2025arXiv:2502.07350

citations

#1483

CoMT: A Novel Benchmark for Chain of Multi-modal Thought on Large Vision-Language Models

Zihui Cheng, Qiguang Chen, Jin Zhang et al.

AAAI 2025paperarXiv:2412.12932

citations

#1484

The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio

Sicong Leng, Yun Xing, Zesen Cheng et al.

NEURIPS 2025arXiv:2410.12787

citations

#1485

InstantSwap: Fast Customized Concept Swapping across Sharp Shape Differences

Chenyang Zhu, Kai Li, Yue Ma et al.

ICLR 2025arXiv:2412.01197

citations

#1486

SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding

Sihang Li, Jin Huang, Jiaxi Zhuang et al.

ICLR 2025arXiv:2408.15545

citations

#1487

Locate-then-edit for Multi-hop Factual Recall under Knowledge Editing

Zhuoran Zhang, Yongxiang Li, Zijian Kan et al.

ICML 2025arXiv:2410.06331

citations

#1488

K-LoRA: Unlocking Training-Free Fusion of Any Subject and Style LoRAs

Ziheng Ouyang, Zhen Li, Qibin Hou

CVPR 2025arXiv:2502.18461

citations

#1489

RealCam-I2V: Real-World Image-to-Video Generation with Interactive Complex Camera Control

Teng Li, Guangcong Zheng, Rui Jiang et al.

ICCV 2025arXiv:2502.10059

citations

#1490

Exploring Enhanced Contextual Information for Video-Level Object Tracking

Ben Kang, Xin Chen, Simiao Lai et al.

AAAI 2025paperarXiv:2412.11023

citations

#1491

Jailbreaking Multimodal Large Language Models via Shuffle Inconsistency

Shiji Zhao, Ranjie Duan, Fengxiang Wang et al.

ICCV 2025arXiv:2501.04931

citations

#1492

Continuous, Subject-Specific Attribute Control in T2I Models by Identifying Semantic Directions

Stefan Andreas Baumann, Felix Krause, Michael Neumayr et al.

CVPR 2025arXiv:2403.17064

citations

#1493

Spurious Forgetting in Continual Learning of Language Models

Junhao Zheng, Xidi Cai, Shengjie Qiu et al.

ICLR 2025arXiv:2501.13453

citations

#1494

VSSD: Vision Mamba with Non-Causal State Space Duality

Yuheng Shi, Mingjia Li, Minjing Dong et al.

ICCV 2025arXiv:2407.18559

citations

#1495

Visual Prompting Upgrades Neural Network Sparsification: A Data-Model Perspective

Can Jin, Tianjin Huang, Yihua Zhang et al.

AAAI 2025paperarXiv:2312.01397

citations

#1496

Q-Bench-Video: Benchmark the Video Quality Understanding of LMMs

Zicheng Zhang, Ziheng Jia, Haoning Wu et al.

CVPR 2025arXiv:2409.20063

citations

#1497

3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination

Jianing "Jed" Yang, Xuweiyi Chen, Nikhil Madaan et al.

CVPR 2025arXiv:2406.05132

citations

#1498

LANTERN: Accelerating Visual Autoregressive Models with Relaxed Speculative Decoding

Doohyuk Jang, Sihwan Park, June Yong Yang et al.

ICLR 2025arXiv:2410.03355

citations

#1499

Cascade Reward Sampling for Efficient Decoding-Time Alignment

Bolian Li, Yifan Wang, Anamika Lochab et al.

COLM 2025paperarXiv:2406.16306

citations

#1500

KRIS-Bench: Benchmarking Next-Level Intelligent Image Editing Models

Yongliang Wu, Zonghui Li, Xinting Hu et al.

NEURIPS 2025arXiv:2505.16707

citations

#1501

How to set AdamW's weight decay as you scale model and dataset size

Xi Wang, Laurence Aitchison

ICML 2025arXiv:2405.13698

citations

#1502

Efficient Diffusion Transformer Policies with Mixture of Expert Denoisers for Multitask Learning

Moritz Reuss, Jyothish Pari, Pulkit Agrawal et al.

ICLR 2025arXiv:2412.12953

citations

#1503

Efficient Dictionary Learning with Switch Sparse Autoencoders

Anish Mudide, Josh Engels, Eric Michaud et al.

ICLR 2025arXiv:2410.08201

citations

#1504

Second-Order Fine-Tuning without Pain for LLMs: A Hessian Informed Zeroth-Order Optimizer

Yanjun Zhao, Sizhe Dang, Haishan Ye et al.

ICLR 2025

citations

#1505

Generative Inbetweening: Adapting Image-to-Video Models for Keyframe Interpolation

Xiaojuan Wang, Boyang Zhou, Brian Curless et al.

ICLR 2025arXiv:2408.15239

citations

#1506

GOFA: A Generative One-For-All Model for Joint Graph Language Modeling

Lecheng Kong, Jiarui Feng, Hao Liu et al.

ICLR 2025arXiv:2407.09709

citations

#1507

Adversarial Diffusion Compression for Real-World Image Super-Resolution

Bin Chen, Gehui Li, Rongyuan Wu et al.

CVPR 2025arXiv:2411.13383

citations

#1508

Efficient Online Reinforcement Learning Fine-Tuning Need Not Retain Offline Data

Zhiyuan Zhou, Andy Peng, Qiyang Li et al.

ICLR 2025arXiv:2412.07762

citations

#1509

MINIMA: Modality Invariant Image Matching

Jiangwei Ren, Xingyu Jiang, Zizhuo Li et al.

CVPR 2025arXiv:2412.19412

citations

#1510

Beyond Autoregression: Fast LLMs via Self-Distillation Through Time

Justin Deschenaux, Caglar Gulcehre

ICLR 2025arXiv:2410.21035

citations

#1511

LMAct: A Benchmark for In-Context Imitation Learning with Long Multimodal Demonstrations

Anian Ruoss, Fabio Pardo, Harris Chan et al.

ICML 2025arXiv:2412.01441

citations

#1512

Measuring and Enhancing Trustworthiness of LLMs in RAG through Grounded Attributions and Learning to Refuse

Maojia Song, Shang Hong Sim, Rishabh Bhardwaj et al.

ICLR 2025arXiv:2409.11242

citations

#1513

Enriching Multimodal Sentiment Analysis Through Textual Emotional Descriptions of Visual-Audio Content

Sheng Wu, Dongxiao He, Xiaobao Wang et al.

AAAI 2025paperarXiv:2412.10460

citations

#1514

Improving Retrieval-Augmented Generation through Multi-Agent Reinforcement Learning

Yiqun Chen, Lingyong Yan, Weiwei Sun et al.

NEURIPS 2025arXiv:2501.15228

citations

#1515

Reducing Hallucinations in Large Vision-Language Models via Latent Space Steering

Sheng Liu, Haotian Ye, James Y Zou

ICLR 2025

citations

#1516

Zigzag Diffusion Sampling: Diffusion Models Can Self-Improve via Self-Reflection

Lichen Bai, Shitong Shao, zikai zhou et al.

ICLR 2025arXiv:2412.10891

citations

#1517

Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale

Fan Zhou, Zengzhi Wang, Qian Liu et al.

ICML 2025arXiv:2409.17115

citations

#1518

LSceneLLM: Enhancing Large 3D Scene Understanding Using Adaptive Visual Preferences

Hongyan Zhi, Peihao Chen, Junyan Li et al.

CVPR 2025arXiv:2412.01292

citations

#1519

Chain-of-Action: Faithful and Multimodal Question Answering through Large Language Models

Zhenyu Pan, Haozheng Luo, Manling Li et al.

ICLR 2025arXiv:2403.17359

citations

#1520

GraphArena: Evaluating and Exploring Large Language Models on Graph Computation

Jianheng Tang, Qifan Zhang, Yuhan Li et al.

ICLR 2025arXiv:2407.00379

citations

#1521

How Discrete and Continuous Diffusion Meet: Comprehensive Analysis of Discrete Diffusion Models via a Stochastic Integral Framework

Yinuo Ren, Haoxuan Chen, Grant Rotskoff et al.

ICLR 2025arXiv:2410.03601

citations

#1522

Learning from Videos for 3D World: Enhancing MLLMs with 3D Vision Geometry Priors

Duo Zheng, shijia Huang, Yanyang Li et al.

NEURIPS 2025arXiv:2505.24625

citations

#1523

Paint by Inpaint: Learning to Add Image Objects by Removing Them First

Navve Wasserman, Noam Rotstein, Roy Ganz et al.

CVPR 2025arXiv:2404.18212

citations

#1524

Your Mixture-of-Experts LLM Is Secretly an Embedding Model for Free

Ziyue Li, Tianyi Zhou

ICLR 2025arXiv:2410.10814

citations

#1525

HERMES: A Unified Self-Driving World Model for Simultaneous 3D Scene Understanding and Generation

Xin Zhou, DINGKANG LIANG, Sifan Tu et al.

ICCV 2025arXiv:2501.14729

citations

#1526

Reinforcement Learning with Action Chunking

Qiyang Li, Zhiyuan (Paul) Zhou, Sergey Levine

NEURIPS 2025oralarXiv:2507.07969

citations

#1527

AgentDAM: Privacy Leakage Evaluation for Autonomous Web Agents

Arman Zharmagambetov, Chuan Guo, Ivan Evtimov et al.

NEURIPS 2025arXiv:2503.09780

citations

#1528

Training Dynamics of In-Context Learning in Linear Attention

Yedi Zhang, Aaditya Singh, Peter Latham et al.

ICML 2025spotlightarXiv:2501.16265

citations

#1529

SlideChat: A Large Vision-Language Assistant for Whole-Slide Pathology Image Understanding

Ying Chen, Guoan Wang, Yuanfeng Ji et al.

CVPR 2025arXiv:2410.11761

citations

#1530

Diving into Self-Evolving Training for Multimodal Reasoning

Wei Liu, Junlong Li, Xiwen Zhang et al.

ICML 2025arXiv:2412.17451

citations

#1531

MotionStreamer: Streaming Motion Generation via Diffusion-based Autoregressive Model in Causal Latent Space

Lixing Xiao, Shunlin Lu, Huaijin Pi et al.

ICCV 2025arXiv:2503.15451

citations

#1532

Can Large Language Models Understand Symbolic Graphics Programs?

Zeju Qiu, Weiyang Liu, Haiwen Feng et al.

ICLR 2025arXiv:2408.08313

citations

#1533

Energy-Weighted Flow Matching for Offline Reinforcement Learning

Shiyuan Zhang, Weitong Zhang, Quanquan Gu

ICLR 2025arXiv:2503.04975

citations

#1534

Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles

Jiangjie Chen, Qianyu He, Siyu Yuan et al.

NEURIPS 2025spotlightarXiv:2505.19914

citations

#1535

Theoretical Benefit and Limitation of Diffusion Language Model

Guhao Feng, Yihan Geng, Jian Guan et al.

NEURIPS 2025arXiv:2502.09622

citations

#1536

The OMG dataset: An Open MetaGenomic corpus for mixed-modality genomic language modeling

Andre Cornman, Jacob West-Roberts, Antonio Camargo et al.

ICLR 2025

citations

#1537

DailyDilemmas: Revealing Value Preferences of LLMs with Quandaries of Daily Life

Yu Ying Chiu, Liwei Jiang, Yejin Choi

ICLR 2025oralarXiv:2410.02683

citations

#1538

Large-scale and Fine-grained Vision-language Pre-training for Enhanced CT Image Understanding

Zhongyi Shui, Jianpeng Zhang, Weiwei Cao et al.

ICLR 2025arXiv:2501.14548

citations

#1539

CURIE: Evaluating LLMs on Multitask Scientific Long-Context Understanding and Reasoning

Hao Cui, Zahra Shamsi, Gowoon Cheon et al.

ICLR 2025arXiv:2503.13517

citations

#1540

Bolt3D: Generating 3D Scenes in Seconds

Stanislaw Szymanowicz, Jason Y. Zhang, Pratul Srinivasan et al.

ICCV 2025arXiv:2503.14445

citations

#1541

Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models

Zehan Wang, Ziang Zhang, Tianyu Pang et al.

ICML 2025arXiv:2412.18605

citations

#1542

LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation with Linear Computational Complexity

Hongjie Wang, Chih-Yao Ma, Yen-Cheng Liu et al.

CVPR 2025arXiv:2412.09856

citations

#1543

Distilling Multi-modal Large Language Models for Autonomous Driving

Deepti Hegde, Rajeev Yasarla, Hong Cai et al.

CVPR 2025arXiv:2501.09757

citations

#1544

InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity

Liming Jiang, Qing Yan, Yumin Jia et al.

ICCV 2025highlightarXiv:2503.16418

citations

#1545

Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models

Samira Abnar, Harshay Shah, Dan Busbridge et al.

ICML 2025arXiv:2501.12370

citations

#1546

Agent4Edu: Generating Learner Response Data by Generative Agents for Intelligent Education Systems

Weibo Gao, Qi Liu, Linan Yue et al.

AAAI 2025paperarXiv:2501.10332

citations

#1547

KVLink: Accelerating Large Language Models via Efficient KV Cache Reuse

Jingbo Yang, Bairu Hou, Wei Wei et al.

NEURIPS 2025arXiv:2502.16002

citations

#1548

Semantics-Adaptive Activation Intervention for LLMs via Dynamic Steering Vectors

Weixuan Wang, JINGYUAN YANG, Wei Peng

ICLR 2025arXiv:2410.12299

citations

#1549

RUN: Reversible Unfolding Network for Concealed Object Segmentation

Chunming He, Rihan Zhang, Fengyang Xiao et al.

ICML 2025arXiv:2501.18783

citations

#1550

DyFo: A Training-Free Dynamic Focus Visual Search for Enhancing LMMs in Fine-Grained Visual Understanding

Geng Li, Jinglin Xu, Yunzhen Zhao et al.

CVPR 2025highlightarXiv:2504.14920

citations

#1551

A Formal Framework for Understanding Length Generalization in Transformers

Xinting Huang, Andy Yang, Satwik Bhattamishra et al.

ICLR 2025arXiv:2410.02140

citations

#1552

ReAgent-V: A Reward-Driven Multi-Agent Framework for Video Understanding

Yiyang Zhou, Yangfan He, Yaofeng Su et al.

NEURIPS 2025arXiv:2506.01300

citations

#1553

CAT-3DGS: A Context-Adaptive Triplane Approach to Rate-Distortion-Optimized 3DGS Compression

Yu-Ting Zhan, Cheng-Yuan Ho, He-Bi Yang et al.

ICLR 2025arXiv:2503.00357

citations

#1554

AdvAgent: Controllable Blackbox Red-teaming on Web Agents

Chejian Xu, Mintong Kang, Jiawei Zhang et al.

ICML 2025arXiv:2410.17401

citations

#1555

Machine Unlearning Fails to Remove Data Poisoning Attacks

Martin Pawelczyk, Jimmy Di, Yiwei Lu et al.

ICLR 2025arXiv:2406.17216

citations

#1556

A Bias-Free Training Paradigm for More General AI-generated Image Detection

Fabrizio Guillaro, Giada Zingarini, Ben Usman et al.

CVPR 2025arXiv:2412.17671

citations

#1557

PerturboLLaVA: Reducing Multimodal Hallucinations with Perturbative Visual Training

Cong Chen, Mingyu Liu, Chenchen Jing et al.

ICLR 2025arXiv:2503.06486

citations

#1558

ChartMoE: Mixture of Diversely Aligned Expert Connector for Chart Understanding

Zhengzhuo Xu, Bowen Qu, Yiyan Qi et al.

ICLR 2025arXiv:2409.03277

citations

#1559

LiveCodeBench Pro: How Do Olympiad Medalists Judge LLMs in Competitive Programming?

Zihan Zheng, Zerui Cheng, Zeyu Shen et al.

NEURIPS 2025arXiv:2506.11928

citations

#1560

Can Knowledge Editing Really Correct Hallucinations?

Baixiang Huang, Canyu Chen, Xiongxiao Xu et al.

ICLR 2025arXiv:2410.16251

citations

#1561

Holistically Evaluating the Environmental Impact of Creating Language Models

Jacob Morrison, Clara Na, Jared Fernandez et al.

ICLR 2025arXiv:2503.05804

citations

#1562

Understanding and Enhancing Safety Mechanisms of LLMs via Safety-Specific Neuron

Yiran Zhao, Wenxuan Zhang, Yuxi Xie et al.

ICLR 2025

citations

#1563

MARS: Unleashing the Power of Variance Reduction for Training Large Models

Huizhuo Yuan, Yifeng Liu, Shuang Wu et al.

ICML 2025arXiv:2411.10438

citations

#1564

Multimodal Situational Safety

Kaiwen Zhou, Chengzhi Liu, Xuandong Zhao et al.

ICLR 2025arXiv:2410.06172

citations

#1565

Graphic Design with Large Multimodal Model

Yutao Cheng, Zhao Zhang, Maoke Yang et al.

AAAI 2025paperarXiv:2404.14368

citations

#1566

Interleaved-Modal Chain-of-Thought

Jun Gao, Yongqi Li, Ziqiang Cao et al.

CVPR 2025arXiv:2411.19488

citations

#1567

MegaMath: Pushing the Limits of Open Math Corpora

Fan Zhou, Zengzhi Wang, Nikhil Ranjan et al.

COLM 2025paperarXiv:2504.02807

citations

#1568

ARLON: Boosting Diffusion Transformers with Autoregressive Models for Long Video Generation

Zongyi Li, Shujie HU, Shujie LIU et al.

ICLR 2025oralarXiv:2410.20502

citations

#1569

HyperGraphRAG: Retrieval-Augmented Generation via Hypergraph-Structured Knowledge Representation

Haoran Luo, Haihong E, Guanting Chen et al.

NEURIPS 2025arXiv:2503.21322

citations

#1570

Enhancing Zeroth-order Fine-tuning for Language Models with Low-rank Structures

Yiming Chen, Yuan Zhang, Liyuan Cao et al.

ICLR 2025arXiv:2410.07698

citations

#1571

Move to Understand a 3D Scene: Bridging Visual Grounding and Exploration for Efficient and Versatile Embodied Navigation

ZIYU ZHU, Xilin Wang, Yixuan Li et al.

ICCV 2025highlightarXiv:2507.04047

citations

#1572

AI for Global Climate Cooperation: Modeling Global Climate Negotiations, Agreements, and Long-Term Cooperation in RICE-N

Tianyu Zhang, Andrew Williams, Phillip Wozny et al.

ICML 2025arXiv:2208.07004

citations

#1573

Grounded Reinforcement Learning for Visual Reasoning

Gabriel Sarch, Snigdha Saha, Naitik Khandelwal et al.

NEURIPS 2025arXiv:2505.23678

citations

#1574

Simple ReFlow: Improved Techniques for Fast Flow Models

Beomsu Kim, Yu-Guan Hsieh, Michal Klein et al.

ICLR 2025arXiv:2410.07815

citations

#1575

TrackGo: A Flexible and Efficient Method for Controllable Video Generation

Haitao Zhou, Chuang Wang, Rui Nie et al.

AAAI 2025paperarXiv:2408.11475

citations

#1576

LayerTracer: Cognitive-Aligned Layered SVG Synthesis via Diffusion Transformer

Yiren Song, Danze Chen, Mike Zheng Shou

ICCV 2025arXiv:2502.01105

citations

#1577

From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions

Changle Qu, Sunhao Dai, Xiaochi Wei et al.

ICLR 2025arXiv:2410.08197

citations

#1578

SVBench: A Benchmark with Temporal Multi-Turn Dialogues for Streaming Video Understanding

Zhenyu Yang, Yuhang Hu, Zemin Du et al.

ICLR 2025oralarXiv:2502.10810

citations

#1579

Understanding Factual Recall in Transformers via Associative Memories

Eshaan Nichani, Jason Lee, Alberto Bietti

ICLR 2025arXiv:2412.06538

citations

#1580

MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code

Zimu Lu, Aojun Zhou, Ke Wang et al.

ICLR 2025arXiv:2410.08196

citations

#1581

SV4D 2.0: Enhancing Spatio-Temporal Consistency in Multi-View Video Diffusion for High-Quality 4D Generation

Chun-Han Yao, Yiming Xie, Vikram Voleti et al.

ICCV 2025arXiv:2503.16396

citations

#1582

ParetoQ: Improving Scaling Laws in Extremely Low-bit LLM Quantization

Zechun Liu, Changsheng Zhao, Hanxian Huang et al.

NEURIPS 2025arXiv:2502.02631

citations

#1583

LLM-SRBench: A New Benchmark for Scientific Equation Discovery with Large Language Models

Parshin Shojaee, Ngoc Hieu Nguyen, Kazem Meidani et al.

ICML 2025oralarXiv:2504.10415

citations

#1584

Fast-Slow Thinking GRPO for Large Vision-Language Model Reasoning

Wenyi Xiao, Leilei Gan

NEURIPS 2025spotlightarXiv:2504.18458

citations

#1585

Star Attention: Efficient LLM Inference over Long Sequences

Shantanu Acharya, Fei Jia, Boris Ginsburg

ICML 2025arXiv:2411.17116

citations

#1586

Pass@K Policy Optimization: Solving Harder Reinforcement Learning Problems

Christian Walder, Deep Tejas Karkhanis

NEURIPS 2025spotlightarXiv:2505.15201

citations

#1587

The Superposition of Diffusion Models Using the Itô Density Estimator

Marta Skreta, Lazar Atanackovic, Joey Bose et al.

ICLR 2025arXiv:2412.17762

citations

#1588

Orthus: Autoregressive Interleaved Image-Text Generation with Modality-Specific Heads

Siqi Kou, Jiachun Jin, Zhihong Liu et al.

ICML 2025arXiv:2412.00127

citations

#1589

Subspace Optimization for Large Language Models with Convergence Guarantees

Yutong He, Pengrui Li, Yipeng Hu et al.

ICML 2025arXiv:2410.11289

citations

#1590

ShortcutsBench: A Large-Scale Real-world Benchmark for API-based Agents

Haiyang SHEN, Yue Li, Desong Meng et al.

ICLR 2025arXiv:2407.00132

citations

#1591

OneIG-Bench: Omni-dimensional Nuanced Evaluation for Image Generation

Jingjing Chang, Yixiao Fang, Peng Xing et al.

NEURIPS 2025arXiv:2506.07977

citations

#1592

Style Outweighs Substance: Failure Modes of LLM Judges in Alignment Benchmarking

Benjamin Feuer, Micah Goldblum, Teresa Datta et al.

ICLR 2025arXiv:2409.15268

citations

#1593

DropGaussian: Structural Regularization for Sparse-view Gaussian Splatting

Hyunwoo Park, Gun Ryu, Wonjun Kim

CVPR 2025arXiv:2504.00773

citations

#1594

Learning stochastic dynamics from snapshots through regularized unbalanced optimal transport

Zhenyi Zhang, Tiejun Li, Peijie Zhou

ICLR 2025arXiv:2410.00844

citations

#1595

Chain-of-Retrieval Augmented Generation

Liang Wang, Haonan Chen, Nan Yang et al.

NEURIPS 2025arXiv:2501.14342

citations

#1596

NLSR: Neuron-Level Safety Realignment of Large Language Models Against Harmful Fine-Tuning

Xin Yi, Shunfan Zheng, Linlin Wang et al.

AAAI 2025paperarXiv:2412.12497

citations

#1597

When Thinking Fails: The Pitfalls of Reasoning for Instruction-Following in LLMs

Xiaomin Li, Zhou Yu, Zhiwei Zhang et al.

NEURIPS 2025spotlightarXiv:2505.11423

citations

#1598

VistaDream: Sampling multiview consistent images for single-view scene reconstruction

Haiping Wang, Yuan Liu, Ziwei Liu et al.

ICCV 2025arXiv:2410.16892

citations

#1599

Nonconvex Stochastic Optimization under Heavy-Tailed Noises: Optimal Convergence without Gradient Clipping

Zijian Liu, Zhengyuan Zhou

ICLR 2025arXiv:2412.19529

citations

#1600

Diffusion-based Neural Network Weights Generation

Bedionita Soro, Bruno Andreis, Hayeon Lee et al.

ICLR 2025arXiv:2402.18153

citations

← Previous

1...6 7 8 9 10...112