Most Cited 2025 "markov chain estimation" Papers

22,274 papers found • Page 12 of 112

#2201

SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning

Xiao Liang, Zhong-Zhi Li, Yeyun Gong et al.

NEURIPS 2025posterarXiv:2506.08989
14
citations
#2202

Block-Attention for Efficient Prefilling

Dongyang Ma, Yan Wang, Tian Lan

ICLR 2025posterarXiv:2409.15355
14
citations
#2203

Multi-Turn Jailbreaking Large Language Models via Attention Shifting

Xiaohu Du, Fan Mo, Ming Wen et al.

AAAI 2025paper
14
citations
#2204

LlavaGuard: An Open VLM-based Framework for Safeguarding Vision Datasets and Models

Lukas Helff, Felix Friedrich, Manuel Brack et al.

ICML 2025posterarXiv:2406.05113
14
citations
#2205

A Second-Order Perspective on Model Compositionality and Incremental Learning

Angelo Porrello, Lorenzo Bonicelli, Pietro Buzzega et al.

ICLR 2025posterarXiv:2405.16350
14
citations
#2206

InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption

Tiehan Fan, Kepan Nan, Rui Xie et al.

CVPR 2025posterarXiv:2412.09283
14
citations
#2207

Unleashing Vecset Diffusion Model for Fast Shape Generation

Zeqiang Lai, Zhao Yunfei, Zibo Zhao et al.

ICCV 2025highlightarXiv:2503.16302
14
citations
#2208

NEST: A Neuromodulated Small-world Hypergraph Trajectory Prediction Model for Autonomous Driving

Chengyue Wang, Haicheng Liao, Bonan Wang et al.

AAAI 2025paperarXiv:2412.11682
14
citations
#2209

SkillMimic: Learning Basketball Interaction Skills from Demonstrations

Yinhuai Wang, Qihan Zhao, Runyi Yu et al.

CVPR 2025highlightarXiv:2408.15270
14
citations
#2210

Know "No" Better: A Data-Driven Approach for Enhancing Negation Awareness in CLIP

Junsung Park, Jungbeom Lee, Jongyoon Song et al.

ICCV 2025posterarXiv:2501.10913
14
citations
#2211

Hybrid-Level Instruction Injection for Video Token Compression in Multi-modal Large Language Models

Zhihang Liu, Chen-Wei Xie, Pandeng Li et al.

CVPR 2025posterarXiv:2503.16036
14
citations
#2212

DPCore: Dynamic Prompt Coreset for Continual Test-Time Adaptation

Yunbei Zhang, Akshay Mehra, Shuaicheng Niu et al.

ICML 2025posterarXiv:2406.10737
14
citations
#2213

MMRole: A Comprehensive Framework for Developing and Evaluating Multimodal Role-Playing Agents

Yanqi Dai, Huanran Hu, Lei Wang et al.

ICLR 2025posterarXiv:2408.04203
14
citations
#2214

AutoMMLab: Automatically Generating Deployable Models from Language Instructions for Computer Vision Tasks

Zekang Yang, Wang Zeng, Sheng Jin et al.

AAAI 2025paperarXiv:2402.15351
14
citations
#2215

UniOcc: A Unified Benchmark for Occupancy Forecasting and Prediction in Autonomous Driving

Yuping Wang, Xiangyu Huang, Xiaokang Sun et al.

ICCV 2025posterarXiv:2503.24381
14
citations
#2216

Code Graph Model (CGM): A Graph-Integrated Large Language Model for Repository-Level Software Engineering Tasks

Hongyuan Tao, Ying Zhang, Zhenhao Tang et al.

NEURIPS 2025posterarXiv:2505.16901
14
citations
#2217

RelGNN: Composite Message Passing for Relational Deep Learning

Tianlang Chen, Charilaos Kanatsoulis, Jure Leskovec

ICML 2025posterarXiv:2502.06784
14
citations
#2218

Joint Velocity-Growth Flow Matching for Single-Cell Dynamics Modeling

Dongyi Wang, Yuanwei Jiang, Zhenyi Zhang et al.

NEURIPS 2025posterarXiv:2505.13413
14
citations
#2219

Vision-Language Gradient Descent-driven All-in-One Deep Unfolding Networks

Haijin Zeng, Xiangming Wang, Yongyong Chen et al.

CVPR 2025posterarXiv:2503.16930
14
citations
#2220

Omnia de EgoTempo: Benchmarking Temporal Understanding of Multi-Modal LLMs in Egocentric Videos

Chiara Plizzari, Alessio Tonioni, Yongqin Xian et al.

CVPR 2025posterarXiv:2503.13646
14
citations
#2221

Uni4D: Unifying Visual Foundation Models for 4D Modeling from a Single Video

David Yifan Yao, Albert J. Zhai, Shenlong Wang

CVPR 2025highlightarXiv:2503.21761
14
citations
#2222

Synthetic-to-Real Self-supervised Robust Depth Estimation via Learning with Motion and Structure Priors

Weilong Yan, Ming Li, Li Haipeng et al.

CVPR 2025posterarXiv:2503.20211
14
citations
#2223

Lost in Translation, Found in Context: Sign Language Translation with Contextual Cues

Youngjoon Jang, Haran Raajesh, Liliane Momeni et al.

CVPR 2025posterarXiv:2501.09754
14
citations
#2224

MrT5: Dynamic Token Merging for Efficient Byte-level Language Models

Julie Kallini, Shikhar Murty, Christopher Manning et al.

ICLR 2025posterarXiv:2410.20771
14
citations
#2225

AdaGrad under Anisotropic Smoothness

Yuxing Liu, Rui Pan, Tong Zhang

ICLR 2025posterarXiv:2406.15244
14
citations
#2226

Robust Self-Paced Hashing for Cross-Modal Retrieval with Noisy Labels

Ruitao Pu, Yuan Sun, Yang Qin et al.

AAAI 2025paperarXiv:2501.01699
14
citations
#2227

STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis

Jiatao Gu, Tianrong Chen, David Berthelot et al.

NEURIPS 2025spotlightarXiv:2506.06276
14
citations
#2228

Ready-to-React: Online Reaction Policy for Two-Character Interaction Generation

Zhi Cen, Huaijin Pi, Sida Peng et al.

ICLR 2025posterarXiv:2502.20370
14
citations
#2229

CO-SPY: Combining Semantic and Pixel Features to Detect Synthetic Images by AI

Siyuan Cheng, Lingjuan Lyu, Zhenting Wang et al.

CVPR 2025posterarXiv:2503.18286
14
citations
#2230

Sum of Squares Circuits

Lorenzo Loconte, Stefan Mengel, Antonio Vergari

AAAI 2025paperarXiv:2408.11778
14
citations
#2231

Improved Regret Bounds for Linear Adversarial MDPs via Linear Optimization

XiangCheng Zhang, Fang Kong, Baoxiang Wang et al.

ICLR 2025posterarXiv:2302.06834
14
citations
#2232

SQLFixAgent: Towards Semantic-Accurate Text-to-SQL Parsing via Consistency-Enhanced Multi-Agent Collaboration

Jipeng Cen, Jiaxin Liu, Zhixu Li et al.

AAAI 2025paperarXiv:2406.13408
14
citations
#2233

Provably Accurate Shapley Value Estimation via Leverage Score Sampling

Christopher Musco, R. Teal Witter

ICLR 2025posterarXiv:2410.01917
14
citations
#2234

Retrieval-Augmented Dynamic Prompt Tuning for Incomplete Multimodal Learning

Jian Lang, Zhangtao Cheng, Ting Zhong et al.

AAAI 2025paperarXiv:2501.01120
14
citations
#2235

Retrieving Semantics from the Deep: an RAG Solution for Gesture Synthesis

M. Hamza Mughal, Rishabh Dabral, Merel CJ Scholman et al.

CVPR 2025posterarXiv:2412.06786
14
citations
#2236

Large Language Model Meets Graph Neural Network in Knowledge Distillation

Shengxiang Hu, Guobing Zou, Song Yang et al.

AAAI 2025paperarXiv:2402.05894
14
citations
#2237

CofCA: A STEP-WISE Counterfactual Multi-hop QA benchmark

Jian Wu, Linyi Yang, Zhen Wang et al.

ICLR 2025posterarXiv:2402.11924
14
citations
#2238

Geolocation Representation from Large Language Models Are Generic Enhancers for Spatio-Temporal Learning

Junlin He, Tong Nie, Wei Ma

AAAI 2025paperarXiv:2408.12116
14
citations
#2239

Active Data Curation Effectively Distills Large-Scale Multimodal Models

Vishaal Udandarao, Nikhil Parthasarathy, Muhammad Ferjad Naeem et al.

CVPR 2025posterarXiv:2411.18674
14
citations
#2240

ReGenesis: LLMs can Grow into Reasoning Generalists via Self-Improvement

XIANGYU PENG, Congying Xia, Xinyi Yang et al.

ICLR 2025posterarXiv:2410.02108
14
citations
#2241

SleeperMark: Towards Robust Watermark against Fine-Tuning Text-to-image Diffusion Models

Zilan Wang, Junfeng Guo, Jiacheng Zhu et al.

CVPR 2025posterarXiv:2412.04852
14
citations
#2242

Scaling Inference Time Compute for Diffusion Models

Nanye Ma, Shangyuan Tong, Haolin Jia et al.

CVPR 2025highlight
14
citations
#2243

Toward Understanding In-context vs. In-weight Learning

Bryan Chan, Xinyi Chen, Andras Gyorgy et al.

ICLR 2025posterarXiv:2410.23042
14
citations
#2244

Probabilistic Language-Image Pre-Training

Sanghyuk Chun, Wonjae Kim, Song Park et al.

ICLR 2025posterarXiv:2410.18857
14
citations
#2245

BotSim: LLM-Powered Malicious Social Botnet Simulation

Boyu Qiao, Kun Li, Wei Zhou et al.

AAAI 2025paperarXiv:2412.13420
14
citations
#2246

Towards Universal Soccer Video Understanding

Jiayuan Rao, Haoning Wu, Hao Jiang et al.

CVPR 2025posterarXiv:2412.01820
14
citations
#2247

Mechanistic Permutability: Match Features Across Layers

Nikita Balagansky, Ian Maksimov, Daniil Gavrilov

ICLR 2025posterarXiv:2410.07656
14
citations
#2248

DPU: Dynamic Prototype Updating for Multimodal Out-of-Distribution Detection

Li Li, Huixian Gong, Hao Dong et al.

CVPR 2025highlightarXiv:2411.08227
14
citations
#2249

Semantic Convergence: Harmonizing Recommender Systems via Two-Stage Alignment and Behavioral Semantic Tokenization

Guanghan Li, Xun Zhang, Yufei Zhang et al.

AAAI 2025paperarXiv:2412.13771
14
citations
#2250

UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface

Hao Tang, Chen-Wei Xie, Haiyang Wang et al.

NEURIPS 2025spotlightarXiv:2503.01342
14
citations
#2251

Presto! Distilling Steps and Layers for Accelerating Music Generation

Zachary Novack, Ge Zhu, Jonah Casebeer et al.

ICLR 2025posterarXiv:2410.05167
14
citations
#2252

Assessing Judging Bias in Large Reasoning Models: An Empirical Study

Qian Wang, Zhanzhi Lou, Zhenheng Tang et al.

COLM 2025paperarXiv:2504.09946
14
citations
#2253

JamMa: Ultra-lightweight Local Feature Matching with Joint Mamba

Xiaoyong Lu, Songlin Du

CVPR 2025posterarXiv:2503.03437
14
citations
#2254

REArtGS: Reconstructing and Generating Articulated Objects via 3D Gaussian Splatting with Geometric and Motion Constraints

Di Wu, Liu Liu, Zhou Linli et al.

NEURIPS 2025posterarXiv:2503.06677
14
citations
#2255

Toward Adaptive Large Language Models Structured Pruning via Hybrid-grained Weight Importance Assessment

Jun Liu, Zhenglun Kong, Pu Zhao et al.

AAAI 2025paperarXiv:2403.10799
14
citations
#2256

The ODE Method for Stochastic Approximation and Reinforcement Learning with Markovian Noise

Shuze Daniel Liu, Shuhang Chen, Shangtong Zhang

NEURIPS 2025oralarXiv:2401.07844
13
citations
#2257

KVzip: Query-Agnostic KV Cache Compression with Context Reconstruction

Jang-Hyun Kim, Jinuk Kim, Sangwoo Kwon et al.

NEURIPS 2025oralarXiv:2505.23416
13
citations
#2258

Through-The-Mask: Mask-based Motion Trajectories for Image-to-Video Generation

Guy Yariv, Yuval Kirstain, Amit Zohar et al.

CVPR 2025posterarXiv:2501.03059
13
citations
#2259

Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL

Jiarui Yao, Yifan Hao, Hanning Zhang et al.

NEURIPS 2025posterarXiv:2505.02391
13
citations
#2260

HRAvatar: High-Quality and Relightable Gaussian Head Avatar

Dongbin Zhang, Yunfei Liu, Lijian Lin et al.

CVPR 2025posterarXiv:2503.08224
13
citations
#2261

Local Conditional Controlling for Text-to-Image Diffusion Models

Yibo Zhao, Liang Peng, Yang Yang et al.

AAAI 2025paperarXiv:2312.08768
13
citations
#2262

HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models

Seanie Lee, Haebin Seong, Dong Bok Lee et al.

ICLR 2025posterarXiv:2410.01524
13
citations
#2263

Scenario Dreamer: Vectorized Latent Diffusion for Generating Driving Simulation Environments

Luke Rowe, Roger Girgis, Anthony Gosselin et al.

CVPR 2025posterarXiv:2503.22496
13
citations
#2264

Change3D: Revisiting Change Detection and Captioning from A Video Modeling Perspective

Duowang Zhu, Xiaohu Huang, Haiyan Huang et al.

CVPR 2025highlightarXiv:2503.18803
13
citations
#2265

AIGI-Holmes: Towards Explainable and Generalizable AI-Generated Image Detection via Multimodal Large Language Models

Ziyin Zhou, Yunpeng Luo, Yuanchen Wu et al.

ICCV 2025posterarXiv:2507.02664
13
citations
#2266

ATLAS: Autoformalizing Theorems through Lifting, Augmentation, and Synthesis of Data

Xiaoyang Liu, Kangjie Bao, Jiashuo Zhang et al.

NEURIPS 2025posterarXiv:2502.05567
13
citations
#2267

DVP-MVS: Synergize Depth-Edge and Visibility Prior for Multi-View Stereo

Zhenlong Yuan, Jinguo Luo, Fei Shen et al.

AAAI 2025paperarXiv:2412.11578
13
citations
#2268

CirT: Global Subseasonal-to-Seasonal Forecasting with Geometry-inspired Transformer

Yang Liu, Zinan Zheng, Jiashun Cheng et al.

ICLR 2025oralarXiv:2502.19750
13
citations
#2269

HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models

Runhui Huang, Xinpeng Ding, Chunwei Wang et al.

CVPR 2025posterarXiv:2407.08706
13
citations
#2270

VisionArena: 230k Real World User-VLM Conversations with Preference Labels

Christopher Chou, Lisa Dunlap, Wei-Lin Chiang et al.

CVPR 2025posterarXiv:2412.08687
13
citations
#2271

Temporal Query Network for Efficient Multivariate Time Series Forecasting

Shengsheng Lin, Haojun Chen, Haijie Wu et al.

ICML 2025oralarXiv:2505.12917
13
citations
#2272

Extragradient Preference Optimization (EGPO): Beyond Last-Iterate Convergence for Nash Learning from Human Feedback

Runlong Zhou, Maryam Fazel, Simon Shaolei Du

COLM 2025paperarXiv:2503.08942
13
citations
#2273

Build-A-Scene: Interactive 3D Layout Control for Diffusion-Based Image Generation

Abdelrahman Eldesokey, Peter Wonka

ICLR 2025posterarXiv:2408.14819
13
citations
#2274

SLoPe: Double-Pruned Sparse Plus Lazy Low-Rank Adapter Pretraining of LLMs

Mohammad Mozaffari, Amir Yazdanbakhsh, Zhao Zhang et al.

ICLR 2025posterarXiv:2405.16325
13
citations
#2275

Contextual Bandits for Unbounded Context Distributions

Puning Zhao, Rongfei Fan, Shaowei Wang et al.

ICML 2025posterarXiv:2408.09655
13
citations
#2276

Rethinking Training for De-biasing Text-to-Image Generation: Unlocking the Potential of Stable Diffusion

Eunji Kim, Siwon Kim, Minjun Park et al.

CVPR 2025posterarXiv:2408.12692
13
citations
#2277

Adding Conditional Control to Diffusion Models with Reinforcement Learning

Yulai Zhao, Masatoshi Uehara, Gabriele Scalia et al.

ICLR 2025posterarXiv:2406.12120
13
citations
#2278

Measuring Non-Adversarial Reproduction of Training Data in Large Language Models

Michael Aerni, Javier Rando, Edoardo Debenedetti et al.

ICLR 2025posterarXiv:2411.10242
13
citations
#2279

Putting the Value Back in RL: Better Test-Time Scaling by Unifying LLM Reasoners With Verifiers

Kusha Sareen, Morgane M Moss, Alessandro Sordoni et al.

COLM 2025paperarXiv:2505.04842
13
citations
#2280

Efficient Inference for Large Language Model-based Generative Recommendation

Xinyu Lin, Chaoqun Yang, Wenjie Wang et al.

ICLR 2025posterarXiv:2410.05165
13
citations
#2281

TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models

Makoto Shing, Kou Misaki, Han Bao et al.

ICLR 2025oralarXiv:2501.16937
13
citations
#2282

What Makes a Maze Look Like a Maze?

Joy Hsu, Jiayuan Mao, Joshua B Tenenbaum et al.

ICLR 2025posterarXiv:2409.08202
13
citations
#2283

Consistent and Controllable Image Animation with Motion Diffusion Models

Xin Ma, Yaohui Wang, Gengyun Jia et al.

CVPR 2025posterarXiv:2407.15642
13
citations
#2284

PILAF: Optimal Human Preference Sampling for Reward Modeling

Yunzhen Feng, Ariel Kwiatkowski, Kunhao Zheng et al.

ICML 2025posterarXiv:2502.04270
13
citations
#2285

Truncated Consistency Models

Sangyun Lee, Yilun Xu, Tomas Geffner et al.

ICLR 2025posterarXiv:2410.14895
13
citations
#2286

Detecting High-Stakes Interactions with Activation Probes

Alex McKenzie, Urja Pawar, Phil Blandfort et al.

NEURIPS 2025posterarXiv:2506.10805
13
citations
#2287

Event-based Video Super-Resolution via State Space Models

Zeyu Xiao, Xinchao Wang

CVPR 2025poster
13
citations
#2288

Adversarial Locomotion and Motion Imitation for Humanoid Policy Learning

Jiyuan Shi, Xinzhe Liu, Dewei Wang et al.

NEURIPS 2025posterarXiv:2504.14305
13
citations
#2289

Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration

Hao Zhong, Muzhi Zhu, Zongze Du et al.

NEURIPS 2025oralarXiv:2505.20256
13
citations
#2290

Sign-IDD: Iconicity Disentangled Diffusion for Sign Language Production

Shengeng Tang, Jiayi He, Dan Guo et al.

AAAI 2025paperarXiv:2412.13609
13
citations
#2291

Generative Photography: Scene-Consistent Camera Control for Realistic Text-to-Image Synthesis

Yu Yuan, Xijun Wang, Yichen Sheng et al.

CVPR 2025highlightarXiv:2412.02168
13
citations
#2292

CaRDiff: Video Salient Object Ranking Chain of Thought Reasoning for Saliency Prediction with Diffusion

Yunlong Tang, Gen Zhan, Li Yang et al.

AAAI 2025paperarXiv:2408.12009
13
citations
#2293

Diffusion-NPO: Negative Preference Optimization for Better Preference Aligned Generation of Diffusion Models

Fu-Yun Wang, Yunhao Shui, Jingtan Piao et al.

ICLR 2025posterarXiv:2505.11245
13
citations
#2294

RI-MAE: Rotation-Invariant Masked AutoEncoders for Self-Supervised Point Cloud Representation Learning

Kunming Su, Qiuxia Wu, Panpan Cai et al.

AAAI 2025paperarXiv:2409.00353
13
citations
#2295

Bayesian scaling laws for in-context learning

Aryaman Arora, Dan Jurafsky, Christopher Potts et al.

COLM 2025paperarXiv:2410.16531
13
citations
#2296

AWRaCLe: All-Weather Image Restoration Using Visual In-Context Learning

Sudarshan Rajagopalan, Vishal M. Patel

AAAI 2025paperarXiv:2409.00263
13
citations
#2297

Eve: Efficient Multimodal Vision Language Models with Elastic Visual Experts

Miao Rang, Zhenni Bi, Chuanjian Liu et al.

AAAI 2025paperarXiv:2501.04322
13
citations
#2298

Perception-as-Control: Fine-grained Controllable Image Animation with 3D-aware Motion Representation

Yingjie Chen, Yifang Men, Yuan Yao et al.

ICCV 2025posterarXiv:2501.05020
13
citations
#2299

GS-LiDAR: Generating Realistic LiDAR Point Clouds with Panoramic Gaussian Splatting

Junzhe Jiang, Chun Gu, Yurui Chen et al.

ICLR 2025posterarXiv:2501.13971
13
citations
#2300

MindJourney: Test-Time Scaling with World Models for Spatial Reasoning

Yuncong Yang, Jiageng Liu, Zheyuan Zhang et al.

NEURIPS 2025posterarXiv:2507.12508
13
citations
#2301

Dynamic Loss-Based Sample Reweighting for Improved Large Language Model Pretraining

Daouda Sow, Herbert Woisetschläger, Saikiran Bulusu et al.

ICLR 2025posterarXiv:2502.06733
13
citations
#2302

An Engorgio Prompt Makes Large Language Model Babble on

Jianshuo Dong, Ziyuan Zhang, Qingjie Zhang et al.

ICLR 2025posterarXiv:2412.19394
13
citations
#2303

Referring to Any Person

Qing Jiang, Lin Wu, Zhaoyang Zeng et al.

ICCV 2025posterarXiv:2503.08507
13
citations
#2304

Ctrl-U: Robust Conditional Image Generation via Uncertainty-aware Reward Modeling

Guiyu Zhang, Huan-ang Gao, Zijian Jiang et al.

ICLR 2025posterarXiv:2410.11236
13
citations
#2305

Prompt-Reverse Inconsistency: LLM Self-Inconsistency Beyond Generative Randomness and Prompt Paraphrasing

Jihyun Janice Ahn, Wenpeng Yin

COLM 2025paperarXiv:2504.01282
13
citations
#2306

MemSim: A Bayesian Simulator for Evaluating Memory of LLM-based Personal Assistants

Zeyu Zhang, Quanyu Dai, Luyu Chen et al.

NEURIPS 2025posterarXiv:2409.20163
13
citations
#2307

SEC-bench: Automated Benchmarking of LLM Agents on Real-World Software Security Tasks

Hwiwon Lee, Ziqi Zhang, Hanxiao Lu et al.

NEURIPS 2025posterarXiv:2506.11791
13
citations
#2308

MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models

Jingwei Xu, Junyu Lai, Yunpeng Huang

ICLR 2025posterarXiv:2405.13053
13
citations
#2309

Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset

Yingzi Ma, Jiongxiao Wang, Fei Wang et al.

ICLR 2025posterarXiv:2411.03554
13
citations
#2310

VISCO: Benchmarking Fine-Grained Critique and Correction Towards Self-Improvement in Visual Reasoning

Xueqing Wu, Yuheng Ding, Bingxuan Li et al.

CVPR 2025posterarXiv:2412.02172
13
citations
#2311

CoA-VLA: Improving Vision-Language-Action Models via Visual-Text Chain-of-Affordance

Jinming Li, Yichen Zhu, Zhibin Tang et al.

ICCV 2025poster
13
citations
#2312

ShotAdapter: Text-to-Multi-Shot Video Generation with Diffusion Models

Ozgur Kara, Krishna Kumar Singh, Feng Liu et al.

CVPR 2025posterarXiv:2505.07652
13
citations
#2313

Ward: Provable RAG Dataset Inference via LLM Watermarks

Nikola Jovanović, Robin Staab, Maximilian Baader et al.

ICLR 2025posterarXiv:2410.03537
13
citations
#2314

Causal Composition Diffusion Model for Closed-loop Traffic Generation

Haohong Lin, Xin Huang, Tung Phan-Minh et al.

CVPR 2025posterarXiv:2412.17920
13
citations
#2315

UFM: A Simple Path towards Unified Dense Correspondence with Flow

Yuchen Zhang, Nikhil Keetha, Chenwei Lyu et al.

NEURIPS 2025posterarXiv:2506.09278
13
citations
#2316

Compression of 3D Gaussian Splatting with Optimized Feature Planes and Standard Video Codecs

Soonbin Lee, Fangwen Shu, Yago Sanchez de la Fuente et al.

ICCV 2025posterarXiv:2501.03399
13
citations
#2317

SALAD: Skeleton-aware Latent Diffusion for Text-driven Motion Generation and Editing

Seokhyeon Hong, Chaelin Kim, Serin Yoon et al.

CVPR 2025posterarXiv:2503.13836
13
citations
#2318

Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM Reasoning

Kongcheng Zhang, QI YAO, Shunyu Liu et al.

NEURIPS 2025posterarXiv:2506.08745
13
citations
#2319

UniNet: A Contrastive Learning-guided Unified Framework with Feature Selection for Anomaly Detection

Shun Wei, Jielin Jiang, Xiaolong Xu

CVPR 2025poster
13
citations
#2320

MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning

Yaming Yang, Dilxat Muhtar, Yelong Shen et al.

AAAI 2025paperarXiv:2410.09437
13
citations
#2321

On the Relationship Between Monotone and Squared Probabilistic Circuits

Benjie Wang, Guy Van den Broeck

AAAI 2025paperarXiv:2408.00876
13
citations
#2322

HEROS-GAN: Honed-Energy Regularized and Optimal Supervised GAN for Enhancing Accuracy and Range of Low-Cost Accelerometers

Yifeng Wang, Yi Zhao

AAAI 2025paperarXiv:2502.18064
13
citations
#2323

Classical Planning with LLM-Generated Heuristics: Challenging the State of the Art with Python Code

Augusto B. Corrêa, André G. Pereira, Jendrik Seipp

NEURIPS 2025posterarXiv:2503.18809
13
citations
#2324

How Contaminated Is Your Benchmark? Measuring Dataset Leakage in Large Language Models with Kernel Divergence

Hyeong Kyu Choi, Maxim Khanov, Hongxin Wei et al.

ICML 2025poster
13
citations
#2325

BadVLA: Towards Backdoor Attacks on Vision-Language-Action Models via Objective-Decoupled Optimization

Xueyang Zhou, Guiyao Tie, Guowen Zhang et al.

NEURIPS 2025posterarXiv:2505.16640
13
citations
#2326

Exploring More from Multiple Gait Modalities for Human Identification

Dongyang Jin, Chao Fan, Weihua Chen et al.

AAAI 2025paperarXiv:2412.11495
13
citations
#2327

CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models

Song Wang, Peng Wang, Tong Zhou et al.

ICLR 2025posterarXiv:2407.02408
13
citations
#2328

Establishing Best Practices in Building Rigorous Agentic Benchmarks

Yuxuan Zhu, Tengjun Jin, Yada Pruksachatkun et al.

NEURIPS 2025posterarXiv:2507.02825
13
citations
#2329

InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation

Gaurav Sahu, Abhay Puri, Juan A. Rodriguez et al.

ICLR 2025posterarXiv:2407.06423
13
citations
#2330

Recycling the Web: A Method to Enhance Pre-training Data Quality and Quantity for Language Models

Thao Nguyen, Yang Li, Olga Golovneva et al.

COLM 2025paperarXiv:2506.04689
13
citations
#2331

Trivialized Momentum Facilitates Diffusion Generative Modeling on Lie Groups

Yuchen Zhu, Tianrong Chen, Lingkai Kong et al.

ICLR 2025posterarXiv:2405.16381
13
citations
#2332

Communication-Efficient Language Model Training Scales Reliably and Robustly: Scaling Laws for DiLoCo

Zachary Charles, Gabriel Teston, Lucio Dery et al.

NEURIPS 2025spotlightarXiv:2503.09799
13
citations
#2333

Standardizing Structural Causal Models

Weronika Ormaniec, Scott Sussex, Lars Lorch et al.

ICLR 2025posterarXiv:2406.11601
13
citations
#2334

MIRAGE: Evaluating and Explaining Inductive Reasoning Process in Language Models

Jiachun Li, Pengfei Cao, Zhuoran Jin et al.

ICLR 2025posterarXiv:2410.09542
13
citations
#2335

Image Over Text: Transforming Formula Recognition Evaluation with Character Detection Matching

Bin Wang, Fan Wu, Linke Ouyang et al.

CVPR 2025posterarXiv:2409.03643
13
citations
#2336

Mixture of Attentions For Speculative Decoding

Matthieu Zimmer, Milan Gritta, Gerasimos Lampouras et al.

ICLR 2025posterarXiv:2410.03804
13
citations
#2337

EffiBench-X: A Multi-Language Benchmark for Measuring Efficiency of LLM-Generated Code

Yuhao Qing, Boyu Zhu, Mingzhe Du et al.

NEURIPS 2025posterarXiv:2505.13004
13
citations
#2338

On a Connection Between Imitation Learning and RLHF

Teng Xiao, Yige Yuan, Mingxiao Li et al.

ICLR 2025posterarXiv:2503.05079
13
citations
#2339

Detect Anything 3D in the Wild

Hanxue Zhang, Haoran Jiang, Qingsong Yao et al.

ICCV 2025posterarXiv:2504.07958
13
citations
#2340

Multi-task Visual Grounding with Coarse-to-Fine Consistency Constraints

Ming Dai, Jian Li, Jiedong Zhuang et al.

AAAI 2025paperarXiv:2501.06710
13
citations
#2341

Topology of Reasoning: Understanding Large Reasoning Models through Reasoning Graph Properties

Gouki Minegishi, Hiroki Furuta, Takeshi Kojima et al.

NEURIPS 2025posterarXiv:2506.05744
13
citations
#2342

Benchmarking LLMs' Judgments with No Gold Standard

Shengwei Xu, Yuxuan Lu, Grant Schoenebeck et al.

ICLR 2025posterarXiv:2411.07127
13
citations
#2343

VideoChat-R1.5: Visual Test-Time Scaling to Reinforce Multimodal Reasoning by Iterative Perception

Ziang Yan, Yinan He, Xinhao Li et al.

NEURIPS 2025oralarXiv:2509.21100
13
citations
#2344

Scaling Autonomous Agents via Automatic Reward Modeling And Planning

Zhenfang Chen, Delin Chen, Rui Sun et al.

ICLR 2025posterarXiv:2502.12130
13
citations
#2345

LongSplat: Robust Unposed 3D Gaussian Splatting for Casual Long Videos

Chin-Yang Lin, Cheng Sun, Fu-En Yang et al.

ICCV 2025posterarXiv:2508.14041
13
citations
#2346

MVTokenFlow: High-quality 4D Content Generation using Multiview Token Flow

Hanzhuo Huang, Yuan Liu, Ge Zheng et al.

ICLR 2025oralarXiv:2502.11697
13
citations
#2347

Language Models Are Capable of Metacognitive Monitoring and Control of Their Internal Activations

Ji-An Li, Huadong Xiong, Robert Wilson et al.

NEURIPS 2025posterarXiv:2505.13763
13
citations
#2348

Small Batch Size Training for Language Models: When Vanilla SGD Works, and Why Gradient Accumulation is Wasteful

Martin Marek, Sanae Lotfi, Aditya Somasundaram et al.

NEURIPS 2025posterarXiv:2507.07101
13
citations
#2349

Let LRMs Break Free from Overthinking via Self-Braking Tuning

Haoran Zhao, Yuchen Yan, Yongliang Shen et al.

NEURIPS 2025posterarXiv:2505.14604
13
citations
#2350

Force Prompting: Video Generation Models Can Learn And Generalize Physics-based Control Signals

Nate Gillman, Charles Herrmann, Michael Freeman et al.

NEURIPS 2025posterarXiv:2505.19386
13
citations
#2351

Parameter and Memory Efficient Pretraining via Low-rank Riemannian Optimization

Zhanfeng Mo, Long-Kai Huang, Sinno Jialin Pan

ICLR 2025poster
13
citations
#2352

FSTA-SNN:Frequency-Based Spatial-Temporal Attention Module for Spiking Neural Networks

Kairong Yu, Tianqing Zhang, Hongwei Wang et al.

AAAI 2025paper
13
citations
#2353

Mix Data or Merge Models? Balancing the Helpfulness, Honesty, and Harmlessness of Large Language Model via Model Merging

Jinluan Yang, Dingnan Jin, Anke Tang et al.

NEURIPS 2025posterarXiv:2502.06876
13
citations
#2354

Efficient Residual Learning with Mixture-of-Experts for Universal Dexterous Grasping

Ziye Huang, Haoqi Yuan, Yuhui Fu et al.

ICLR 2025posterarXiv:2410.02475
13
citations
#2355

Grounding Continuous Representations in Geometry: Equivariant Neural Fields

David Wessels, David Knigge, Riccardo Valperga et al.

ICLR 2025posterarXiv:2406.05753
13
citations
#2356

Hyperbolic Fine-Tuning for Large Language Models

Menglin Yang, Ram Samarth B B, Aosong Feng et al.

NEURIPS 2025spotlightarXiv:2410.04010
13
citations
#2357

AttriBoT: A Bag of Tricks for Efficiently Approximating Leave-One-Out Context Attribution

Fengyuan Liu, Nikhil Kandpal, Colin Raffel

ICLR 2025posterarXiv:2411.15102
13
citations
#2358

Unlearning through Knowledge Overwriting: Reversible Federated Unlearning via Selective Sparse Adapter

Zhengyi Zhong, Weidong Bao, Ji Wang et al.

CVPR 2025posterarXiv:2502.20709
13
citations
#2359

Harnessing Vision Foundation Models for High-Performance, Training-Free Open Vocabulary Segmentation

Yuheng Shi, Minjing Dong, Chang Xu

ICCV 2025posterarXiv:2411.09219
13
citations
#2360

ConvCodeWorld: Benchmarking Conversational Code Generation in Reproducible Feedback Environments

Hojae Han, seung-won hwang, Rajhans Samdani et al.

ICLR 2025posterarXiv:2502.19852
13
citations
#2361

ECVC: Exploiting Non-Local Correlations in Multiple Frames for Contextual Video Compression

Wei Jiang, Junru Li, Kai Zhang et al.

CVPR 2025posterarXiv:2410.09706
13
citations
#2362

TANGO: Training-free Embodied AI Agents for Open-world Tasks

Filippo Ziliotto, Tommaso Campari, Luciano Serafini et al.

CVPR 2025posterarXiv:2412.10402
13
citations
#2363

High-dimensional Analysis of Knowledge Distillation: Weak-to-Strong Generalization and Scaling Laws

Muhammed Ildiz, Halil Gozeten, Ege Taga et al.

ICLR 2025posterarXiv:2410.18837
13
citations
#2364

ASIGN: An Anatomy-aware Spatial Imputation Graphic Network for 3D Spatial Transcriptomics

Junchao Zhu, Ruining Deng, Tianyuan Yao et al.

CVPR 2025posterarXiv:2412.03026
13
citations
#2365

Diffusion Model as a Noise-Aware Latent Reward Model for Step-Level Preference Optimization

Tao Zhang, Cheng Da, Kun Ding et al.

NEURIPS 2025posterarXiv:2502.01051
13
citations
#2366

Fréchet Wavelet Distance: A Domain-Agnostic Metric for Image Generation

Lokesh Veeramacheneni, Moritz Wolter, Hilde Kuehne et al.

ICLR 2025posterarXiv:2312.15289
13
citations
#2367

InstantSplamp: Fast and Generalizable Stenography Framework for Generative Gaussian Splatting

Chenxin Li, Hengyu Liu, Zhiwen Fan et al.

ICLR 2025poster
13
citations
#2368

Neuroplastic Expansion in Deep Reinforcement Learning

Jiashun Liu, Johan S Obando Ceron, Aaron Courville et al.

ICLR 2025posterarXiv:2410.07994
13
citations
#2369

Are Large Vision Language Models Good Game Players?

Xinyu Wang, Bohan Zhuang, Qi Wu

ICLR 2025posterarXiv:2503.02358
13
citations
#2370

MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation

Shuwei Shi, Biao Gong, Xi Chen et al.

CVPR 2025posterarXiv:2412.05848
13
citations
#2371

Human Motion Instruction Tuning

Lei Li, Sen Jia, Jianhao Wang et al.

CVPR 2025posterarXiv:2411.16805
13
citations
#2372

Puppeteer: Rig and Animate Your 3D Models

Chaoyue Song, Xiu Li, Fan Yang et al.

NEURIPS 2025oralarXiv:2508.10898
13
citations
#2373

A Periodic Bayesian Flow for Material Generation

Hanlin Wu, Yuxuan Song, Jingjing Gong et al.

ICLR 2025posterarXiv:2502.02016
13
citations
#2374

HOIGen-1M: A Large-scale Dataset for Human-Object Interaction Video Generation

Kun Liu, Qi Liu, Xinchen Liu et al.

CVPR 2025posterarXiv:2503.23715
13
citations
#2375

MBQ: Modality-Balanced Quantization for Large Vision-Language Models

Shiyao Li, Yingchun Hu, Xuefei Ning et al.

CVPR 2025posterarXiv:2412.19509
13
citations
#2376

Do LLMs estimate uncertainty well in instruction-following?

Juyeon Heo, Miao Xiong, Christina Heinze-Deml et al.

ICLR 2025posterarXiv:2410.14582
13
citations
#2377

VPO: Aligning Text-to-Video Generation Models with Prompt Optimization

Jiale Cheng, Ruiliang Lyu, Xiaotao Gu et al.

ICCV 2025posterarXiv:2503.20491
13
citations
#2378

Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis

Zikun Zhang, Zixiang Chen, Quanquan Gu

ICLR 2025posterarXiv:2410.02321
13
citations
#2379

Learning Transformer-based World Models with Contrastive Predictive Coding

Maxime Burchi, Radu Timofte

ICLR 2025oralarXiv:2503.04416
13
citations
#2380

MapExpert: Online HD Map Construction with Simple and Efficient Sparse Map Element Expert

Dapeng Zhang, Dayu Chen, Peng Zhi et al.

AAAI 2025paperarXiv:2412.12704
13
citations
#2381

Conformal Prediction for Causal Effects of Continuous Treatments

Maresa Schröder, Dennis Frauen, Jonas Schweisthal et al.

NEURIPS 2025posterarXiv:2407.03094
13
citations
#2382

Surprising Effectiveness of pretraining Ternary Language Model at Scale

Ayush Kaushal, Tejas Vaidhya, Arnab Mondal et al.

ICLR 2025posterarXiv:2407.12327
13
citations
#2383

Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction

Vaishnavh Nagarajan, Chen Wu, Charles Ding et al.

ICML 2025oralarXiv:2504.15266
13
citations
#2384

OWLS: Scaling Laws for Multilingual Speech Recognition and Translation Models

William Chen, Jinchuan Tian, Yifan Peng et al.

ICML 2025posterarXiv:2502.10373
13
citations
#2385

Aligning Vision to Language: Annotation-Free Multimodal Knowledge Graph Construction for Enhanced LLMs Reasoning

Junming Liu, Siyuan Meng, Yanting Gao et al.

ICCV 2025posterarXiv:2503.12972
13
citations
#2386

A Unifying Framework for Representation Learning

Shaden Alshammari, John Hershey, Axel Feldmann et al.

ICLR 2025posterarXiv:2504.16929
13
citations
#2387

C-CLIP: Multimodal Continual Learning for Vision-Language Model

Wenzhuo Liu, Fei Zhu, Longhui Wei et al.

ICLR 2025poster
13
citations
#2388

AdaWM: Adaptive World Model based Planning for Autonomous Driving

Hang Wang, Xin Ye, Feng Tao et al.

ICLR 2025posterarXiv:2501.13072
13
citations
#2389

TS-RAG: Retrieval-Augmented Generation based Time Series Foundation Models are Stronger Zero-Shot Forecaster

Kanghui Ning, Zijie Pan, Yu Liu et al.

NEURIPS 2025posterarXiv:2503.07649
13
citations
#2390

Synthetic Data is an Elegant GIFT for Continual Vision-Language Models

Bin Wu, Wuxuan Shi, Jinqiao Wang et al.

CVPR 2025posterarXiv:2503.04229
13
citations
#2391

KITS: Inductive Spatio-Temporal Kriging with Increment Training Strategy

Qianxiong Xu, Cheng Long, Ziyue Li et al.

AAAI 2025paperarXiv:2311.02565
13
citations
#2392

ClearSight: Visual Signal Enhancement for Object Hallucination Mitigation in Multimodal Large Language Models

Hao Yin, Guangzong Si, Zilei Wang

CVPR 2025posterarXiv:2503.13107
13
citations
#2393

RoboTron-Mani: All-in-One Multimodal Large Model for Robotic Manipulation

Feng yan, Fanfan Liu, Yiyang Huang et al.

ICCV 2025posterarXiv:2412.07215
13
citations
#2394

PipeFusion: Patch-level Pipeline Parallelism for Diffusion Transformers Inference

Jiarui Fang, Jinzhe Pan, Aoyu Li et al.

NEURIPS 2025posterarXiv:2405.14430
13
citations
#2395

AG-VPReID: A Challenging Large-Scale Benchmark for Aerial-Ground Video-based Person Re-Identification

Huy Nguyen, Kien Nguyen Thanh, Akila Pemasiri et al.

CVPR 2025posterarXiv:2503.08121
13
citations
#2396

ACC-Collab: An Actor-Critic Approach to Multi-Agent LLM Collaboration

Andrew Estornell, Jean-Francois Ton, Yuanshun Yao et al.

ICLR 2025posterarXiv:2411.00053
13
citations
#2397

Faster Algorithms for Structured Linear and Kernel Support Vector Machines

Yuzhou Gu, Zhao Song, Lichen Zhang

ICLR 2025posterarXiv:2307.07735
13
citations
#2398

From Flat to Hierarchical: Extracting Sparse Representations with Matching Pursuit

Valérie Costa, Thomas Fel, Ekdeep S Lubana et al.

NEURIPS 2025posterarXiv:2506.03093
13
citations
#2399

Enhancing Vision-Language Model Reliability with Uncertainty-Guided Dropout Decoding

Yixiong Fang, Ziran Yang, Zhaorun Chen et al.

NEURIPS 2025posterarXiv:2412.06474
13
citations
#2400

Amplifier: Bringing Attention to Neglected Low-Energy Components in Time Series Forecasting

Jingru Fei, Kun Yi, Wei Fan et al.

AAAI 2025paperarXiv:2501.17216
13
citations