Most Cited 2025 "3d all-atom models" Papers

22,274 papers found • Page 12 of 112

#2201

Unleashing Vecset Diffusion Model for Fast Shape Generation

Zeqiang Lai, Zhao Yunfei, Zibo Zhao et al.

ICCV 2025highlightarXiv:2503.16302
14
citations
#2202

Presto! Distilling Steps and Layers for Accelerating Music Generation

Zachary Novack, Ge Zhu, Jonah Casebeer et al.

ICLR 2025posterarXiv:2410.05167
14
citations
#2203

The ODE Method for Stochastic Approximation and Reinforcement Learning with Markovian Noise

Shuze Daniel Liu, Shuhang Chen, Shangtong Zhang

NEURIPS 2025oralarXiv:2401.07844
13
citations
#2204

KVzip: Query-Agnostic KV Cache Compression with Context Reconstruction

Jang-Hyun Kim, Jinuk Kim, Sangwoo Kwon et al.

NEURIPS 2025oralarXiv:2505.23416
13
citations
#2205

Amplifier: Bringing Attention to Neglected Low-Energy Components in Time Series Forecasting

Jingru Fei, Kun Yi, Wei Fan et al.

AAAI 2025paperarXiv:2501.17216
13
citations
#2206

Through-The-Mask: Mask-based Motion Trajectories for Image-to-Video Generation

Guy Yariv, Yuval Kirstain, Amit Zohar et al.

CVPR 2025posterarXiv:2501.03059
13
citations
#2207

HRAvatar: High-Quality and Relightable Gaussian Head Avatar

Dongbin Zhang, Yunfei Liu, Lijian Lin et al.

CVPR 2025posterarXiv:2503.08224
13
citations
#2208

Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL

Jiarui Yao, Yifan Hao, Hanning Zhang et al.

NEURIPS 2025posterarXiv:2505.02391
13
citations
#2209

Scenario Dreamer: Vectorized Latent Diffusion for Generating Driving Simulation Environments

Luke Rowe, Roger Girgis, Anthony Gosselin et al.

CVPR 2025posterarXiv:2503.22496
13
citations
#2210

Change3D: Revisiting Change Detection and Captioning from A Video Modeling Perspective

Duowang Zhu, Xiaohu Huang, Haiyan Huang et al.

CVPR 2025highlightarXiv:2503.18803
13
citations
#2211

HarmAug: Effective Data Augmentation for Knowledge Distillation of Safety Guard Models

Seanie Lee, Haebin Seong, Dong Bok Lee et al.

ICLR 2025posterarXiv:2410.01524
13
citations
#2212

CirT: Global Subseasonal-to-Seasonal Forecasting with Geometry-inspired Transformer

Yang Liu, Zinan Zheng, Jiashun Cheng et al.

ICLR 2025oralarXiv:2502.19750
13
citations
#2213

Local Conditional Controlling for Text-to-Image Diffusion Models

Yibo Zhao, Liang Peng, Yang Yang et al.

AAAI 2025paperarXiv:2312.08768
13
citations
#2214

AIGI-Holmes: Towards Explainable and Generalizable AI-Generated Image Detection via Multimodal Large Language Models

Ziyin Zhou, Yunpeng Luo, Yuanchen Wu et al.

ICCV 2025posterarXiv:2507.02664
13
citations
#2215

Multimodal Class-aware Semantic Enhancement Network for Audio-Visual Video Parsing

Pengcheng Zhao, Jinxing Zhou, Yang Zhao et al.

AAAI 2025paperarXiv:2412.11248
13
citations
#2216

ATLAS: Autoformalizing Theorems through Lifting, Augmentation, and Synthesis of Data

Xiaoyang Liu, Kangjie Bao, Jiashuo Zhang et al.

NEURIPS 2025posterarXiv:2502.05567
13
citations
#2217

PILAF: Optimal Human Preference Sampling for Reward Modeling

Yunzhen Feng, Ariel Kwiatkowski, Kunhao Zheng et al.

ICML 2025posterarXiv:2502.04270
13
citations
#2218

HiRes-LLaVA: Restoring Fragmentation Input in High-Resolution Large Vision-Language Models

Runhui Huang, Xinpeng Ding, Chunwei Wang et al.

CVPR 2025posterarXiv:2407.08706
13
citations
#2219

VisionArena: 230k Real World User-VLM Conversations with Preference Labels

Christopher Chou, Lisa Dunlap, Wei-Lin Chiang et al.

CVPR 2025posterarXiv:2412.08687
13
citations
#2220

DVP-MVS: Synergize Depth-Edge and Visibility Prior for Multi-View Stereo

Zhenlong Yuan, Jinguo Luo, Fei Shen et al.

AAAI 2025paperarXiv:2412.11578
13
citations
#2221

SLoPe: Double-Pruned Sparse Plus Lazy Low-Rank Adapter Pretraining of LLMs

Mohammad Mozaffari, Amir Yazdanbakhsh, Zhao Zhang et al.

ICLR 2025posterarXiv:2405.16325
13
citations
#2222

Build-A-Scene: Interactive 3D Layout Control for Diffusion-Based Image Generation

Abdelrahman Eldesokey, Peter Wonka

ICLR 2025posterarXiv:2408.14819
13
citations
#2223

Rethinking Training for De-biasing Text-to-Image Generation: Unlocking the Potential of Stable Diffusion

Eunji Kim, Siwon Kim, Minjun Park et al.

CVPR 2025posterarXiv:2408.12692
13
citations
#2224

Measuring Non-Adversarial Reproduction of Training Data in Large Language Models

Michael Aerni, Javier Rando, Edoardo Debenedetti et al.

ICLR 2025posterarXiv:2411.10242
13
citations
#2225

Adding Conditional Control to Diffusion Models with Reinforcement Learning

Yulai Zhao, Masatoshi Uehara, Gabriele Scalia et al.

ICLR 2025posterarXiv:2406.12120
13
citations
#2226

Efficient Inference for Large Language Model-based Generative Recommendation

Xinyu Lin, Chaoqun Yang, Wenjie Wang et al.

ICLR 2025posterarXiv:2410.05165
13
citations
#2227

Temporal Query Network for Efficient Multivariate Time Series Forecasting

Shengsheng Lin, Haojun Chen, Haijie Wu et al.

ICML 2025oralarXiv:2505.12917
13
citations
#2228

What Makes a Maze Look Like a Maze?

Joy Hsu, Jiayuan Mao, Joshua B Tenenbaum et al.

ICLR 2025posterarXiv:2409.08202
13
citations
#2229

Contextual Bandits for Unbounded Context Distributions

Puning Zhao, Rongfei Fan, Shaowei Wang et al.

ICML 2025posterarXiv:2408.09655
13
citations
#2230

TAID: Temporally Adaptive Interpolated Distillation for Efficient Knowledge Transfer in Language Models

Makoto Shing, Kou Misaki, Han Bao et al.

ICLR 2025oralarXiv:2501.16937
13
citations
#2231

Consistent and Controllable Image Animation with Motion Diffusion Models

Xin Ma, Yaohui Wang, Gengyun Jia et al.

CVPR 2025posterarXiv:2407.15642
13
citations
#2232

Adversarial Locomotion and Motion Imitation for Humanoid Policy Learning

Jiyuan Shi, Xinzhe Liu, Dewei Wang et al.

NEURIPS 2025posterarXiv:2504.14305
13
citations
#2233

Event-based Video Super-Resolution via State Space Models

Zeyu Xiao, Xinchao Wang

CVPR 2025poster
13
citations
#2234

Detecting High-Stakes Interactions with Activation Probes

Alex McKenzie, Urja Pawar, Phil Blandfort et al.

NEURIPS 2025posterarXiv:2506.10805
13
citations
#2235

Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration

Hao Zhong, Muzhi Zhu, Zongze Du et al.

NEURIPS 2025oralarXiv:2505.20256
13
citations
#2236

Ctrl-U: Robust Conditional Image Generation via Uncertainty-aware Reward Modeling

Guiyu Zhang, Huan-ang Gao, Zijian Jiang et al.

ICLR 2025posterarXiv:2410.11236
13
citations
#2237

Bayesian scaling laws for in-context learning

Aryaman Arora, Dan Jurafsky, Christopher Potts et al.

COLM 2025paperarXiv:2410.16531
13
citations
#2238

Generative Photography: Scene-Consistent Camera Control for Realistic Text-to-Image Synthesis

Yu Yuan, Xijun Wang, Yichen Sheng et al.

CVPR 2025highlightarXiv:2412.02168
13
citations
#2239

RI-MAE: Rotation-Invariant Masked AutoEncoders for Self-Supervised Point Cloud Representation Learning

Kunming Su, Qiuxia Wu, Panpan Cai et al.

AAAI 2025paperarXiv:2409.00353
13
citations
#2240

GS-LiDAR: Generating Realistic LiDAR Point Clouds with Panoramic Gaussian Splatting

Junzhe Jiang, Chun Gu, Yurui Chen et al.

ICLR 2025posterarXiv:2501.13971
13
citations
#2241

Perception-as-Control: Fine-grained Controllable Image Animation with 3D-aware Motion Representation

Yingjie Chen, Yifang Men, Yuan Yao et al.

ICCV 2025posterarXiv:2501.05020
13
citations
#2242

Sign-IDD: Iconicity Disentangled Diffusion for Sign Language Production

Shengeng Tang, Jiayi He, Dan Guo et al.

AAAI 2025paperarXiv:2412.13609
13
citations
#2243

MemSim: A Bayesian Simulator for Evaluating Memory of LLM-based Personal Assistants

Zeyu Zhang, Quanyu Dai, Luyu Chen et al.

NEURIPS 2025posterarXiv:2409.20163
13
citations
#2244

Referring to Any Person

Qing Jiang, Lin Wu, Zhaoyang Zeng et al.

ICCV 2025posterarXiv:2503.08507
13
citations
#2245

Truncated Consistency Models

Sangyun Lee, Yilun Xu, Tomas Geffner et al.

ICLR 2025posterarXiv:2410.14895
13
citations
#2246

MeteoRA: Multiple-tasks Embedded LoRA for Large Language Models

Jingwei Xu, Junyu Lai, Yunpeng Huang

ICLR 2025posterarXiv:2405.13053
13
citations
#2247

Eve: Efficient Multimodal Vision Language Models with Elastic Visual Experts

Miao Rang, Zhenni Bi, Chuanjian Liu et al.

AAAI 2025paperarXiv:2501.04322
13
citations
#2248

SEC-bench: Automated Benchmarking of LLM Agents on Real-World Software Security Tasks

Hwiwon Lee, Ziqi Zhang, Hanxiao Lu et al.

NEURIPS 2025posterarXiv:2506.11791
13
citations
#2249

Classical Planning with LLM-Generated Heuristics: Challenging the State of the Art with Python Code

Augusto B. Corrêa, André G. Pereira, Jendrik Seipp

NEURIPS 2025posterarXiv:2503.18809
13
citations
#2250

Dynamic Loss-Based Sample Reweighting for Improved Large Language Model Pretraining

Daouda Sow, Herbert Woisetschläger, Saikiran Bulusu et al.

ICLR 2025posterarXiv:2502.06733
13
citations
#2251

MindJourney: Test-Time Scaling with World Models for Spatial Reasoning

Yuncong Yang, Jiageng Liu, Zheyuan Zhang et al.

NEURIPS 2025posterarXiv:2507.12508
13
citations
#2252

Diffusion-NPO: Negative Preference Optimization for Better Preference Aligned Generation of Diffusion Models

Fu-Yun Wang, Yunhao Shui, Jingtan Piao et al.

ICLR 2025posterarXiv:2505.11245
13
citations
#2253

CaRDiff: Video Salient Object Ranking Chain of Thought Reasoning for Saliency Prediction with Diffusion

Yunlong Tang, Gen Zhan, Li Yang et al.

AAAI 2025paperarXiv:2408.12009
13
citations
#2254

VISCO: Benchmarking Fine-Grained Critique and Correction Towards Self-Improvement in Visual Reasoning

Xueqing Wu, Yuheng Ding, Bingxuan Li et al.

CVPR 2025posterarXiv:2412.02172
13
citations
#2255

CoA-VLA: Improving Vision-Language-Action Models via Visual-Text Chain-of-Affordance

Jinming Li, Yichen Zhu, Zhibin Tang et al.

ICCV 2025poster
13
citations
#2256

ShotAdapter: Text-to-Multi-Shot Video Generation with Diffusion Models

Ozgur Kara, Krishna Kumar Singh, Feng Liu et al.

CVPR 2025posterarXiv:2505.07652
13
citations
#2257

Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset

Yingzi Ma, Jiongxiao Wang, Fei Wang et al.

ICLR 2025posterarXiv:2411.03554
13
citations
#2258

Ward: Provable RAG Dataset Inference via LLM Watermarks

Nikola Jovanović, Robin Staab, Maximilian Baader et al.

ICLR 2025posterarXiv:2410.03537
13
citations
#2259

Causal Composition Diffusion Model for Closed-loop Traffic Generation

Haohong Lin, Xin Huang, Tung Phan-Minh et al.

CVPR 2025posterarXiv:2412.17920
13
citations
#2260

UFM: A Simple Path towards Unified Dense Correspondence with Flow

Yuchen Zhang, Nikhil Keetha, Chenwei Lyu et al.

NEURIPS 2025posterarXiv:2506.09278
13
citations
#2261

SALAD: Skeleton-aware Latent Diffusion for Text-driven Motion Generation and Editing

Seokhyeon Hong, Chaelin Kim, Serin Yoon et al.

CVPR 2025posterarXiv:2503.13836
13
citations
#2262

An Engorgio Prompt Makes Large Language Model Babble on

Jianshuo Dong, Ziyuan Zhang, Qingjie Zhang et al.

ICLR 2025posterarXiv:2412.19394
13
citations
#2263

Compression of 3D Gaussian Splatting with Optimized Feature Planes and Standard Video Codecs

Soonbin Lee, Fangwen Shu, Yago Sanchez de la Fuente et al.

ICCV 2025posterarXiv:2501.03399
13
citations
#2264

UniNet: A Contrastive Learning-guided Unified Framework with Feature Selection for Anomaly Detection

Shun Wei, Jielin Jiang, Xiaolong Xu

CVPR 2025poster
13
citations
#2265

Consistent Paths Lead to Truth: Self-Rewarding Reinforcement Learning for LLM Reasoning

Kongcheng Zhang, QI YAO, Shunyu Liu et al.

NEURIPS 2025posterarXiv:2506.08745
13
citations
#2266

How Contaminated Is Your Benchmark? Measuring Dataset Leakage in Large Language Models with Kernel Divergence

Hyeong Kyu Choi, Maxim Khanov, Hongxin Wei et al.

ICML 2025poster
13
citations
#2267

AWRaCLe: All-Weather Image Restoration Using Visual In-Context Learning

Sudarshan Rajagopalan, Vishal M. Patel

AAAI 2025paperarXiv:2409.00263
13
citations
#2268

Prompt-Reverse Inconsistency: LLM Self-Inconsistency Beyond Generative Randomness and Prompt Paraphrasing

Jihyun Janice Ahn, Wenpeng Yin

COLM 2025paperarXiv:2504.01282
13
citations
#2269

BadVLA: Towards Backdoor Attacks on Vision-Language-Action Models via Objective-Decoupled Optimization

Xueyang Zhou, Guiyao Tie, Guowen Zhang et al.

NEURIPS 2025posterarXiv:2505.16640
13
citations
#2270

MTL-LoRA: Low-Rank Adaptation for Multi-Task Learning

Yaming Yang, Dilxat Muhtar, Yelong Shen et al.

AAAI 2025paperarXiv:2410.09437
13
citations
#2271

On the Relationship Between Monotone and Squared Probabilistic Circuits

Benjie Wang, Guy Van den Broeck

AAAI 2025paperarXiv:2408.00876
13
citations
#2272

CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models

Song Wang, Peng Wang, Tong Zhou et al.

ICLR 2025posterarXiv:2407.02408
13
citations
#2273

Establishing Best Practices in Building Rigorous Agentic Benchmarks

Yuxuan Zhu, Tengjun Jin, Yada Pruksachatkun et al.

NEURIPS 2025posterarXiv:2507.02825
13
citations
#2274

Exploring More from Multiple Gait Modalities for Human Identification

Dongyang Jin, Chao Fan, Weihua Chen et al.

AAAI 2025paperarXiv:2412.11495
13
citations
#2275

HEROS-GAN: Honed-Energy Regularized and Optimal Supervised GAN for Enhancing Accuracy and Range of Low-Cost Accelerometers

Yifeng Wang, Yi Zhao

AAAI 2025paperarXiv:2502.18064
13
citations
#2276

Trivialized Momentum Facilitates Diffusion Generative Modeling on Lie Groups

Yuchen Zhu, Tianrong Chen, Lingkai Kong et al.

ICLR 2025posterarXiv:2405.16381
13
citations
#2277

Standardizing Structural Causal Models

Weronika Ormaniec, Scott Sussex, Lars Lorch et al.

ICLR 2025posterarXiv:2406.11601
13
citations
#2278

Communication-Efficient Language Model Training Scales Reliably and Robustly: Scaling Laws for DiLoCo

Zachary Charles, Gabriel Teston, Lucio Dery et al.

NEURIPS 2025spotlightarXiv:2503.09799
13
citations
#2279

MIRAGE: Evaluating and Explaining Inductive Reasoning Process in Language Models

Jiachun Li, Pengfei Cao, Zhuoran Jin et al.

ICLR 2025posterarXiv:2410.09542
13
citations
#2280

Image Over Text: Transforming Formula Recognition Evaluation with Character Detection Matching

Bin Wang, Fan Wu, Linke Ouyang et al.

CVPR 2025posterarXiv:2409.03643
13
citations
#2281

On a Connection Between Imitation Learning and RLHF

Teng Xiao, Yige Yuan, Mingxiao Li et al.

ICLR 2025posterarXiv:2503.05079
13
citations
#2282

Mixture of Attentions For Speculative Decoding

Matthieu Zimmer, Milan Gritta, Gerasimos Lampouras et al.

ICLR 2025posterarXiv:2410.03804
13
citations
#2283

EffiBench-X: A Multi-Language Benchmark for Measuring Efficiency of LLM-Generated Code

Yuhao Qing, Boyu Zhu, Mingzhe Du et al.

NEURIPS 2025posterarXiv:2505.13004
13
citations
#2284

InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation

Gaurav Sahu, Abhay Puri, Juan A. Rodriguez et al.

ICLR 2025posterarXiv:2407.06423
13
citations
#2285

Detect Anything 3D in the Wild

Hanxue Zhang, Haoran Jiang, Qingsong Yao et al.

ICCV 2025posterarXiv:2504.07958
13
citations
#2286

Topology of Reasoning: Understanding Large Reasoning Models through Reasoning Graph Properties

Gouki Minegishi, Hiroki Furuta, Takeshi Kojima et al.

NEURIPS 2025posterarXiv:2506.05744
13
citations
#2287

Multi-task Visual Grounding with Coarse-to-Fine Consistency Constraints

Ming Dai, Jian Li, Jiedong Zhuang et al.

AAAI 2025paperarXiv:2501.06710
13
citations
#2288

Recycling the Web: A Method to Enhance Pre-training Data Quality and Quantity for Language Models

Thao Nguyen, Yang Li, Olga Golovneva et al.

COLM 2025paperarXiv:2506.04689
13
citations
#2289

Benchmarking LLMs' Judgments with No Gold Standard

Shengwei Xu, Yuxuan Lu, Grant Schoenebeck et al.

ICLR 2025posterarXiv:2411.07127
13
citations
#2290

VideoChat-R1.5: Visual Test-Time Scaling to Reinforce Multimodal Reasoning by Iterative Perception

Ziang Yan, Yinan He, Xinhao Li et al.

NEURIPS 2025oralarXiv:2509.21100
13
citations
#2291

Scaling Autonomous Agents via Automatic Reward Modeling And Planning

Zhenfang Chen, Delin Chen, Rui Sun et al.

ICLR 2025posterarXiv:2502.12130
13
citations
#2292

LongSplat: Robust Unposed 3D Gaussian Splatting for Casual Long Videos

Chin-Yang Lin, Cheng Sun, Fu-En Yang et al.

ICCV 2025posterarXiv:2508.14041
13
citations
#2293

Let LRMs Break Free from Overthinking via Self-Braking Tuning

Haoran Zhao, Yuchen Yan, Yongliang Shen et al.

NEURIPS 2025posterarXiv:2505.14604
13
citations
#2294

Language Models Are Capable of Metacognitive Monitoring and Control of Their Internal Activations

Ji-An Li, Huadong Xiong, Robert Wilson et al.

NEURIPS 2025posterarXiv:2505.13763
13
citations
#2295

Small Batch Size Training for Language Models: When Vanilla SGD Works, and Why Gradient Accumulation is Wasteful

Martin Marek, Sanae Lotfi, Aditya Somasundaram et al.

NEURIPS 2025posterarXiv:2507.07101
13
citations
#2296

Parameter and Memory Efficient Pretraining via Low-rank Riemannian Optimization

Zhanfeng Mo, Long-Kai Huang, Sinno Jialin Pan

ICLR 2025poster
13
citations
#2297

Force Prompting: Video Generation Models Can Learn And Generalize Physics-based Control Signals

Nate Gillman, Charles Herrmann, Michael Freeman et al.

NEURIPS 2025posterarXiv:2505.19386
13
citations
#2298

InstantSplamp: Fast and Generalizable Stenography Framework for Generative Gaussian Splatting

Chenxin Li, Hengyu Liu, Zhiwen Fan et al.

ICLR 2025poster
13
citations
#2299

Mix Data or Merge Models? Balancing the Helpfulness, Honesty, and Harmlessness of Large Language Model via Model Merging

Jinluan Yang, Dingnan Jin, Anke Tang et al.

NEURIPS 2025posterarXiv:2502.06876
13
citations
#2300

Efficient Residual Learning with Mixture-of-Experts for Universal Dexterous Grasping

Ziye Huang, Haoqi Yuan, Yuhui Fu et al.

ICLR 2025posterarXiv:2410.02475
13
citations
#2301

Unlearning through Knowledge Overwriting: Reversible Federated Unlearning via Selective Sparse Adapter

Zhengyi Zhong, Weidong Bao, Ji Wang et al.

CVPR 2025posterarXiv:2502.20709
13
citations
#2302

Harnessing Vision Foundation Models for High-Performance, Training-Free Open Vocabulary Segmentation

Yuheng Shi, Minjing Dong, Chang Xu

ICCV 2025posterarXiv:2411.09219
13
citations
#2303

AttriBoT: A Bag of Tricks for Efficiently Approximating Leave-One-Out Context Attribution

Fengyuan Liu, Nikhil Kandpal, Colin Raffel

ICLR 2025posterarXiv:2411.15102
13
citations
#2304

ConvCodeWorld: Benchmarking Conversational Code Generation in Reproducible Feedback Environments

Hojae Han, seung-won hwang, Rajhans Samdani et al.

ICLR 2025posterarXiv:2502.19852
13
citations
#2305

ECVC: Exploiting Non-Local Correlations in Multiple Frames for Contextual Video Compression

Wei Jiang, Junru Li, Kai Zhang et al.

CVPR 2025posterarXiv:2410.09706
13
citations
#2306

TANGO: Training-free Embodied AI Agents for Open-world Tasks

Filippo Ziliotto, Tommaso Campari, Luciano Serafini et al.

CVPR 2025posterarXiv:2412.10402
13
citations
#2307

High-dimensional Analysis of Knowledge Distillation: Weak-to-Strong Generalization and Scaling Laws

Muhammed Ildiz, Halil Gozeten, Ege Taga et al.

ICLR 2025posterarXiv:2410.18837
13
citations
#2308

ASIGN: An Anatomy-aware Spatial Imputation Graphic Network for 3D Spatial Transcriptomics

Junchao Zhu, Ruining Deng, Tianyuan Yao et al.

CVPR 2025posterarXiv:2412.03026
13
citations
#2309

Diffusion Model as a Noise-Aware Latent Reward Model for Step-Level Preference Optimization

Tao Zhang, Cheng Da, Kun Ding et al.

NEURIPS 2025posterarXiv:2502.01051
13
citations
#2310

Fréchet Wavelet Distance: A Domain-Agnostic Metric for Image Generation

Lokesh Veeramacheneni, Moritz Wolter, Hilde Kuehne et al.

ICLR 2025posterarXiv:2312.15289
13
citations
#2311

Neuroplastic Expansion in Deep Reinforcement Learning

Jiashun Liu, Johan S Obando Ceron, Aaron Courville et al.

ICLR 2025posterarXiv:2410.07994
13
citations
#2312

Grounding Continuous Representations in Geometry: Equivariant Neural Fields

David Wessels, David Knigge, Riccardo Valperga et al.

ICLR 2025posterarXiv:2406.05753
13
citations
#2313

MotionStone: Decoupled Motion Intensity Modulation with Diffusion Transformer for Image-to-Video Generation

Shuwei Shi, Biao Gong, Xi Chen et al.

CVPR 2025posterarXiv:2412.05848
13
citations
#2314

Human Motion Instruction Tuning

Lei Li, Sen Jia, Jianhao Wang et al.

CVPR 2025posterarXiv:2411.16805
13
citations
#2315

Robust Self-Paced Hashing for Cross-Modal Retrieval with Noisy Labels

Ruitao Pu, Yuan Sun, Yang Qin et al.

AAAI 2025paperarXiv:2501.01699
13
citations
#2316

Puppeteer: Rig and Animate Your 3D Models

Chaoyue Song, Xiu Li, Fan Yang et al.

NEURIPS 2025oralarXiv:2508.10898
13
citations
#2317

MBQ: Modality-Balanced Quantization for Large Vision-Language Models

Shiyao Li, Yingchun Hu, Xuefei Ning et al.

CVPR 2025posterarXiv:2412.19509
13
citations
#2318

HOIGen-1M: A Large-scale Dataset for Human-Object Interaction Video Generation

Kun Liu, Qi Liu, Xinchen Liu et al.

CVPR 2025posterarXiv:2503.23715
13
citations
#2319

C-CLIP: Multimodal Continual Learning for Vision-Language Model

Wenzhuo Liu, Fei Zhu, Longhui Wei et al.

ICLR 2025poster
13
citations
#2320

Learning Transformer-based World Models with Contrastive Predictive Coding

Maxime Burchi, Radu Timofte

ICLR 2025oralarXiv:2503.04416
13
citations
#2321

A Periodic Bayesian Flow for Material Generation

Hanlin Wu, Yuxuan Song, Jingjing Gong et al.

ICLR 2025posterarXiv:2502.02016
13
citations
#2322

Do LLMs estimate uncertainty well in instruction-following?

Juyeon Heo, Miao Xiong, Christina Heinze-Deml et al.

ICLR 2025posterarXiv:2410.14582
13
citations
#2323

Sum of Squares Circuits

Lorenzo Loconte, Stefan Mengel, Antonio Vergari

AAAI 2025paperarXiv:2408.11778
13
citations
#2324

VPO: Aligning Text-to-Video Generation Models with Prompt Optimization

Jiale Cheng, Ruiliang Lyu, Xiaotao Gu et al.

ICCV 2025posterarXiv:2503.20491
13
citations
#2325

Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis

Zikun Zhang, Zixiang Chen, Quanquan Gu

ICLR 2025posterarXiv:2410.02321
13
citations
#2326

Conformal Prediction for Causal Effects of Continuous Treatments

Maresa Schröder, Dennis Frauen, Jonas Schweisthal et al.

NEURIPS 2025posterarXiv:2407.03094
13
citations
#2327

Aligning Vision to Language: Annotation-Free Multimodal Knowledge Graph Construction for Enhanced LLMs Reasoning

Junming Liu, Siyuan Meng, Yanting Gao et al.

ICCV 2025posterarXiv:2503.12972
13
citations
#2328

Surprising Effectiveness of pretraining Ternary Language Model at Scale

Ayush Kaushal, Tejas Vaidhya, Arnab Mondal et al.

ICLR 2025posterarXiv:2407.12327
13
citations
#2329

AdaWM: Adaptive World Model based Planning for Autonomous Driving

Hang Wang, Xin Ye, Feng Tao et al.

ICLR 2025posterarXiv:2501.13072
13
citations
#2330

A Unifying Framework for Representation Learning

Shaden Alshammari, John Hershey, Axel Feldmann et al.

ICLR 2025posterarXiv:2504.16929
13
citations
#2331

Are Large Vision Language Models Good Game Players?

Xinyu Wang, Bohan Zhuang, Qi Wu

ICLR 2025posterarXiv:2503.02358
13
citations
#2332

MapExpert: Online HD Map Construction with Simple and Efficient Sparse Map Element Expert

Dapeng Zhang, Dayu Chen, Peng Zhi et al.

AAAI 2025paperarXiv:2412.12704
13
citations
#2333

TS-RAG: Retrieval-Augmented Generation based Time Series Foundation Models are Stronger Zero-Shot Forecaster

Kanghui Ning, Zijie Pan, Yu Liu et al.

NEURIPS 2025posterarXiv:2503.07649
13
citations
#2334

Synthetic Data is an Elegant GIFT for Continual Vision-Language Models

Bin Wu, Wuxuan Shi, Jinqiao Wang et al.

CVPR 2025posterarXiv:2503.04229
13
citations
#2335

KITS: Inductive Spatio-Temporal Kriging with Increment Training Strategy

Qianxiong Xu, Cheng Long, Ziyue Li et al.

AAAI 2025paperarXiv:2311.02565
13
citations
#2336

ClearSight: Visual Signal Enhancement for Object Hallucination Mitigation in Multimodal Large Language Models

Hao Yin, Guangzong Si, Zilei Wang

CVPR 2025posterarXiv:2503.13107
13
citations
#2337

Faster Algorithms for Structured Linear and Kernel Support Vector Machines

Yuzhou Gu, Zhao Song, Lichen Zhang

ICLR 2025posterarXiv:2307.07735
13
citations
#2338

RoboTron-Mani: All-in-One Multimodal Large Model for Robotic Manipulation

Feng yan, Fanfan Liu, Yiyang Huang et al.

ICCV 2025posterarXiv:2412.07215
13
citations
#2339

PipeFusion: Patch-level Pipeline Parallelism for Diffusion Transformers Inference

Jiarui Fang, Jinzhe Pan, Aoyu Li et al.

NEURIPS 2025posterarXiv:2405.14430
13
citations
#2340

AG-VPReID: A Challenging Large-Scale Benchmark for Aerial-Ground Video-based Person Re-Identification

Huy Nguyen, Kien Nguyen Thanh, Akila Pemasiri et al.

CVPR 2025posterarXiv:2503.08121
13
citations
#2341

ACC-Collab: An Actor-Critic Approach to Multi-Agent LLM Collaboration

Andrew Estornell, Jean-Francois Ton, Yuanshun Yao et al.

ICLR 2025posterarXiv:2411.00053
13
citations
#2342

From Flat to Hierarchical: Extracting Sparse Representations with Matching Pursuit

Valérie Costa, Thomas Fel, Ekdeep S Lubana et al.

NEURIPS 2025posterarXiv:2506.03093
13
citations
#2343

Roll the dice & look before you leap: Going beyond the creative limits of next-token prediction

Vaishnavh Nagarajan, Chen Wu, Charles Ding et al.

ICML 2025oralarXiv:2504.15266
13
citations
#2344

Enhancing Vision-Language Model Reliability with Uncertainty-Guided Dropout Decoding

Yixiong Fang, Ziran Yang, Zhaorun Chen et al.

NEURIPS 2025posterarXiv:2412.06474
13
citations
#2345

On the Feature Learning in Diffusion Models

Andi Han, Wei Huang, Yuan Cao et al.

ICLR 2025posterarXiv:2412.01021
13
citations
#2346

FedBiP: Heterogeneous One-Shot Federated Learning with Personalized Latent Diffusion Models

Haokun Chen, Hang Li, Yao Zhang et al.

CVPR 2025posterarXiv:2410.04810
13
citations
#2347

MMAR: Towards Lossless Multi-Modal Auto-Regressive Probabilistic Modeling

Jian Yang, Dacheng Yin, Yizhou Zhou et al.

CVPR 2025posterarXiv:2410.10798
13
citations
#2348

MVTokenFlow: High-quality 4D Content Generation using Multiview Token Flow

Hanzhuo Huang, Yuan Liu, Ge Zheng et al.

ICLR 2025oralarXiv:2502.11697
13
citations
#2349

OWLS: Scaling Laws for Multilingual Speech Recognition and Translation Models

William Chen, Jinchuan Tian, Yifan Peng et al.

ICML 2025posterarXiv:2502.10373
13
citations
#2350

SuperEdit: Rectifying and Facilitating Supervision for Instruction-Based Image Editing

Ming Li, Xin Gu, Fan Chen et al.

ICCV 2025posterarXiv:2505.02370
12
citations
#2351

RhythmMamba: Fast, Lightweight, and Accurate Remote Physiological Measurement

Bochao Zou, Zizheng Guo, Xiaocheng Hu et al.

AAAI 2025paperarXiv:2404.06483
12
citations
#2352

ReasonGrounder: LVLM-Guided Hierarchical Feature Splatting for Open-Vocabulary 3D Visual Grounding and Reasoning

Zhenyang Liu, Yikai Wang, Sixiao Zheng et al.

CVPR 2025posterarXiv:2503.23297
12
citations
#2353

P-SPIKESSM: HARNESSING PROBABILISTIC SPIKING STATE SPACE MODELS FOR LONG-RANGE DEPENDENCY TASKS

Malyaban Bal, Abhronil Sengupta

ICLR 2025posterarXiv:2406.02923
12
citations
#2354

Patient-Level Anatomy Meets Scanning-Level Physics: Personalized Federated Low-Dose CT Denoising Empowered by Large Language Model

Ziyuan Yang, Yingyu Chen, Zhiwen Wang et al.

CVPR 2025posterarXiv:2503.00908
12
citations
#2355

ACL: Activating Capability of Linear Attention for Image Restoration

Yubin Gu, Yuan Meng, Jiayi Ji et al.

CVPR 2025poster
12
citations
#2356

Mobile Video Diffusion

Haitam Ben Yahia, Denis Korzhenkov, Ioannis Lelekas et al.

ICCV 2025posterarXiv:2412.07583
12
citations
#2357

DreamDance: Animating Human Images by Enriching 3D Geometry Cues from 2D Poses

Yatian Pang, Bin Zhu, Bin Lin et al.

ICCV 2025posterarXiv:2412.00397
12
citations
#2358

Repulsive Latent Score Distillation for Solving Inverse Problems

Nicolas Zilberstein, Morteza Mardani, Santiago Segarra

ICLR 2025posterarXiv:2406.16683
12
citations
#2359

FramePainter: Endowing Interactive Image Editing with Video Diffusion Priors

Yabo Zhang, xinpeng zhou, Yihan Zeng et al.

ICCV 2025posterarXiv:2501.08225
12
citations
#2360

CLIMB: Data Foundations for Large Scale Multimodal Clinical Foundation Models

David Dai, Peilin Chen, Malinda Lu et al.

ICML 2025oralarXiv:2503.07667
12
citations
#2361

GeoBEV: Learning Geometric BEV Representation for Multi-view 3D Object Detection

Jinqing Zhang, Yanan Zhang, Yunlong Qi et al.

AAAI 2025paperarXiv:2409.01816
12
citations
#2362

The Crystal Ball Hypothesis in diffusion models: Anticipating object positions from initial noise

Yuanhao Ban, Ruochen Wang, Tianyi Zhou et al.

ICLR 2025posterarXiv:2406.01970
12
citations
#2363

Adversarial Distribution Matching for Diffusion Distillation Towards Efficient Image and Video Synthesis

Yanzuo Lu, Yuxi Ren, Xin Xia et al.

ICCV 2025highlightarXiv:2507.18569
12
citations
#2364

ICE-Bench: A Unified and Comprehensive Benchmark for Image Creating and Editing

Yulin Pan, Xiangteng He, Chaojie Mao et al.

ICCV 2025posterarXiv:2503.14482
12
citations
#2365

SmartEraser: Remove Anything from Images using Masked-Region Guidance

Longtao Jiang, Zhendong Wang, Jianmin Bao et al.

CVPR 2025posterarXiv:2501.08279
12
citations
#2366

Multi-Teacher Knowledge Distillation with Reinforcement Learning for Visual Recognition

Chuanguang Yang, XinQiang Yu, Han Yang et al.

AAAI 2025paperarXiv:2502.18510
12
citations
#2367

CLIMB-ReID: A Hybrid CLIP-Mamba Framework for Person Re-Identification

Chenyang Yu, Xuehu Liu, Jiawen Zhu et al.

AAAI 2025paper
12
citations
#2368

SymmCompletion: High-Fidelity and High-Consistency Point Cloud Completion with Symmetry Guidance

Hongyu Yan, Zijun Li, Kunming Luo et al.

AAAI 2025paperarXiv:2503.18007
12
citations
#2369

Flow matching achieves almost minimax optimal convergence

Kenji Fukumizu, Taiji Suzuki, Noboru Isobe et al.

ICLR 2025posterarXiv:2405.20879
12
citations
#2370

OLiDM: Object-aware LiDAR Diffusion Models for Autonomous Driving

Tianyi Yan, Junbo Yin, Xianpeng Lang et al.

AAAI 2025paperarXiv:2412.17226
12
citations
#2371

Quest: Query-centric Data Synthesis Approach for Long-context Scaling of Large Language Model

Chaochen Gao, Xing W, Qi Fu et al.

ICLR 2025posterarXiv:2405.19846
12
citations
#2372

Searching Latent Program Spaces

Matthew Macfarlane, Clem Bonnet

NEURIPS 2025spotlightarXiv:2411.08706
12
citations
#2373

CoT Red-Handed: Stress Testing Chain-of-Thought Monitoring

Benjamin Arnav, Pablo Bernabeu-Perez, Nathan Helm-Burger et al.

NEURIPS 2025posterarXiv:2505.23575
12
citations
#2374

Vec2Face: Scaling Face Dataset Generation with Loosely Constrained Vectors

Haiyu Wu, Jaskirat Singh, Sicong Tian et al.

ICLR 2025posterarXiv:2409.02979
12
citations
#2375

In Search of Adam’s Secret Sauce

Antonio Orvieto, Robert Gower

NEURIPS 2025oralarXiv:2505.21829
12
citations
#2376

Equivariance Everywhere All At Once: A Recipe for Graph Foundation Models

Ben Finkelshtein, Ismail Ilkan Ceylan, Michael Bronstein et al.

NEURIPS 2025posterarXiv:2506.14291
12
citations
#2377

TREAD: Token Routing for Efficient Architecture-agnostic Diffusion Training

Felix Krause, Timy Phan, Ming Gui et al.

ICCV 2025posterarXiv:2501.04765
12
citations
#2378

Backdoor Attacks Against No-Reference Image Quality Assessment Models via a Scalable Trigger

Yi Yu, Song Xia, Xun Lin et al.

AAAI 2025paperarXiv:2412.07277
12
citations
#2379

Heterogeneous Swarms: Jointly Optimizing Model Roles and Weights for Multi-LLM Systems

Shangbin Feng, Zifeng Wang, Palash Goyal et al.

NEURIPS 2025posterarXiv:2502.04510
12
citations
#2380

Consistent Flow Distillation for Text-to-3D Generation

runjie yan, Yinbo Chen, Xiaolong Wang

ICLR 2025posterarXiv:2501.05445
12
citations
#2381

Can Watermarked LLMs be Identified by Users via Crafted Prompts?

Aiwei Liu, Sheng Guan, Yiming Liu et al.

ICLR 2025posterarXiv:2410.03168
12
citations
#2382

STEREO: A Two-Stage Framework for Adversarially Robust Concept Erasing from Text-to-Image Diffusion Models

Koushik Srivatsan, Fahad Shamshad, Muzammal Naseer et al.

CVPR 2025highlightarXiv:2408.16807
12
citations
#2383

Ambient Diffusion Omni: Training Good Models with Bad Data

Giannis Daras, Adrian Rodriguez-Munoz, Adam Klivans et al.

NEURIPS 2025spotlightarXiv:2506.10038
12
citations
#2384

Revisiting Zeroth-Order Optimization: Minimum-Variance Two-Point Estimators and Directionally Aligned Perturbations

Shaocong Ma, Heng Huang

ICLR 2025posterarXiv:2510.19975
12
citations
#2385

Debiased All-in-one Image Restoration with Task Uncertainty Regularization

Gang Wu, Junjun Jiang, Yijun Wang et al.

AAAI 2025paper
12
citations
#2386

Unsupervised Foundation Model-Agnostic Slide-Level Representation Learning

Tim Lenz, Peter Neidlinger, Marta Ligero et al.

CVPR 2025posterarXiv:2411.13623
12
citations
#2387

Yuan: Yielding Unblemished Aesthetics Through a Unified Network for Visual Imperfections Removal in Generated Images

Zhenyu Yu, Chee Seng Chan

AAAI 2025paperarXiv:2501.08505
12
citations
#2388

OmniSep: Unified Omni-Modality Sound Separation with Query-Mixup

Xize Cheng, Siqi Zheng, zehan wang et al.

ICLR 2025posterarXiv:2410.21269
12
citations
#2389

Exploring the limits of strong membership inference attacks on large language models

Jamie Hayes, I Shumailov, Christopher A. Choquette-Choo et al.

NEURIPS 2025posterarXiv:2505.18773
12
citations
#2390

NAVIX: Scaling MiniGrid Environments with JAX

Eduardo Pignatelli, Jarek Liesen, Robert Lange et al.

NEURIPS 2025posterarXiv:2407.19396
12
citations
#2391

Panacea: Mitigating Harmful Fine-tuning for Large Language Models via Post-fine-tuning Perturbation

Yibo Wang, Tiansheng Huang, Li Shen et al.

NEURIPS 2025posterarXiv:2501.18100
12
citations
#2392

How to Synthesize Text Data without Model Collapse?

Xuekai Zhu, Daixuan Cheng, Hengli Li et al.

ICML 2025posterarXiv:2412.14689
12
citations
#2393

CoMM: A Coherent Interleaved Image-Text Dataset for Multimodal Understanding and Generation

Wei Chen, Lin Li, Yongqi Yang et al.

CVPR 2025highlightarXiv:2406.10462
12
citations
#2394

Generalized Principal-Agent Problem with a Learning Agent

Tao Lin, Yiling Chen

ICLR 2025posterarXiv:2402.09721
12
citations
#2395

RoomTour3D: Geometry-Aware Video-Instruction Tuning for Embodied Navigation

Mingfei Han, Liang Ma, Kamila Zhumakhanova et al.

CVPR 2025posterarXiv:2412.08591
12
citations
#2396

Compositional 4D Dynamic Scenes Understanding with Physics Priors for Video Question Answering

Xingrui Wang, Wufei Ma, Angtian Wang et al.

ICLR 2025oralarXiv:2406.00622
12
citations
#2397

AdaManip: Adaptive Articulated Object Manipulation Environments and Policy Learning

Yuanfei Wang, Xiaojie Zhang, Ruihai Wu et al.

ICLR 2025posterarXiv:2502.11124
12
citations
#2398

GASP: Efficient Black-Box Generation of Adversarial Suffixes for Jailbreaking LLMs

Advik Basani, Xiao Zhang

NEURIPS 2025posterarXiv:2411.14133
12
citations
#2399

Image Generation Diversity Issues and How to Tame Them

Mischa Dombrowski, Weitong Zhang, Hadrien Reynaud et al.

CVPR 2025posterarXiv:2411.16171
12
citations
#2400

Coreset Selection via Reducible Loss in Continual Learning

Ruilin Tong, Yuhang Liu, Javen Qinfeng Shi et al.

ICLR 2025poster
12
citations