Most Cited 2025 "noise prediction module" Papers

22,274 papers found • Page 10 of 112

Filters:Most Cited 2025 noise prediction module Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

#1801

AnyTouch: Learning Unified Static-Dynamic Representation across Multiple Visuo-tactile Sensors

Ruoxuan Feng, Jiangyu Hu, Wenke Xia et al.

ICLR 2025arXiv:2502.12191

citations

#1802

Self-Adapting Language Models

Adam Zweiger, Jyo Pari, Han Guo et al.

NEURIPS 2025arXiv:2506.10943

citations

#1803

SelfSplat: Pose-Free and 3D Prior-Free Generalizable 3D Gaussian Splatting

Gyeongjin Kang, Jisang Yoo, Jihyeon Park et al.

CVPR 2025arXiv:2411.17190

citations

#1804

OmniSVG: A Unified Scalable Vector Graphics Generation Model

Yiying Yang, Wei Cheng, Sijin Chen et al.

NEURIPS 2025arXiv:2504.06263

citations

#1805

OrcaLoca: An LLM Agent Framework for Software Issue Localization

Zhongming Yu, Hejia Zhang, Yujie Zhao et al.

ICML 2025arXiv:2502.00350

citations

#1806

Towards a Mechanistic Explanation of Diffusion Model Generalization

Matthew Niedoba, Berend Zwartsenberg, Kevin Murphy et al.

ICML 2025spotlightarXiv:2411.19339

citations

#1807

When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers

Hongkang Li, Yihua Zhang, shuai ZHANG et al.

ICLR 2025arXiv:2504.10957

citations

#1808

T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching

Zizheng Pan, Bohan Zhuang, De-An Huang et al.

ICLR 2025arXiv:2402.14167

citations

#1809

Attention Distillation: A Unified Approach to Visual Characteristics Transfer

Yang Zhou, Xu Gao, Zichong Chen et al.

CVPR 2025arXiv:2502.20235

citations

#1810

SegMAN: Omni-scale Context Modeling with State Space Models and Local Attention for Semantic Segmentation

Yunxiang Fu, Meng Lou, Yizhou Yu

CVPR 2025arXiv:2412.11890

citations

#1811

LICO: Large Language Models for In-Context Molecular Optimization

Tung Nguyen, Aditya Grover

ICLR 2025arXiv:2406.18851

citations

#1812

STORM: Spatio-TempOral Reconstruction Model For Large-Scale Outdoor Scenes

Jiawei Yang, Jiahui Huang, Boris Ivanovic et al.

ICLR 2025oralarXiv:2501.00602

citations

#1813

Relaxed Recursive Transformers: Effective Parameter Sharing with Layer-wise LoRA

Sangmin Bae, Adam Fisch, Hrayr Harutyunyan et al.

ICLR 2025arXiv:2410.20672

citations

#1814

Modifying Large Language Model Post-Training for Diverse Creative Writing

John Joon Young Chung, Vishakh Padmakumar, Melissa Roemmele et al.

COLM 2025paperarXiv:2503.17126

citations

#1815

A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training

Kai Wang, Mingjia Shi, YuKun Zhou et al.

CVPR 2025arXiv:2405.17403

citations

#1816

Self-Consistency Preference Optimization

Archiki Prasad, Weizhe Yuan, Richard Yuanzhe Pang et al.

ICML 2025arXiv:2411.04109

citations

#1817

OphCLIP: Hierarchical Retrieval-Augmented Learning for Ophthalmic Surgical Video-Language Pretraining

Ming Hu, Kun yuan, Yaling Shen et al.

ICCV 2025arXiv:2411.15421

citations

#1818

CAP4D: Creating Animatable 4D Portrait Avatars with Morphable Multi-View Diffusion Models

Felix Taubner, Ruihang Zhang, Mathieu Tuli et al.

CVPR 2025arXiv:2412.12093

citations

#1819

3D Vision-Language Gaussian Splatting

Qucheng Peng, Benjamin Planche, Zhongpai Gao et al.

ICLR 2025arXiv:2410.07577

citations

#1820

Generalizable Human Gaussians from Single-View Image

Jinnan Chen, Chen Li, Jianfeng Zhang et al.

ICLR 2025arXiv:2406.06050

citations

#1821

Empowering LLMs with Pseudo-Untrimmed Videos for Audio-Visual Temporal Understanding

Yunlong Tang, Daiki Shimada, Jing Bi et al.

AAAI 2025paperarXiv:2403.16276

citations

#1822

Towards Robust Knowledge Unlearning: An Adversarial Framework for Assessing and Improving Unlearning Robustness in Large Language Models

Hongbang Yuan, Zhuoran Jin, Pengfei Cao et al.

AAAI 2025paperarXiv:2408.10682

citations

#1823

VisRL: Intention-Driven Visual Perception via Reinforced Reasoning

Zhangquan Chen, Xufang Luo, Dongsheng Li

ICCV 2025arXiv:2503.07523

citations

#1824

Dream to Manipulate: Compositional World Models Empowering Robot Imitation Learning with Imagination

Leonardo Barcellona, Andrii Zadaianchuk, Davide Allegro et al.

ICLR 2025arXiv:2412.14957

citations

#1825

Fast Exact Unlearning for In-Context Learning Data for LLMs

Andrei Muresanu, Anvith Thudi, Michael Zhang et al.

ICML 2025arXiv:2402.00751

citations

#1826

FreeSim: Toward Free-viewpoint Camera Simulation in Driving Scenes

Lue Fan, Hao ZHANG, Qitai Wang et al.

CVPR 2025arXiv:2412.03566

citations

#1827

Correlated Proxies: A New Definition and Improved Mitigation for Reward Hacking

Cassidy Laidlaw, Shivam Singhal, Anca Dragan

ICLR 2025arXiv:2403.03185

citations

#1828

Evaluating Judges as Evaluators: The JETTS Benchmark of LLM-as-Judges as Test-Time Scaling Evaluators

Yilun Zhou, Austin Xu, PeiFeng Wang et al.

ICML 2025arXiv:2504.15253

citations

#1829

Core Knowledge Deficits in Multi-Modal Language Models

Yijiang Li, Qingying Gao, Tianwei Zhao et al.

ICML 2025arXiv:2410.10855

citations

#1830

Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers

Wei Pang, Kevin Qinghong Lin, Xiangru Jian et al.

NEURIPS 2025arXiv:2505.21497

citations

#1831

BSAFusion: A Bidirectional Stepwise Feature Alignment Network for Unaligned Medical Image Fusion

Huafeng Li, Dayong Su, Qing Cai et al.

AAAI 2025paperarXiv:2412.08050

citations

#1832

Regularization by Texts for Latent Diffusion Inverse Solvers

Jeongsol Kim, Geon Yeong Park, Hyungjin Chung et al.

ICLR 2025arXiv:2311.15658

citations

#1833

Instant Policy: In-Context Imitation Learning via Graph Diffusion

Vitalis Vosylius, Edward Johns

ICLR 2025arXiv:2411.12633

citations

#1834

The Hidden Life of Tokens: Reducing Hallucination of Large Vision-Language Models Via Visual Information Steering

Zhuowei Li, Haizhou Shi, Yunhe Gao et al.

ICML 2025arXiv:2502.03628

citations

#1835

SWE-bench Goes Live!

Linghao Zhang, Shilin He, Chaoyun Zhang et al.

NEURIPS 2025arXiv:2505.23419

citations

#1836

Faster Diffusion Sampling with Randomized Midpoints: Sequential and Parallel

Shivam Gupta, Linda Cai, Sitan Chen

ICLR 2025arXiv:2406.00924

citations

#1837

rStar-Coder: Scaling Competitive Code Reasoning with a Large-Scale Verified Dataset

Yifei Liu, Li Lyna Zhang, Yi Zhu et al.

NEURIPS 2025arXiv:2505.21297

citations

#1838

Dobi-SVD: Differentiable SVD for LLM Compression and Some New Perspectives

Qinsi Wang, Jinghan Ke, Masayoshi Tomizuka et al.

ICLR 2025arXiv:2502.02723

citations

#1839

SCSegamba: Lightweight Structure-Aware Vision Mamba for Crack Segmentation in Structures

Hui Liu, Chen Jia, Fan Shi et al.

CVPR 2025arXiv:2503.01113

citations

#1840

CAD-Llama: Leveraging Large Language Models for Computer-Aided Design Parametric 3D Model Generation

Jiahao Li, Weijian Ma, Xueyang Li et al.

CVPR 2025arXiv:2505.04481

citations

#1841

AIGV-Assessor: Benchmarking and Evaluating the Perceptual Quality of Text-to-Video Generation with LMM

Wang Jiarui, Huiyu Duan, Guangtao Zhai et al.

CVPR 2025arXiv:2411.17221

citations

#1842

Weight ensembling improves reasoning in language models

Xingyu Dang, Christina Baek, Kaiyue Wen et al.

COLM 2025paperarXiv:2504.10478

citations

#1843

MMRL: Multi-Modal Representation Learning for Vision-Language Models

Yuncheng Guo, Xiaodong Gu

CVPR 2025arXiv:2503.08497

citations

#1844

Towards a Multimodal Large Language Model with Pixel-Level Insight for Biomedicine

Xiaoshuang Huang, Lingdong Shen, Jia Liu et al.

AAAI 2025paperarXiv:2412.09278

citations

#1845

SeRL: Self-play Reinforcement Learning for Large Language Models with Limited Data

Wenkai Fang, Shunyu Liu, Yang Zhou et al.

NEURIPS 2025arXiv:2505.20347

citations

#1846

PhD: A ChatGPT-Prompted Visual Hallucination Evaluation Dataset

Jiazhen Liu, Yuhan Fu, Ruobing Xie et al.

CVPR 2025highlightarXiv:2403.11116

citations

#1847

PISA Experiments: Exploring Physics Post-Training for Video Diffusion Models by Watching Stuff Drop

Chenyu Li, Oscar Michel, Xichen Pan et al.

ICML 2025arXiv:2503.09595

citations

#1848

ConceptAttention: Diffusion Transformers Learn Highly Interpretable Features

Alec Helbling, Tuna Han Salih Meral, Benjamin Hoover et al.

ICML 2025oralarXiv:2502.04320

citations

#1849

JetFormer: An autoregressive generative model of raw images and text

Michael Tschannen, André Susano Pinto, Alexander Kolesnikov

ICLR 2025arXiv:2411.19722

citations

#1850

Algorithm Discovery With LLMs: Evolutionary Search Meets Reinforcement Learning

Anja Šurina, Amin Mansouri, Lars C.P.M. Quaedvlieg et al.

COLM 2025paperarXiv:2504.05108

citations

#1851

FLIP: Flow-Centric Generative Planning as General-Purpose Manipulation World Model

Chongkai Gao, Haozhuo Zhang, Zhixuan Xu et al.

ICLR 2025arXiv:2412.08261

citations

#1852

Dual-Interrelated Diffusion Model for Few-Shot Anomaly Image Generation

Ying Jin, Jinlong Peng, Qingdong He et al.

CVPR 2025arXiv:2408.13509

citations

#1853

LeGrad: An Explainability Method for Vision Transformers via Feature Formation Sensitivity

Walid Bousselham, Angie Boggust, Sofian Chaybouti et al.

ICCV 2025arXiv:2404.03214

citations

#1854

Principled Data Selection for Alignment: The Hidden Risks of Difficult Examples

chengqian gao, Haonan Li, Liu Liu et al.

ICML 2025arXiv:2502.09650

citations

#1855

Cheating Automatic LLM Benchmarks: Null Models Achieve High Win Rates

Xiaosen Zheng, Tianyu Pang, Chao Du et al.

ICLR 2025arXiv:2410.07137

citations

#1856

Seeing Far and Clearly: Mitigating Hallucinations in MLLMs with Attention Causal Decoding

feilong tang, Chengzhi Liu, Zhongxing Xu et al.

CVPR 2025arXiv:2505.16652

citations

#1857

ParGo: Bridging Vision-Language with Partial and Global Views

An-Lan Wang, Bin Shan, Wei Shi et al.

AAAI 2025paperarXiv:2408.12928

citations

#1858

UFT: Unifying Supervised and Reinforcement Fine-Tuning

Mingyang Liu, Gabriele Farina, Asuman Ozdaglar

NEURIPS 2025arXiv:2505.16984

citations

#1859

Min-K%++: Improved Baseline for Pre-Training Data Detection from Large Language Models

Jingyang Zhang, Jingwei Sun, Eric Yeats et al.

ICLR 2025

citations

#1860

Diffusion Generative Modeling for Spatially Resolved Gene Expression Inference from Histology Images

Sichen Zhu, Yuchen Zhu, Molei Tao et al.

ICLR 2025arXiv:2501.15598

citations

#1861

Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements

Jingyu Zhang, Ahmed Elgohary Ghoneim, Ahmed Magooda et al.

ICLR 2025arXiv:2410.08968

citations

#1862

M-Prometheus: A Suite of Open Multilingual LLM Judges

José Pombal, Dongkeun Yoon, Patrick Fernandes et al.

COLM 2025paperarXiv:2504.04953

citations

#1863

Both Ears Wide Open: Towards Language-Driven Spatial Audio Generation

Peiwen Sun, Sitong Cheng, Xiangtai Li et al.

ICLR 2025arXiv:2410.10676

citations

#1864

Leveraging Large Language Models for Node Generation in Few-Shot Learning on Text-Attributed Graphs

Jianxiang Yu, Yuxiang Ren, Chenghua Gong et al.

AAAI 2025paperarXiv:2310.09872

citations

#1865

Straight to Zero: Why Linearly Decaying the Learning Rate to Zero Works Best for LLMs

Shane Bergsma, Nolan Dey, Gurpreet Gosal et al.

ICLR 2025arXiv:2502.15938

citations

#1866

Backtracking Improves Generation Safety

Yiming Zhang, Jianfeng Chi, Hailey Nguyen et al.

ICLR 2025arXiv:2409.14586

citations

#1867

miniCTX: Neural Theorem Proving with (Long-)Contexts

Jiewen Hu, Thomas Zhu, Sean Welleck

ICLR 2025arXiv:2408.03350

citations

#1868

EditAR: Unified Conditional Generation with Autoregressive Models

Jiteng Mu, Nuno Vasconcelos, Xiaolong Wang

CVPR 2025arXiv:2501.04699

citations

#1869

Universal Image Restoration Pre-training via Degradation Classification

Jiakui Hu, Lujia Jin, Zhengjian Yao et al.

ICLR 2025arXiv:2501.15510

citations

#1870

MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning

Xinyan Chen, Renrui Zhang, Dongzhi JIANG et al.

NEURIPS 2025arXiv:2506.05331

citations

#1871

Mixture Compressor for Mixture-of-Experts LLMs Gains More

Wei Huang, Yue Liao, Jianhui Liu et al.

ICLR 2025arXiv:2410.06270

citations

#1872

TopV: Compatible Token Pruning with Inference Time Optimization for Fast and Low-Memory Multimodal Vision Language Model

Cheng Yang, Yang Sui, Jinqi Xiao et al.

CVPR 2025arXiv:2503.18278

citations

#1873

Dr. Splat: Directly Referring 3D Gaussian Splatting via Direct Language Embedding Registration

JUNSEONG KIM, GeonU Kim, Kim Yu-Ji et al.

CVPR 2025highlightarXiv:2502.16652

citations

#1874

Failures to Find Transferable Image Jailbreaks Between Vision-Language Models

Rylan Schaeffer, Dan Valentine, Luke Bailey et al.

ICLR 2025arXiv:2407.15211

citations

#1875

Agent-Oriented Planning in Multi-Agent Systems

Ao LI, Yuexiang Xie, Songze Li et al.

ICLR 2025arXiv:2410.02189

citations

#1876

MEGA: Memory-Efficient 4D Gaussian Splatting for Dynamic Scenes

XINJIE ZHANG, Zhening Liu, Yifan Zhang et al.

ICCV 2025highlightarXiv:2410.13613

citations

#1877

GOAL: A Generalist Combinatorial Optimization Agent Learner

Darko Drakulić, Sofia Michel, Jean-Marc Andreoli

ICLR 2025arXiv:2406.15079

citations

#1878

SpatialLLM: A Compound 3D-Informed Design towards Spatially-Intelligent Large Multimodal Models

Wufei Ma, Luoxin Ye, Nessa McWeeney et al.

CVPR 2025highlightarXiv:2505.00788

citations

#1879

Decoupled Spatio-Temporal Consistency Learning for Self-Supervised Tracking

Yaozong Zheng, Bineng Zhong, Qihua Liang et al.

AAAI 2025paperarXiv:2507.21606

citations

#1880

VL-Cache: Sparsity and Modality-Aware KV Cache Compression for Vision-Language Model Inference Acceleration

Dezhan Tu, Danylo Vashchilenko, Yuzhe Lu et al.

ICLR 2025arXiv:2410.23317

citations

#1881

GAF: Gaussian Avatar Reconstruction from Monocular Videos via Multi-view Diffusion

Jiapeng Tang, Davide Davoli, Tobias Kirschstein et al.

CVPR 2025arXiv:2412.10209

citations

#1882

Demystifying Reasoning Dynamics with Mutual Information: Thinking Tokens are Information Peaks in LLM Reasoning

Chen Qian, Dongrui Liu, Hao Wen et al.

NEURIPS 2025arXiv:2506.02867

citations

#1883

Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging

Shiqi Chen, Jinghan Zhang, Tongyao Zhu et al.

ICML 2025arXiv:2505.05464

citations

#1884

Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing

Jaihoon Kim, Taehoon Yoon, Jisung Hwang et al.

NEURIPS 2025arXiv:2503.19385

citations

#1885

A Percolation Model of Emergence: Analyzing Transformers Trained on a Formal Language

Ekdeep Singh Lubana, Kyogo Kawaguchi, Robert Dick et al.

ICLR 2025arXiv:2408.12578

citations

#1886

FrameFusion: Combining Similarity and Importance for Video Token Reduction on Large Vision Language Models

Tianyu Fu, Tengxuan Liu, Qinghao Han et al.

ICCV 2025arXiv:2501.01986

citations

#1887

Halton Scheduler for Masked Generative Image Transformer

Victor Besnier, Mickael Chen, David Hurych et al.

ICLR 2025arXiv:2503.17076

citations

#1888

Dynamic Point Maps: A Versatile Representation for Dynamic 3D Reconstruction

Edgar Sucar, Zihang Lai, Eldar Insafutdinov et al.

ICCV 2025highlightarXiv:2503.16318

citations

#1889

Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning

Jie Cheng, Gang Xiong, Ruixi Qiao et al.

NEURIPS 2025arXiv:2504.15275

citations

#1890

SPA: 3D Spatial-Awareness Enables Effective Embodied Representation

Haoyi Zhu, Honghui Yang, Yating Wang et al.

ICLR 2025arXiv:2410.08208

citations

#1891

Discrepancy Minimization in Input-Sparsity Time

Yichuan Deng, Xiaoyu Li, Zhao Song et al.

ICML 2025spotlightarXiv:2210.12468

citations

#1892

ClinicalLab: Aligning Agents for Multi-Departmental Clinical Diagnostics in the Real World

Weixiang Yan, Haitian Liu, Tengxiao Wu et al.

NEURIPS 2025arXiv:2406.13890

citations

#1893

Ca2-VDM: Efficient Autoregressive Video Diffusion Model with Causal Generation and Cache Sharing

Kaifeng Gao, Jiaxin Shi, Hanwang Zhang et al.

ICML 2025arXiv:2411.16375

citations

#1894

ARB-LLM: Alternating Refined Binarizations for Large Language Models

Zhiteng Li, Xianglong Yan, Tianao Zhang et al.

ICLR 2025arXiv:2410.03129

citations

#1895

Hypothetical Minds: Scaffolding Theory of Mind for Multi-Agent Tasks with Large Language Models

Logan Cross, Violet Xiang, Agam Bhatia et al.

ICLR 2025arXiv:2407.07086

citations

#1896

AID: Adapting Image2Video Diffusion Models for Instruction-guided Video Prediction

Zhen Xing, Qi Dai, Zejia Weng et al.

ICCV 2025arXiv:2406.06465

citations

#1897

Curriculum Direct Preference Optimization for Diffusion and Consistency Models

Florinel Croitoru, Vlad Hondru, Radu Tudor Ionescu et al.

CVPR 2025arXiv:2405.13637

citations

#1898

Towards a General Time Series Anomaly Detector with Adaptive Bottlenecks and Dual Adversarial Decoders

Qichao Shentu, Beibu Li, Kai Zhao et al.

ICLR 2025arXiv:2405.15273

citations

#1899

MambaPro: Multi-Modal Object Re-identification with Mamba Aggregation and Synergistic Prompt

Yuhao Wang, Xuehu Liu, Tianyu Yan et al.

AAAI 2025paperarXiv:2412.10707

citations

#1900

CleanDIFT: Diffusion Features without Noise

Nick Stracke, Stefan Andreas Baumann, Kolja Bauer et al.

CVPR 2025arXiv:2412.03439

citations

#1901

Gaussian Splashing: Unified Particles for Versatile Motion Synthesis and Rendering

Yutao Feng, Xiang Feng, Yintong Shang et al.

CVPR 2025arXiv:2401.15318

citations

#1902

On the Closed-Form of Flow Matching: Generalization Does Not Arise from Target Stochasticity

Quentin Bertrand, Anne Gagneux, Mathurin Massias et al.

NEURIPS 2025oralarXiv:2506.03719

citations

#1903

PosterMaker: Towards High-Quality Product Poster Generation with Accurate Text Rendering

Yifan Gao, Zihang Lin, Chuanbin Liu et al.

CVPR 2025arXiv:2504.06632

citations

#1904

Towards More General Video-based Deepfake Detection through Facial Component Guided Adaptation for Foundation Model

Yue-Hua Han, Tai-Ming Huang, Kailung Hua et al.

CVPR 2025arXiv:2404.05583

citations

#1905

MaxInfoRL: Boosting exploration in reinforcement learning through information gain maximization

Bhavya, Stelian Coros, Andreas Krause et al.

ICLR 2025arXiv:2412.12098

citations

#1906

Towards Understanding Safety Alignment: A Mechanistic Perspective from Safety Neurons

Jianhui Chen, Xiaozhi Wang, Zijun Yao et al.

NEURIPS 2025arXiv:2406.14144

citations

#1907

Black-Box Forgery Attacks on Semantic Watermarks for Diffusion Models

Andreas Müller, Denis Lukovnikov, Jonas Thietke et al.

CVPR 2025arXiv:2412.03283

citations

#1908

NovelQA: Benchmarking Question Answering on Documents Exceeding 200K Tokens

Cunxiang Wang, Ruoxi Ning, Boqi Pan et al.

ICLR 2025arXiv:2403.12766

citations

#1909

GEM: Empowering MLLM for Grounded ECG Understanding with Time Series and Images

Xiang Lan, Feng Wu, Kai He et al.

NEURIPS 2025arXiv:2503.06073

citations

#1910

Ouroboros3D: Image-to-3D Generation via 3D-aware Recursive Diffusion

Hao Wen, Zehuan Huang, Yaohui Wang et al.

CVPR 2025arXiv:2406.03184

citations

#1911

Safety Pretraining: Toward the Next Generation of Safe AI

Pratyush Maini, Sachin Goyal, Dylan Sam et al.

NEURIPS 2025oralarXiv:2504.16980

citations

#1912

Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search

Yuichi Inoue, Kou Misaki, Yuki Imajuku et al.

NEURIPS 2025spotlightarXiv:2503.04412

citations

#1913

Optimizing $(L_0, L_1)$-Smooth Functions by Gradient Methods

Daniil Vankov, Anton Rodomanov, Angelia Nedich et al.

ICLR 2025arXiv:2410.10800

citations

#1914

Fantastic Copyrighted Beasts and How (Not) to Generate Them

Luxi He, Yangsibo Huang, Weijia Shi et al.

ICLR 2025arXiv:2406.14526

citations

#1915

From Multimodal LLMs to Generalist Embodied Agents: Methods and Lessons

Andrew Szot, Bogdan Mazoure, Omar Attia et al.

CVPR 2025arXiv:2412.08442

citations

#1916

Navigation-Guided Sparse Scene Representation for End-to-End Autonomous Driving

Peidong Li, Dixiao Cui

ICLR 2025oralarXiv:2409.18341

citations

#1917

CLIP-CID: Efficient CLIP Distillation via Cluster-Instance Discrimination

Kaicheng Yang, Tiancheng Gu, Xiang An et al.

AAAI 2025paperarXiv:2408.09441

citations

#1918

MC-Bench: A Benchmark for Multi-Context Visual Grounding in the Era of MLLMs

Yunqiu Xu, Linchao Zhu, Yi Yang

ICCV 2025arXiv:2410.12332

citations

#1919

NightHaze: Nighttime Image Dehazing via Self-Prior Learning

Beibei Lin, Yeying Jin, Yan Wending et al.

AAAI 2025paperarXiv:2403.07408

citations

#1920

Training on the Test Task Confounds Evaluation and Emergence

Ricardo Dominguez-Olmedo, Florian Eddie Dorner, Moritz Hardt

ICLR 2025arXiv:2407.07890

citations

#1921

Model Poisoning Attacks to Federated Learning via Multi-Round Consistency

Yueqi Xie, Minghong Fang, Neil Zhenqiang Gong

CVPR 2025arXiv:2404.15611

citations

#1922

POSTA: A Go-to Framework for Customized Artistic Poster Generation

Haoyu Chen, Xiaojie Xu, Wenbo Li et al.

CVPR 2025arXiv:2503.14908

citations

#1923

OSCAR: Operating System Control via State-Aware Reasoning and Re-Planning

Xiaoqiang Wang, Bang Liu

ICLR 2025arXiv:2410.18963

citations

#1924

Towards Foundation Models for Mixed Integer Linear Programming

Sirui Li, Janardhan Kulkarni, Ishai Menache et al.

ICLR 2025arXiv:2410.08288

citations

#1925

Specialized Foundation Models Struggle to Beat Supervised Baselines

Zongzhe Xu, Ritvik Gupta, Wenduo Cheng et al.

ICLR 2025arXiv:2411.02796

citations

#1926

SimpleVQA: Multimodal Factuality Evaluation for Multimodal Large Language Models

Xianfu Cheng, Wei Zhang, Shiwei Zhang et al.

ICCV 2025arXiv:2502.13059

citations

#1927

QuestBench: Can LLMs ask the right question to acquire information in reasoning tasks?

Belinda Li, Been Kim, Zi Wang

NEURIPS 2025arXiv:2503.22674

citations

#1928

Limits to scalable evaluation at the frontier: LLM as judge won’t beat twice the data

Florian Eddie Dorner, Vivian Nastl, Moritz Hardt

ICLR 2025

citations

#1929

What is the Visual Cognition Gap between Humans and Multimodal LLMs?

Xu Cao, Yifan Shen, Bolin Lai et al.

COLM 2025paperarXiv:2406.10424

citations

#1930

Nemotron-CLIMB: Clustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training

Shizhe Diao, Yu Yang, Yonggan Fu et al.

NEURIPS 2025spotlightarXiv:2504.13161

citations

#1931

Differentiation and Specialization of Attention Heads via the Refined Local Learning Coefficient

George Wang, Jesse Hoogland, Stan van Wingerden et al.

ICLR 2025arXiv:2410.02984

citations

#1932

Exploiting Diffusion Prior for Real-World Image Dehazing with Unpaired Training

Yunwei Lan, Zhigao Cui, Chang Liu et al.

AAAI 2025paperarXiv:2503.15017

citations

#1933

AIM: Adaptive Inference of Multi-Modal LLMs via Token Merging and Pruning

Yiwu Zhong, Zhuoming Liu, Yin Li et al.

ICCV 2025arXiv:2412.03248

citations

#1934

RouteLLM: Learning to Route LLMs from Preference Data

Isaac Ong, Amjad Almahairi, Vincent Wu et al.

ICLR 2025

citations

#1935

Towards General-Purpose Model-Free Reinforcement Learning

Scott Fujimoto, Pierluca D'Oro, Amy Zhang et al.

ICLR 2025arXiv:2501.16142

citations

#1936

Teaching Language Models to Critique via Reinforcement Learning

Zhihui Xie, Jie chen, Liyu Chen et al.

ICML 2025arXiv:2502.03492

citations

#1937

PUMA: Empowering Unified MLLM with Multi-granular Visual Generation

Rongyao Fang, Chengqi Duan, Kun Wang et al.

ICCV 2025arXiv:2410.13861

citations

#1938

RL Tango: Reinforcing Generator and Verifier Together for Language Reasoning

Kaiwen Zha, Zhengqi Gao, Maohao Shen et al.

NEURIPS 2025arXiv:2505.15034

citations

#1939

Multi-Level Optimal Transport for Universal Cross-Tokenizer Knowledge Distillation on Language Models

Xiao Cui, Mo Zhu, Yulei Qin et al.

AAAI 2025paperarXiv:2412.14528

citations

#1940

FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait

Taekyung Ki, Dongchan Min, Gyeongsu Chae

ICCV 2025arXiv:2412.01064

citations

#1941

FastLGS: Speeding Up Language Embedded Gaussians with Feature Grid Mapping

Yuzhou Ji, He Zhu, Junshu Tang et al.

AAAI 2025paperarXiv:2406.01916

citations

#1942

CARP: Visuomotor Policy Learning via Coarse-to-Fine Autoregressive Prediction

Zhefei Gong, Pengxiang Ding, Shangke Lyu et al.

ICCV 2025arXiv:2412.06782

citations

#1943

CITER: Collaborative Inference for Efficient Large Language Model Decoding with Token-Level Routing

Wenhao Zheng, Yixiao Chen, Weitong Zhang et al.

COLM 2025paperarXiv:2502.01976

citations

#1944

EvoChart: A Benchmark and a Self-Training Approach Towards Real-World Chart Understanding

Muye Huang, Han Lai, Xinyu Zhang et al.

AAAI 2025paperarXiv:2409.01577

citations

#1945

MC^2: Multi-concept Guidance for Customized Multi-concept Generation

Jiaxiu Jiang, Yabo Zhang, Kailai Feng et al.

CVPR 2025arXiv:2404.05268

citations

#1946

GPTAQ: Efficient Finetuning-Free Quantization for Asymmetric Calibration

Yuhang Li, Ruokai Yin, Donghyun Lee et al.

ICML 2025arXiv:2504.02692

citations

#1947

I2VControl-Camera: Precise Video Camera Control with Adjustable Motion Strength

Wanquan Feng, Jiawei Liu, Pengqi Tu et al.

ICLR 2025arXiv:2411.06525

citations

#1948

Emergence of meta-stable clustering in mean-field transformer models

Giuseppe Bruno, Federico Pasqualotto, Andrea Agazzi

ICLR 2025arXiv:2410.23228

citations

#1949

WPMixer: Efficient Multi-Resolution Mixing for Long-Term Time Series Forecasting

Md Mahmuddun Nabi Murad, Mehmet Aktukmak, Yasin Yilmaz

AAAI 2025paperarXiv:2412.17176

citations

#1950

AdvWave: Stealthy Adversarial Jailbreak Attack against Large Audio-Language Models

Mintong Kang, Chejian Xu, Bo Li

ICLR 2025oralarXiv:2412.08608

citations

#1951

OpenVIS: Open-vocabulary Video Instance Segmentation

Pinxue Guo, Hao Huang, Peiyang He et al.

AAAI 2025paperarXiv:2305.16835

citations

#1952

Latent Reward: LLM-Empowered Credit Assignment in Episodic Reinforcement Learning

Yun Qu, Yuhang Jiang, Boyuan Wang et al.

AAAI 2025paperarXiv:2412.11120

citations

#1953

Bi-Factorial Preference Optimization: Balancing Safety-Helpfulness in Language Models

Wenxuan Zhang, Philip Torr, Mohamed Elhoseiny et al.

ICLR 2025arXiv:2408.15313

citations

#1954

ReinboT: Amplifying Robot Visual-Language Manipulation with Reinforcement Learning

Hongyin Zhang, Zifeng Zhuang, Han Zhao et al.

ICML 2025arXiv:2505.07395

citations

#1955

Does Thinking More Always Help? Mirage of Test-Time Scaling in Reasoning Models

Soumya Suvra Ghosal, Souradip Chakraborty, Avinash Reddy et al.

NEURIPS 2025arXiv:2506.04210

citations

#1956

Fine-Tuning Discrete Diffusion Models with Policy Gradient Methods

Oussama Zekri, Nicolas Boulle

NEURIPS 2025arXiv:2502.01384

citations

#1957

AnimateAnything: Consistent and Controllable Animation for Video Generation

guojun lei, Chi Wang, Rong Zhang et al.

CVPR 2025arXiv:2411.10836

citations

#1958

Diverse Preference Learning for Capabilities and Alignment

Stewart Slocum, Asher Parker-Sartori, Dylan Hadfield-Menell

ICLR 2025arXiv:2511.08594

citations

#1959

Non-myopic Generation of Language Models for Reasoning and Planning

Chang Ma, Haiteng Zhao, Junlei Zhang et al.

ICLR 2025arXiv:2410.17195

citations

#1960

Alpha-SQL: Zero-Shot Text-to-SQL using Monte Carlo Tree Search

Boyan Li, Jiayi Zhang, Ju Fan et al.

ICML 2025arXiv:2502.17248

citations

#1961

When To Solve, When To Verify: Compute-Optimal Problem Solving and Generative Verification for LLM Reasoning

Nishad Singhi, Hritik Bansal, Arian Hosseini et al.

COLM 2025paper

citations

#1962

From Language Models over Tokens to Language Models over Characters

Tim Vieira, Benjamin LeBrun, Mario Giulianelli et al.

ICML 2025spotlightarXiv:2412.03719

citations

#1963

Manifolds, Random Matrices and Spectral Gaps: The geometric phases of generative diffusion

Enrico Ventura, Beatrice Achilli, Gianluigi Silvestri et al.

ICLR 2025arXiv:2410.05898

citations

#1964

Population Transformer: Learning Population-level Representations of Neural Activity

Geeling Chau, Christopher Wang, Sabera Talukder et al.

ICLR 2025oralarXiv:2406.03044

citations

#1965

HyperFree: A Channel-adaptive and Tuning-free Foundation Model for Hyperspectral Remote Sensing Imagery

Jingtao Li, Yingyi Liu, XINYU WANG et al.

CVPR 2025arXiv:2503.21841

citations

#1966

MeshArt: Generating Articulated Meshes with Structure-Guided Transformers

Daoyi Gao, Mohd Yawar Nihal Siddiqui, Lei Li et al.

CVPR 2025arXiv:2412.11596

citations

#1967

Faster Cascades via Speculative Decoding

Harikrishna Narasimhan, Wittawat Jitkrittum, Ankit Singh Rawat et al.

ICLR 2025arXiv:2405.19261

citations

#1968

Explore Theory of Mind: program-guided adversarial data generation for theory of mind reasoning

Melanie Sclar, Jane Dwivedi-Yu, Maryam Fazel-Zarandi et al.

ICLR 2025arXiv:2412.12175

citations

#1969

SyllableLM: Learning Coarse Semantic Units for Speech Language Models

Alan Baade, Puyuan Peng, David Harwath

ICLR 2025arXiv:2410.04029

citations

#1970

Adaptive Guidance: Training-free Acceleration of Conditional Diffusion Models

Angela Castillo, Jonas Kohler, Juan C. Pérez et al.

AAAI 2025paperarXiv:2312.12487

citations

#1971

Language Imbalance Driven Rewarding for Multilingual Self-improving

Wen Yang, Junhong Wu, Chen Wang et al.

ICLR 2025arXiv:2410.08964

citations

#1972

A Simple Model of Inference Scaling Laws

Noam Levi

ICML 2025arXiv:2410.16377

citations

#1973

PhysAnimator: Physics-Guided Generative Cartoon Animation

Tianyi Xie, Yiwei Zhao, Ying Jiang et al.

CVPR 2025arXiv:2501.16550

citations

#1974

FLAIR: VLM with Fine-grained Language-informed Image Representations

Rui Xiao, Sanghwan Kim, Iuliana Georgescu et al.

CVPR 2025arXiv:2412.03561

citations

#1975

MMQA: Evaluating LLMs with Multi-Table Multi-Hop Complex Questions

Jian Wu, Linyi Yang, Dongyuan Li et al.

ICLR 2025

citations

#1976

Erasing Conceptual Knowledge from Language Models

Rohit Gandikota, Sheridan Feucht, Samuel Marks et al.

NEURIPS 2025arXiv:2410.02760

citations

#1977

Language Models are Advanced Anonymizers

Robin Staab, Mark Vero, Mislav Balunovic et al.

ICLR 2025arXiv:2402.13846

citations

#1978

Q-Frame: Query-aware Frame Selection and Multi-Resolution Adaptation for Video-LLMs

Shaojie Zhang, Jiahui Yang, Jianqin Yin et al.

ICCV 2025arXiv:2506.22139

citations

#1979

Position: Evaluating Generative AI Systems Is a Social Science Measurement Challenge

Hanna Wallach, Meera Desai, A. Feder Cooper et al.

ICML 2025arXiv:2502.00561

citations

#1980

Oscillatory State-Space Models

T. Konstantin Rusch, Daniela Rus

ICLR 2025arXiv:2410.03943

citations

#1981

Heavy-Tailed Diffusion Models

Kushagra Pandey, Jaideep Pathak, Yilun Xu et al.

ICLR 2025arXiv:2410.14171

citations

#1982

ResAdapter: Domain Consistent Resolution Adapter for Diffusion Models

Jiaxiang Cheng, Pan Xie, Xin Xia et al.

AAAI 2025paperarXiv:2403.02084

citations

#1983

Q-Eval-100K: Evaluating Visual Quality and Alignment Level for Text-to-Vision Content

Zicheng Zhang, Tengchuan Kou, Chunyi Li et al.

CVPR 2025arXiv:2503.02357

citations

#1984

ConFIG: Towards Conflict-free Training of Physics Informed Neural Networks

Qiang Liu, Mengyu Chu, Nils Thuerey

ICLR 2025arXiv:2408.11104

citations

#1985

Amodal3R: Amodal 3D Reconstruction from Occluded 2D Images

Tianhao Wu, Chuanxia Zheng, Frank Guan et al.

ICCV 2025arXiv:2503.13439

citations

#1986

On Statistical Rates of Conditional Diffusion Transformers: Approximation, Estimation and Minimax Optimality

Jerry Yao-Chieh Hu, Weimin Wu, Yi-Chen Lee et al.

ICLR 2025arXiv:2411.17522

citations

#1987

SPC: Evolving Self-Play Critic via Adversarial Games for LLM Reasoning

Jiaqi Chen, Bang Zhang, Ruotian Ma et al.

NEURIPS 2025arXiv:2504.19162

citations

#1988

Transformers are Universal In-context Learners

Takashi Furuya, Maarten V de Hoop, Gabriel Peyré

ICLR 2025arXiv:2408.01367

citations

#1989

Hogwild! Inference: Parallel LLM Generation via Concurrent Attention

Gleb Rodionov, Roman Garipov, Alina Shutova et al.

NEURIPS 2025spotlightarXiv:2504.06261

citations

#1990

MoManipVLA: Transferring Vision-language-action Models for General Mobile Manipulation

Zhenyu Wu, Yuheng Zhou, Xiuwei Xu et al.

CVPR 2025arXiv:2503.13446

citations

#1991

Grounding Video Models to Actions through Goal Conditioned Exploration

Yunhao Luo, Yilun Du

ICLR 2025arXiv:2411.07223

citations

#1992

HELMET: How to Evaluate Long-context Models Effectively and Thoroughly

Howard Yen, Tianyu Gao, Minmin Hou et al.

ICLR 2025

citations

#1993

Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary Segmentation

Luca Barsellotti, Lorenzo Bianchi, Nicola Messina et al.

ICCV 2025arXiv:2411.19331

citations

#1994

Towards Fast, Specialized Machine Learning Force Fields: Distilling Foundation Models via Energy Hessians

Ishan Amin, Sanjeev Raja, Aditi Krishnapriyan

ICLR 2025arXiv:2501.09009

citations

#1995

Exploring Criteria of Loss Reweighting to Enhance LLM Unlearning

Puning Yang, Qizhou Wang, Zhuo Huang et al.

ICML 2025arXiv:2505.11953

citations

#1996

SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation

Zihan Liu, Shuangrui Ding, Zhixiong Zhang et al.

ICML 2025arXiv:2502.13128

citations

#1997

Mastering Board Games by External and Internal Planning with Language Models

John Schultz, Jakub Adamek, Matej Jusup et al.

ICML 2025spotlightarXiv:2412.12119

citations

#1998

CPath-Omni: A Unified Multimodal Foundation Model for Patch and Whole Slide Image Analysis in Computational Pathology

Yuxuan Sun, Yixuan Si, Chenglu Zhu et al.

CVPR 2025arXiv:2412.12077

citations

#1999

Towards Scalable Exact Machine Unlearning Using Parameter-Efficient Fine-Tuning

Somnath Basu Roy Chowdhury, Krzysztof Choromanski, Arijit Sehanobish et al.

ICLR 2025arXiv:2406.16257

citations

#2000

SVTRv2: CTC Beats Encoder-Decoder Models in Scene Text Recognition

Yongkun Du, Zhineng Chen, Hongtao Xie et al.

ICCV 2025arXiv:2411.15858

citations

← Previous

1...8 9 10 11 12...112