Most Cited 2025 "ultra-low bit-widths" Papers

22,274 papers found • Page 10 of 112

#1801

Collab-RAG: Boosting Retrieval-Augmented Generation for Complex Question Answering via White-Box and Black-Box LLM Collaboration

Ran Xu, Wenqi Shi, Yuchen Zhuang et al.

COLM 2025paperarXiv:2504.04915
17
citations
#1802

AlignMamba: Enhancing Multimodal Mamba with Local and Global Cross-modal Alignment

Yan Li, Yifei Xing, Xiangyuan Lan et al.

CVPR 2025posterarXiv:2412.00833
17
citations
#1803

u-$\mu$P: The Unit-Scaled Maximal Update Parametrization

Charles Blake, Constantin Eichenberg, Josef Dean et al.

ICLR 2025poster
17
citations
#1804

Neighboring Autoregressive Modeling for Efficient Visual Generation

Yefei He, Yuanyu He, Shaoxuan He et al.

ICCV 2025posterarXiv:2503.10696
17
citations
#1805

Precise, Fast, and Low-cost Concept Erasure in Value Space: Orthogonal Complement Matters

Yuan Wang, Ouxiang Li, Tingting Mu et al.

CVPR 2025posterarXiv:2412.06143
17
citations
#1806

SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation

Yining Hong, Beide Liu, Maxine Wu et al.

ICLR 2025oralarXiv:2410.23277
17
citations
#1807

M²IV: Towards Efficient and Fine-grained Multimodal In-Context Learning via Representation Engineering

Yanshu Li, Yi Cao, Hongyang He et al.

COLM 2025paper
17
citations
#1808

Unifying Unsupervised Graph-Level Anomaly Detection and Out-of-Distribution Detection: A Benchmark

Yili Wang, Yixin Liu, Xu Shen et al.

ICLR 2025posterarXiv:2406.15523
17
citations
#1809

PointOBB-v2: Towards Simpler, Faster, and Stronger Single Point Supervised Oriented Object Detection

Botao Ren, Xue Yang, Yi Yu et al.

ICLR 2025posterarXiv:2410.08210
17
citations
#1810

Interpretable Bilingual Multimodal Large Language Model for Diverse Biomedical Tasks

Lehan Wang, Haonan Wang, Honglong Yang et al.

ICLR 2025posterarXiv:2410.18387
17
citations
#1811

Unleashing the Potential of Multi-modal Foundation Models and Video Diffusion for 4D Dynamic Physical Scene Simulation

Zhuoman Liu, Weicai Ye, Yan Luximon et al.

CVPR 2025posterarXiv:2411.14423
17
citations
#1812

HaWoR: World-Space Hand Motion Reconstruction from Egocentric Videos

Jinglei Zhang, Jiankang Deng, Chao Ma et al.

CVPR 2025highlightarXiv:2501.02973
17
citations
#1813

GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning

Yue Liu, Shengfang Zhai, Mingzhe Du et al.

NEURIPS 2025posterarXiv:2505.11049
17
citations
#1814

A Quantum Circuit-Based Compression Perspective for Parameter-Efficient Learning

Chen-Yu Liu, Chao-Han Huck Yang, Hsi-Sheng Goan et al.

ICLR 2025posterarXiv:2410.09846
17
citations
#1815

Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation

Yuqing Wang, Zhijie Lin, Yao Teng et al.

ICCV 2025posterarXiv:2503.16430
17
citations
#1816

Can LVLMs Obtain a Driver’s License? A Benchmark Towards Reliable AGI for Autonomous Driving

Yuhang Lu, Yichen Yao, Jiadong Tu et al.

AAAI 2025paperarXiv:2409.02914
17
citations
#1817

Adaptive Methods through the Lens of SDEs: Theoretical Insights on the Role of Noise

Enea Monzio Compagnoni, Tianlin Liu, Rustem Islamov et al.

ICLR 2025posterarXiv:2411.15958
17
citations
#1818

A Probabilistic Perspective on Unlearning and Alignment for Large Language Models

Yan Scholten, Stephan Günnemann, Leo Schwinn

ICLR 2025posterarXiv:2410.03523
17
citations
#1819

Router-R1: Teaching LLMs Multi-Round Routing and Aggregation via Reinforcement Learning

Haozhen Zhang, Tao Feng, Jiaxuan You

NEURIPS 2025posterarXiv:2506.09033
16
citations
#1820

EmbodiedOcc: Embodied 3D Occupancy Prediction for Vision-based Online Scene Understanding

Yuqi Wu, Wenzhao Zheng, Sicheng Zuo et al.

ICCV 2025posterarXiv:2412.04380
16
citations
#1821

What Matters in Learning from Large-Scale Datasets for Robot Manipulation

Vaibhav Saxena, Matthew Bronars, Nadun Ranawaka Arachchige et al.

ICLR 2025posterarXiv:2506.13536
16
citations
#1822

Model Equality Testing: Which Model is this API Serving?

Irena Gao, Percy Liang, Carlos Guestrin

ICLR 2025posterarXiv:2410.20247
16
citations
#1823

The Scene Language: Representing Scenes with Programs, Words, and Embeddings

Yunzhi Zhang, Zizhang Li, Matt Zhou et al.

CVPR 2025highlightarXiv:2410.16770
16
citations
#1824

TechSinger: Technique Controllable Multilingual Singing Voice Synthesis via Flow Matching

Wenxiang Guo, Yu Zhang, Changhao Pan et al.

AAAI 2025paperarXiv:2502.12572
16
citations
#1825

Tighter Privacy Auditing of DP-SGD in the Hidden State Threat Model

Tudor Cebere, Aurélien Bellet, Nicolas Papernot

ICLR 2025posterarXiv:2405.14457
16
citations
#1826

MLLM-as-a-Judge for Image Safety without Human Labeling

Zhenting Wang, Shuming Hu, Shiyu Zhao et al.

CVPR 2025highlightarXiv:2501.00192
16
citations
#1827

The Same but Different: Structural Similarities and Differences in Multilingual Language Modeling

Ruochen Zhang, Qinan Yu, Matianyu Zang et al.

ICLR 2025posterarXiv:2410.09223
16
citations
#1828

A Unified Comparative Study with Generalized Conformity Scores for Multi-Output Conformal Regression

Victor Dheur, Matteo Fontana, Yorick Estievenart et al.

ICML 2025posterarXiv:2501.10533
16
citations
#1829

MonoInstance: Enhancing Monocular Priors via Multi-view Instance Alignment for Neural Rendering and Reconstruction

Wenyuan Zhang, Yixiao Yang, Han Huang et al.

CVPR 2025posterarXiv:2503.18363
16
citations
#1830

Accelerating Multimodal Large Language Models by Searching Optimal Vision Token Reduction

Shiyu Zhao, Zhenting Wang, Felix Juefei-Xu et al.

CVPR 2025posterarXiv:2412.00556
16
citations
#1831

Mixture of Noise for Pre-Trained Model-Based Class-Incremental Learning

Kai Jiang, Zhengyan Shi, Dell Zhang et al.

NEURIPS 2025posterarXiv:2509.16738
16
citations
#1832

Scalable Image Tokenization with Index Backpropagation Quantization

Fengyuan Shi, Zhuoyan Luo, Yixiao Ge et al.

ICCV 2025posterarXiv:2412.02692
16
citations
#1833

A CLIP-Powered Framework for Robust and Generalizable Data Selection

Suorong Yang, Peng Ye, Wanli Ouyang et al.

ICLR 2025posterarXiv:2410.11215
16
citations
#1834

How Much is a Noisy Image Worth? Data Scaling Laws for Ambient Diffusion.

Giannis Daras, Yeshwanth Cherapanamjeri, Constantinos C Daskalakis

ICLR 2025posterarXiv:2411.02780
16
citations
#1835

Motion Prior Knowledge Learning with Homogeneous Language Descriptions for Moving Infrared Small Target Detection

Shengjia Chen, Luping Ji, Weiwei Duan et al.

AAAI 2025paper
16
citations
#1836

ThinkSound: Chain-of-Thought Reasoning in Multimodal LLMs for Audio Generation and Editing

Huadai Liu, Kaicheng Luo, Jialei Wang et al.

NEURIPS 2025oral
16
citations
#1837

DexVLG: Dexterous Vision-Language-Grasp Model at Scale

Jiawei He, Danshi Li, Xinqiang Yu et al.

ICCV 2025highlightarXiv:2507.02747
16
citations
#1838

Beyond Single Concept Vector: Modeling Concept Subspace in LLMs with Gaussian Distribution

Haiyan Zhao, Heng Zhao, Bo Shen et al.

ICLR 2025posterarXiv:2410.00153
16
citations
#1839

MOS: Model Surgery for Pre-Trained Model-Based Class-Incremental Learning

Hai-Long Sun, Da-Wei Zhou, Hanbin Zhao et al.

AAAI 2025paperarXiv:2412.09441
16
citations
#1840

RefactorBench: Evaluating Stateful Reasoning in Language Agents Through Code

Dhruv Gautam, Spandan Garg, Jinu Jang et al.

ICLR 2025posterarXiv:2503.07832
16
citations
#1841

Structured Packing in LLM Training Improves Long Context Utilization

Konrad Staniszewski, Szymon Tworkowski, Sebastian Jaszczur et al.

AAAI 2025paperarXiv:2312.17296
16
citations
#1842

MACPO: Weak-to-Strong Alignment via Multi-Agent Contrastive Preference Optimization

Yougang Lyu, Lingyong Yan, Zihan Wang et al.

ICLR 2025oralarXiv:2410.07672
16
citations
#1843

ContextGNN: Beyond Two-Tower Recommendation Systems

Yiwen Yuan, Zecheng Zhang, Xinwei He et al.

ICLR 2025posterarXiv:2411.19513
16
citations
#1844

Signature Kernel Conditional Independence Tests in Causal Discovery for Stochastic Processes

Georg Manten, Cecilia Casolo, Emilio Ferrucci et al.

ICLR 2025posterarXiv:2402.18477
16
citations
#1845

CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion

Shoubin Yu, Jaehong Yoon, Mohit Bansal

ICLR 2025posterarXiv:2402.05889
16
citations
#1846

UniGraspTransformer: Simplified Policy Distillation for Scalable Dexterous Robotic Grasping

Wenbo Wang, Fangyun Wei, Lei Zhou et al.

CVPR 2025posterarXiv:2412.02699
16
citations
#1847

DINO-Foresight: Looking into the Future with DINO

Efstathios Karypidis, Ioannis Kakogeorgiou, Spyridon Gidaris et al.

NEURIPS 2025posterarXiv:2412.11673
16
citations
#1848

MMDisCo: Multi-Modal Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation

Akio Hayakawa, Masato Ishii, Takashi Shibuya et al.

ICLR 2025posterarXiv:2405.17842
16
citations
#1849

PRAGA: Prototype-aware Graph Adaptive Aggregation for Spatial Multi-modal Omics Analysis

Xinlei Huang, Zhiqi Ma, Dian Meng et al.

AAAI 2025paperarXiv:2409.12728
16
citations
#1850

Federated Unlearning with Gradient Descent and Conflict Mitigation

Zibin Pan, Zhichao Wang, Chi Li et al.

AAAI 2025paperarXiv:2412.20200
16
citations
#1851

Efficient Part-level 3D Object Generation via Dual Volume Packing

Jiaxiang Tang, Ruijie Lu, Max Li et al.

NEURIPS 2025posterarXiv:2506.09980
16
citations
#1852

BEST-Route: Adaptive LLM Routing with Test-Time Optimal Compute

Dujian Ding, Ankur Mallick, Shaokun Zhang et al.

ICML 2025posterarXiv:2506.22716
16
citations
#1853

Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence

Shangbin Feng, Zifeng Wang, Yike Wang et al.

ICML 2025posterarXiv:2410.11163
16
citations
#1854

Quantization without Tears

Minghao Fu, Hao Yu, Jie Shao et al.

CVPR 2025posterarXiv:2411.13918
16
citations
#1855

MagicArticulate: Make Your 3D Models Articulation-Ready

Chaoyue Song, Jianfeng Zhang, Xiu Li et al.

CVPR 2025posterarXiv:2502.12135
16
citations
#1856

Reinforce LLM Reasoning through Multi-Agent Reflection

Yurun Yuan, Tengyang Xie

ICML 2025posterarXiv:2506.08379
16
citations
#1857

Tracing Representation Progression: Analyzing and Enhancing Layer-Wise Similarity

Jiachen Jiang, Jinxin Zhou, Zhihui Zhu

ICLR 2025posterarXiv:2406.14479
16
citations
#1858

ROS-SAM: High-Quality Interactive Segmentation for Remote Sensing Moving Object

Zhe Shan, Yang Liu, Lei Zhou et al.

CVPR 2025posterarXiv:2503.12006
16
citations
#1859

Adaptive Length Image Tokenization via Recurrent Allocation

Shivam Duggal, Phillip Isola, Antonio Torralba et al.

ICLR 2025posterarXiv:2411.02393
16
citations
#1860

Aioli: A Unified Optimization Framework for Language Model Data Mixing

Mayee Chen, Michael Hu, Nicholas Lourie et al.

ICLR 2025posterarXiv:2411.05735
16
citations
#1861

Instant3dit: Multiview Inpainting for Fast Editing of 3D Objects

Amir Barda, Matheus Gadelha, Vladimir G. Kim et al.

CVPR 2025posterarXiv:2412.00518
16
citations
#1862

VQ-VLA: Improving Vision-Language-Action Models via Scaling Vector-Quantized Action Tokenizers

Yating Wang, Haoyi Zhu, Mingyu Liu et al.

ICCV 2025posterarXiv:2507.01016
16
citations
#1863

Concept Bottleneck Language Models For Protein Design

Aya Ismail, Tuomas Oikarinen, Amy Wang et al.

ICLR 2025posterarXiv:2411.06090
16
citations
#1864

MMAD: A Comprehensive Benchmark for Multimodal Large Language Models in Industrial Anomaly Detection

Xi Jiang, Jian Li, Hanqiu Deng et al.

ICLR 2025posterarXiv:2410.09453
16
citations
#1865

Mimir: Improving Video Diffusion Models for Precise Text Understanding

Shuai Tan, Biao Gong, Yutong Feng et al.

CVPR 2025posterarXiv:2412.03085
16
citations
#1866

WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch

Zimu Lu, Yunqiao Yang, Houxing Ren et al.

NEURIPS 2025oralarXiv:2505.03733
16
citations
#1867

Octopus: Alleviating Hallucination via Dynamic Contrastive Decoding

Wei Suo, Lijun Zhang, Mengyang Sun et al.

CVPR 2025highlightarXiv:2503.00361
16
citations
#1868

Chat2SVG: Vector Graphics Generation with Large Language Models and Image Diffusion Models

Ronghuan Wu, Wanchao Su, Jing Liao

CVPR 2025posterarXiv:2411.16602
16
citations
#1869

Shared Global and Local Geometry of Language Model Embeddings

Andrew Lee, Melanie Weber, Fernanda Viégas et al.

COLM 2025paperarXiv:2503.21073
16
citations
#1870

MambaIC: State Space Models for High-Performance Learned Image Compression

Fanhu Zeng, Hao Tang, Yihua Shao et al.

CVPR 2025posterarXiv:2503.12461
16
citations
#1871

Degradation-Aware Feature Perturbation for All-in-One Image Restoration

Xiangpeng Tian, Xiangyu Liao, Xiao Liu et al.

CVPR 2025posterarXiv:2505.12630
16
citations
#1872

GraphMoRE: Mitigating Topological Heterogeneity via Mixture of Riemannian Experts

Zihao Guo, Qingyun Sun, Haonan Yuan et al.

AAAI 2025paperarXiv:2412.11085
16
citations
#1873

Toward Generalized Image Quality Assessment: Relaxing the Perfect Reference Quality Assumption

Du CHEN, Tianhe Wu, Kede Ma et al.

CVPR 2025posterarXiv:2503.11221
16
citations
#1874

Base Models Beat Aligned Models at Randomness and Creativity

Peter West, Christopher Potts

COLM 2025paperarXiv:2505.00047
16
citations
#1875

Quamba: A Post-Training Quantization Recipe for Selective State Space Models

Hung-Yueh Chiang, Chi-Chih Chang, Natalia Frumkin et al.

ICLR 2025posterarXiv:2410.13229
16
citations
#1876

Revealing Key Details to See Differences: A Novel Prototypical Perspective for Skeleton-based Action Recognition

Hongda Liu, Yunfan Liu, Min Ren et al.

CVPR 2025highlightarXiv:2411.18941
16
citations
#1877

Endless Jailbreaks with Bijection Learning

Brian R.Y. Huang, Max Li, Leonard Tang

ICLR 2025posterarXiv:2410.01294
16
citations
#1878

Closed-Form Merging of Parameter-Efficient Modules for Federated Continual Learning

Riccardo Salami, Pietro Buzzega, Matteo Mosconi et al.

ICLR 2025posterarXiv:2410.17961
16
citations
#1879

IDArb: Intrinsic Decomposition for Arbitrary Number of Input Views and Illuminations

Zhibing Li, Tong Wu, Jing Tan et al.

ICLR 2025posterarXiv:2412.12083
16
citations
#1880

Dual Data Alignment Makes AI-Generated Image Detector Easier Generalizable

Ruoxin Chen, Junwei Xi, Zhiyuan Yan et al.

NEURIPS 2025spotlightarXiv:2505.14359
16
citations
#1881

Understanding and Enhancing the Transferability of Jailbreaking Attacks

Runqi Lin, Bo Han, Fengwang Li et al.

ICLR 2025posterarXiv:2502.03052
16
citations
#1882

EnvGS: Modeling View-Dependent Appearance with Environment Gaussian

Tao Xie, Xi Chen, Zhen Xu et al.

CVPR 2025posterarXiv:2412.15215
16
citations
#1883

Stationary Kernels and Gaussian Processes on Lie Groups and their Homogeneous Spaces II: non-compact symmetric spaces

Iskander Azangulov, Andrei Smolensky, Alexander Terenin et al.

NEURIPS 2025oralarXiv:2301.13088
16
citations
#1884

Physics-Constrained Flow Matching: Sampling Generative Models with Hard Constraints

Utkarsh Utkarsh, Pengfei Cai, Alan Edelman et al.

NEURIPS 2025posterarXiv:2506.04171
16
citations
#1885

MMedPO: Aligning Medical Vision-Language Models with Clinical-Aware Multimodal Preference Optimization

Kangyu Zhu, Peng Xia, Yun Li et al.

ICML 2025posterarXiv:2412.06141
16
citations
#1886

MiniPLM: Knowledge Distillation for Pre-training Language Models

Yuxian Gu, Hao Zhou, Fandong Meng et al.

ICLR 2025posterarXiv:2410.17215
16
citations
#1887

Revisiting MAE Pre-training for 3D Medical Image Segmentation

Tassilo Wald, Constantin Ulrich, Stanislav Lukyanenko et al.

CVPR 2025highlightarXiv:2410.23132
16
citations
#1888

APE: Faster and Longer Context-Augmented Generation via Adaptive Parallel Encoding

Xinyu Yang, Tianqi Chen, Beidi Chen

ICLR 2025posterarXiv:2502.05431
16
citations
#1889

Global-Local Tree Search in VLMs for 3D Indoor Scene Generation

Wei Deng, Mengshi Qi, Huadong Ma

CVPR 2025posterarXiv:2503.18476
16
citations
#1890

UniCombine: Unified Multi-Conditional Combination with Diffusion Transformer

Haoxuan Wang, Jinlong Peng, Qingdong He et al.

ICCV 2025posterarXiv:2503.09277
16
citations
#1891

OBI-Bench: Can LMMs Aid in Study of Ancient Script on Oracle Bones?

Zijian Chen, tingzhu chen, Wenjun Zhang et al.

ICLR 2025posterarXiv:2412.01175
16
citations
#1892

VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?

Yunlong Tang, JunJia Guo, Hang Hua et al.

CVPR 2025posterarXiv:2411.10979
16
citations
#1893

PrEditor3D: Fast and Precise 3D Shape Editing

Ziya Erkoc, Can Gümeli, Chaoyang Wang et al.

CVPR 2025posterarXiv:2412.06592
16
citations
#1894

RevisEval: Improving LLM-as-a-Judge via Response-Adapted References

Qiyuan Zhang, Yufei Wang, Tiezheng YU et al.

ICLR 2025posterarXiv:2410.05193
16
citations
#1895

FastVID: Dynamic Density Pruning for Fast Video Large Language Models

Leqi Shen, Guoqiang Gong, Tao He et al.

NEURIPS 2025oralarXiv:2503.11187
16
citations
#1896

DRoC: Elevating Large Language Models for Complex Vehicle Routing via Decomposed Retrieval of Constraints

Xia Jiang, Yaoxin Wu, Chenhao Zhang et al.

ICLR 2025poster
16
citations
#1897

VeriThinker: Learning to Verify Makes Reasoning Model Efficient

Zigeng Chen, Xinyin Ma, Gongfan Fang et al.

NEURIPS 2025posterarXiv:2505.17941
16
citations
#1898

OATS: Outlier-Aware Pruning Through Sparse and Low Rank Decomposition

Stephen Zhang, Vardan Papyan

ICLR 2025posterarXiv:2409.13652
16
citations
#1899

MetaOOD: Automatic Selection of OOD Detection Models

Yuehan Qin, Yichi Zhang, Yi Nian et al.

ICLR 2025posterarXiv:2410.03074
16
citations
#1900

ECBench: Can Multi-modal Foundation Models Understand the Egocentric World? A Holistic Embodied Cognition Benchmark

Ronghao Dang, Yuqian Yuan, Wenqi Zhang et al.

CVPR 2025posterarXiv:2501.05031
16
citations
#1901

Improved Bounds for Online Facility Location with Predictions

Dimitris Fotakis, Evangelia Gergatsouli, Themistoklis Gouleakis et al.

AAAI 2025paperarXiv:2107.08277
16
citations
#1902

A Many-Objective Problem Where Crossover Is Provably Indispensable

Andre Opris

AAAI 2025paper
16
citations
#1903

Sharp Analysis for KL-Regularized Contextual Bandits and RLHF

Heyang Zhao, Chenlu Ye, Quanquan Gu et al.

NEURIPS 2025posterarXiv:2411.04625
16
citations
#1904

Training Neural Networks as Recognizers of Formal Languages

Alexandra Butoi, Ghazal Khalighinejad, Anej Svete et al.

ICLR 2025posterarXiv:2411.07107
16
citations
#1905

SensorLM: Learning the Language of Wearable Sensors

Yuwei Zhang, Kumar Ayush, Siyuan Qiao et al.

NEURIPS 2025posterarXiv:2506.09108
16
citations
#1906

Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation and Reconstruction

Yuanhao Cai, He Zhang, Kai Zhang et al.

ICCV 2025posterarXiv:2411.14384
16
citations
#1907

Law of Vision Representation in MLLMs

Shijia Yang, Bohan Zhai, Quanzeng You et al.

COLM 2025paperarXiv:2408.16357
16
citations
#1908

SciReplicate-Bench: Benchmarking LLMs in Agent-driven Algorithmic Reproduction from Research Papers

Yanzheng Xiang, Hanqi Yan, Shuyin Ouyang et al.

COLM 2025paperarXiv:2504.00255
16
citations
#1909

Scalable Influence and Fact Tracing for Large Language Model Pretraining

Tyler Chang, Dheeraj Rajagopal, Tolga Bolukbasi et al.

ICLR 2025posterarXiv:2410.17413
16
citations
#1910

xLSTM-Mixer: Multivariate Time Series Forecasting by Mixing via Scalar Memories

Maurice Kraus, Felix Divo, Devendra Singh Dhami et al.

NEURIPS 2025oralarXiv:2410.16928
16
citations
#1911

EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing

Gaoxiang Cong, Jiadong Pan, Liang Li et al.

CVPR 2025highlightarXiv:2412.08988
16
citations
#1912

Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards

Xiaoyuan Liu, Tian Liang, Zhiwei He et al.

NEURIPS 2025posterarXiv:2505.13445
16
citations
#1913

Brains vs. Bytes: Evaluating LLM Proficiency in Olympiad Mathematics

Hamed Mahdavi, Alireza Hashemi, Majid Daliri et al.

COLM 2025paperarXiv:2504.01995
16
citations
#1914

Simulating Human-like Daily Activities with Desire-driven Autonomy

Yiding Wang, Yuxuan Chen, Fangwei Zhong et al.

ICLR 2025oralarXiv:2412.06435
16
citations
#1915

Task Vectors in In-Context Learning: Emergence, Formation, and Benefits

Liu Yang, Ziqian Lin, Kangwook Lee et al.

COLM 2025paperarXiv:2501.09240
16
citations
#1916

Learnable Expansion of Graph Operators for Multi-Modal Feature Fusion

Dexuan Ding, Lei Wang, Liyun Zhu et al.

ICLR 2025posterarXiv:2410.01506
16
citations
#1917

Prior-guided Hierarchical Harmonization Network for Efficient Image Dehazing

Xiongfei Su, Siyuan Li, Yuning Cui et al.

AAAI 2025paperarXiv:2503.01136
16
citations
#1918

Does SGD really happen in tiny subspaces?

Minhak Song, Kwangjun Ahn, Chulhee Yun

ICLR 2025posterarXiv:2405.16002
16
citations
#1919

Where am I? Cross-View Geo-localization with Natural Language Descriptions

Junyan Ye, Honglin Lin, Leyan Ou et al.

ICCV 2025posterarXiv:2412.17007
16
citations
#1920

DreamOmni: Unified Image Generation and Editing

Bin Xia, Yuechen Zhang, Jingyao Li et al.

CVPR 2025posterarXiv:2412.17098
16
citations
#1921

Memory Injection Attacks on LLM Agents via Query-Only Interaction

Shen Dong, Shaochen Xu, Pengfei He et al.

NEURIPS 2025posterarXiv:2503.03704
16
citations
#1922

Track-On: Transformer-based Online Point Tracking with Memory

Görkay Aydemir, Xiongyi Cai, Weidi Xie et al.

ICLR 2025oralarXiv:2501.18487
16
citations
#1923

Debate or Vote: Which Yields Better Decisions in Multi-Agent Large Language Models?

Hyeong Kyu Choi, Jerry Zhu, Sharon Li

NEURIPS 2025spotlightarXiv:2508.17536
16
citations
#1924

LLMs Can Plan Only If We Tell Them

Bilgehan Sel, Ruoxi Jia, Ming Jin

ICLR 2025posterarXiv:2501.13545
16
citations
#1925

Leveraging Online Olympiad-Level Math Problems for LLMs Training and Contamination-Resistant Evaluation

Sadegh Mahdavi, Muchen Li, Kaiwen Liu et al.

ICML 2025posterarXiv:2501.14275
16
citations
#1926

Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation

Nicolas Dufour, Vicky Kalogeiton, David Picard et al.

CVPR 2025posterarXiv:2412.06781
16
citations
#1927

EgoExo-Gen: Ego-centric Video Prediction by Watching Exo-centric Videos

Jilan Xu, Yifei Huang, Baoqi Pei et al.

ICLR 2025oralarXiv:2504.11732
16
citations
#1928

RelationAdapter: Learning and Transferring Visual Relation with Diffusion Transformers

Yan Gong, Yiren Song, Yicheng Li et al.

NEURIPS 2025posterarXiv:2506.02528
15
citations
#1929

PuzzleFusion++: Auto-agglomerative 3D Fracture Assembly by Denoise and Verify

Zhengqing Wang, Jiacheng Chen, Yasutaka Furukawa

ICLR 2025posterarXiv:2406.00259
15
citations
#1930

AllTracker: Efficient Dense Point Tracking at High Resolution

Adam Harley, Yang You, Yang Zheng et al.

ICCV 2025posterarXiv:2506.07310
15
citations
#1931

GOPlan: Goal-conditioned Offline Reinforcement Learning by Planning with Learned Models

Mianchu Wang, Rui Yang, Xi Chen et al.

ICLR 2025posterarXiv:2310.20025
15
citations
#1932

Equivariant Neural Functional Networks for Transformers

Viet-Hoang Tran, Thieu Vo, An Nguyen et al.

ICLR 2025posterarXiv:2410.04209
15
citations
#1933

Retrieval Augmented Time Series Forecasting

Sungwon Han, Seungeon Lee, MEEYOUNG CHA et al.

ICML 2025oralarXiv:2505.04163
15
citations
#1934

FD2-Net: Frequency-Driven Feature Decomposition Network for Infrared-Visible Object Detection

Ke Li, Di Wang, Zhangyuan Hu et al.

AAAI 2025paperarXiv:2412.09258
15
citations
#1935

DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models

Dewei Zhou, Mingwei Li, Zongxin Yang et al.

ICCV 2025posterarXiv:2503.12885
15
citations
#1936

Discrete to Continuous: Generating Smooth Transition Poses from Sign Language Observations

Shengeng Tang, Jiayi He, Lechao Cheng et al.

CVPR 2025posterarXiv:2411.16810
15
citations
#1937

FlowTok: Flowing Seamlessly Across Text and Image Tokens

Ju He, Qihang Yu, Qihao Liu et al.

ICCV 2025posterarXiv:2503.10772
15
citations
#1938

TabDPT: Scaling Tabular Foundation Models on Real Data

Junwei Ma, Valentin Thomas, Rasa Hosseinzadeh et al.

NEURIPS 2025posterarXiv:2410.18164
15
citations
#1939

Systematic Outliers in Large Language Models

Yongqi An, Xu Zhao, Tao Yu et al.

ICLR 2025posterarXiv:2502.06415
15
citations
#1940

Few-Shot Recognition via Stage-Wise Retrieval-Augmented Finetuning

Tian Liu, Huixin Zhang, Shubham Parashar et al.

CVPR 2025posterarXiv:2406.11148
15
citations
#1941

Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs

Yuzhe Gu, Wenwei Zhang, Chengqi Lyu et al.

ICLR 2025posterarXiv:2503.02846
15
citations
#1942

DiffMS: Diffusion Generation of Molecules Conditioned on Mass Spectra

Montgomery Bohde, Mrunali Manjrekar, Runzhong Wang et al.

ICML 2025posterarXiv:2502.09571
15
citations
#1943

Needle In A Video Haystack: A Scalable Synthetic Evaluator for Video MLLMs

Zijia Zhao, Haoyu Lu, Yuqi Huo et al.

ICLR 2025oralarXiv:2406.09367
15
citations
#1944

Citations and Trust in LLM Generated Responses

Yifan Ding, Matthew Facciani, Ellen Joyce et al.

AAAI 2025paperarXiv:2501.01303
15
citations
#1945

AutoRedTeamer: Autonomous Red Teaming with Lifelong Attack Integration

Andy Zhou, Kevin Wu, Francesco Pinto et al.

NEURIPS 2025posterarXiv:2503.15754
15
citations
#1946

Generating Multi-Image Synthetic Data for Text-to-Image Customization

Nupur Kumari, Xi Yin, Jun-Yan Zhu et al.

ICCV 2025posterarXiv:2502.01720
15
citations
#1947

Falcon: Faster and Parallel Inference of Large Language Models Through Enhanced Semi-Autoregressive Drafting and Custom-Designed Decoding Tree

Xiangxiang Gao, Weisheng Xie, Yiwei Xiang et al.

AAAI 2025paperarXiv:2412.12639
15
citations
#1948

Universal Cross-Tokenizer Distillation via Approximate Likelihood Matching

Benjamin Minixhofer, Ivan Vulić, Edoardo Maria Ponti

NEURIPS 2025posterarXiv:2503.20083
15
citations
#1949

LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization

Jui-Nan Yen, Si Si, Zhao Meng et al.

ICLR 2025posterarXiv:2410.20625
15
citations
#1950

RoboScape: Physics-informed Embodied World Model

Yu Shang, Xin Zhang, Yinzhou Tang et al.

NEURIPS 2025oralarXiv:2506.23135
15
citations
#1951

FrugalNeRF: Fast Convergence for Extreme Few-shot Novel View Synthesis without Learned Priors

Chin-Yang Lin, Chung-Ho Wu, Changhan Yeh et al.

CVPR 2025posterarXiv:2410.16271
15
citations
#1952

Scaling Properties of Diffusion Models For Perceptual Tasks

Rahul Ravishankar, Zeeshan Patel, Jathushan Rajasegaran et al.

CVPR 2025posterarXiv:2411.08034
15
citations
#1953

Security Attacks on LLM-based Code Completion Tools

Wen Cheng, Ke Sun, Xinyu Zhang et al.

AAAI 2025paperarXiv:2408.11006
15
citations
#1954

Efficient Automated Circuit Discovery in Transformers using Contextual Decomposition

Aliyah Hsu, Georgia Zhou, Yeshwanth Cherapanamjeri et al.

ICLR 2025posterarXiv:2407.00886
15
citations
#1955

Is Artificial Intelligence Generated Image Detection a Solved Problem?

Ziqiang Li, Jiazhen Yan, Ziwen He et al.

NEURIPS 2025posterarXiv:2505.12335
15
citations
#1956

GaussianFlowOcc: Sparse and Weakly Supervised Occupancy Estimation using Gaussian Splatting and Temporal Flow

Simon Boeder, Fabian Gigengack, Benjamin Risse

ICCV 2025posterarXiv:2502.17288
15
citations
#1957

GGS: Generalizable Gaussian Splatting for Lane Switching in Autonomous Driving

Huasong Han, Kaixuan Zhou, Xiaoxiao Long et al.

AAAI 2025paperarXiv:2409.02382
15
citations
#1958

ActSafe: Active Exploration with Safety Constraints for Reinforcement Learning

Yarden As, Bhavya, Lenart Treven et al.

ICLR 2025posterarXiv:2410.09486
15
citations
#1959

LightPROF: A Lightweight Reasoning Framework for Large Language Model on Knowledge Graph

Tu Ao, Yanhua Yu, Yuling Wang et al.

AAAI 2025paperarXiv:2504.03137
15
citations
#1960

Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion

Marco Mistretta, Alberto Baldrati, Lorenzo Agnolucci et al.

ICLR 2025posterarXiv:2502.04263
15
citations
#1961

UNSURE: self-supervised learning with Unknown Noise level and Stein's Unbiased Risk Estimate

Julián Tachella, Mike Davies, Laurent Jacques

ICLR 2025posterarXiv:2409.01985
15
citations
#1962

Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models

Yiran Guo, Lijie Xu, Jie Liu et al.

NEURIPS 2025posterarXiv:2505.23564
15
citations
#1963

Multi-Domain Graph Foundation Models: Robust Knowledge Transfer via Topology Alignment

Shuo Wang, Bokui Wang, Zhixiang Shen et al.

ICML 2025posterarXiv:2502.02017
15
citations
#1964

OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting

Yongsheng Yu, Ziyun Zeng, Haitian Zheng et al.

ICCV 2025posterarXiv:2503.08677
15
citations
#1965

Recoverable Compression: A Multimodal Vision Token Recovery Mechanism Guided by Text Information

Yi Chen, Jian Xu, Xu-Yao Zhang et al.

AAAI 2025paperarXiv:2409.01179
15
citations
#1966

Missing Target-Relevant Information Prediction with World Model for Accurate Zero-Shot Composed Image Retrieval

Yuanmin Tang, Jing Yu, Keke Gai et al.

CVPR 2025posterarXiv:2503.17109
15
citations
#1967

NumbOD: A Spatial-Frequency Fusion Attack Against Object Detectors

Ziqi Zhou, Bowen Li, Yufei Song et al.

AAAI 2025paperarXiv:2412.16955
15
citations
#1968

Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion

Jiuhai Chen, Jianwei Yang, Haiping Wu et al.

CVPR 2025posterarXiv:2412.04424
15
citations
#1969

FreeTimeGS: Free Gaussian Primitives at Anytime Anywhere for Dynamic Scene Reconstruction

Yifan Wang, Peishan Yang, Zhen Xu et al.

CVPR 2025poster
15
citations
#1970

VMBench: A Benchmark for Perception-Aligned Video Motion Generation

Xinran Ling, Chen Zhu, Meiqi Wu et al.

ICCV 2025posterarXiv:2503.10076
15
citations
#1971

Stochastic Deep Restoration Priors for Imaging Inverse Problems

Yuyang Hu, Albert Peng, Weijie Gan et al.

ICML 2025posterarXiv:2410.02057
15
citations
#1972

Assessing Modality Bias in Video Question Answering Benchmarks with Multimodal Large Language Models

Jean Park, Kuk Jin Jang, Basam Alasaly et al.

AAAI 2025paperarXiv:2408.12763
15
citations
#1973

Adaptive teachers for amortized samplers

Minsu Kim, Sanghyeok Choi, Taeyoung Yun et al.

ICLR 2025posterarXiv:2410.01432
15
citations
#1974

From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories, and Applications

Ajay Jaiswal, Yifan Wang, Lu Yin et al.

ICML 2025posterarXiv:2407.11239
15
citations
#1975

Restoring Calibration for Aligned Large Language Models: A Calibration-Aware Fine-Tuning Approach

Jiancong Xiao, Bojian Hou, Zhanliang Wang et al.

ICML 2025posterarXiv:2505.01997
15
citations
#1976

Transformers Can Learn Temporal Difference Methods for In-Context Reinforcement Learning

Jiuqi Wang, Ethan Blaser, Hadi Daneshmand et al.

ICLR 2025oralarXiv:2405.13861
15
citations
#1977

ComfyBench: Benchmarking LLM-based Agents in ComfyUI for Autonomously Designing Collaborative AI Systems

Xiangyuan Xue, Zeyu Lu, Di Huang et al.

CVPR 2025posterarXiv:2409.01392
15
citations
#1978

Thermalizer: Stable autoregressive neural emulation of spatiotemporal chaos

Chris Pedersen, Laure Zanna, Joan Bruna

ICML 2025oralarXiv:2503.18731
15
citations
#1979

Reinforcement Learning Finetunes Small Subnetworks in Large Language Models

Sagnik Mukherjee, Lifan Yuan, Dilek Hakkani-Tur et al.

NEURIPS 2025posterarXiv:2505.11711
15
citations
#1980

AutoToM: Scaling Model-based Mental Inference via Automated Agent Modeling

Zhining Zhang, Chuanyang Jin, Mung Yao Jia et al.

NEURIPS 2025spotlightarXiv:2502.15676
15
citations
#1981

F-Fidelity: A Robust Framework for Faithfulness Evaluation of Explainable AI

Xu Zheng, Farhad Shirani, Zhuomin Chen et al.

ICLR 2025posterarXiv:2410.02970
15
citations
#1982

BillBoard Splatting (BBSplat): Learnable Textured Primitives for Novel View Synthesis

David Svitov, Pietro Morerio, Lourdes Agapito et al.

ICCV 2025posterarXiv:2411.08508
15
citations
#1983

Re-Thinking Inverse Graphics With Large Language Models

Haiwen Feng, Michael J Black, Weiyang Liu et al.

ICLR 2025posterarXiv:2404.15228
15
citations
#1984

TimeDP: Learning to Generate Multi-Domain Time Series with Domain Prompts

Yu-Hao Huang, Chang Xu, Yueying Wu et al.

AAAI 2025paperarXiv:2501.05403
15
citations
#1985

RocketEval: Efficient automated LLM evaluation via grading checklist

Tianjun Wei, Wei Wen, Ruizhi Qiao et al.

ICLR 2025posterarXiv:2503.05142
15
citations
#1986

Can We Talk Models Into Seeing the World Differently?

Paul Gavrikov, Jovita Lukasik, Steffen Jung et al.

ICLR 2025posterarXiv:2403.09193
15
citations
#1987

VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation

Ziyang Luo, Haoning Wu, Dongxu Li et al.

CVPR 2025posterarXiv:2411.13281
15
citations
#1988

RoboGround: Robotic Manipulation with Grounded Vision-Language Priors

Haifeng Huang, Xinyi Chen, Yilun Chen et al.

CVPR 2025posterarXiv:2504.21530
15
citations
#1989

Multi-Layer Visual Feature Fusion in Multimodal LLMs: Methods, Analysis, and Best Practices

Junyan Lin, Haoran Chen, Yue Fan et al.

CVPR 2025posterarXiv:2503.06063
15
citations
#1990

Erwin: A Tree-based Hierarchical Transformer for Large-scale Physical Systems

Maksim Zhdanov, Max Welling, Jan-Willem van de Meent

ICML 2025posterarXiv:2502.17019
15
citations
#1991

FineLIP: Extending CLIP’s Reach via Fine-Grained Alignment with Longer Text Inputs

Mothilal Asokan, Kebin wu, Fatima Albreiki

CVPR 2025posterarXiv:2504.01916
15
citations
#1992

Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval

Sheryl Hsu, Omar Khattab, Chelsea Finn et al.

ICLR 2025posterarXiv:2410.23214
15
citations
#1993

Scalable Discrete Diffusion Samplers: Combinatorial Optimization and Statistical Physics

Sebastian Sanokowski, Wilhelm Berghammer, Haoyu Wang et al.

ICLR 2025posterarXiv:2502.08696
15
citations
#1994

Wasserstein Flow Matching: Generative Modeling Over Families of Distributions

Doron Haviv, Aram-Alexandre Pooladian, Dana Pe'er et al.

ICML 2025posterarXiv:2411.00698
15
citations
#1995

Sylber: Syllabic Embedding Representation of Speech from Raw Audio

Cheol Jun Cho, Nicholas Lee, Akshat Gupta et al.

ICLR 2025posterarXiv:2410.07168
15
citations
#1996

Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning

Qinghao Ye, Xianhan Zeng, Fu Li et al.

ICLR 2025posterarXiv:2503.07906
15
citations
#1997

Image-level Memorization Detection via Inversion-based Inference Perturbation

Yue Jiang, Haokun Lin, Yang Bai et al.

ICLR 2025poster
15
citations
#1998

MallowsPO: Fine-Tune Your LLM with Preference Dispersions

Haoxian Chen, Hanyang Zhao, Henry Lam et al.

ICLR 2025posterarXiv:2405.14953
15
citations
#1999

PolyPythias: Stability and Outliers across Fifty Language Model Pre-Training Runs

Oskar van der Wal, Pietro Lesci, Max Müller-Eberstein et al.

ICLR 2025posterarXiv:2503.09543
15
citations
#2000

Identifiable Exchangeable Mechanisms for Causal Structure and Representation Learning

Patrik Reizinger, Siyuan Guo, Ferenc Huszar et al.

ICLR 2025posterarXiv:2406.14302
15
citations