Most Cited 2025 "msa profile modeling" Papers

22,274 papers found • Page 15 of 112

#2801

MambaIC: State Space Models for High-Performance Learned Image Compression

Fanhu Zeng, Hao Tang, Yihua Shao et al.

CVPR 2025arXiv:2503.12461
17
citations
#2802

Text2midi: Generating Symbolic Music from Captions

Keshav Bhandari, Abhinaba Roy, Kyra Wang et al.

AAAI 2025paperarXiv:2412.16526
17
citations
#2803

RefactorBench: Evaluating Stateful Reasoning in Language Agents Through Code

Dhruv Gautam, Spandan Garg, Jinu Jang et al.

ICLR 2025arXiv:2503.07832
17
citations
#2804

Reasoning Limitations of Multimodal Large Language Models. A case study of Bongard Problems

Mikołaj Małkiński, Szymon Pawlonka, Jacek Mańdziuk

ICML 2025arXiv:2411.01173
17
citations
#2805

EmbodiedOcc: Embodied 3D Occupancy Prediction for Vision-based Online Scene Understanding

Yuqi Wu, Wenzhao Zheng, Sicheng Zuo et al.

ICCV 2025arXiv:2412.04380
17
citations
#2806

Tighter Privacy Auditing of DP-SGD in the Hidden State Threat Model

Tudor Cebere, Aurélien Bellet, Nicolas Papernot

ICLR 2025arXiv:2405.14457
17
citations
#2807

Inference-Time Hyper-Scaling with KV Cache Compression

Adrian Łańcucki, Konrad Staniszewski, Piotr Nawrot et al.

NEURIPS 2025arXiv:2506.05345
17
citations
#2808

TimePFN: Effective Multivariate Time Series Forecasting with Synthetic Data

Ege Onur Taga, Muhammed Emrullah Ildiz, Samet Oymak

AAAI 2025paperarXiv:2502.16294
17
citations
#2809

CAD-Assistant: Tool-Augmented VLLMs as Generic CAD Task Solvers

Dimitrios Mallis, Ahmet Karadeniz, Sebastian Cavada et al.

ICCV 2025arXiv:2412.13810
17
citations
#2810

A CLIP-Powered Framework for Robust and Generalizable Data Selection

Suorong Yang, Peng Ye, Wanli Ouyang et al.

ICLR 2025arXiv:2410.11215
17
citations
#2811

Learning Precise Affordances from Egocentric Videos for Robotic Manipulation

Li, Nikolaos Tsagkas, Jifei Song et al.

ICCV 2025arXiv:2408.10123
17
citations
#2812

EnvGS: Modeling View-Dependent Appearance with Environment Gaussian

Tao Xie, Xi Chen, Zhen Xu et al.

CVPR 2025arXiv:2412.15215
17
citations
#2813

Debate or Vote: Which Yields Better Decisions in Multi-Agent Large Language Models?

Hyeong Kyu Choi, Jerry Zhu, Sharon Li

NEURIPS 2025spotlightarXiv:2508.17536
17
citations
#2814

AutoBencher: Towards Declarative Benchmark Construction

XIANG LI, Farzaan Kaiyom, Evan Liu et al.

ICLR 2025arXiv:2407.08351
17
citations
#2815

Optimal Transport for Time Series Imputation

Hao Wang, zhengnan li, Haoxuan Li et al.

ICLR 2025oral
17
citations
#2816

(How) Do Language Models Track State?

Belinda Li, Carl Guo, Jacob Andreas

ICML 2025arXiv:2503.02854
17
citations
#2817

DeLLMa: Decision Making Under Uncertainty with Large Language Models

Ollie Liu, Deqing Fu, Dani Yogatama et al.

ICLR 2025arXiv:2402.02392
17
citations
#2818

Personalized Federated Collaborative Filtering: A Variational AutoEncoder Approach

Zhiwei Li, Guodong Long, Tianyi Zhou et al.

AAAI 2025paperarXiv:2408.08931
17
citations
#2819

Adam Exploits $\ell_\infty$-geometry of Loss Landscape via Coordinate-wise Adaptivity

Shuo Xie, Mohamad Amin Mohamadi, Zhiyuan Li

ICLR 2025arXiv:2410.08198
17
citations
#2820

CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion

Shoubin Yu, Jaehong Yoon, Mohit Bansal

ICLR 2025arXiv:2402.05889
17
citations
#2821

Enhancing LLM Reasoning with Iterative DPO: A Comprehensive Empirical Investigation

Songjun Tu, Jiahao Lin, Xiangyu Tian et al.

COLM 2025paperarXiv:2503.12854
17
citations
#2822

Idiosyncrasies in Large Language Models

Mingjie Sun, Yida Yin, Zhiqiu (Oscar) Xu et al.

ICML 2025arXiv:2502.12150
17
citations
#2823

UniRestore: Unified Perceptual and Task-Oriented Image Restoration Model Using Diffusion Prior

I-Hsiang Chen, Wei-Ting Chen, Yu-Wei Liu et al.

CVPR 2025highlightarXiv:2501.13134
17
citations
#2824

U-Nets as Belief Propagation: Efficient Classification, Denoising, and Diffusion in Generative Hierarchical Models

Song Mei

ICLR 2025arXiv:2404.18444
17
citations
#2825

DBLoss: Decomposition-based Loss Function for Time Series Forecasting

Xiangfei Qiu, Xingjian Wu, Hanyin Cheng et al.

NEURIPS 2025arXiv:2510.23672
17
citations
#2826

Emergence and scaling laws in SGD learning of shallow neural networks

Yunwei Ren, Eshaan Nichani, Denny Wu et al.

NEURIPS 2025arXiv:2504.19983
17
citations
#2827

An Engorgio Prompt Makes Large Language Model Babble on

Jianshuo Dong, Ziyuan Zhang, Qingjie Zhang et al.

ICLR 2025arXiv:2412.19394
17
citations
#2828

Do Vision-Language Models Really Understand Visual Language?

Yifan Hou, Buse Giledereli, Yilei Tu et al.

ICML 2025arXiv:2410.00193
17
citations
#2829

JEN-1 Composer: A Unified Framework for High-Fidelity Multi-Track Music Generation

Yao Yao, Peike Li, Boyu Chen et al.

AAAI 2025paperarXiv:2310.19180
17
citations
#2830

Scalable Discrete Diffusion Samplers: Combinatorial Optimization and Statistical Physics

Sebastian Sanokowski, Wilhelm Berghammer, Haoyu Wang et al.

ICLR 2025arXiv:2502.08696
17
citations
#2831

EpiCoder: Encompassing Diversity and Complexity in Code Generation

Yaoxiang Wang, Haoling Li, Xin Zhang et al.

ICML 2025arXiv:2501.04694
17
citations
#2832

MPQ-DM: Mixed Precision Quantization for Extremely Low Bit Diffusion Models

Weilun Feng, Haotong Qin, Chuanguang Yang et al.

AAAI 2025paperarXiv:2412.11549
17
citations
#2833

V2PE: Improving Multimodal Long-Context Capability of Vision-Language Models with Variable Visual Position Encoding

Junqi Ge, Ziyi Chen, Jintao Lin et al.

ICCV 2025arXiv:2412.09616
17
citations
#2834

PRAGA: Prototype-aware Graph Adaptive Aggregation for Spatial Multi-modal Omics Analysis

Xinlei Huang, Zhiqi Ma, Dian Meng et al.

AAAI 2025paperarXiv:2409.12728
17
citations
#2835

Learning Interpretable Hierarchical Dynamical Systems Models from Time Series Data

Manuel Brenner, Elias Weber, Georgia Koppe et al.

ICLR 2025arXiv:2410.04814
17
citations
#2836

Can We Talk Models Into Seeing the World Differently?

Paul Gavrikov, Jovita Lukasik, Steffen Jung et al.

ICLR 2025arXiv:2403.09193
17
citations
#2837

The VLLM Safety Paradox: Dual Ease in Jailbreak Attack and Defense

Yangyang Guo, Fangkai Jiao, Liqiang Nie et al.

NEURIPS 2025arXiv:2411.08410
17
citations
#2838

A Probabilistic Perspective on Unlearning and Alignment for Large Language Models

Yan Scholten, Stephan Günnemann, Leo Schwinn

ICLR 2025arXiv:2410.03523
17
citations
#2839

Magic Insert: Style-Aware Drag-and-Drop

Nataniel Ruiz, Yuanzhen Li, Neal Wadhwa et al.

ICCV 2025highlightarXiv:2407.02489
17
citations
#2840

Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning

Qinghao Ye, Xianhan Zeng, Fu Li et al.

ICLR 2025arXiv:2503.07906
17
citations
#2841

Trajectory Balance with Asynchrony: Decoupling Exploration and Learning for Fast, Scalable LLM Post-Training

Brian Bartoldson, Siddarth Venkatraman, James Diffenderfer et al.

NEURIPS 2025arXiv:2503.18929
17
citations
#2842

TULIP: Token-length Upgraded CLIP

Ivona Najdenkoska, Mohammad Mahdi Derakhshani, Yuki Asano et al.

ICLR 2025arXiv:2410.10034
17
citations
#2843

Force Prompting: Video Generation Models Can Learn And Generalize Physics-based Control Signals

Nate Gillman, Charles Herrmann, Michael Freeman et al.

NEURIPS 2025arXiv:2505.19386
17
citations
#2844

Physics-Constrained Flow Matching: Sampling Generative Models with Hard Constraints

Utkarsh Utkarsh, Pengfei Cai, Alan Edelman et al.

NEURIPS 2025arXiv:2506.04171
17
citations
#2845

Dynamic Multimodal Evaluation with Flexible Complexity by Vision-Language Bootstrapping

Yue Yang, Shuibo Zhang, Kaipeng Zhang et al.

ICLR 2025arXiv:2410.08695
17
citations
#2846

Quamba: A Post-Training Quantization Recipe for Selective State Space Models

Hung-Yueh Chiang, Chi-Chih Chang, Natalia Frumkin et al.

ICLR 2025arXiv:2410.13229
17
citations
#2847

PrEditor3D: Fast and Precise 3D Shape Editing

Ziya Erkoc, Can Gümeli, Chaoyang Wang et al.

CVPR 2025arXiv:2412.06592
17
citations
#2848

Omni-ID: Holistic Identity Representation Designed for Generative Tasks

Guocheng Qian, Kuan-Chieh Wang, Or Patashnik et al.

CVPR 2025arXiv:2412.09694
17
citations
#2849

Power Lines: Scaling laws for weight decay and batch size in LLM pre-training

Shane Bergsma, Nolan Dey, Gurpreet Gosal et al.

NEURIPS 2025arXiv:2505.13738
17
citations
#2850

SAEs Can Improve Unlearning: Dynamic Sparse Autoencoder Guardrails for Precision Unlearning in LLMs

Aashiq Muhamed, Jacopo Bonato, Mona T. Diab et al.

COLM 2025paper
17
citations
#2851

Visual Description Grounding Reduces Hallucinations and Boosts Reasoning in LVLMs

Sreyan Ghosh, Chandra Kiran Evuru, Sonal Kumar et al.

ICLR 2025arXiv:2405.15683
17
citations
#2852

Long-Form Speech Generation with Spoken Language Models

Se Jin Park, Julian Salazar, Aren Jansen et al.

ICML 2025oralarXiv:2412.18603
17
citations
#2853

Large Images Are Gaussians: High-Quality Large Image Representation with Levels of 2D Gaussian Splatting

Lingting Zhu, Guying Lin, Jinnan Chen et al.

AAAI 2025paperarXiv:2502.09039
17
citations
#2854

Optimizing Temperature for Language Models with Multi-Sample Inference

Weihua Du, Yiming Yang, Sean Welleck

ICML 2025arXiv:2502.05234
17
citations
#2855

xLSTM-Mixer: Multivariate Time Series Forecasting by Mixing via Scalar Memories

Maurice Kraus, Felix Divo, Devendra Singh Dhami et al.

NEURIPS 2025oralarXiv:2410.16928
17
citations
#2856

A Simple Data Augmentation for Feature Distribution Skewed Federated Learning

Yunlu Yan, Huazhu Fu, Yuexiang Li et al.

CVPR 2025arXiv:2306.09363
17
citations
#2857

Cross-Entropy Is All You Need To Invert the Data Generating Process

Patrik Reizinger, Alice Bizeul, Attila Juhos et al.

ICLR 2025arXiv:2410.21869
17
citations
#2858

Boosting Generative Image Modeling via Joint Image-Feature Synthesis

Theodoros Kouzelis, Efstathios Karypidis, Ioannis Kakogeorgiou et al.

NEURIPS 2025spotlightarXiv:2504.16064
17
citations
#2859

A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules

Kairong Luo, Haodong Wen, Shengding Hu et al.

ICLR 2025arXiv:2503.12811
17
citations
#2860

GraphMoRE: Mitigating Topological Heterogeneity via Mixture of Riemannian Experts

Zihao Guo, Qingyun Sun, Haonan Yuan et al.

AAAI 2025paperarXiv:2412.11085
17
citations
#2861

AllTracker: Efficient Dense Point Tracking at High Resolution

Adam Harley, Yang You, Yang Zheng et al.

ICCV 2025arXiv:2506.07310
17
citations
#2862

Emergence in non-neural models: grokking modular arithmetic via average gradient outer product

Neil Mallinar, Daniel Beaglehole, Libin Zhu et al.

ICML 2025oralarXiv:2407.20199
17
citations
#2863

Mitigating Object Hallucination in Large Vision-Language Models via Image-Grounded Guidance

Linxi Zhao, Yihe Deng, Weitong Zhang et al.

ICML 2025spotlightarXiv:2402.08680
17
citations
#2864

Closed-Form Merging of Parameter-Efficient Modules for Federated Continual Learning

Riccardo Salami, Pietro Buzzega, Matteo Mosconi et al.

ICLR 2025arXiv:2410.17961
17
citations
#2865

AlignMamba: Enhancing Multimodal Mamba with Local and Global Cross-modal Alignment

Yan Li, Yifei Xing, Xiangyuan Lan et al.

CVPR 2025arXiv:2412.00833
17
citations
#2866

DexVLG: Dexterous Vision-Language-Grasp Model at Scale

Jiawei He, Danshi Li, Xinqiang Yu et al.

ICCV 2025highlightarXiv:2507.02747
17
citations
#2867

UniCombine: Unified Multi-Conditional Combination with Diffusion Transformer

Haoxuan Wang, Jinlong Peng, Qingdong He et al.

ICCV 2025arXiv:2503.09277
17
citations
#2868

Restoring Calibration for Aligned Large Language Models: A Calibration-Aware Fine-Tuning Approach

Jiancong Xiao, Bojian Hou, Zhanliang Wang et al.

ICML 2025arXiv:2505.01997
17
citations
#2869

Swift4D: Adaptive divide-and-conquer Gaussian Splatting for compact and efficient reconstruction of dynamic scene

Jiahao Wu, Rui Peng, Zhiyan Wang et al.

ICLR 2025
17
citations
#2870

VITA-Audio: Fast Interleaved Audio-Text Token Generation for Efficient Large Speech-Language Model

Zuwei Long, Yunhang Shen, Chaoyou Fu et al.

NEURIPS 2025
17
citations
#2871

Spatial457: A Diagnostic Benchmark for 6D Spatial Reasoning of Large Mutimodal Models

Xingrui Wang, Wufei Ma, Tiezheng Zhang et al.

CVPR 2025highlight
17
citations
#2872

Unleashing the Potential of Multi-modal Foundation Models and Video Diffusion for 4D Dynamic Physical Scene Simulation

Zhuoman Liu, Weicai Ye, Yan Luximon et al.

CVPR 2025arXiv:2411.14423
17
citations
#2873

FOLDER: Accelerating Multi-Modal Large Language Models with Enhanced Performance

Haicheng Wang, Zhemeng Yu, Gabriele Spadaro et al.

ICCV 2025arXiv:2501.02430
17
citations
#2874

ThinkBot: Embodied Instruction Following with Thought Chain Reasoning

Guanxing Lu, Ziwei Wang, Changliu Liu et al.

ICLR 2025arXiv:2312.07062
17
citations
#2875

Learning 3D Persistent Embodied World Models

Siyuan Zhou, Yilun Du, Yuncong Yang et al.

NEURIPS 2025arXiv:2505.05495
17
citations
#2876

Learning Clustering-based Prototypes for Compositional Zero-Shot Learning

Hongyu Qu, Jianan Wei, Xiangbo Shu et al.

ICLR 2025arXiv:2502.06501
17
citations
#2877

Copilot Arena: A Platform for Code LLM Evaluation in the Wild

Wayne Chi, Valerie Chen, Anastasios Angelopoulos et al.

ICML 2025arXiv:2502.09328
17
citations
#2878

Accelerated Diffusion Models via Speculative Sampling

Valentin De Bortoli, Alexandre Galashov, Arthur Gretton et al.

ICML 2025arXiv:2501.05370
17
citations
#2879

Block-Attention for Efficient Prefilling

Dongyang Ma, Yan Wang, Tian Lan

ICLR 2025arXiv:2409.15355
17
citations
#2880

CaDA: Cross-Problem Routing Solver with Constraint-Aware Dual-Attention

Han Li, Fei Liu, Zhi Zheng et al.

ICML 2025arXiv:2412.00346
17
citations
#2881

Cross-modulated Attention Transformer for RGBT Tracking

Yun Xiao, Jiacong Zhao, Andong Lu et al.

AAAI 2025paperarXiv:2408.02222
17
citations
#2882

VideoDirector: Precise Video Editing via Text-to-Video Models

Yukun Wang, Longguang Wang, Zhiyuan Ma et al.

CVPR 2025arXiv:2411.17592
17
citations
#2883

SlowFast-LLaVA-1.5: A Family of Token-Efficient Video Large Language Models for Long-Form Video Understanding

Mingze Xu, Mingfei Gao, Shiyu Li et al.

COLM 2025paperarXiv:2503.18943
17
citations
#2884

Controllable Context Sensitivity and the Knob Behind It

Julian Minder, Kevin Du, Niklas Stoehr et al.

ICLR 2025arXiv:2411.07404
17
citations
#2885

OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation

Junyuan Zhang, Qintong Zhang, Bin Wang et al.

ICCV 2025arXiv:2412.02592
17
citations
#2886

Prismatic Synthesis: Gradient-based Data Diversification Boosts Generalization in LLM Reasoning

Jaehun Jung, Seungju Han, Ximing Lu et al.

NEURIPS 2025spotlightarXiv:2505.20161
17
citations
#2887

Collab-RAG: Boosting Retrieval-Augmented Generation for Complex Question Answering via White-Box and Black-Box LLM Collaboration

Ran Xu, Wenqi Shi, Yuchen Zhuang et al.

COLM 2025paperarXiv:2504.04915
17
citations
#2888

Visual Thoughts: A Unified Perspective of Understanding Multimodal Chain-of-Thought

ZIhui Cheng, Qiguang Chen, Xiao Xu et al.

NEURIPS 2025arXiv:2505.15510
17
citations
#2889

GEVRM: Goal-Expressive Video Generation Model For Robust Visual Manipulation

Hongyin Zhang, Pengxiang Ding, Shangke Lyu et al.

ICLR 2025arXiv:2502.09268
17
citations
#2890

Learning Efficient Positional Encodings with Graph Neural Networks

Charilaos Kanatsoulis, Evelyn Choi, Stefanie Jegelka et al.

ICLR 2025arXiv:2502.01122
17
citations
#2891

From Sparse Dependence to Sparse Attention: Unveiling How Chain-of-Thought Enhances Transformer Sample Efficiency

Kaiyue Wen, Huaqing Zhang, Hongzhou Lin et al.

ICLR 2025arXiv:2410.05459
17
citations
#2892

Training Neural Networks as Recognizers of Formal Languages

Alexandra Butoi, Ghazal Khalighinejad, Anej Svete et al.

ICLR 2025arXiv:2411.07107
17
citations
#2893

CBGBench: Fill in the Blank of Protein-Molecule Complex Binding Graph

Haitao Lin, Guojiang Zhao, Odin Zhang et al.

ICLR 2025arXiv:2406.10840
17
citations
#2894

Spiking Vision Transformer with Saccadic Attention

Shuai Wang, Malu Zhang, Dehao Zhang et al.

ICLR 2025oralarXiv:2502.12677
17
citations
#2895

GraphMaster: Automated Graph Synthesis via LLM Agents in Data-Limited Environments

Enjun Du, Xunkai Li, Tian Jin et al.

NEURIPS 2025spotlightarXiv:2504.00711
17
citations
#2896

Asymmetric REINFORCE for off-Policy Reinforcement Learning: Balancing positive and negative rewards

Charles Arnal, Gaëtan Narozniak, Vivien Cabannes et al.

NEURIPS 2025arXiv:2506.20520
17
citations
#2897

Cavia: Camera-controllable Multi-view Video Diffusion with View-Integrated Attention

Dejia Xu, Yifan Jiang, Chen Huang et al.

ICML 2025oralarXiv:2410.10774
17
citations
#2898

DoomArena: A framework for Testing AI Agents Against Evolving Security Threats

Léo Boisvert, Abhay Puri, Gabriel Huang et al.

COLM 2025paperarXiv:2504.14064
17
citations
#2899

DreamOmni: Unified Image Generation and Editing

Bin Xia, Yuechen Zhang, Jingyao Li et al.

CVPR 2025arXiv:2412.17098
16
citations
#2900

CNC: Cross-modal Normality Constraint for Unsupervised Multi-class Anomaly Detection

Xiaolei Wang, Xiaoyang Wang, Huihui Bai et al.

AAAI 2025paperarXiv:2501.00346
16
citations
#2901

SiReRAG: Indexing Similar and Related Information for Multihop Reasoning

Nan Zhang, Prafulla Kumar Choubey, Alexander Fabbri et al.

ICLR 2025arXiv:2412.06206
16
citations
#2902

DynaSaur: Large Language Agents Beyond Predefined Actions

Dang Nguyen, Viet Dac Lai, Seunghyun Yoon et al.

COLM 2025paperarXiv:2411.01747
16
citations
#2903

LeVo: High-Quality Song Generation with Multi-Preference Alignment

Shun Lei, Yaoxun XU, ZhiweiLin et al.

NEURIPS 2025arXiv:2506.07520
16
citations
#2904

Task Vectors in In-Context Learning: Emergence, Formation, and Benefits

Liu Yang, Ziqian Lin, Kangwook Lee et al.

COLM 2025paperarXiv:2501.09240
16
citations
#2905

AffordDexGrasp: Open-set Language-guided Dexterous Grasp with Generalizable-Instructive Affordance

Yilin Wei, Mu Lin, Yuhao Lin et al.

ICCV 2025arXiv:2503.07360
16
citations
#2906

ET-SEED: EFFICIENT TRAJECTORY-LEVEL SE(3) EQUIVARIANT DIFFUSION POLICY

Chenrui Tie, Yue Chen, Ruihai Wu et al.

ICLR 2025arXiv:2411.03990
16
citations
#2907

HEROS-GAN: Honed-Energy Regularized and Optimal Supervised GAN for Enhancing Accuracy and Range of Low-Cost Accelerometers

Yifeng Wang, Yi Zhao

AAAI 2025paperarXiv:2502.18064
16
citations
#2908

Logically Consistent Language Models via Neuro-Symbolic Integration

Diego Calanzone, Stefano Teso, Antonio Vergari

ICLR 2025arXiv:2409.13724
16
citations
#2909

MLLM-as-a-Judge for Image Safety without Human Labeling

Zhenting Wang, Shuming Hu, Shiyu Zhao et al.

CVPR 2025highlightarXiv:2501.00192
16
citations
#2910

A Many-Objective Problem Where Crossover Is Provably Indispensable

Andre Opris

AAAI 2025paper
16
citations
#2911

COME: Test-time Adaption by Conservatively Minimizing Entropy

Qingyang Zhang, Yatao Bian, Xinke Kong et al.

ICLR 2025arXiv:2410.10894
16
citations
#2912

FrugalNeRF: Fast Convergence for Extreme Few-shot Novel View Synthesis without Learned Priors

Chin-Yang Lin, Chung-Ho Wu, Changhan Yeh et al.

CVPR 2025arXiv:2410.16271
16
citations
#2913

MonoInstance: Enhancing Monocular Priors via Multi-view Instance Alignment for Neural Rendering and Reconstruction

Wenyuan Zhang, Yixiao Yang, Han Huang et al.

CVPR 2025arXiv:2503.18363
16
citations
#2914

SynthLight: Portrait Relighting with Diffusion Model by Learning to Re-render Synthetic Faces

Sumit Chaturvedi, Mengwei Ren, Yannick Hold-Geoffroy et al.

CVPR 2025arXiv:2501.09756
16
citations
#2915

AutoRedTeamer: Autonomous Red Teaming with Lifelong Attack Integration

Andy Zhou, Kevin Wu, Francesco Pinto et al.

NEURIPS 2025arXiv:2503.15754
16
citations
#2916

Accelerating Multimodal Large Language Models by Searching Optimal Vision Token Reduction

Shiyu Zhao, Zhenting Wang, Felix Juefei-Xu et al.

CVPR 2025arXiv:2412.00556
16
citations
#2917

Prior-guided Hierarchical Harmonization Network for Efficient Image Dehazing

Xiongfei Su, Siyuan Li, Yuning Cui et al.

AAAI 2025paperarXiv:2503.01136
16
citations
#2918

An Empirical Analysis of Uncertainty in Large Language Model Evaluations

Qiujie Xie, Qingqiu Li, Zhuohao Yu et al.

ICLR 2025arXiv:2502.10709
16
citations
#2919

Speeding Up the NSGA-II with a Simple Tie-Breaking Rule

Benjamin Doerr, Tudor Ivan, Martin S. Krejca

AAAI 2025paperarXiv:2412.11931
16
citations
#2920

SEC-bench: Automated Benchmarking of LLM Agents on Real-World Software Security Tasks

Hwiwon Lee, Ziqi Zhang, Hanxiao Lu et al.

NEURIPS 2025arXiv:2506.11791
16
citations
#2921

LLMs Can Plan Only If We Tell Them

Bilgehan Sel, Ruoxi Jia, Ming Jin

ICLR 2025arXiv:2501.13545
16
citations
#2922

Erwin: A Tree-based Hierarchical Transformer for Large-scale Physical Systems

Maksim Zhdanov, Max Welling, Jan-Willem van de Meent

ICML 2025arXiv:2502.17019
16
citations
#2923

IteRPrimE: Zero-shot Referring Image Segmentation with Iterative Grad-CAM Refinement and Primary Word Emphasis

Yuji Wang, Jingchen Ni, Yong Liu et al.

AAAI 2025paperarXiv:2503.00936
16
citations
#2924

Improved Bounds for Online Facility Location with Predictions

Dimitris Fotakis, Evangelia Gergatsouli, Themistoklis Gouleakis et al.

AAAI 2025paperarXiv:2107.08277
16
citations
#2925

Black-Box Detection of Language Model Watermarks

Thibaud Gloaguen, Nikola Jovanović, Robin Staab et al.

ICLR 2025arXiv:2405.20777
16
citations
#2926

Dynamic Gaussians Mesh: Consistent Mesh Reconstruction from Dynamic Scenes

Isabella Liu, Hao Su, Xiaolong Wang

ICLR 2025oralarXiv:2404.12379
16
citations
#2927

Visual Autoregressive Modeling for Image Super-Resolution

Yunpeng Qu, Kun Yuan, Jinhua Hao et al.

ICML 2025arXiv:2501.18993
16
citations
#2928

Can In-context Learning Really Generalize to Out-of-distribution Tasks?

Qixun Wang, Yifei Wang, Xianghua Ying et al.

ICLR 2025arXiv:2410.09695
16
citations
#2929

SCott: Accelerating Diffusion Models with Stochastic Consistency Distillation

Hongjian Liu, Qingsong Xie, Tianxiang Ye et al.

AAAI 2025paperarXiv:2403.01505
16
citations
#2930

RSAR: Restricted State Angle Resolver and Rotated SAR Benchmark

Xin Zhang, Xue Yang, Yuxuan Li et al.

CVPR 2025arXiv:2501.04440
16
citations
#2931

Is Factuality Enhancement a Free Lunch For LLMs? Better Factuality Can Lead to Worse Context-Faithfulness

Baolong Bi, Shenghua Liu, Yiwei Wang et al.

ICLR 2025arXiv:2404.00216
16
citations
#2932

LeanQuant: Accurate and Scalable Large Language Model Quantization with Loss-error-aware Grid

Tianyi Zhang, Anshumali Shrivastava

ICLR 2025arXiv:2407.10032
16
citations
#2933

Improving LLM General Preference Alignment via Optimistic Online Mirror Descent

Yuheng Zhang, Dian Yu, Tao Ge et al.

NEURIPS 2025spotlightarXiv:2502.16852
16
citations
#2934

InfiniPot-V: Memory-Constrained KV Cache Compression for Streaming Video Understanding

Minsoo Kim, Kyuhong Shim, Jungwook Choi et al.

NEURIPS 2025oralarXiv:2506.15745
16
citations
#2935

ThinkSound: Chain-of-Thought Reasoning in Multimodal LLMs for Audio Generation and Editing

Huadai Liu, Kaicheng Luo, Jialei Wang et al.

NEURIPS 2025oral
16
citations
#2936

Action-Minimization Meets Generative Modeling: Efficient Transition Path Sampling with the Onsager-Machlup Functional

Sanjeev Raja, Martin Šípka, Michael Psenka et al.

ICML 2025oralarXiv:2504.18506
16
citations
#2937

Diff2Flow: Training Flow Matching Models via Diffusion Model Alignment

Johannes Schusterbauer, Ming Gui, Frank Fundel et al.

CVPR 2025arXiv:2506.02221
16
citations
#2938

EgoExo-Gen: Ego-centric Video Prediction by Watching Exo-centric Videos

Jilan Xu, Yifei Huang, Baoqi Pei et al.

ICLR 2025oralarXiv:2504.11732
16
citations
#2939

Patch-level Sounding Object Tracking for Audio-Visual Question Answering

Zhangbin Li, Jinxing Zhou, Jing Zhang et al.

AAAI 2025paperarXiv:2412.10749
16
citations
#2940

Quantization without Tears

Minghao Fu, Hao Yu, Jie Shao et al.

CVPR 2025arXiv:2411.13918
16
citations
#2941

Track-On: Transformer-based Online Point Tracking with Memory

Görkay Aydemir, Xiongyi Cai, Weidi Xie et al.

ICLR 2025oralarXiv:2501.18487
16
citations
#2942

Structured Packing in LLM Training Improves Long Context Utilization

Konrad Staniszewski, Szymon Tworkowski, Sebastian Jaszczur et al.

AAAI 2025paperarXiv:2312.17296
16
citations
#2943

DyMO: Training-Free Diffusion Model Alignment with Dynamic Multi-Objective Scheduling

Xin Xie, Dong Gong

CVPR 2025arXiv:2412.00759
16
citations
#2944

Where am I? Cross-View Geo-localization with Natural Language Descriptions

Junyan Ye, Honglin Lin, Leyan Ou et al.

ICCV 2025arXiv:2412.17007
16
citations
#2945

Expressive Power of Graph Neural Networks for (Mixed-Integer) Quadratic Programs

Ziang Chen, Xiaohan Chen, Jialin Liu et al.

ICML 2025arXiv:2406.05938
16
citations
#2946

Uncertainty-guided Perturbation for Image Super-Resolution Diffusion Model

Leheng Zhang, Weiyi You, Kexuan Shi et al.

CVPR 2025arXiv:2503.18512
16
citations
#2947

3D-AffordanceLLM: Harnessing Large Language Models for Open-Vocabulary Affordance Detection in 3D Worlds

Hengshuo Chu, Xiang Deng, Qi Lv et al.

ICLR 2025arXiv:2502.20041
16
citations
#2948

Reversible Decoupling Network for Single Image Reflection Removal

Hao Zhao, Mingjia Li, Qiming Hu et al.

CVPR 2025arXiv:2410.08063
16
citations
#2949

Refine Knowledge of Large Language Models via Adaptive Contrastive Learning

Yinghui Li, Haojing Huang, Jiayi Kuang et al.

ICLR 2025arXiv:2502.07184
16
citations
#2950

On Calibration of LLM-based Guard Models for Reliable Content Moderation

Hongfu Liu, Hengguan Huang, Xiangming Gu et al.

ICLR 2025arXiv:2410.10414
16
citations
#2951

BIRD: A Trustworthy Bayesian Inference Framework for Large Language Models

Yu Feng, Ben Zhou, Weidong Lin et al.

ICLR 2025arXiv:2404.12494
16
citations
#2952

LLM Unlearning via Neural Activation Redirection

William Shen, Xinchi Qiu, Meghdad Kurmanji et al.

NEURIPS 2025arXiv:2502.07218
16
citations
#2953

GOPlan: Goal-conditioned Offline Reinforcement Learning by Planning with Learned Models

Mianchu Wang, Rui Yang, Xi Chen et al.

ICLR 2025arXiv:2310.20025
16
citations
#2954

Rare-to-Frequent: Unlocking Compositional Generation Power of Diffusion Models on Rare Concepts with LLM Guidance

Dongmin Park, Sebin Kim, Taehong Moon et al.

ICLR 2025arXiv:2410.22376
16
citations
#2955

WonderPlay: Dynamic 3D Scene Generation from a Single Image and Actions

Zizhang Li, Hong-Xing Yu, Wei Liu et al.

ICCV 2025highlightarXiv:2505.18151
16
citations
#2956

Movie Weaver: Tuning-Free Multi-Concept Video Personalization with Anchored Prompts

Feng Liang, Haoyu Ma, Zecheng He et al.

CVPR 2025arXiv:2502.07802
16
citations
#2957

EmotiveTalk: Expressive Talking Head Generation through Audio Information Decoupling and Emotional Video Diffusion

Haotian Wang, Yuzhe Weng, Yueyan Li et al.

CVPR 2025arXiv:2411.16726
16
citations
#2958

MACPO: Weak-to-Strong Alignment via Multi-Agent Contrastive Preference Optimization

Yougang Lyu, Lingyong Yan, Zihan Wang et al.

ICLR 2025oralarXiv:2410.07672
16
citations
#2959

GestureLSM: Latent Shortcut based Co-Speech Gesture Generation with Spatial-Temporal Modeling

Pinxin Liu, Luchuan Song, Junhua Huang et al.

ICCV 2025arXiv:2501.18898
16
citations
#2960

The Same but Different: Structural Similarities and Differences in Multilingual Language Modeling

Ruochen Zhang, Qinan Yu, Matianyu Zang et al.

ICLR 2025arXiv:2410.09223
16
citations
#2961

Adapting Multi-modal Large Language Model to Concept Drift From Pre-training Onwards

Xiaoyu Yang, Jie Lu, En Yu

ICLR 2025arXiv:2405.13459
16
citations
#2962

Presto! Distilling Steps and Layers for Accelerating Music Generation

Zachary Novack, Ge Zhu, Jonah Casebeer et al.

ICLR 2025arXiv:2410.05167
16
citations
#2963

Memory Injection Attacks on LLM Agents via Query-Only Interaction

Shen Dong, Shaochen Xu, Pengfei He et al.

NEURIPS 2025arXiv:2503.03704
16
citations
#2964

EMOS: Embodiment-aware Heterogeneous Multi-robot Operating System with LLM Agents

Junting Chen, Checheng Yu, Xunzhe Zhou et al.

ICLR 2025arXiv:2410.22662
16
citations
#2965

DRoC: Elevating Large Language Models for Complex Vehicle Routing via Decomposed Retrieval of Constraints

Xia Jiang, Yaoxin Wu, Chenhao Zhang et al.

ICLR 2025
16
citations
#2966

DreamPhysics: Learning Physics-Based 3D Dynamics with Video Diffusion Priors

Tianyu Huang, Haoze Zhang, Yihan Zeng et al.

AAAI 2025paperarXiv:2406.01476
16
citations
#2967

FlowDec: A flow-based full-band general audio codec with high perceptual quality

Simon Welker, Matthew Le, Ricky T. Q. Chen et al.

ICLR 2025arXiv:2503.01485
16
citations
#2968

Security Attacks on LLM-based Code Completion Tools

Wen Cheng, Ke Sun, Xinyu Zhang et al.

AAAI 2025paperarXiv:2408.11006
16
citations
#2969

MonoDGP: Monocular 3D Object Detection with Decoupled-Query and Geometry-Error Priors

Fanqi Pu, Yifan Wang, Jiru Deng et al.

CVPR 2025arXiv:2410.19590
16
citations
#2970

MrT5: Dynamic Token Merging for Efficient Byte-level Language Models

Julie Kallini, Shikhar Murty, Christopher Manning et al.

ICLR 2025arXiv:2410.20771
16
citations
#2971

VideoChat-R1.5: Visual Test-Time Scaling to Reinforce Multimodal Reasoning by Iterative Perception

Ziang Yan, Yinan He, Xinhao Li et al.

NEURIPS 2025oralarXiv:2509.21100
16
citations
#2972

OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting

Yongsheng Yu, Ziyun Zeng, Haitian Zheng et al.

ICCV 2025arXiv:2503.08677
16
citations
#2973

LoRA Done RITE: Robust Invariant Transformation Equilibration for LoRA Optimization

Jui-Nan Yen, Si Si, Zhao Meng et al.

ICLR 2025arXiv:2410.20625
16
citations
#2974

SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning

Xiao Liang, Zhong-Zhi Li, Yeyun Gong et al.

NEURIPS 2025arXiv:2506.08989
16
citations
#2975

Unified Parameter-Efficient Unlearning for LLMs

Chenlu Ding, Jiancan Wu, Yancheng Yuan et al.

ICLR 2025arXiv:2412.00383
16
citations
#2976

How Much is a Noisy Image Worth? Data Scaling Laws for Ambient Diffusion.

Giannis Daras, Yeshwanth Cherapanamjeri, Constantinos C Daskalakis

ICLR 2025arXiv:2411.02780
16
citations
#2977

A Unified Comparative Study with Generalized Conformity Scores for Multi-Output Conformal Regression

Victor Dheur, Matteo Fontana, Yorick Estievenart et al.

ICML 2025arXiv:2501.10533
16
citations
#2978

Signature Kernel Conditional Independence Tests in Causal Discovery for Stochastic Processes

Georg Manten, Cecilia Casolo, Emilio Ferrucci et al.

ICLR 2025arXiv:2402.18477
16
citations
#2979

Federated Unlearning with Gradient Descent and Conflict Mitigation

Zibin Pan, Zhichao Wang, Chi Li et al.

AAAI 2025paperarXiv:2412.20200
16
citations
#2980

Knowledge Entropy Decay during Language Model Pretraining Hinders New Knowledge Acquisition

Jiyeon Kim, Hyunji Lee, Hyowon Cho et al.

ICLR 2025arXiv:2410.01380
16
citations
#2981

MOS: Model Surgery for Pre-Trained Model-Based Class-Incremental Learning

Hai-Long Sun, Da-Wei Zhou, Hanbin Zhao et al.

AAAI 2025paperarXiv:2412.09441
16
citations
#2982

Language Models Are Capable of Metacognitive Monitoring and Control of Their Internal Activations

Ji-An Li, Huadong Xiong, Robert Wilson et al.

NEURIPS 2025arXiv:2505.13763
16
citations
#2983

Let LRMs Break Free from Overthinking via Self-Braking Tuning

Haoran Zhao, Yuchen Yan, Yongliang Shen et al.

NEURIPS 2025arXiv:2505.14604
16
citations
#2984

Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion

Jiuhai Chen, Jianwei Yang, Haiping Wu et al.

CVPR 2025arXiv:2412.04424
16
citations
#2985

HGSFusion: Radar-Camera Fusion with Hybrid Generation and Synchronization for 3D Object Detection

Zijian Gu, Jianwei Ma, Yan Huang et al.

AAAI 2025paperarXiv:2412.11489
16
citations
#2986

UniGraspTransformer: Simplified Policy Distillation for Scalable Dexterous Robotic Grasping

Wenbo Wang, Fangyun Wei, Lei Zhou et al.

CVPR 2025arXiv:2412.02699
16
citations
#2987

GGS: Generalizable Gaussian Splatting for Lane Switching in Autonomous Driving

Huasong Han, Kaixuan Zhou, Xiaoxiao Long et al.

AAAI 2025paperarXiv:2409.02382
16
citations
#2988

Breaking the Batch Barrier (B3) of Contrastive Learning via Smart Batch Mining

Raghuveer Thirukovalluru, Rui Meng, Ye Liu et al.

NEURIPS 2025spotlightarXiv:2505.11293
16
citations
#2989

CARES: Comprehensive Evaluation of Safety and Adversarial Robustness in Medical LLMs

Sijia Chen, Xiaomin Li, mengxue zhang et al.

NEURIPS 2025arXiv:2505.11413
16
citations
#2990

Synthetic-to-Real Self-supervised Robust Depth Estimation via Learning with Motion and Structure Priors

Weilong Yan, Ming Li, Li Haipeng et al.

CVPR 2025arXiv:2503.20211
16
citations
#2991

Provably Accurate Shapley Value Estimation via Leverage Score Sampling

Christopher Musco, R. Teal Witter

ICLR 2025arXiv:2410.01917
16
citations
#2992

ContextGNN: Beyond Two-Tower Recommendation Systems

Yiwen Yuan, Zecheng Zhang, Xinwei He et al.

ICLR 2025arXiv:2411.19513
16
citations
#2993

ROS-SAM: High-Quality Interactive Segmentation for Remote Sensing Moving Object

Zhe Shan, Yang Liu, Lei Zhou et al.

CVPR 2025arXiv:2503.12006
16
citations
#2994

Langevin Soft Actor-Critic: Efficient Exploration through Uncertainty-Driven Critic Learning

Haque Ishfaq, Guangyuan Wang, Sami Islam et al.

ICLR 2025arXiv:2501.17827
16
citations
#2995

Probabilistic Language-Image Pre-Training

Sanghyuk Chun, Wonjae Kim, Song Park et al.

ICLR 2025arXiv:2410.18857
16
citations
#2996

VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?

Yunlong Tang, JunJia Guo, Hang Hua et al.

CVPR 2025arXiv:2411.10979
16
citations
#2997

Re-Thinking Inverse Graphics With Large Language Models

Haiwen Feng, Michael J Black, Weiyang Liu et al.

ICLR 2025arXiv:2404.15228
16
citations
#2998

Motion Prior Knowledge Learning with Homogeneous Language Descriptions for Moving Infrared Small Target Detection

Shengjia Chen, Luping Ji, Weiwei Duan et al.

AAAI 2025paper
16
citations
#2999

Mimir: Improving Video Diffusion Models for Precise Text Understanding

Shuai Tan, Biao Gong, Yutong Feng et al.

CVPR 2025arXiv:2412.03085
16
citations
#3000

Cross the Gap: Exposing the Intra-modal Misalignment in CLIP via Modality Inversion

Marco Mistretta, Alberto Baldrati, Lorenzo Agnolucci et al.

ICLR 2025arXiv:2502.04263
16
citations