Most Cited 2025 "lyapunov method" Papers

22,274 papers found • Page 9 of 112

#1601

Feat2GS: Probing Visual Foundation Models with Gaussian Splatting

Yue Chen, Xingyu Chen, Anpei Chen et al.

CVPR 2025posterarXiv:2412.09606
17
citations
#1602

CAD-Recode: Reverse Engineering CAD Code from Point Clouds

Danila Rukhovich, Elona Dupont, Dimitrios Mallis et al.

ICCV 2025posterarXiv:2412.14042
17
citations
#1603

Interpretable Bilingual Multimodal Large Language Model for Diverse Biomedical Tasks

Lehan Wang, Haonan Wang, Honglong Yang et al.

ICLR 2025posterarXiv:2410.18387
17
citations
#1604

CyberHost: A One-stage Diffusion Framework for Audio-driven Talking Body Generation

Gaojie Lin, Jianwen Jiang, Chao Liang et al.

ICLR 2025poster
17
citations
#1605

AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models

Kim Sung-Bin, Oh Hyun-Bin, Lee Jung-Mok et al.

ICLR 2025posterarXiv:2410.18325
17
citations
#1606

Precise, Fast, and Low-cost Concept Erasure in Value Space: Orthogonal Complement Matters

Yuan Wang, Ouxiang Li, Tingting Mu et al.

CVPR 2025posterarXiv:2412.06143
17
citations
#1607

A Quantum Circuit-Based Compression Perspective for Parameter-Efficient Learning

Chen-Yu Liu, Chao-Han Huck Yang, Hsi-Sheng Goan et al.

ICLR 2025posterarXiv:2410.09846
17
citations
#1608

AlignMamba: Enhancing Multimodal Mamba with Local and Global Cross-modal Alignment

Yan Li, Yifei Xing, Xiangyuan Lan et al.

CVPR 2025posterarXiv:2412.00833
17
citations
#1609

Grokking at the Edge of Numerical Stability

Lucas Prieto, Melih Barsbey, Pedro Mediano et al.

ICLR 2025posterarXiv:2501.04697
17
citations
#1610

Accelerating neural network training: An analysis of the AlgoPerf competition

Priya Kasimbeg, Frank Schneider, Runa Eschenhagen et al.

ICLR 2025posterarXiv:2502.15015
17
citations
#1611

USP: Unified Self-Supervised Pretraining for Image Generation and Understanding

Xiangxiang Chu, Renda Li, Yong Wang

ICCV 2025posterarXiv:2503.06132
17
citations
#1612

From Risk to Uncertainty: Generating Predictive Uncertainty Measures via Bayesian Estimation

Nikita Kotelevskii, Vladimir Kondratyev, Martin Takáč et al.

ICLR 2025posterarXiv:2402.10727
17
citations
#1613

No Preference Left Behind: Group Distributional Preference Optimization

Binwei Yao, Zefan Cai, Yun-Shiuan Chuang et al.

ICLR 2025posterarXiv:2412.20299
17
citations
#1614

Design Principle Transfer in Neural Architecture Search via Large Language Models

Xun Zhou, Xingyu Wu, Liang Feng et al.

AAAI 2025paperarXiv:2408.11330
17
citations
#1615

GEM: Empowering MLLM for Grounded ECG Understanding with Time Series and Images

Xiang Lan, Feng Wu, Kai He et al.

NEURIPS 2025posterarXiv:2503.06073
17
citations
#1616

Causal Concept Graph Models: Beyond Causal Opacity in Deep Learning

Gabriele Dominici, Pietro Barbiero, Mateo Espinosa Zarlenga et al.

ICLR 2025posterarXiv:2405.16507
17
citations
#1617

DarkBench: Benchmarking Dark Patterns in Large Language Models

Esben Kran, Hieu Minh Nguyen, Akash Kundu et al.

ICLR 2025posterarXiv:2503.10728
17
citations
#1618

Mitigating the Human-Robot Domain Discrepancy in Visual Pre-training for Robotic Manipulation

Jiaming Zhou, Teli Ma, Kun-Yu Lin et al.

CVPR 2025posterarXiv:2406.14235
17
citations
#1619

3DRealCar: An In-the-wild RGB-D Car Dataset with 360-degree Views

Xiaobiao Du, Yida Wang, Haiyang Sun et al.

ICCV 2025posterarXiv:2406.04875
17
citations
#1620

MLLMs Need 3D-Aware Representation Supervision for Scene Understanding

Xiaohu Huang, Jingjing Wu, Qunyi Xie et al.

NEURIPS 2025posterarXiv:2506.01946
17
citations
#1621

VITA-Audio: Fast Interleaved Audio-Text Token Generation for Efficient Large Speech-Language Model

Zuwei Long, Yunhang Shen, Chaoyou Fu et al.

NEURIPS 2025poster
17
citations
#1622

DeLLMa: Decision Making Under Uncertainty with Large Language Models

Ollie Liu, Deqing Fu, Dani Yogatama et al.

ICLR 2025posterarXiv:2402.02392
17
citations
#1623

Personalized Federated Collaborative Filtering: A Variational AutoEncoder Approach

Zhiwei Li, Guodong Long, Tianyi Zhou et al.

AAAI 2025paperarXiv:2408.08931
17
citations
#1624

Detecting Out-of-Distribution Through the Lens of Neural Collapse

Litian Liu, Yao Qin

CVPR 2025posterarXiv:2311.01479
17
citations
#1625

Optimal Transport for Time Series Imputation

Hao Wang, zhengnan li, Haoxuan Li et al.

ICLR 2025oral
17
citations
#1626

Unleashing the Potential of Multi-modal Foundation Models and Video Diffusion for 4D Dynamic Physical Scene Simulation

Zhuoman Liu, Weicai Ye, Yan Luximon et al.

CVPR 2025posterarXiv:2411.14423
17
citations
#1627

Beyond Chemical QA: Evaluating LLM's Chemical Reasoning with Modular Chemical Operations

Li Hao, He CAO, Bin Feng et al.

NEURIPS 2025posterarXiv:2505.21318
17
citations
#1628

Reasoning of Large Language Models over Knowledge Graphs with Super-Relations

Song Wang, Junhong Lin, Xiaojie Guo et al.

ICLR 2025posterarXiv:2503.22166
17
citations
#1629

Sloth: scaling laws for LLM skills to predict multi-benchmark performance across families

Felipe Maia Polo, Seamus Somerstep, Leshem Choshen et al.

NEURIPS 2025posterarXiv:2412.06540
17
citations
#1630

RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization

Hanyang Zhao, Genta Winata, Anirban Das et al.

ICLR 2025posterarXiv:2410.04203
17
citations
#1631

ThinkSound: Chain-of-Thought Reasoning in Multimodal LLMs for Audio Generation and Editing

Huadai Liu, Kaicheng Luo, Jialei Wang et al.

NEURIPS 2025oral
16
citations
#1632

Instant3dit: Multiview Inpainting for Fast Editing of 3D Objects

Amir Barda, Matheus Gadelha, Vladimir G. Kim et al.

CVPR 2025posterarXiv:2412.00518
16
citations
#1633

Closed-Form Merging of Parameter-Efficient Modules for Federated Continual Learning

Riccardo Salami, Pietro Buzzega, Matteo Mosconi et al.

ICLR 2025posterarXiv:2410.17961
16
citations
#1634

Roboflow100-VL: A Multi-Domain Object Detection Benchmark for Vision-Language Models

Matvei Popov, Peter Robicheaux, Anish Madan et al.

NEURIPS 2025posterarXiv:2505.20612
16
citations
#1635

GenFusion: Closing the Loop between Reconstruction and Generation via Videos

Sibo Wu, Congrong Xu, Binbin Huang et al.

CVPR 2025posterarXiv:2503.21219
16
citations
#1636

SensorLM: Learning the Language of Wearable Sensors

Yuwei Zhang, Kumar Ayush, Siyuan Qiao et al.

NEURIPS 2025posterarXiv:2506.09108
16
citations
#1637

Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation

Nicolas Dufour, Vicky Kalogeiton, David Picard et al.

CVPR 2025posterarXiv:2412.06781
16
citations
#1638

Chat2SVG: Vector Graphics Generation with Large Language Models and Image Diffusion Models

Ronghuan Wu, Wanchao Su, Jing Liao

CVPR 2025posterarXiv:2411.16602
16
citations
#1639

Stationary Kernels and Gaussian Processes on Lie Groups and their Homogeneous Spaces II: non-compact symmetric spaces

Iskander Azangulov, Andrei Smolensky, Alexander Terenin et al.

NEURIPS 2025oralarXiv:2301.13088
16
citations
#1640

Track-On: Transformer-based Online Point Tracking with Memory

Görkay Aydemir, Xiongyi Cai, Weidi Xie et al.

ICLR 2025oralarXiv:2501.18487
16
citations
#1641

Quamba: A Post-Training Quantization Recipe for Selective State Space Models

Hung-Yueh Chiang, Chi-Chih Chang, Natalia Frumkin et al.

ICLR 2025posterarXiv:2410.13229
16
citations
#1642

EnvGS: Modeling View-Dependent Appearance with Environment Gaussian

Tao Xie, Xi Chen, Zhen Xu et al.

CVPR 2025posterarXiv:2412.15215
16
citations
#1643

Scalable Influence and Fact Tracing for Large Language Model Pretraining

Tyler Chang, Dheeraj Rajagopal, Tolga Bolukbasi et al.

ICLR 2025posterarXiv:2410.17413
16
citations
#1644

Aioli: A Unified Optimization Framework for Language Model Data Mixing

Mayee Chen, Michael Hu, Nicholas Lourie et al.

ICLR 2025posterarXiv:2411.05735
16
citations
#1645

TechSinger: Technique Controllable Multilingual Singing Voice Synthesis via Flow Matching

Wenxiang Guo, Yu Zhang, Changhao Pan et al.

AAAI 2025paperarXiv:2502.12572
16
citations
#1646

Leveraging Online Olympiad-Level Math Problems for LLMs Training and Contamination-Resistant Evaluation

Sadegh Mahdavi, Muchen Li, Kaiwen Liu et al.

ICML 2025posterarXiv:2501.14275
16
citations
#1647

TULIP: Token-length Upgraded CLIP

Ivona Najdenkoska, Mohammad Mahdi Derakhshani, Yuki Asano et al.

ICLR 2025posterarXiv:2410.10034
16
citations
#1648

Accelerating Multimodal Large Language Models by Searching Optimal Vision Token Reduction

Shiyu Zhao, Zhenting Wang, Felix Juefei-Xu et al.

CVPR 2025posterarXiv:2412.00556
16
citations
#1649

Scalable Image Tokenization with Index Backpropagation Quantization

Fengyuan Shi, Zhuoyan Luo, Yixiao Ge et al.

ICCV 2025posterarXiv:2412.02692
16
citations
#1650

Revealing Key Details to See Differences: A Novel Prototypical Perspective for Skeleton-based Action Recognition

Hongda Liu, Yunfan Liu, Min Ren et al.

CVPR 2025highlightarXiv:2411.18941
16
citations
#1651

A Unified Comparative Study with Generalized Conformity Scores for Multi-Output Conformal Regression

Victor Dheur, Matteo Fontana, Yorick Estievenart et al.

ICML 2025posterarXiv:2501.10533
16
citations
#1652

Motion Prior Knowledge Learning with Homogeneous Language Descriptions for Moving Infrared Small Target Detection

Shengjia Chen, Luping Ji, Weiwei Duan et al.

AAAI 2025paper
16
citations
#1653

Swift4D: Adaptive divide-and-conquer Gaussian Splatting for compact and efficient reconstruction of dynamic scene

Jiahao Wu, Rui Peng, Zhiyan Wang et al.

ICLR 2025poster
16
citations
#1654

DashGaussian: Optimizing 3D Gaussian Splatting in 200 Seconds

Youyu Chen, Junjun Jiang, Kui Jiang et al.

CVPR 2025highlightarXiv:2503.18402
16
citations
#1655

Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence

Shangbin Feng, Zifeng Wang, Yike Wang et al.

ICML 2025posterarXiv:2410.11163
16
citations
#1656

MMAD: A Comprehensive Benchmark for Multimodal Large Language Models in Industrial Anomaly Detection

Xi Jiang, Jian Li, Hanqiu Deng et al.

ICLR 2025posterarXiv:2410.09453
16
citations
#1657

Does SGD really happen in tiny subspaces?

Minhak Song, Kwangjun Ahn, Chulhee Yun

ICLR 2025posterarXiv:2405.16002
16
citations
#1658

OATS: Outlier-Aware Pruning Through Sparse and Low Rank Decomposition

Stephen Zhang, Vardan Papyan

ICLR 2025posterarXiv:2409.13652
16
citations
#1659

MarS: a Financial Market Simulation Engine Powered by Generative Foundation Model

Junjie Li, Yang Liu, Weiqing Liu et al.

ICLR 2025posterarXiv:2409.07486
16
citations
#1660

MOS: Model Surgery for Pre-Trained Model-Based Class-Incremental Learning

Hai-Long Sun, Da-Wei Zhou, Hanbin Zhao et al.

AAAI 2025paperarXiv:2412.09441
16
citations
#1661

BEST-Route: Adaptive LLM Routing with Test-Time Optimal Compute

Dujian Ding, Ankur Mallick, Shaokun Zhang et al.

ICML 2025posterarXiv:2506.22716
16
citations
#1662

Reinforce LLM Reasoning through Multi-Agent Reflection

Yurun Yuan, Tengyang Xie

ICML 2025posterarXiv:2506.08379
16
citations
#1663

MiniPLM: Knowledge Distillation for Pre-training Language Models

Yuxian Gu, Hao Zhou, Fandong Meng et al.

ICLR 2025posterarXiv:2410.17215
16
citations
#1664

BioCLIP 2: Emergent Properties from Scaling Hierarchical Contrastive Learning

Jianyang Gu, Sam Stevens, Elizabeth Campolongo et al.

NEURIPS 2025spotlightarXiv:2505.23883
16
citations
#1665

MLLM-as-a-Judge for Image Safety without Human Labeling

Zhenting Wang, Shuming Hu, Shiyu Zhao et al.

CVPR 2025highlightarXiv:2501.00192
16
citations
#1666

EgoExo-Gen: Ego-centric Video Prediction by Watching Exo-centric Videos

Jilan Xu, Yifei Huang, Baoqi Pei et al.

ICLR 2025oralarXiv:2504.11732
16
citations
#1667

PrEditor3D: Fast and Precise 3D Shape Editing

Ziya Erkoc, Can Gümeli, Chaoyang Wang et al.

CVPR 2025posterarXiv:2412.06592
16
citations
#1668

Exploring the Limits of Vision-Language-Action Manipulation in Cross-task Generalization

Jiaming Zhou, Ke Ye, Jiayi Liu et al.

NEURIPS 2025posterarXiv:2505.15660
16
citations
#1669

xLSTM-Mixer: Multivariate Time Series Forecasting by Mixing via Scalar Memories

Maurice Kraus, Felix Divo, Devendra Singh Dhami et al.

NEURIPS 2025oralarXiv:2410.16928
16
citations
#1670

CREMA: Generalizable and Efficient Video-Language Reasoning via Multimodal Modular Fusion

Shoubin Yu, Jaehong Yoon, Mohit Bansal

ICLR 2025posterarXiv:2402.05889
16
citations
#1671

Adaptive Rectangular Convolution for Remote Sensing Pansharpening

Xueyang Wang, Zhixin Zheng, Jiandong Shao et al.

CVPR 2025posterarXiv:2503.00467
16
citations
#1672

Tighter Privacy Auditing of DP-SGD in the Hidden State Threat Model

Tudor Cebere, Aurélien Bellet, Nicolas Papernot

ICLR 2025posterarXiv:2405.14457
16
citations
#1673

Beyond Single Concept Vector: Modeling Concept Subspace in LLMs with Gaussian Distribution

Haiyan Zhao, Heng Zhao, Bo Shen et al.

ICLR 2025posterarXiv:2410.00153
16
citations
#1674

Where am I? Cross-View Geo-localization with Natural Language Descriptions

Junyan Ye, Honglin Lin, Leyan Ou et al.

ICCV 2025posterarXiv:2412.17007
16
citations
#1675

WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch

Zimu Lu, Yunqiao Yang, Houxing Ren et al.

NEURIPS 2025oralarXiv:2505.03733
16
citations
#1676

DreamOmni: Unified Image Generation and Editing

Bin Xia, Yuechen Zhang, Jingyao Li et al.

CVPR 2025posterarXiv:2412.17098
16
citations
#1677

CXPMRG-Bench: Pre-training and Benchmarking for X-ray Medical Report Generation on CheXpert Plus Dataset

Xiao Wang, Fuling Wang, Yuehang Li et al.

CVPR 2025posterarXiv:2410.00379
16
citations
#1678

MonoInstance: Enhancing Monocular Priors via Multi-view Instance Alignment for Neural Rendering and Reconstruction

Wenyuan Zhang, Yixiao Yang, Han Huang et al.

CVPR 2025posterarXiv:2503.18363
16
citations
#1679

UniCombine: Unified Multi-Conditional Combination with Diffusion Transformer

Haoxuan Wang, Jinlong Peng, Qingdong He et al.

ICCV 2025posterarXiv:2503.09277
16
citations
#1680

VeriThinker: Learning to Verify Makes Reasoning Model Efficient

Zigeng Chen, Xinyin Ma, Gongfan Fang et al.

NEURIPS 2025posterarXiv:2505.17941
16
citations
#1681

MagicArticulate: Make Your 3D Models Articulation-Ready

Chaoyue Song, Jianfeng Zhang, Xiu Li et al.

CVPR 2025posterarXiv:2502.12135
16
citations
#1682

Depth Any Camera: Zero-Shot Metric Depth Estimation from Any Camera

Yuliang Guo, Sparsh Garg, S. Mahdi H. Miangoleh et al.

CVPR 2025posterarXiv:2501.02464
16
citations
#1683

MetaOOD: Automatic Selection of OOD Detection Models

Yuehan Qin, Yichi Zhang, Yi Nian et al.

ICLR 2025posterarXiv:2410.03074
16
citations
#1684

Neighboring Autoregressive Modeling for Efficient Visual Generation

Yefei He, Yuanyu He, Shaoxuan He et al.

ICCV 2025posterarXiv:2503.10696
16
citations
#1685

What Matters in Learning from Large-Scale Datasets for Robot Manipulation

Vaibhav Saxena, Matthew Bronars, Nadun Ranawaka Arachchige et al.

ICLR 2025posterarXiv:2506.13536
16
citations
#1686

Training Neural Networks as Recognizers of Formal Languages

Alexandra Butoi, Ghazal Khalighinejad, Anej Svete et al.

ICLR 2025posterarXiv:2411.07107
16
citations
#1687

DexVLG: Dexterous Vision-Language-Grasp Model at Scale

Jiawei He, Danshi Li, Xinqiang Yu et al.

ICCV 2025highlightarXiv:2507.02747
16
citations
#1688

Adaptive Methods through the Lens of SDEs: Theoretical Insights on the Role of Noise

Enea Monzio Compagnoni, Tianlin Liu, Rustem Islamov et al.

ICLR 2025posterarXiv:2411.15958
16
citations
#1689

Memory Injection Attacks on LLM Agents via Query-Only Interaction

Shen Dong, Shaochen Xu, Pengfei He et al.

NEURIPS 2025posterarXiv:2503.03704
16
citations
#1690

Generative Flows on Synthetic Pathway for Drug Design

Seonghwan Seo, Minsu Kim, Tony Shen et al.

ICLR 2025posterarXiv:2410.04542
16
citations
#1691

UniRestore: Unified Perceptual and Task-Oriented Image Restoration Model Using Diffusion Prior

I-Hsiang Chen, Wei-Ting Chen, Yu-Wei Liu et al.

CVPR 2025highlightarXiv:2501.13134
16
citations
#1692

VQ-VLA: Improving Vision-Language-Action Models via Scaling Vector-Quantized Action Tokenizers

Yating Wang, Haoyi Zhu, Mingyu Liu et al.

ICCV 2025posterarXiv:2507.01016
16
citations
#1693

Spiking Transformer with Spatial-Temporal Attention

Donghyun Lee, Yuhang Li, Youngeun Kim et al.

CVPR 2025posterarXiv:2409.19764
16
citations
#1694

BlueSuffix: Reinforced Blue Teaming for Vision-Language Models Against Jailbreak Attacks

Yunhan Zhao, Xiang Zheng, Lin Luo et al.

ICLR 2025posterarXiv:2410.20971
16
citations
#1695

A Many-Objective Problem Where Crossover Is Provably Indispensable

Andre Opris

AAAI 2025paper
16
citations
#1696

MMedPO: Aligning Medical Vision-Language Models with Clinical-Aware Multimodal Preference Optimization

Kangyu Zhu, Peng Xia, Yun Li et al.

ICML 2025posterarXiv:2412.06141
16
citations
#1697

FastVID: Dynamic Density Pruning for Fast Video Large Language Models

Leqi Shen, Guoqiang Gong, Tao He et al.

NEURIPS 2025oralarXiv:2503.11187
16
citations
#1698

PointOBB-v2: Towards Simpler, Faster, and Stronger Single Point Supervised Oriented Object Detection

Botao Ren, Xue Yang, Yi Yu et al.

ICLR 2025posterarXiv:2410.08210
16
citations
#1699

ECBench: Can Multi-modal Foundation Models Understand the Egocentric World? A Holistic Embodied Cognition Benchmark

Ronghao Dang, Yuqian Yuan, Wenqi Zhang et al.

CVPR 2025posterarXiv:2501.05031
16
citations
#1700

How Much is a Noisy Image Worth? Data Scaling Laws for Ambient Diffusion.

Giannis Daras, Yeshwanth Cherapanamjeri, Constantinos C Daskalakis

ICLR 2025posterarXiv:2411.02780
16
citations
#1701

Adaptive Length Image Tokenization via Recurrent Allocation

Shivam Duggal, Phillip Isola, Antonio Torralba et al.

ICLR 2025posterarXiv:2411.02393
16
citations
#1702

Structured Packing in LLM Training Improves Long Context Utilization

Konrad Staniszewski, Szymon Tworkowski, Sebastian Jaszczur et al.

AAAI 2025paperarXiv:2312.17296
16
citations
#1703

Mixture of Noise for Pre-Trained Model-Based Class-Incremental Learning

Kai Jiang, Zhengyan Shi, Dell Zhang et al.

NEURIPS 2025posterarXiv:2509.16738
16
citations
#1704

Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation and Reconstruction

Yuanhao Cai, He Zhang, Kai Zhang et al.

ICCV 2025posterarXiv:2411.14384
16
citations
#1705

Learning Interpretable Hierarchical Dynamical Systems Models from Time Series Data

Manuel Brenner, Elias Weber, Georgia Koppe et al.

ICLR 2025posterarXiv:2410.04814
16
citations
#1706

Large Images Are Gaussians: High-Quality Large Image Representation with Levels of 2D Gaussian Splatting

Lingting Zhu, Guying Lin, Jinnan Chen et al.

AAAI 2025paperarXiv:2502.09039
16
citations
#1707

Tracing Representation Progression: Analyzing and Enhancing Layer-Wise Similarity

Jiachen Jiang, Jinxin Zhou, Zhihui Zhu

ICLR 2025posterarXiv:2406.14479
16
citations
#1708

CBGBench: Fill in the Blank of Protein-Molecule Complex Binding Graph

Haitao Lin, Guojiang Zhao, Odin Zhang et al.

ICLR 2025posterarXiv:2406.10840
16
citations
#1709

Efficient Part-level 3D Object Generation via Dual Volume Packing

Jiaxiang Tang, Ruijie Lu, Max Li et al.

NEURIPS 2025posterarXiv:2506.09980
16
citations
#1710

Toward Generalized Image Quality Assessment: Relaxing the Perfect Reference Quality Assumption

Du CHEN, Tianhe Wu, Kede Ma et al.

CVPR 2025posterarXiv:2503.11221
16
citations
#1711

Mimir: Improving Video Diffusion Models for Precise Text Understanding

Shuai Tan, Biao Gong, Yutong Feng et al.

CVPR 2025posterarXiv:2412.03085
16
citations
#1712

APE: Faster and Longer Context-Augmented Generation via Adaptive Parallel Encoding

Xinyu Yang, Tianqi Chen, Beidi Chen

ICLR 2025posterarXiv:2502.05431
16
citations
#1713

VidComposition: Can MLLMs Analyze Compositions in Compiled Videos?

Yunlong Tang, JunJia Guo, Hang Hua et al.

CVPR 2025posterarXiv:2411.10979
16
citations
#1714

Controlling Large Language Models Through Concept Activation Vectors

Hanyu Zhang, Xiting Wang, Chengao Li et al.

AAAI 2025paperarXiv:2501.05764
16
citations
#1715

MMDisCo: Multi-Modal Discriminator-Guided Cooperative Diffusion for Joint Audio and Video Generation

Akio Hayakawa, Masato Ishii, Takashi Shibuya et al.

ICLR 2025posterarXiv:2405.17842
16
citations
#1716

Palu: KV-Cache Compression with Low-Rank Projection

Chi-Chih Chang, Wei-Cheng Lin, Chien-Yu Lin et al.

ICLR 2025poster
16
citations
#1717

EmbodiedOcc: Embodied 3D Occupancy Prediction for Vision-based Online Scene Understanding

Yuqi Wu, Wenzhao Zheng, Sicheng Zuo et al.

ICCV 2025posterarXiv:2412.04380
16
citations
#1718

NavMorph: A Self-Evolving World Model for Vision-and-Language Navigation in Continuous Environments

Xuan Yao, Junyu Gao, Changsheng Xu

ICCV 2025posterarXiv:2506.23468
16
citations
#1719

Understanding and Enhancing the Transferability of Jailbreaking Attacks

Runqi Lin, Bo Han, Fengwang Li et al.

ICLR 2025posterarXiv:2502.03052
16
citations
#1720

Degradation-Aware Feature Perturbation for All-in-One Image Restoration

Xiangpeng Tian, Xiangyu Liao, Xiao Liu et al.

CVPR 2025posterarXiv:2505.12630
16
citations
#1721

Revisiting MAE Pre-training for 3D Medical Image Segmentation

Tassilo Wald, Constantin Ulrich, Stanislav Lukyanenko et al.

CVPR 2025highlightarXiv:2410.23132
16
citations
#1722

OBI-Bench: Can LMMs Aid in Study of Ancient Script on Oracle Bones?

Zijian Chen, tingzhu chen, Wenjun Zhang et al.

ICLR 2025posterarXiv:2412.01175
16
citations
#1723

Global-Local Tree Search in VLMs for 3D Indoor Scene Generation

Wei Deng, Mengshi Qi, Huadong Ma

CVPR 2025posterarXiv:2503.18476
16
citations
#1724

Learnable Expansion of Graph Operators for Multi-Modal Feature Fusion

Dexuan Ding, Lei Wang, Liyun Zhu et al.

ICLR 2025posterarXiv:2410.01506
16
citations
#1725

ROS-SAM: High-Quality Interactive Segmentation for Remote Sensing Moving Object

Zhe Shan, Yang Liu, Lei Zhou et al.

CVPR 2025posterarXiv:2503.12006
16
citations
#1726

EmoDubber: Towards High Quality and Emotion Controllable Movie Dubbing

Gaoxiang Cong, Jiadong Pan, Liang Li et al.

CVPR 2025highlightarXiv:2412.08988
16
citations
#1727

LLMs Can Plan Only If We Tell Them

Bilgehan Sel, Ruoxi Jia, Ming Jin

ICLR 2025posterarXiv:2501.13545
16
citations
#1728

Model Equality Testing: Which Model is this API Serving?

Irena Gao, Percy Liang, Carlos Guestrin

ICLR 2025posterarXiv:2410.20247
16
citations
#1729

ViLLa: Video Reasoning Segmentation with Large Language Model

rongkun Zheng, Lu Qi, Xi Chen et al.

ICCV 2025posterarXiv:2407.14500
16
citations
#1730

Prior-guided Hierarchical Harmonization Network for Efficient Image Dehazing

Xiongfei Su, Siyuan Li, Yuning Cui et al.

AAAI 2025paperarXiv:2503.01136
16
citations
#1731

Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation

Xinyu Yang, Yuwei An, Hongyi Liu et al.

NEURIPS 2025spotlightarXiv:2506.09991
16
citations
#1732

The Same but Different: Structural Similarities and Differences in Multilingual Language Modeling

Ruochen Zhang, Qinan Yu, Matianyu Zang et al.

ICLR 2025posterarXiv:2410.09223
16
citations
#1733

Spatiotemporal Skip Guidance for Enhanced Video Diffusion Sampling

Junha Hyung, Kinam Kim, Susung Hong et al.

CVPR 2025posterarXiv:2411.18664
16
citations
#1734

Logical Consistency of Large Language Models in Fact-Checking

Bishwamittra Ghosh, Sarah Hasan, Naheed Anjum Arafat et al.

ICLR 2025posterarXiv:2412.16100
15
citations
#1735

Power Lines: Scaling laws for weight decay and batch size in LLM pre-training

Shane Bergsma, Nolan Dey, Gurpreet Gosal et al.

NEURIPS 2025posterarXiv:2505.13738
15
citations
#1736

Can In-context Learning Really Generalize to Out-of-distribution Tasks?

Qixun Wang, Yifei Wang, Xianghua Ying et al.

ICLR 2025posterarXiv:2410.09695
15
citations
#1737

RelationAdapter: Learning and Transferring Visual Relation with Diffusion Transformers

Yan Gong, Yiren Song, Yicheng Li et al.

NEURIPS 2025posterarXiv:2506.02528
15
citations
#1738

Multi-Layer Visual Feature Fusion in Multimodal LLMs: Methods, Analysis, and Best Practices

Junyan Lin, Haoran Chen, Yue Fan et al.

CVPR 2025posterarXiv:2503.06063
15
citations
#1739

Hierarchical Context Pruning: Optimizing Real-World Code Completion with Repository-Level Pretrained Code LLMs

Lei Zhang, Yunshui Li, Jiaming Li et al.

AAAI 2025paperarXiv:2406.18294
15
citations
#1740

BindGPT: A Scalable Framework for 3D Molecular Design via Language Modeling and Reinforcement Learning

Artem Zholus, Maksim Kuznetsov, Roman Schutski et al.

AAAI 2025paperarXiv:2406.03686
15
citations
#1741

3D-AffordanceLLM: Harnessing Large Language Models for Open-Vocabulary Affordance Detection in 3D Worlds

Hengshuo Chu, Xiang Deng, Qi Lv et al.

ICLR 2025posterarXiv:2502.20041
15
citations
#1742

FaithDiff: Unleashing Diffusion Priors for Faithful Image Super-resolution

Junyang Chen, Jinshan Pan, Jiangxin Dong

CVPR 2025posterarXiv:2411.18824
15
citations
#1743

Horizon Reduction Makes RL Scalable

Seohong Park, Kevin Frans, Deepinder Mann et al.

NEURIPS 2025spotlightarXiv:2506.04168
15
citations
#1744

Two-stream Beats One-stream: Asymmetric Siamese Network for Efficient Visual Tracking

Jiawen Zhu, Huayi Tang, Xin Chen et al.

AAAI 2025paperarXiv:2503.00516
15
citations
#1745

Systematic Outliers in Large Language Models

Yongqi An, Xu Zhao, Tao Yu et al.

ICLR 2025posterarXiv:2502.06415
15
citations
#1746

MobileUse: A Hierarchical Reflection-Driven GUI Agent for Autonomous Mobile Operation

Ning Li, Xiangmou Qu, Jiamu Zhou et al.

NEURIPS 2025oral
15
citations
#1747

FOLDER: Accelerating Multi-Modal Large Language Models with Enhanced Performance

Haicheng Wang, Zhemeng Yu, Gabriele Spadaro et al.

ICCV 2025posterarXiv:2501.02430
15
citations
#1748

FD2-Net: Frequency-Driven Feature Decomposition Network for Infrared-Visible Object Detection

Ke Li, Di Wang, Zhangyuan Hu et al.

AAAI 2025paperarXiv:2412.09258
15
citations
#1749

GraphMoRE: Mitigating Topological Heterogeneity via Mixture of Riemannian Experts

Zihao Guo, Qingyun Sun, Haonan Yuan et al.

AAAI 2025paperarXiv:2412.11085
15
citations
#1750

To Trust or Not to Trust? Enhancing Large Language Models' Situated Faithfulness to External Contexts

Yukun Huang, Sanxing Chen, Hongyi Cai et al.

ICLR 2025posterarXiv:2410.14675
15
citations
#1751

A CLIP-Powered Framework for Robust and Generalizable Data Selection

Suorong Yang, Peng Ye, Wanli Ouyang et al.

ICLR 2025posterarXiv:2410.11215
15
citations
#1752

Sylber: Syllabic Embedding Representation of Speech from Raw Audio

Cheol Jun Cho, Nicholas Lee, Akshat Gupta et al.

ICLR 2025posterarXiv:2410.07168
15
citations
#1753

ET-SEED: EFFICIENT TRAJECTORY-LEVEL SE(3) EQUIVARIANT DIFFUSION POLICY

Chenrui Tie, Yue Chen, Ruihai Wu et al.

ICLR 2025posterarXiv:2411.03990
15
citations
#1754

Dual Data Alignment Makes AI-Generated Image Detector Easier Generalizable

Ruoxin Chen, Junwei Xi, Zhiyuan Yan et al.

NEURIPS 2025spotlightarXiv:2505.14359
15
citations
#1755

IDArb: Intrinsic Decomposition for Arbitrary Number of Input Views and Illuminations

Zhibing Li, Tong Wu, Jing Tan et al.

ICLR 2025posterarXiv:2412.12083
15
citations
#1756

A Probabilistic Perspective on Unlearning and Alignment for Large Language Models

Yan Scholten, Stephan Günnemann, Leo Schwinn

ICLR 2025posterarXiv:2410.03523
15
citations
#1757

SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training

Tianjin Huang, Ziquan Zhu, Gaojie Jin et al.

ICLR 2025posterarXiv:2501.06842
15
citations
#1758

DRoC: Elevating Large Language Models for Complex Vehicle Routing via Decomposed Retrieval of Constraints

Xia Jiang, Yaoxin Wu, Chenhao Zhang et al.

ICLR 2025poster
15
citations
#1759

Spiking Vision Transformer with Saccadic Attention

Shuai Wang, Malu Zhang, Dehao Zhang et al.

ICLR 2025oralarXiv:2502.12677
15
citations
#1760

Exploring Sparse MoE in GANs for Text-conditioned Image Synthesis

Jiapeng Zhu, Ceyuan Yang, Kecheng Zheng et al.

CVPR 2025posterarXiv:2309.03904
15
citations
#1761

The Pitfalls of Memorization: When Memorization Hurts Generalization

Reza Bayat, Mohammad Pezeshki, Elvis Dohmatob et al.

ICLR 2025posterarXiv:2412.07684
15
citations
#1762

Retrieval Augmented Time Series Forecasting

Sungwon Han, Seungeon Lee, MEEYOUNG CHA et al.

ICML 2025oralarXiv:2505.04163
15
citations
#1763

DiffMS: Diffusion Generation of Molecules Conditioned on Mass Spectra

Montgomery Bohde, Mrunali Manjrekar, Runzhong Wang et al.

ICML 2025posterarXiv:2502.09571
15
citations
#1764

SOLVE: Synergy of Language-Vision and End-to-End Networks for Autonomous Driving

Xuesong Chen, Linjiang Huang, Tao Ma et al.

CVPR 2025posterarXiv:2505.16805
15
citations
#1765

Few-Shot Recognition via Stage-Wise Retrieval-Augmented Finetuning

Tian Liu, Huixin Zhang, Shubham Parashar et al.

CVPR 2025posterarXiv:2406.11148
15
citations
#1766

FireEdit: Fine-grained Instruction-based Image Editing via Region-aware Vision Language Model

Jun Zhou, Jiahao Li, Zunnan Xu et al.

CVPR 2025posterarXiv:2503.19839
15
citations
#1767

SAFE: Multitask Failure Detection for Vision-Language-Action Models

Qiao Gu, Yuanliang Ju, Shengxiang Sun et al.

NEURIPS 2025posterarXiv:2506.09937
15
citations
#1768

RoboGround: Robotic Manipulation with Grounded Vision-Language Priors

Haifeng Huang, Xinyi Chen, Yilun Chen et al.

CVPR 2025posterarXiv:2504.21530
15
citations
#1769

Security Attacks on LLM-based Code Completion Tools

Wen Cheng, Ke Sun, Xinyu Zhang et al.

AAAI 2025paperarXiv:2408.11006
15
citations
#1770

Scaling Properties of Diffusion Models For Perceptual Tasks

Rahul Ravishankar, Zeeshan Patel, Jathushan Rajasegaran et al.

CVPR 2025posterarXiv:2411.08034
15
citations
#1771

GotenNet: Rethinking Efficient 3D Equivariant Graph Neural Networks

Sarp Aykent, Tian Xia

ICLR 2025poster
15
citations
#1772

SiReRAG: Indexing Similar and Related Information for Multihop Reasoning

Nan Zhang, Prafulla Kumar Choubey, Alexander Fabbri et al.

ICLR 2025posterarXiv:2412.06206
15
citations
#1773

PuzzleFusion++: Auto-agglomerative 3D Fracture Assembly by Denoise and Verify

Zhengqing Wang, Jiacheng Chen, Yasutaka Furukawa

ICLR 2025posterarXiv:2406.00259
15
citations
#1774

From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories, and Applications

Ajay Jaiswal, Yifan Wang, Lu Yin et al.

ICML 2025posterarXiv:2407.11239
15
citations
#1775

Any6D: Model-free 6D Pose Estimation of Novel Object

Taeyeop Lee, Bowen Wen, Minjun Kang et al.

CVPR 2025posterarXiv:2503.18673
15
citations
#1776

Octopus: Alleviating Hallucination via Dynamic Contrastive Decoding

Wei Suo, Lijun Zhang, Mengyang Sun et al.

CVPR 2025highlightarXiv:2503.00361
15
citations
#1777

AGENTIF: Benchmarking Large Language Models Instruction Following Ability in Agentic Scenarios

Yunjia Qi, Hao Peng, Xiaozhi Wang et al.

NEURIPS 2025spotlight
15
citations
#1778

Thermalizer: Stable autoregressive neural emulation of spatiotemporal chaos

Chris Pedersen, Laure Zanna, Joan Bruna

ICML 2025oralarXiv:2503.18731
15
citations
#1779

Reinforcement Learning Finetunes Small Subnetworks in Large Language Models

Sagnik Mukherjee, Lifan Yuan, Dilek Hakkani-Tur et al.

NEURIPS 2025posterarXiv:2505.11711
15
citations
#1780

Restoring Calibration for Aligned Large Language Models: A Calibration-Aware Fine-Tuning Approach

Jiancong Xiao, Bojian Hou, Zhanliang Wang et al.

ICML 2025posterarXiv:2505.01997
15
citations
#1781

Black-Box Detection of Language Model Watermarks

Thibaud Gloaguen, Nikola Jovanović, Robin Staab et al.

ICLR 2025posterarXiv:2405.20777
15
citations
#1782

DON’T STOP ME NOW: EMBEDDING BASED SCHEDULING FOR LLMS

Rana Shahout, Eran Malach, Chunwei Liu et al.

ICLR 2025poster
15
citations
#1783

Multi-Domain Graph Foundation Models: Robust Knowledge Transfer via Topology Alignment

Shuo Wang, Bokui Wang, Zhixiang Shen et al.

ICML 2025posterarXiv:2502.02017
15
citations
#1784

The Scene Language: Representing Scenes with Programs, Words, and Embeddings

Yunzhi Zhang, Zizhang Li, Matt Zhou et al.

CVPR 2025highlightarXiv:2410.16770
15
citations
#1785

RoboScape: Physics-informed Embodied World Model

Yu Shang, Xin Zhang, Yinzhou Tang et al.

NEURIPS 2025oralarXiv:2506.23135
15
citations
#1786

AutoRedTeamer: Autonomous Red Teaming with Lifelong Attack Integration

Andy Zhou, Kevin Wu, Francesco Pinto et al.

NEURIPS 2025posterarXiv:2503.15754
15
citations
#1787

Signature Kernel Conditional Independence Tests in Causal Discovery for Stochastic Processes

Georg Manten, Cecilia Casolo, Emilio Ferrucci et al.

ICLR 2025posterarXiv:2402.18477
15
citations
#1788

Towards Robust and Parameter-Efficient Knowledge Unlearning for LLMs

Sungmin Cha, Sungjun Cho, Dasol Hwang et al.

ICLR 2025posterarXiv:2408.06621
15
citations
#1789

Adaptive teachers for amortized samplers

Minsu Kim, Sanghyeok Choi, Taeyoung Yun et al.

ICLR 2025posterarXiv:2410.01432
15
citations
#1790

Boosting Generative Image Modeling via Joint Image-Feature Synthesis

Theodoros Kouzelis, Efstathios Karypidis, Ioannis Kakogeorgiou et al.

NEURIPS 2025spotlightarXiv:2504.16064
15
citations
#1791

Training-Free Efficient Video Generation via Dynamic Token Carving

Yuechen Zhang, Jinbo Xing, bin xia et al.

NEURIPS 2025posterarXiv:2505.16864
15
citations
#1792

UNSURE: self-supervised learning with Unknown Noise level and Stein's Unbiased Risk Estimate

Julián Tachella, Mike Davies, Laurent Jacques

ICLR 2025posterarXiv:2409.01985
15
citations
#1793

Trust, But Verify: A Self-Verification Approach to Reinforcement Learning with Verifiable Rewards

Xiaoyuan Liu, Tian Liang, Zhiwei He et al.

NEURIPS 2025posterarXiv:2505.13445
15
citations
#1794

RefactorBench: Evaluating Stateful Reasoning in Language Agents Through Code

Dhruv Gautam, Spandan Garg, Jinu Jang et al.

ICLR 2025posterarXiv:2503.07832
15
citations
#1795

DINO-Foresight: Looking into the Future with DINO

Efstathios Karypidis, Ioannis Kakogeorgiou, Spyridon Gidaris et al.

NEURIPS 2025posterarXiv:2412.11673
15
citations
#1796

JarvisIR: Elevating Autonomous Driving Perception with Intelligent Image Restoration

yunlong lin, Zixu Lin, Haoyu Chen et al.

CVPR 2025posterarXiv:2504.04158
15
citations
#1797

Re-Thinking Inverse Graphics With Large Language Models

Haiwen Feng, Michael J Black, Weiyang Liu et al.

ICLR 2025posterarXiv:2404.15228
15
citations
#1798

Stochastic Deep Restoration Priors for Imaging Inverse Problems

Yuyang Hu, Albert Peng, Weijie Gan et al.

ICML 2025posterarXiv:2410.02057
15
citations
#1799

IncEventGS: Pose-Free Gaussian Splatting from a Single Event Camera

Jian Huang, Chengrui Dong, Xuanhua Chen et al.

CVPR 2025highlightarXiv:2410.08107
15
citations
#1800

Discrete to Continuous: Generating Smooth Transition Poses from Sign Language Observations

Shengeng Tang, Jiayi He, Lechao Cheng et al.

CVPR 2025posterarXiv:2411.16810
15
citations