Most Cited 2025 "plug-and-play control" Papers

22,274 papers found • Page 15 of 112

#2801

DreamDistribution: Learning Prompt Distribution for Diverse In-distribution Generation

Brian Nlong Zhao, Yuhang Xiao, Jiashu Xu et al.

ICLR 2025posterarXiv:2312.14216
9
citations
#2802

Realistic Evaluation of Deep Partial-Label Learning Algorithms

Wei Wang, Dong-Dong Wu, Jindong Wang et al.

ICLR 2025posterarXiv:2502.10184
9
citations
#2803

AlphaZero Neural Scaling and Zipf's Law: a Tale of Board Games and Power Laws

Oren Neumann, Claudius Gros

NEURIPS 2025spotlightarXiv:2412.11979
9
citations
#2804

SILMM: Self-Improving Large Multimodal Models for Compositional Text-to-Image Generation

Leigang Qu, Haochuan Li, Wenjie Wang et al.

CVPR 2025posterarXiv:2412.05818
9
citations
#2805

X-Fi: A Modality-Invariant Foundation Model for Multimodal Human Sensing

Xinyan Chen, Jianfei Yang

ICLR 2025posterarXiv:2410.10167
9
citations
#2806

Adversarially Robust Out-of-Distribution Detection Using Lyapunov-Stabilized Embeddings

Hossein Mirzaei Sadeghlou, Mackenzie Mathis

ICLR 2025poster
9
citations
#2807

On the Optimization and Generalization of Two-layer Transformers with Sign Gradient Descent

Bingrui Li, Wei Huang, Andi Han et al.

ICLR 2025posterarXiv:2410.04870
9
citations
#2808

Provable Convergence and Limitations of Geometric Tempering for Langevin Dynamics

Omar Chehab, Anna Korba, Austin Stromme et al.

ICLR 2025posterarXiv:2410.09697
9
citations
#2809

EventGPT: Event Stream Understanding with Multimodal Large Language Models

shaoyu liu, Jianing Li, guanghui zhao et al.

CVPR 2025posterarXiv:2412.00832
9
citations
#2810

SAMRefiner: Taming Segment Anything Model for Universal Mask Refinement

Yuqi Lin, Hengjia Li, Wenqi Shao et al.

ICLR 2025posterarXiv:2502.06756
9
citations
#2811

(Mis)Fitting Scaling Laws: A Survey of Scaling Law Fitting Techniques in Deep Learning

Margaret Li, Sneha Kudugunta, Luke Zettlemoyer

ICLR 2025poster
9
citations
#2812

PassionSR: Post-Training Quantization with Adaptive Scale in One-Step Diffusion based Image Super-Resolution

Zhu Li Bo, Jianze Li, Haotong Qin et al.

CVPR 2025posterarXiv:2411.17106
9
citations
#2813

Edge Prompt Tuning for Graph Neural Networks

Xingbo Fu, Yinhan He, Jundong Li

ICLR 2025posterarXiv:2503.00750
9
citations
#2814

SMARTIES: Spectrum-Aware Multi-Sensor Auto-Encoder for Remote Sensing Images

Gencer Sumbul, Chang Xu, Emanuele Dalsasso et al.

ICCV 2025posterarXiv:2506.19585
9
citations
#2815

SOLAMI: Social Vision-Language-Action Modeling for Immersive Interaction with 3D Autonomous Characters

Jianping Jiang, Weiye Xiao, Zhengyu Lin et al.

CVPR 2025posterarXiv:2412.00174
9
citations
#2816

GLASS: Guided Latent Slot Diffusion for Object-Centric Learning

Krishnakant Singh, Simone Schaub-Meyer, Stefan Roth

CVPR 2025posterarXiv:2407.17929
9
citations
#2817

On Reasoning Strength Planning in Large Reasoning Models

Leheng Sheng, An Zhang, Zijian Wu et al.

NEURIPS 2025posterarXiv:2506.08390
9
citations
#2818

Fast Summation of Radial Kernels via QMC Slicing

Johannes Hertrich, Tim Jahn, Michael Quellmalz

ICLR 2025posterarXiv:2410.01316
9
citations
#2819

Distilling Monocular Foundation Model for Fine-grained Depth Completion

Yingping Liang, Yutao Hu, Wenqi Shao et al.

CVPR 2025posterarXiv:2503.16970
9
citations
#2820

MAP: Unleashing Hybrid Mamba-Transformer Vision Backbone's Potential with Masked Autoregressive Pretraining

Yunze Liu, Li Yi

CVPR 2025posterarXiv:2410.00871
9
citations
#2821

Monet: Mixture of Monosemantic Experts for Transformers

Jungwoo Park, Young Jin Ahn, Kee-Eung Kim et al.

ICLR 2025posterarXiv:2412.04139
9
citations
#2822

I Can Hear You: Selective Robust Training for Deepfake Audio Detection

Zirui Zhang, Wei Hao, Aroon Sankoh et al.

ICLR 2025posterarXiv:2411.00121
9
citations
#2823

CoA: Towards Real Image Dehazing via Compression-and-Adaptation

Long Ma, Yuxin Feng, Yan Zhang et al.

CVPR 2025posterarXiv:2504.05590
9
citations
#2824

Random-Set Neural Networks

Shireen Kudukkil Manchingal, Muhammad Mubashar, Kaizheng Wang et al.

ICLR 2025posterarXiv:2307.05772
9
citations
#2825

Visual-Instructed Degradation Diffusion for All-in-One Image Restoration

Haina Qin, Wenyang Luo, Zewen Chen et al.

CVPR 2025posterarXiv:2506.16960
9
citations
#2826

Accurate and Regret-Aware Numerical Problem Solver for Tabular Question Answering

Yuxiang Wang, Jianzhong Qi, Junhao Gan

AAAI 2025paperarXiv:2410.12846
9
citations
#2827

OLinear: A Linear Model for Time Series Forecasting in Orthogonally Transformed Domain

Wenzhen Yue, Yong Liu, Hao Wang et al.

NEURIPS 2025oralarXiv:2505.08550
9
citations
#2828

Coarse Correspondences Boost Spatial-Temporal Reasoning in Multimodal Language Model

Benlin Liu, Yuhao Dong, Yiqin Wang et al.

CVPR 2025posterarXiv:2408.00754
9
citations
#2829

Do Large Language Models Truly Understand Geometric Structures?

Xiaofeng Wang, Yiming Wang, Wenhong Zhu et al.

ICLR 2025posterarXiv:2501.13773
9
citations
#2830

An End-to-End Robust Point Cloud Semantic Segmentation Network with Single-Step Conditional Diffusion Models

Wentao Qu, Jing Wang, Yongshun Gong et al.

CVPR 2025posterarXiv:2411.16308
9
citations
#2831

Bootstrapping Language-Guided Navigation Learning with Self-Refining Data Flywheel

Zun Wang, Jialu Li, Yicong Hong et al.

ICLR 2025posterarXiv:2412.08467
9
citations
#2832

DreamText: High Fidelity Scene Text Synthesis

Yibin Wang, Weizhong Zhang, honghui xu et al.

CVPR 2025posterarXiv:2405.14701
9
citations
#2833

Denoising as Adaptation: Noise-Space Domain Adaptation for Image Restoration

Kang Liao, Zongsheng Yue, Zhouxia Wang et al.

ICLR 2025posterarXiv:2406.18516
9
citations
#2834

Omni-RGPT: Unifying Image and Video Region-level Understanding via Token Marks

Miran Heo, Min-Hung Chen, De-An Huang et al.

CVPR 2025posterarXiv:2501.08326
9
citations
#2835

Rethinking Query-based Transformer for Continual Image Segmentation

Yuchen Zhu, Cheng Shi, Dingyou Wang et al.

CVPR 2025posterarXiv:2507.07831
9
citations
#2836

GTR: Improving Large 3D Reconstruction Models through Geometry and Texture Refinement

Peiye Zhuang, Songfang Han, Chaoyang Wang et al.

ICLR 2025posterarXiv:2406.05649
9
citations
#2837

Beyond the convexity assumption: Realistic tabular data generation under quantifier-free real linear constraints

Mihaela Stoian, Eleonora Giunchiglia

ICLR 2025posterarXiv:2502.18237
9
citations
#2838

BountyBench: Dollar Impact of AI Agent Attackers and Defenders on Real-World Cybersecurity Systems

Andy Zhang, Joey Ji, Celeste Menders et al.

NEURIPS 2025posterarXiv:2505.15216
9
citations
#2839

Fourier Sliced-Wasserstein Embedding for Multisets and Measures

Tal Amir, Nadav Dym

ICLR 2025posterarXiv:2405.16519
9
citations
#2840

Radiology Report Generation via Multi-objective Preference Optimization

Ting Xiao, Lei Shi, Peng Liu et al.

AAAI 2025paperarXiv:2412.08901
9
citations
#2841

MM-Agent: LLM as Agents for Real-world Mathematical Modeling Problem

Fan LIU, Zherui Yang, Cancheng Liu et al.

NEURIPS 2025posterarXiv:2505.14148
9
citations
#2842

Context-Parametric Inversion: Why Instruction Finetuning May Not Actually Improve Context Reliance

Sachin Goyal, Christina Baek, Zico Kolter et al.

ICLR 2025poster
9
citations
#2843

Shortcuts and Identifiability in Concept-based Models from a Neuro-Symbolic Lens

Samuele Bortolotti, Emanuele Marconato, Paolo Morettin et al.

NEURIPS 2025posterarXiv:2502.11245
9
citations
#2844

Attention as a Hypernetwork

Simon Schug, Seijin Kobayashi, Yassir Akram et al.

ICLR 2025posterarXiv:2406.05816
9
citations
#2845

ADBA: Approximation Decision Boundary Approach for Black-Box Adversarial Attacks

Feiyang Wang, Xingquan Zuo, Hai Huang et al.

AAAI 2025paper
9
citations
#2846

A Watermark for Order-Agnostic Language Models

Ruibo Chen, Yihan Wu, Yanshuo Chen et al.

ICLR 2025posterarXiv:2410.13805
9
citations
#2847

SuperMat: Physically Consistent PBR Material Estimation at Interactive Rates

Yijia Hong, Yuan-Chen Guo, Ran Yi et al.

ICCV 2025posterarXiv:2411.17515
9
citations
#2848

FisherTune: Fisher-Guided Robust Tuning of Vision Foundation Models for Domain Generalized Segmentation

Dong Zhao, Jinlong Li, Shuang Wang et al.

CVPR 2025posterarXiv:2503.17940
9
citations
#2849

Chain of Attack: On the Robustness of Vision-Language Models Against Transfer-Based Adversarial Attacks

Peng Xie, Yequan Bie, Jianda Mao et al.

CVPR 2025posterarXiv:2411.15720
9
citations
#2850

GenHancer: Imperfect Generative Models are Secretly Strong Vision-Centric Enhancers

Shijie Ma, Yuying Ge, Teng Wang et al.

ICCV 2025posterarXiv:2503.19480
9
citations
#2851

Zero-Shot Styled Text Image Generation, but Make It Autoregressive

Vittorio Pippi, Fabio Quattrini, Silvia Cascianelli et al.

CVPR 2025posterarXiv:2503.17074
9
citations
#2852

SceneTAP: Scene-Coherent Typographic Adversarial Planner against Vision-Language Models in Real-World Environments

Yue Cao, Yun Xing, Jie Zhang et al.

CVPR 2025posterarXiv:2412.00114
9
citations
#2853

RepLDM: Reprogramming Pretrained Latent Diffusion Models for High-Quality, High-Efficiency, High-Resolution Image Generation

Boyuan Cao, Jiaxin Ye, Yujie Wei et al.

NEURIPS 2025spotlightarXiv:2410.06055
9
citations
#2854

ConVis: Contrastive Decoding with Hallucination Visualization for Mitigating Hallucinations in Multimodal Large Language Models

Yeji Park, Deokyeong Lee, Junsuk Choe et al.

AAAI 2025paperarXiv:2408.13906
9
citations
#2855

Enhancing Time Series Forecasting through Selective Representation Spaces: A Patch Perspective

Xingjian Wu, Xiangfei Qiu, Hanyin Cheng et al.

NEURIPS 2025posterarXiv:2510.14510
9
citations
#2856

UnCommon Objects in 3D

Xingchen Liu, Piyush Tayal, Jianyuan Wang et al.

CVPR 2025posterarXiv:2501.07574
9
citations
#2857

Hierarchical Mixture of Experts: Generalizable Learning for High-Level Synthesis

Weikai Li, Ding Wang, Zijian Ding et al.

AAAI 2025paperarXiv:2410.19225
9
citations
#2858

JarvisArt: Liberating Human Artistic Creativity via an Intelligent Photo Retouching Agent

Yunlong Lin, Zixu Lin, Kunjie Lin et al.

NEURIPS 2025posterarXiv:2506.17612
9
citations
#2859

Mol-LLaMA: Towards General Understanding of Molecules in Large Molecular Language Model

Dongki Kim, Wonbin Lee, Sung Ju Hwang

NEURIPS 2025posterarXiv:2502.13449
9
citations
#2860

From Poses to Identity: Training-Free Person Re-Identification via Feature Centralization

Chao Yuan, Guiwei Zhang, Changxiao Ma et al.

CVPR 2025posterarXiv:2503.00938
9
citations
#2861

NeRAF: 3D Scene Infused Neural Radiance and Acoustic Fields

Amandine Brunetto, Sascha Hornauer, Fabien Moutarde

ICLR 2025posterarXiv:2405.18213
9
citations
#2862

MultiGO: Towards Multi-level Geometry Learning for Monocular 3D Textured Human Reconstruction

Gangjian Zhang, Nanjie Yao, Shunsi Zhang et al.

CVPR 2025posterarXiv:2412.03103
9
citations
#2863

Probability Density Geodesics in Image Diffusion Latent Space

Qingtao Yu, Jaskirat Singh, Zhaoyuan Yang et al.

CVPR 2025posterarXiv:2504.06675
9
citations
#2864

DriveEditor: A Unified 3D Information-Guided Framework for Controllable Object Editing in Driving Scenes

Yiyuan Liang, Zhiying Yan, Liqun Chen et al.

AAAI 2025paperarXiv:2412.19458
9
citations
#2865

Video-ColBERT: Contextualized Late Interaction for Text-to-Video Retrieval

Arun Reddy, Alexander Martin, Eugene Yang et al.

CVPR 2025posterarXiv:2503.19009
9
citations
#2866

EmergentTTS-Eval: Evaluating TTS Models on Complex Prosodic, Expressiveness, and Linguistic Challenges Using Model-as-a-Judge

Ruskin Raj Manku, Yuzhi Tang, Xingjian Shi et al.

NEURIPS 2025posterarXiv:2505.23009
9
citations
#2867

MUST: The First Dataset and Unified Framework for Multispectral UAV Single Object Tracking

Haolin Qin, Tingfa Xu, Tianhao Li et al.

CVPR 2025posterarXiv:2503.17699
9
citations
#2868

DIFFER: Disentangling Identity Features via Semantic Cues for Clothes-Changing Person Re-ID

Xin Liang, Yogesh S. Rawat

CVPR 2025posterarXiv:2503.22912
9
citations
#2869

PreciseCam: Precise Camera Control for Text-to-Image Generation

Edurne Bernal-Berdun, Ana Serrano, Belen Masia et al.

CVPR 2025posterarXiv:2501.12910
9
citations
#2870

VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation

Weiming Ren, Huan Yang, Jie Min et al.

CVPR 2025posterarXiv:2412.00927
9
citations
#2871

Counterfactual Generative Modeling with Variational Causal Inference

Yulun Wu, Louis McConnell, Claudia Iriondo

ICLR 2025posterarXiv:2410.12730
9
citations
#2872

LoRA Subtraction for Drift-Resistant Space in Exemplar-Free Continual Learning

Xuan Liu, Xiaobin Chang

CVPR 2025posterarXiv:2503.18985
9
citations
#2873

ZoRI: Towards Discriminative Zero-Shot Remote Sensing Instance Segmentation

Shiqi Huang, Shuting He, Bihan Wen

AAAI 2025paperarXiv:2412.12798
9
citations
#2874

R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing

Tianyu Fu, Yi Ge, Yichen You et al.

NEURIPS 2025posterarXiv:2505.21600
9
citations
#2875

LiDAR-RT: Gaussian-based Ray Tracing for Dynamic LiDAR Re-simulation

Chenxu Zhou, Lvchang Fu, Sida Peng et al.

CVPR 2025posterarXiv:2412.15199
9
citations
#2876

GUARDIAN: Safeguarding LLM Multi-Agent Collaborations with Temporal Graph Modeling

Jialong Zhou, Lichao Wang, Xiao Yang

NEURIPS 2025oralarXiv:2505.19234
9
citations
#2877

GIM: A Million-scale Benchmark for Generative Image Manipulation Detection and Localization

Yirui Chen, Xudong Huang, Quan Zhang et al.

AAAI 2025paperarXiv:2406.16531
9
citations
#2878

DepthCues: Evaluating Monocular Depth Perception in Large Vision Models

Duolikun Danier, Mehmet Aygun, Changjian Li et al.

CVPR 2025posterarXiv:2411.17385
9
citations
#2879

PMQ-VE: Progressive Multi-Frame Quantization for Video Enhancement

ZhanFeng Feng, Long Peng, Xin Di et al.

NEURIPS 2025oralarXiv:2505.12266
9
citations
#2880

Markov Persuasion Processes: Learning to Persuade From Scratch

Francesco Bacchiocchi, Francesco Emanuele Stradi, Matteo Castiglioni et al.

NEURIPS 2025posterarXiv:2402.03077
9
citations
#2881

Imagine and Seek: Improving Composed Image Retrieval with an Imagined Proxy

You Li, Fan Ma, Yi Yang

CVPR 2025posterarXiv:2411.16752
9
citations
#2882

Quality-Driven Curation of Remote Sensing Vision-Language Data via Learned Scoring Models

Dilxat Muhtar, Enzhuo Zhang, Zhenshi Li et al.

NEURIPS 2025posterarXiv:2503.00743
9
citations
#2883

MP-SfM: Monocular Surface Priors for Robust Structure-from-Motion

Zador Pataki, Paul-Edouard Sarlin, Johannes Schönberger et al.

CVPR 2025posterarXiv:2504.20040
9
citations
#2884

GIFT: Unlocking Full Potential of Labels in Distilled Dataset at Near-zero Cost

Xinyi Shang, Peng Sun, Tao Lin

ICLR 2025posterarXiv:2405.14736
9
citations
#2885

Bayesian Test-Time Adaptation for Vision-Language Models

Lihua Zhou, Mao Ye, Shuaifeng Li et al.

CVPR 2025posterarXiv:2503.09248
9
citations
#2886

HyperGS: Hyperspectral 3D Gaussian Splatting

Christopher Thirgood, Oscar Mendez, Erin Chao Ling et al.

CVPR 2025posterarXiv:2412.12849
9
citations
#2887

Empowering Users in Digital Privacy Management through Interactive LLM-Based Agents

Bolun Sun, Yifan Zhou, Haiyun Jiang

ICLR 2025posterarXiv:2410.11906
9
citations
#2888

Theoretically Grounded Framework for LLM Watermarking: A Distribution-Adaptive Approach

Haiyun He, Yepeng Liu, Ziqiao Wang et al.

NEURIPS 2025posterarXiv:2410.02890
9
citations
#2889

Universal Video Temporal Grounding with Generative Multi-modal Large Language Models

Zeqian Li, Shangzhe Di, Zhonghua Zhai et al.

NEURIPS 2025oralarXiv:2506.18883
9
citations
#2890

Aux-Think: Exploring Reasoning Strategies for Data-Efficient Vision-Language Navigation

Shuo Wang, Yongcai Wang, Wanting Li et al.

NEURIPS 2025posterarXiv:2505.11886
9
citations
#2891

ScribbleLight: Single Image Indoor Relighting with Scribbles

Jun Myeong Choi, Annie N. Wang, Pieter Peers et al.

CVPR 2025posterarXiv:2411.17696
9
citations
#2892

KeyFace: Expressive Audio-Driven Facial Animation for Long Sequences via KeyFrame Interpolation

Antoni Bigata Casademunt, Michał Stypułkowski, Rodrigo Mira et al.

CVPR 2025posterarXiv:2503.01715
9
citations
#2893

Constrained Fair and Efficient Allocations

Benjamin Cookson, Soroush Ebadian, Nisarg Shah

AAAI 2025paperarXiv:2411.00133
9
citations
#2894

ViBiDSampler: Enhancing Video Interpolation Using Bidirectional Diffusion Sampler

Serin Yang, Taesung Kwon, Jong Chul YE

ICLR 2025oralarXiv:2410.05651
9
citations
#2895

HiMoR: Monocular Deformable Gaussian Reconstruction with Hierarchical Motion Representation

Yiming Liang, Tianhan Xu, Yuta Kikuchi

CVPR 2025posterarXiv:2504.06210
9
citations
#2896

TraF-Align: Trajectory-aware Feature Alignment for Asynchronous Multi-agent Perception

Zhiying Song, Lei Yang, Fuxi Wen et al.

CVPR 2025posterarXiv:2503.19391
9
citations
#2897

Strategy Coopetition Explains the Emergence and Transience of In-Context Learning

Aaditya Singh, Ted Moskovitz, Sara Dragutinović et al.

ICML 2025oralarXiv:2503.05631
9
citations
#2898

Deep Nonlinear Sufficient Dimension Reduction

Yinfeng Chen, Yuling Jiao, Rui Qiu et al.

NEURIPS 2025poster
9
citations
#2899

AgentTTS: Large Language Model Agent for Test-time Compute-optimal Scaling Strategy in Complex Tasks

Fali Wang, Hui Liu, Zhenwei Dai et al.

NEURIPS 2025posterarXiv:2508.00890
9
citations
#2900

Update Your Transformer to the Latest Release: Re-Basin of Task Vectors

Filippo Rinaldi, Giacomo Capitani, Lorenzo Bonicelli et al.

ICML 2025posterarXiv:2505.22697
9
citations
#2901

$\gamma-$MoD: Exploring Mixture-of-Depth Adaptation for Multimodal Large Language Models

Yaxin Luo, Gen Luo, Jiayi Ji et al.

ICLR 2025poster
9
citations
#2902

Controllable Generation via Locally Constrained Resampling

Kareem Ahmed, Kai-Wei Chang, Guy Van den Broeck

ICLR 2025posterarXiv:2410.13111
9
citations
#2903

On Temperature Scaling and Conformal Prediction of Deep Classifiers

Lahav Dabah, Tom Tirer

ICML 2025posterarXiv:2402.05806
9
citations
#2904

WeatherGFM: Learning a Weather Generalist Foundation Model via In-context Learning

Xiangyu Zhao, Zhiwang Zhou, Wenlong Zhang et al.

ICLR 2025oralarXiv:2411.05420
9
citations
#2905

RSafe: Incentivizing proactive reasoning to build robust and adaptive LLM safeguards

jingnan zheng, Xiangtian Ji, Yijun Lu et al.

NEURIPS 2025posterarXiv:2506.07736
9
citations
#2906

Robot-R1: Reinforcement Learning for Enhanced Embodied Reasoning in Robotics

Dongyoung Kim, Huiwon Jang, Sumin Park et al.

NEURIPS 2025posterarXiv:2506.00070
9
citations
#2907

PlanarSplatting: Accurate Planar Surface Reconstruction in 3 Minutes

Bin Tan, Rui Yu, Yujun Shen et al.

CVPR 2025highlightarXiv:2412.03451
9
citations
#2908

Circumventing Shortcuts in Audio-visual Deepfake Detection Datasets with Unsupervised Learning

Stefan Smeu, Dragos-Alexandru Boldisor, Dan Oneata et al.

CVPR 2025highlightarXiv:2412.00175
9
citations
#2909

MagCache: Fast Video Generation with Magnitude-Aware Cache

Zehong Ma, Longhui Wei, Feng Wang et al.

NEURIPS 2025posterarXiv:2506.09045
9
citations
#2910

ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models

Liyan Tang, Grace Kim, Xinyu Zhao et al.

NEURIPS 2025posterarXiv:2505.13444
9
citations
#2911

Efficient Alternating Minimization with Applications to Weighted Low Rank Approximation

Zhao Song, Mingquan Ye, Junze Yin et al.

ICLR 2025posterarXiv:2306.04169
9
citations
#2912

BASKET: A Large-Scale Video Dataset for Fine-Grained Skill Estimation

Yulu Pan, Ce Zhang, Gedas Bertasius

CVPR 2025posterarXiv:2503.20781
9
citations
#2913

DOVE: Efficient One-Step Diffusion Model for Real-World Video Super-Resolution

Zheng Chen, Zichen Zou, Kewei Zhang et al.

NEURIPS 2025posterarXiv:2505.16239
9
citations
#2914

Online Video Understanding: OVBench and VideoChat-Online

Zhenpeng Huang, Xinhao Li, Jiaqi Li et al.

CVPR 2025posterarXiv:2501.00584
9
citations
#2915

Solving Video Inverse Problems Using Image Diffusion Models

Taesung Kwon, Jong Chul YE

ICLR 2025oralarXiv:2409.02574
9
citations
#2916

LITA-GS: Illumination-Agnostic Novel View Synthesis via Reference-Free 3D Gaussian Splatting and Physical Priors

Han Zhou, Wei Dong, Jun Chen

CVPR 2025posterarXiv:2504.00219
9
citations
#2917

Language Guided Concept Bottleneck Models for Interpretable Continual Learning

Lu Yu, HaoYu Han, Zhe Tao et al.

CVPR 2025posterarXiv:2503.23283
9
citations
#2918

All-in-One: Transferring Vision Foundation Models into Stereo Matching

Jingyi Zhou, Haoyu Zhang, Jiakang Yuan et al.

AAAI 2025paperarXiv:2412.09912
9
citations
#2919

Scaling Laws for Differentially Private Language Models

Ryan McKenna, Yangsibo Huang, Amer Sinha et al.

ICML 2025posterarXiv:2501.18914
9
citations
#2920

LesionLocator: Zero-Shot Universal Tumor Segmentation and Tracking in 3D Whole-Body Imaging

Maximilian Rokuss, Yannick Kirchhoff, Seval Akbal et al.

CVPR 2025posterarXiv:2502.20985
9
citations
#2921

QuaDiM: A Conditional Diffusion Model For Quantum State Property Estimation

Yehui Tang, Mabiao Long, Junchi Yan

ICLR 2025poster
9
citations
#2922

Distributive Fairness in Large Language Models: Evaluating Alignment with Human Values

Hadi Hosseini, Samarth Khanna

NEURIPS 2025posterarXiv:2502.00313
9
citations
#2923

ReVisionLLM: Recursive Vision-Language Model for Temporal Grounding in Hour-Long Videos

Tanveer Hannan, Md Mohaiminul Islam, Jindong Gu et al.

CVPR 2025posterarXiv:2411.14901
9
citations
#2924

MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments

MATTHIEU CORD, Antonin Vobecky, Oriane Siméoni et al.

ICLR 2025posterarXiv:2307.09361
9
citations
#2925

SuperPC: A Single Diffusion Model for Point Cloud Completion, Upsampling, Denoising, and Colorization

Yi Du, Zhipeng Zhao, Shaoshu Su et al.

CVPR 2025posterarXiv:2503.14558
9
citations
#2926

X-Dancer: Expressive Music to Human Dance Video Generation

Zeyuan Chen, Hongyi Xu, Guoxian Song et al.

ICCV 2025highlightarXiv:2502.17414
9
citations
#2927

HOPE for a Robust Parameterization of Long-memory State Space Models

Annan Yu, Michael W Mahoney, N. Benjamin Erichson

ICLR 2025posterarXiv:2405.13975
9
citations
#2928

Synthesizing Privacy-Preserving Text Data via Finetuning *without* Finetuning Billion-Scale LLMs

Bowen Tan, Zheng Xu, Eric Xing et al.

ICML 2025posterarXiv:2503.12347
9
citations
#2929

REGENT: A Retrieval-Augmented Generalist Agent That Can Act In-Context in New Environments

Kaustubh Sridhar, Souradeep Dutta, Dinesh Jayaraman et al.

ICLR 2025posterarXiv:2412.04759
9
citations
#2930

HO-Cap: A Capture System and Dataset for 3D Reconstruction and Pose Tracking of Hand-Object Interaction

Jikai Wang, Qifan Zhang, Yu-Wei Chao et al.

NEURIPS 2025posterarXiv:2406.06843
9
citations
#2931

Lumina-T2X: Scalable Flow-based Large Diffusion Transformer for Flexible Resolution Generation

Gao Peng, Le Zhuo, Dongyang Liu et al.

ICLR 2025oral
9
citations
#2932

DICE: End-to-end Deformation Capture of Hand-Face Interactions from a Single Image

Qingxuan Wu, Zhiyang Dou, Sirui Xu et al.

ICLR 2025posterarXiv:2406.17988
9
citations
#2933

Can Classic GNNs Be Strong Baselines for Graph-level Tasks? Simple Architectures Meet Excellence

Yuankai Luo, Lei Shi, Xiao-Ming Wu

ICML 2025posterarXiv:2502.09263
9
citations
#2934

Rapidly Adapting Policies to the Real-World via Simulation-Guided Fine-Tuning

Patrick Yin, Tyler Westenbroek, Ching-An Cheng et al.

ICLR 2025posterarXiv:2502.02705
9
citations
#2935

Knowledge-Aligned Counterfactual-Enhancement Diffusion Perception for Unsupervised Cross-Domain Visual Emotion Recognition

Wen Yin, Yong Wang, Guiduo Duan et al.

CVPR 2025posterarXiv:2505.19694
9
citations
#2936

Dual Prompting Image Restoration with Diffusion Transformers

Dehong Kong, Fan Li, Zhixin Wang et al.

CVPR 2025posterarXiv:2504.17825
9
citations
#2937

MegActor-Sigma: Unlocking Flexible Mixed-Modal Control in Portrait Animation with Diffusion Transformer

Shurong Yang, Huadong Li, Juhao Wu et al.

AAAI 2025paper
9
citations
#2938

When Do LLMs Help With Node Classification? A Comprehensive Analysis

Xixi Wu, Yifei Shen, Fangzhou Ge et al.

ICML 2025posterarXiv:2502.00829
9
citations
#2939

Solving Inequality Proofs with Large Language Models

Jiayi Sheng, Luna Lyu, Jikai Jin et al.

NEURIPS 2025spotlightarXiv:2506.07927
9
citations
#2940

Attention-Driven GUI Grounding: Leveraging Pretrained Multimodal Large Language Models Without Fine-Tuning

Hai-Ming Xu, Qi Chen, Lei Wang et al.

AAAI 2025paperarXiv:2412.10840
9
citations
#2941

Enhancing Foundation Models for Time Series Forecasting via Wavelet-based Tokenization

Luca Masserano, Abdul Fatir Ansari, Boran Han et al.

ICML 2025oralarXiv:2412.05244
9
citations
#2942

Cafe-Talk: Generating 3D Talking Face Animation with Multimodal Coarse- and Fine-grained Control

Hejia Chen, Haoxian Zhang, Shoulong Zhang et al.

ICLR 2025oralarXiv:2503.14517
9
citations
#2943

ICM-Assistant: Instruction-tuning Multimodal Large Language Models for Rule-based Explainable Image Content Moderation

Mengyang Wu, Yuzhi Zhao, Jialun Cao et al.

AAAI 2025paperarXiv:2412.18216
9
citations
#2944

Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs?

Simon Park, Abhishek Panigrahi, Yun Cheng et al.

ICML 2025posterarXiv:2501.02669
9
citations
#2945

Breaking Free from MMI: A New Frontier in Rationalization by Probing Input Utilization

Wei Liu, Zhiying Deng, Zhongyu Niu et al.

ICLR 2025posterarXiv:2503.06202
9
citations
#2946

Decomposition Polyhedra of Piecewise Linear Functions

Marie-Charlotte Brandenburg, Moritz Grillo, Christoph Hertrich

ICLR 2025posterarXiv:2410.04907
9
citations
#2947

Physical Plausibility-aware Trajectory Prediction via Locomotion Embodiment

Hiromu Taketsugu, Takeru Oba, Takahiro Maeda et al.

CVPR 2025posterarXiv:2503.17267
9
citations
#2948

PDE-Transformer: Efficient and Versatile Transformers for Physics Simulations

Benjamin Holzschuh, Qiang Liu, Georg Kohl et al.

ICML 2025oralarXiv:2505.24717
9
citations
#2949

HybridGS: High-Efficiency Gaussian Splatting Data Compression using Dual-Channel Sparse Representation and Point Cloud Encoder

Qi Yang, Le Yang, Geert Van der Auwera et al.

ICML 2025posterarXiv:2505.01938
9
citations
#2950

Multimodal Latent Diffusion Model for Complex Sewing Pattern Generation

Shengqi Liu, Yuhao Cheng, Zhuo Chen et al.

ICCV 2025posterarXiv:2412.14453
9
citations
#2951

Diffusion Tree Sampling: Scalable inference‑time alignment of diffusion models

Vineet Jain, Kusha Sareen, Mohammad Pedramfar et al.

NEURIPS 2025posterarXiv:2506.20701
9
citations
#2952

Scaling Collapse Reveals Universal Dynamics in Compute-Optimally Trained Neural Networks

Shikai Qiu, Lechao Xiao, Andrew Wilson et al.

ICML 2025oralarXiv:2507.02119
9
citations
#2953

Objective drives the consistency of representational similarity across datasets

Laure Ciernik, Lorenz Linhardt, Marco Morik et al.

ICML 2025posterarXiv:2411.05561
9
citations
#2954

ROADWork: A Dataset and Benchmark for Learning to Recognize, Observe, Analyze and Drive Through Work Zones

Anurag Ghosh, Shen Zheng, Robert Tamburo et al.

ICCV 2025posterarXiv:2406.07661
9
citations
#2955

Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment

Yifan Zhang, Ge Zhang, Yue Wu et al.

ICML 2025posterarXiv:2410.02197
9
citations
#2956

SILO: Solving Inverse Problems with Latent Operators

Ron Raphaeli, Sean Man, Michael Elad

ICCV 2025posterarXiv:2501.11746
9
citations
#2957

StreamMind: Unlocking Full Frame Rate Streaming Video Dialogue through Event-Gated Cognition

Xin Ding, Hao Wu, Yifan Yang et al.

ICCV 2025posterarXiv:2503.06220
9
citations
#2958

DM-Adapter: Domain-Aware Mixture-of-Adapters for Text-Based Person Retrieval

Yating Liu, Zimo Liu, Xiangyuan Lan et al.

AAAI 2025paperarXiv:2503.04144
9
citations
#2959

Diffusion Transformers as Open-World Spatiotemporal Foundation Models

Yuan Yuan, Chonghua Han, Jingtao Ding et al.

NEURIPS 2025oralarXiv:2411.12164
9
citations
#2960

OS-ATLAS: Foundation Action Model for Generalist GUI Agents

Zhiyong Wu, Zhenyu Wu, Fangzhi Xu et al.

ICLR 2025poster
9
citations
#2961

Perspective-Invariant 3D Object Detection

Alan Liang, Lingdong Kong, Dongyue Lu et al.

ICCV 2025posterarXiv:2507.17665
9
citations
#2962

MP-GUI: Modality Perception with MLLMs for GUI Understanding

Ziwei Wang, Weizhi Chen, Leyang Yang et al.

CVPR 2025posterarXiv:2503.14021
9
citations
#2963

UAVScenes: A Multi-Modal Dataset for UAVs

Sijie Wang, Siqi Li, Yawei Zhang et al.

ICCV 2025posterarXiv:2507.22412
9
citations
#2964

How Transformers Learn Structured Data: Insights From Hierarchical Filtering

Jerome Garnier-Brun, Marc Mezard, Emanuele Moscato et al.

ICML 2025posterarXiv:2408.15138
9
citations
#2965

Make Me Happier: Evoking Emotions Through Image Diffusion Models

Qing Lin, Jingfeng Zhang, YEW-SOON ONG et al.

ICCV 2025posterarXiv:2403.08255
9
citations
#2966

SurFhead: Affine Rig Blending for Geometrically Accurate 2D Gaussian Surfel Head Avatars

Jaeseong Lee, Taewoong Kang, Marcel Buehler et al.

ICLR 2025posterarXiv:2410.11682
9
citations
#2967

Can Transformers Do Enumerative Geometry?

Baran Hashemi, Roderic Corominas, Alessandro Giacchetto

ICLR 2025posterarXiv:2408.14915
9
citations
#2968

Diffusion models for Gaussian distributions: Exact solutions and Wasserstein errors

Emile Pierret, Bruno Galerne

ICML 2025posterarXiv:2405.14250
9
citations
#2969

BrainUICL: An Unsupervised Individual Continual Learning Framework for EEG Applications

Yangxuan Zhou, Sha Zhao, Jiquan Wang et al.

ICLR 2025poster
9
citations
#2970

Generalized and Efficient 2D Gaussian Splatting for Arbitrary-scale Super-Resolution

Du Chen, Liyi Chen, Zhengqiang ZHANG et al.

ICCV 2025posterarXiv:2501.06838
9
citations
#2971

MUDDFormer: Breaking Residual Bottlenecks in Transformers via Multiway Dynamic Dense Connections

Da Xiao, Qingye Meng, Shengping Li et al.

ICML 2025posterarXiv:2502.12170
9
citations
#2972

Discrete Codebook World Models for Continuous Control

Aidan Scannell, Mohammadreza Nakhaeinezhadfard, Kalle Kujanpää et al.

ICLR 2025posterarXiv:2503.00653
9
citations
#2973

MINERVA: Evaluating Complex Video Reasoning

Arsha Nagrani, Sachit Menon, Ahmet Iscen et al.

ICCV 2025posterarXiv:2505.00681
9
citations
#2974

DiffCalib: Reformulating Monocular Camera Calibration as Diffusion-Based Dense Incident Map Generation

Xiankang He, Guangkai Xu, Bo Zhang et al.

AAAI 2025paperarXiv:2405.15619
9
citations
#2975

Towards Federated RLHF with Aggregated Client Preference for LLMs

Feijie Wu, Xiaoze Liu, Haoyu Wang et al.

ICLR 2025posterarXiv:2407.03038
9
citations
#2976

MultiADS: Defect-aware Supervision for Multi-type Anomaly Detection and Segmentation in Zero-Shot Learning

Ylli Sadikaj, Hongkuan Zhou, Lavdim Halilaj et al.

ICCV 2025posterarXiv:2504.06740
9
citations
#2977

ReQFlow: Rectified Quaternion Flow for Efficient and High-Quality Protein Backbone Generation

Angxiao Yue, Zichong Wang, Hongteng Xu

ICML 2025posterarXiv:2502.14637
9
citations
#2978

MIB: A Mechanistic Interpretability Benchmark

Aaron Mueller, Atticus Geiger, Sarah Wiegreffe et al.

ICML 2025posterarXiv:2504.13151
9
citations
#2979

Learning to Adapt Frozen CLIP for Few-Shot Test-Time Domain Adaptation

Zhixiang Chi, Li Gu, Huan Liu et al.

ICLR 2025posterarXiv:2506.17307
9
citations
#2980

Graph Generative Pre-trained Transformer

Xiaohui Chen, Yinkai Wang, JIAXING HE et al.

ICML 2025posterarXiv:2501.01073
9
citations
#2981

Diff3DS: Generating View-Consistent 3D Sketch via Differentiable Curve Rendering

Yibo Zhang, Lihong Wang, Changqing Zou et al.

ICLR 2025posterarXiv:2405.15305
9
citations
#2982

Advancing Expert Specialization for Better MoE

Hongcan Guo, Haolang Lu, Guoshun Nan et al.

NEURIPS 2025oralarXiv:2505.22323
9
citations
#2983

Multi-Pair Temporal Sentence Grounding via Multi-Thread Knowledge Transfer Network

Xiang Fang, Wanlong Fang, Changshuo Wang et al.

AAAI 2025paperarXiv:2412.15678
9
citations
#2984

Jailbreaking as a Reward Misspecification Problem

Zhihui Xie, Jiahui Gao, Lei Li et al.

ICLR 2025posterarXiv:2406.14393
9
citations
#2985

LeFusion: Controllable Pathology Synthesis via Lesion-Focused Diffusion Models

Hantao Zhang, Yuhe Liu, Jiancheng Yang et al.

ICLR 2025posterarXiv:2403.14066
9
citations
#2986

Accelerating LLM Inference with Lossless Speculative Decoding Algorithms for Heterogeneous Vocabularies

Nadav Timor, Jonathan Mamou, Daniel Korat et al.

ICML 2025oralarXiv:2502.05202
9
citations
#2987

Gap-Dependent Bounds for Q-Learning using Reference-Advantage Decomposition

Zhong Zheng, Haochen Zhang, Lingzhou Xue

ICLR 2025posterarXiv:2410.07574
9
citations
#2988

Wide Neural Networks Trained with Weight Decay Provably Exhibit Neural Collapse

Arthur Jacot, Peter Súkeník, Zihan Wang et al.

ICLR 2025posterarXiv:2410.04887
9
citations
#2989

PENCIL: Long Thoughts with Short Memory

Chenxiao Yang, Nati Srebro, David McAllester et al.

ICML 2025posterarXiv:2503.14337
9
citations
#2990

Repo2Run: Automated Building Executable Environment for Code Repository at Scale

Ruida Hu, Chao Peng, XinchenWang et al.

NEURIPS 2025spotlightarXiv:2502.13681
9
citations
#2991

Reasoning as an Adaptive Defense for Safety

Taeyoun Kim, Fahim Tajwar, Aditi Raghunathan et al.

NEURIPS 2025posterarXiv:2507.00971
9
citations
#2992

HELM: Hierarchical Encoding for mRNA Language Modeling

Mehdi Yazdani-Jahromi, Mangal Prakash, Tommaso Mansi et al.

ICLR 2025posterarXiv:2410.12459
9
citations
#2993

BodyGen: Advancing Towards Efficient Embodiment Co-Design

Haofei Lu, Zhe Wu, Junliang Xing et al.

ICLR 2025oralarXiv:2503.00533
9
citations
#2994

InPO: Inversion Preference Optimization with Reparametrized DDIM for Efficient Diffusion Model Alignment

Yunhong Lu, Qichao Wang, Hengyuan Cao et al.

CVPR 2025highlightarXiv:2503.18454
9
citations
#2995

EAP-GP: Mitigating Saturation Effect in Gradient-based Automated Circuit Identification

Lin Zhang, Wenshuo Dong, Zhuoran Zhang et al.

NEURIPS 2025posterarXiv:2502.06852
9
citations
#2996

Rethinking Evaluation of Sparse Autoencoders through the Representation of Polysemous Words

Gouki Gouki, Hiroki Furuta, Yusuke Iwasawa et al.

ICLR 2025posterarXiv:2501.06254
9
citations
#2997

CAD-Coder: Text-to-CAD Generation with Chain-of-Thought and Geometric Reward

Yandong Guan, Xilin Wang, XiMing Xing et al.

NEURIPS 2025posterarXiv:2505.19713
9
citations
#2998

On the Training Convergence of Transformers for In-Context Classification of Gaussian Mixtures

Wei Shen, Ruida Zhou, Jing Yang et al.

ICML 2025posterarXiv:2410.11778
9
citations
#2999

Zeroth-Order Policy Gradient for Reinforcement Learning from Human Feedback without Reward Inference

Qining Zhang, Lei Ying

ICLR 2025posterarXiv:2409.17401
9
citations
#3000

The Non-Linear Representation Dilemma: Is Causal Abstraction Enough for Mechanistic Interpretability?

Denis Sutter, Julian Minder, Thomas Hofmann et al.

NEURIPS 2025spotlightarXiv:2507.08802
9
citations