Most Cited 2025 "style-content separation" Papers

22,274 papers found • Page 11 of 112

#2001

DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models

Dewei Zhou, Mingwei Li, Zongxin Yang et al.

ICCV 2025posterarXiv:2503.12885
15
citations
#2002

Streamlining Redundant Layers to Compress Large Language Models

Xiaodong Chen, Yuxuan Hu, Jing Zhang et al.

ICLR 2025posterarXiv:2403.19135
15
citations
#2003

Multi-Domain Graph Foundation Models: Robust Knowledge Transfer via Topology Alignment

Shuo Wang, Bokui Wang, Zhixiang Shen et al.

ICML 2025posterarXiv:2502.02017
15
citations
#2004

Logically Consistent Language Models via Neuro-Symbolic Integration

Diego Calanzone, Stefano Teso, Antonio Vergari

ICLR 2025posterarXiv:2409.13724
15
citations
#2005

Generating Multi-Image Synthetic Data for Text-to-Image Customization

Nupur Kumari, Xi Yin, Jun-Yan Zhu et al.

ICCV 2025posterarXiv:2502.01720
15
citations
#2006

FaithDiff: Unleashing Diffusion Priors for Faithful Image Super-resolution

Junyang Chen, Jinshan Pan, Jiangxin Dong

CVPR 2025posterarXiv:2411.18824
15
citations
#2007

CoTFormer: A Chain of Thought Driven Architecture with Budget-Adaptive Computation Cost at Inference

Amirkeivan Mohtashami, Matteo Pagliardini, Martin Jaggi

ICLR 2025posterarXiv:2310.10845
15
citations
#2008

ET-SEED: EFFICIENT TRAJECTORY-LEVEL SE(3) EQUIVARIANT DIFFUSION POLICY

Chenrui Tie, Yue Chen, Ruihai Wu et al.

ICLR 2025posterarXiv:2411.03990
15
citations
#2009

Logical Consistency of Large Language Models in Fact-Checking

Bishwamittra Ghosh, Sarah Hasan, Naheed Anjum Arafat et al.

ICLR 2025posterarXiv:2412.16100
15
citations
#2010

AutoRedTeamer: Autonomous Red Teaming with Lifelong Attack Integration

Andy Zhou, Kevin Wu, Francesco Pinto et al.

NEURIPS 2025posterarXiv:2503.15754
15
citations
#2011

TimeDP: Learning to Generate Multi-Domain Time Series with Domain Prompts

Yu-Hao Huang, Chang Xu, Yueying Wu et al.

AAAI 2025paperarXiv:2501.05403
15
citations
#2012

RelationAdapter: Learning and Transferring Visual Relation with Diffusion Transformers

Yan Gong, Yiren Song, Yicheng Li et al.

NEURIPS 2025posterarXiv:2506.02528
15
citations
#2013

GaussianFlowOcc: Sparse and Weakly Supervised Occupancy Estimation using Gaussian Splatting and Temporal Flow

Simon Boeder, Fabian Gigengack, Benjamin Risse

ICCV 2025posterarXiv:2502.17288
15
citations
#2014

ILIAS: Instance-Level Image retrieval At Scale

Giorgos Kordopatis-Zilos, Vladan Stojnić, Anna Manko et al.

CVPR 2025posterarXiv:2502.11748
15
citations
#2015

PuzzleFusion++: Auto-agglomerative 3D Fracture Assembly by Denoise and Verify

Zhengqing Wang, Jiacheng Chen, Yasutaka Furukawa

ICLR 2025posterarXiv:2406.00259
15
citations
#2016

Dynamic Camera Poses and Where to Find Them

Chris Rockwell, Joseph Tung, Tsung-Yi Lin et al.

CVPR 2025posterarXiv:2504.17788
15
citations
#2017

SOLVE: Synergy of Language-Vision and End-to-End Networks for Autonomous Driving

Xuesong Chen, Linjiang Huang, Tao Ma et al.

CVPR 2025posterarXiv:2505.16805
15
citations
#2018

RoboScape: Physics-informed Embodied World Model

Yu Shang, Xin Zhang, Yinzhou Tang et al.

NEURIPS 2025oralarXiv:2506.23135
15
citations
#2019

Spiking Vision Transformer with Saccadic Attention

Shuai Wang, Malu Zhang, Dehao Zhang et al.

ICLR 2025oralarXiv:2502.12677
15
citations
#2020

Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models

Yiran Guo, Lijie Xu, Jie Liu et al.

NEURIPS 2025posterarXiv:2505.23564
15
citations
#2021

From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories, and Applications

Ajay Jaiswal, Yifan Wang, Lu Yin et al.

ICML 2025posterarXiv:2407.11239
15
citations
#2022

Revisiting Nearest Neighbor for Tabular Data: A Deep Tabular Baseline Two Decades Later

Han-Jia Ye, Huai-Hong Yin, De-Chuan Zhan et al.

ICLR 2025posterarXiv:2407.03257
15
citations
#2023

DaWin: Training-free Dynamic Weight Interpolation for Robust Adaptation

Changdae Oh, Yixuan Li, Kyungwoo Song et al.

ICLR 2025posterarXiv:2410.03782
15
citations
#2024

Restoring Calibration for Aligned Large Language Models: A Calibration-Aware Fine-Tuning Approach

Jiancong Xiao, Bojian Hou, Zhanliang Wang et al.

ICML 2025posterarXiv:2505.01997
15
citations
#2025

FireEdit: Fine-grained Instruction-based Image Editing via Region-aware Vision Language Model

Jun Zhou, Jiahao Li, Zunnan Xu et al.

CVPR 2025posterarXiv:2503.19839
15
citations
#2026

Stochastic Deep Restoration Priors for Imaging Inverse Problems

Yuyang Hu, Albert Peng, Weijie Gan et al.

ICML 2025posterarXiv:2410.02057
15
citations
#2027

Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF

Zhaolin Gao, Wenhao Zhan, Jonathan Chang et al.

ICLR 2025posterarXiv:2410.04612
15
citations
#2028

Thermalizer: Stable autoregressive neural emulation of spatiotemporal chaos

Chris Pedersen, Laure Zanna, Joan Bruna

ICML 2025oralarXiv:2503.18731
15
citations
#2029

MangaNinja: Line Art Colorization with Precise Reference Following

Zhiheng Liu, Ka Leong Cheng, Xi Chen et al.

CVPR 2025highlightarXiv:2501.08332
15
citations
#2030

Is Artificial Intelligence Generated Image Detection a Solved Problem?

Ziqiang Li, Jiazhen Yan, Ziwen He et al.

NEURIPS 2025posterarXiv:2505.12335
15
citations
#2031

Breaking the Low-Rank Dilemma of Linear Attention

Qihang Fan, Huaibo Huang, Ran He

CVPR 2025posterarXiv:2411.07635
15
citations
#2032

Personalized Federated Learning for Spatio-Temporal Forecasting: A Dual Semantic Alignment-Based Contrastive Approach

Qingxiang Liu, Sheng Sun, Yuxuan Liang et al.

AAAI 2025paperarXiv:2404.03702
15
citations
#2033

Reinforcement Learning Finetunes Small Subnetworks in Large Language Models

Sagnik Mukherjee, Lifan Yuan, Dilek Hakkani-Tur et al.

NEURIPS 2025posterarXiv:2505.11711
15
citations
#2034

Towards Robust and Parameter-Efficient Knowledge Unlearning for LLMs

Sungmin Cha, Sungjun Cho, Dasol Hwang et al.

ICLR 2025posterarXiv:2408.06621
15
citations
#2035

One-for-All Few-Shot Anomaly Detection via Instance-Induced Prompt Learning

Wenxi Lv, Qinliang Su, Wenchao Xu

ICLR 2025poster
15
citations
#2036

Black-Box Detection of Language Model Watermarks

Thibaud Gloaguen, Nikola Jovanović, Robin Staab et al.

ICLR 2025posterarXiv:2405.20777
15
citations
#2037

VMBench: A Benchmark for Perception-Aligned Video Motion Generation

Xinran Ling, Chen Zhu, Meiqi Wu et al.

ICCV 2025posterarXiv:2503.10076
15
citations
#2038

Erwin: A Tree-based Hierarchical Transformer for Large-scale Physical Systems

Maksim Zhdanov, Max Welling, Jan-Willem van de Meent

ICML 2025posterarXiv:2502.17019
15
citations
#2039

Tuning Timestep-Distilled Diffusion Model Using Pairwise Sample Optimization

Zichen Miao, Zhengyuan Yang, Kevin Lin et al.

ICLR 2025posterarXiv:2410.03190
15
citations
#2040

Dynamic Gaussians Mesh: Consistent Mesh Reconstruction from Dynamic Scenes

Isabella Liu, Hao Su, Xiaolong Wang

ICLR 2025oralarXiv:2404.12379
15
citations
#2041

TimeKAN: KAN-based Frequency Decomposition Learning Architecture for Long-term Time Series Forecasting

Songtao Huang, Zhen Zhao, Can Li et al.

ICLR 2025oralarXiv:2502.06910
15
citations
#2042

Wasserstein Flow Matching: Generative Modeling Over Families of Distributions

Doron Haviv, Aram-Alexandre Pooladian, Dana Pe'er et al.

ICML 2025posterarXiv:2411.00698
15
citations
#2043

BillBoard Splatting (BBSplat): Learnable Textured Primitives for Novel View Synthesis

David Svitov, Pietro Morerio, Lourdes Agapito et al.

ICCV 2025posterarXiv:2411.08508
15
citations
#2044

PACT: Pruning and Clustering-Based Token Reduction for Faster Visual Language Models

Dhouib Mohamed, Davide Buscaldi, Vanier Sonia et al.

CVPR 2025posterarXiv:2504.08966
15
citations
#2045

UNSURE: self-supervised learning with Unknown Noise level and Stein's Unbiased Risk Estimate

Julián Tachella, Mike Davies, Laurent Jacques

ICLR 2025posterarXiv:2409.01985
15
citations
#2046

IncEventGS: Pose-Free Gaussian Splatting from a Single Event Camera

Jian Huang, Chengrui Dong, Xuanhua Chen et al.

CVPR 2025highlightarXiv:2410.08107
15
citations
#2047

AIR-BENCH 2024: A Safety Benchmark based on Regulation and Policies Specified Risk Categories

Yi Zeng, Yu Yang, Andy Zhou et al.

ICLR 2025poster
15
citations
#2048

AutoToM: Scaling Model-based Mental Inference via Automated Agent Modeling

Zhining Zhang, Chuanyang Jin, Mung Yao Jia et al.

NEURIPS 2025spotlightarXiv:2502.15676
15
citations
#2049

Asymmetric REINFORCE for off-Policy Reinforcement Learning: Balancing positive and negative rewards

Charles Arnal, Gaëtan Narozniak, Vivien Cabannes et al.

NEURIPS 2025posterarXiv:2506.20520
15
citations
#2050

CADCrafter: Generating Computer-Aided Design Models from Unconstrained Images

Chen Cheng, Jiacheng Wei, Tianrun Chen et al.

CVPR 2025posterarXiv:2504.04753
15
citations
#2051

MANTA: A Large-Scale Multi-View and Visual-Text Anomaly Detection Dataset for Tiny Objects

Lei Fan, Dongdong Fan, Zhiguang Hu et al.

CVPR 2025posterarXiv:2412.04867
15
citations
#2052

FlowTok: Flowing Seamlessly Across Text and Image Tokens

Ju He, Qihang Yu, Qihao Liu et al.

ICCV 2025posterarXiv:2503.10772
15
citations
#2053

SpectroMotion: Dynamic 3D Reconstruction of Specular Scenes

Cheng-De Fan, Chen-Wei Chang, Yi-Ruei Liu et al.

CVPR 2025posterarXiv:2410.17249
15
citations
#2054

Two-stream Beats One-stream: Asymmetric Siamese Network for Efficient Visual Tracking

Jiawen Zhu, Huayi Tang, Xin Chen et al.

AAAI 2025paperarXiv:2503.00516
15
citations
#2055

AdaGrad under Anisotropic Smoothness

Yuxing Liu, Rui Pan, Tong Zhang

ICLR 2025posterarXiv:2406.15244
14
citations
#2056

Interpreting the linear structure of vision-language model embedding spaces

Isabel Papadimitriou, Huangyuan Su, Thomas Fel et al.

COLM 2025paperarXiv:2504.11695
14
citations
#2057

An Empirical Analysis of Uncertainty in Large Language Model Evaluations

Qiujie Xie, Qingqiu Li, Zhuohao Yu et al.

ICLR 2025posterarXiv:2502.10709
14
citations
#2058

ActionPiece: Contextually Tokenizing Action Sequences for Generative Recommendation

Yupeng Hou, Jianmo Ni, Zhankui He et al.

ICML 2025spotlightarXiv:2502.13581
14
citations
#2059

Presto! Distilling Steps and Layers for Accelerating Music Generation

Zachary Novack, Ge Zhu, Jonah Casebeer et al.

ICLR 2025posterarXiv:2410.05167
14
citations
#2060

SITCOM: Step-wise Triple-Consistent Diffusion Sampling For Inverse Problems

Ismail Alkhouri, Shijun Liang, Cheng-Han Huang et al.

ICML 2025posterarXiv:2410.04479
14
citations
#2061

Can A Society of Generative Agents Simulate Human Behavior and Inform Public Health Policy? A Case Study on Vaccine Hesitancy

Abe Bohan Hou, Hongru Du, Yichen Wang et al.

COLM 2025paperarXiv:2503.09639
14
citations
#2062

Improved Regret Bounds for Linear Adversarial MDPs via Linear Optimization

XiangCheng Zhang, Fang Kong, Baoxiang Wang et al.

ICLR 2025posterarXiv:2302.06834
14
citations
#2063

MrT5: Dynamic Token Merging for Efficient Byte-level Language Models

Julie Kallini, Shikhar Murty, Christopher Manning et al.

ICLR 2025posterarXiv:2410.20771
14
citations
#2064

Efficient Track Anything

Yunyang Xiong, Chong Zhou, Xiaoyu Xiang et al.

ICCV 2025posterarXiv:2411.18933
14
citations
#2065

Toward Understanding In-context vs. In-weight Learning

Bryan Chan, Xinyi Chen, Andras Gyorgy et al.

ICLR 2025posterarXiv:2410.23042
14
citations
#2066

Diffusion on Language Model Encodings for Protein Sequence Generation

Viacheslav Meshchaninov, Pavel Strashnov, Andrey Shevtsov et al.

ICML 2025posterarXiv:2403.03726
14
citations
#2067

Learning to Generate Unit Tests for Automated Debugging

Archiki Prasad, Elias Stengel-Eskin, Justin Chen et al.

COLM 2025paperarXiv:2502.01619
14
citations
#2068

SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning

Xiao Liang, Zhong-Zhi Li, Yeyun Gong et al.

NEURIPS 2025posterarXiv:2506.08989
14
citations
#2069

When Large Vision-Language Model Meets Large Remote Sensing Imagery: Coarse-to-Fine Text-Guided Token Pruning

Junwei Luo, Yingying Zhang, Xue Yang et al.

ICCV 2025posterarXiv:2503.07588
14
citations
#2070

CofCA: A STEP-WISE Counterfactual Multi-hop QA benchmark

Jian Wu, Linyi Yang, Zhen Wang et al.

ICLR 2025posterarXiv:2402.11924
14
citations
#2071

Optimizing Temperature for Language Models with Multi-Sample Inference

Weihua Du, Yiming Yang, Sean Welleck

ICML 2025posterarXiv:2502.05234
14
citations
#2072

KVTuner: Sensitivity-Aware Layer-Wise Mixed-Precision KV Cache Quantization for Efficient and Nearly Lossless LLM Inference

Xing Li, Zeyu Xing, Yiming Li et al.

ICML 2025posterarXiv:2502.04420
14
citations
#2073

UniOcc: A Unified Benchmark for Occupancy Forecasting and Prediction in Autonomous Driving

Yuping Wang, Xiangyu Huang, Xiaokang Sun et al.

ICCV 2025posterarXiv:2503.24381
14
citations
#2074

Refine Knowledge of Large Language Models via Adaptive Contrastive Learning

Yinghui Li, Haojing Huang, Jiayi Kuang et al.

ICLR 2025posterarXiv:2502.07184
14
citations
#2075

MVSAnywhere: Zero-Shot Multi-View Stereo

Sergio Izquierdo, Mohamed Sayed, Michael Firman et al.

CVPR 2025posterarXiv:2503.22430
14
citations
#2076

Explore In-Context Segmentation via Latent Diffusion Models

Chaoyang Wang, Xiangtai Li, Henghui Ding et al.

AAAI 2025paperarXiv:2403.09616
14
citations
#2077

Optimal transport-based conformal prediction

Gauthier Thurin, Kimia Nadjahi, Claire Boyer

ICML 2025posterarXiv:2501.18991
14
citations
#2078

Probabilistic Language-Image Pre-Training

Sanghyuk Chun, Wonjae Kim, Song Park et al.

ICLR 2025posterarXiv:2410.18857
14
citations
#2079

Provably Accurate Shapley Value Estimation via Leverage Score Sampling

Christopher Musco, R. Teal Witter

ICLR 2025posterarXiv:2410.01917
14
citations
#2080

Quantized Spike-driven Transformer

Xuerui Qiu, Malu Zhang, Jieyuan Zhang et al.

ICLR 2025posterarXiv:2501.13492
14
citations
#2081

To Backtrack or Not to Backtrack: When Sequential Search Limits Model Reasoning

Tian Qin, David Alvarez-Melis, Samy Jelassi et al.

COLM 2025paperarXiv:2504.07052
14
citations
#2082

R-TPT: Improving Adversarial Robustness of Vision-Language Models through Test-Time Prompt Tuning

Lijun Sheng, Jian Liang, Zilei Wang et al.

CVPR 2025posterarXiv:2504.11195
14
citations
#2083

Code Graph Model (CGM): A Graph-Integrated Large Language Model for Repository-Level Software Engineering Tasks

Hongyuan Tao, Ying Zhang, Zhenhao Tang et al.

NEURIPS 2025posterarXiv:2505.16901
14
citations
#2084

Unified Parameter-Efficient Unlearning for LLMs

Chenlu Ding, Jiancan Wu, Yancheng Yuan et al.

ICLR 2025posterarXiv:2412.00383
14
citations
#2085

Backdoor Attacks on Dense Retrieval via Public and Unintentional Triggers

Quanyu Long, Yue Deng, Leilei Gan et al.

COLM 2025paperarXiv:2402.13532
14
citations
#2086

LONG3R: Long Sequence Streaming 3D Reconstruction

Zhuoguang Chen, Minghui Qin, Tianyuan Yuan et al.

ICCV 2025posterarXiv:2507.18255
14
citations
#2087

STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis

Jiatao Gu, Tianrong Chen, David Berthelot et al.

NEURIPS 2025spotlightarXiv:2506.06276
14
citations
#2088

FlowDec: A flow-based full-band general audio codec with high perceptual quality

Simon Welker, Matthew Le, Ricky T. Q. Chen et al.

ICLR 2025posterarXiv:2503.01485
14
citations
#2089

AerialMegaDepth: Learning Aerial-Ground Reconstruction and View Synthesis

Khiem Vuong, Anurag Ghosh, Deva Ramanan et al.

CVPR 2025posterarXiv:2504.13157
14
citations
#2090

UGMathBench: A Diverse and Dynamic Benchmark for Undergraduate-Level Mathematical Reasoning with Large Language Models

Xin Xu, Jiaxin ZHANG, Tianhao Chen et al.

ICLR 2025posterarXiv:2501.13766
14
citations
#2091

Can Transformers Learn Full Bayesian Inference in Context?

Arik Reuter, Tim G. J. Rudner, Vincent Fortuin et al.

ICML 2025posterarXiv:2501.16825
14
citations
#2092

Samba: Synchronized Set-of-Sequences Modeling for Multiple Object Tracking

Mattia Segu, Luigi Piccinelli, Siyuan Li et al.

ICLR 2025oralarXiv:2410.01806
14
citations
#2093

Pippo: High-Resolution Multi-View Humans from a Single Image

Yash Kant, Ethan Weber, Jin Kyu Kim et al.

CVPR 2025highlightarXiv:2502.07785
14
citations
#2094

Patch-level Sounding Object Tracking for Audio-Visual Question Answering

Zhangbin Li, Jinxing Zhou, Jing Zhang et al.

AAAI 2025paperarXiv:2412.10749
14
citations
#2095

DyMO: Training-Free Diffusion Model Alignment with Dynamic Multi-Objective Scheduling

Xin Xie, Dong Gong

CVPR 2025posterarXiv:2412.00759
14
citations
#2096

MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation

Weijia Wu, Mingyu Liu, Zeyu Zhu et al.

CVPR 2025posterarXiv:2411.15262
14
citations
#2097

Robust Function-Calling for On-Device Language Model via Function Masking

Qiqiang Lin, Muning Wen, Qiuying Peng et al.

ICLR 2025posterarXiv:2410.04587
14
citations
#2098

Provable weak-to-strong generalization via benign overfitting

David Wu, Anant Sahai

ICLR 2025posterarXiv:2410.04638
14
citations
#2099

HiLo: A Learning Framework for Generalized Category Discovery Robust to Domain Shifts

Hongjun Wang, Sagar Vaze, Kai Han

ICLR 2025posterarXiv:2408.04591
14
citations
#2100

Joint Velocity-Growth Flow Matching for Single-Cell Dynamics Modeling

Dongyi Wang, Yuanwei Jiang, Zhenyi Zhang et al.

NEURIPS 2025posterarXiv:2505.13413
14
citations
#2101

Reversible Decoupling Network for Single Image Reflection Removal

Hao Zhao, Mingjia Li, Qiming Hu et al.

CVPR 2025posterarXiv:2410.08063
14
citations
#2102

FaceShot: Bring Any Character into Life

Junyao Gao, Yanan Sun, Fei Shen et al.

ICLR 2025posterarXiv:2503.00740
14
citations
#2103

Position: Don't Use the CLT in LLM Evals With Fewer Than a Few Hundred Datapoints

Sam Bowyer, Laurence Aitchison, Desi Ivanova

ICML 2025spotlightarXiv:2503.01747
14
citations
#2104

GigaHands: A Massive Annotated Dataset of Bimanual Hand Activities

Rao Fu, Dingxi Zhang, Alex Jiang et al.

CVPR 2025highlightarXiv:2412.04244
14
citations
#2105

Personalized Preference Fine-tuning of Diffusion Models

Meihua Dang, Anikait Singh, Linqi Zhou et al.

CVPR 2025posterarXiv:2501.06655
14
citations
#2106

UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface

Hao Tang, Chen-Wei Xie, Haiyang Wang et al.

NEURIPS 2025spotlightarXiv:2503.01342
14
citations
#2107

Bridging Modalities: Improving Universal Multimodal Retrieval by Multimodal Large Language Models

Xin Zhang, Yanzhao Zhang, Wen Xie et al.

CVPR 2025poster
14
citations
#2108

Video Diffusion Models Are Strong Video Inpainter

Minhyeok Lee, Suhwan Cho, Chajin Shin et al.

AAAI 2025paperarXiv:2408.11402
14
citations
#2109

NitroFusion: High-Fidelity Single-Step Diffusion through Dynamic Adversarial Training

Dar-Yen Chen, Hmrishav Bandyopadhyay, Kai Zou et al.

CVPR 2025posterarXiv:2412.02030
14
citations
#2110

IDProtector: An Adversarial Noise Encoder to Protect Against ID-Preserving Image Generation

Yiren Song, Pei Yang, Hai Ci et al.

CVPR 2025posterarXiv:2412.11638
14
citations
#2111

E-Valuating Classifier Two-Sample Tests

Tim Bakker, Christian A. Naesseth, Patrick Forré et al.

ICLR 2025posterarXiv:2210.13027
14
citations
#2112

Assessing and Learning Alignment of Unimodal Vision and Language Models

Le Zhang, Qian Yang, Aishwarya Agrawal

CVPR 2025highlightarXiv:2412.04616
14
citations
#2113

X-Dyna: Expressive Dynamic Human Image Animation

Di Chang, Hongyi Xu, You Xie et al.

CVPR 2025highlightarXiv:2501.10021
14
citations
#2114

DRAWER: Digital Reconstruction and Articulation With Environment Realism

Hongchi Xia, Entong Su, Marius Memmel et al.

CVPR 2025posterarXiv:2504.15278
14
citations
#2115

Rethinking End-to-End 2D to 3D Scene Segmentation in Gaussian Splatting

Runsong Zhu, Shi Qiu, ZHENGZHE LIU et al.

CVPR 2025posterarXiv:2503.14029
14
citations
#2116

Unleashing Vecset Diffusion Model for Fast Shape Generation

Zeqiang Lai, Zhao Yunfei, Zibo Zhao et al.

ICCV 2025highlightarXiv:2503.16302
14
citations
#2117

WorkflowLLM: Enhancing Workflow Orchestration Capability of Large Language Models

Shengda Fan, Xin Cong, Yuepeng Fu et al.

ICLR 2025posterarXiv:2411.05451
14
citations
#2118

DynamicScaler: Seamless and Scalable Video Generation for Panoramic Scenes

Jinxiu Liu, Shaoheng Lin, Yinxiao Li et al.

CVPR 2025posterarXiv:2412.11100
14
citations
#2119

FunBO: Discovering Acquisition Functions for Bayesian Optimization with FunSearch

Virginia Aglietti, Ira Ktena, Jessica Schrouff et al.

ICML 2025posterarXiv:2406.04824
14
citations
#2120

Know "No" Better: A Data-Driven Approach for Enhancing Negation Awareness in CLIP

Junsung Park, Jungbeom Lee, Jongyoon Song et al.

ICCV 2025posterarXiv:2501.10913
14
citations
#2121

Layout-your-3D: Controllable and Precise 3D Generation with 2D Blueprint

Junwei Zhou, Xueting Li, Lu Qi et al.

ICLR 2025posterarXiv:2410.15391
14
citations
#2122

SAIST: Segment Any Infrared Small Target Model Guided by Contrastive Language-Image Pretraining

Mingjin Zhang, Xiaolong Li, Fei Gao et al.

CVPR 2025poster
14
citations
#2123

CircuitFusion: Multimodal Circuit Representation Learning for Agile Chip Design

Wenji Fang, Shang Liu, Jing Wang et al.

ICLR 2025posterarXiv:2505.02168
14
citations
#2124

Knowledge Editing with Dynamic Knowledge Graphs for Multi-Hop Question Answering

Yifan Lu, Yigeng Zhou, Jing Li et al.

AAAI 2025paperarXiv:2412.13782
14
citations
#2125

Implicit Search via Discrete Diffusion: A Study on Chess

Jiacheng Ye, Zhenyu Wu, Jiahui Gao et al.

ICLR 2025posterarXiv:2502.19805
14
citations
#2126

Weighted-Reward Preference Optimization for Implicit Model Fusion

Ziyi Yang, Fanqi Wan, Longguang Zhong et al.

ICLR 2025posterarXiv:2412.03187
14
citations
#2127

Pitfalls of Evidence-Based AI Policy

Stephen Casper, David Krueger, Dylan Hadfield-Menell

ICLR 2025posterarXiv:2502.09618
14
citations
#2128

A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules

Kairong Luo, Haodong Wen, Shengding Hu et al.

ICLR 2025posterarXiv:2503.12811
14
citations
#2129

MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research

James Burgess, Jeffrey J Nirschl, Laura Bravo-Sánchez et al.

CVPR 2025posterarXiv:2503.13399
14
citations
#2130

Multi-Turn Jailbreaking Large Language Models via Attention Shifting

Xiaohu Du, Fan Mo, Ming Wen et al.

AAAI 2025paper
14
citations
#2131

Implicit Reward as the Bridge: A Unified View of SFT and DPO Connections

Bo Wang, Qinyuan Cheng, Runyu Peng et al.

NEURIPS 2025posterarXiv:2507.00018
14
citations
#2132

Docopilot: Improving Multimodal Models for Document-Level Understanding

Yuchen Duan, Zhe Chen, Yusong Hu et al.

CVPR 2025posterarXiv:2507.14675
14
citations
#2133

Towards a Theoretical Understanding of Synthetic Data in LLM Post-Training: A Reverse-Bottleneck Perspective

Zeyu Gan, Yong Liu

ICLR 2025posterarXiv:2410.01720
14
citations
#2134

Dissecting and Mitigating Diffusion Bias via Mechanistic Interpretability

Yingdong Shi, Changming Li, Yifan Wang et al.

CVPR 2025posterarXiv:2503.20483
14
citations
#2135

HGSFusion: Radar-Camera Fusion with Hybrid Generation and Synchronization for 3D Object Detection

Zijian Gu, Jianwei Ma, Yan Huang et al.

AAAI 2025paperarXiv:2412.11489
14
citations
#2136

Synthetic-to-Real Self-supervised Robust Depth Estimation via Learning with Motion and Structure Priors

Weilong Yan, Ming Li, Li Haipeng et al.

CVPR 2025posterarXiv:2503.20211
14
citations
#2137

Cross-Modal Safety Mechanism Transfer in Large Vision-Language Models

Shicheng Xu, Liang Pang, Yunchang Zhu et al.

ICLR 2025posterarXiv:2410.12662
14
citations
#2138

DiffusionGuard: A Robust Defense Against Malicious Diffusion-based Image Editing

William June Suk Choi, Kyungmin Lee, Jongheon Jeong et al.

ICLR 2025posterarXiv:2410.05694
14
citations
#2139

Mixture of Parrots: Experts improve memorization more than reasoning

Samy Jelassi, Clara Mohri, David Brandfonbrener et al.

ICLR 2025posterarXiv:2410.19034
14
citations
#2140

AutoMMLab: Automatically Generating Deployable Models from Language Instructions for Computer Vision Tasks

Zekang Yang, Wang Zeng, Sheng Jin et al.

AAAI 2025paperarXiv:2402.15351
14
citations
#2141

Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models

Lucas Bandarkar, Benjamin Muller, Pritish Yuvraj et al.

ICLR 2025posterarXiv:2410.01335
14
citations
#2142

CO-SPY: Combining Semantic and Pixel Features to Detect Synthetic Images by AI

Siyuan Cheng, Lingjuan Lyu, Zhenting Wang et al.

CVPR 2025posterarXiv:2503.18286
14
citations
#2143

Nested Learning: The Illusion of Deep Learning Architectures

Ali Behrouz, Meisam Razaviyayn, Peilin Zhong et al.

NEURIPS 2025posterarXiv:2512.24695
14
citations
#2144

Active Data Curation Effectively Distills Large-Scale Multimodal Models

Vishaal Udandarao, Nikhil Parthasarathy, Muhammad Ferjad Naeem et al.

CVPR 2025posterarXiv:2411.18674
14
citations
#2145

SPORTU: A Comprehensive Sports Understanding Benchmark for Multimodal Large Language Models

Haotian Xia, Zhengbang Yang, Junbo Zou et al.

ICLR 2025posterarXiv:2410.08474
14
citations
#2146

DPCore: Dynamic Prompt Coreset for Continual Test-Time Adaptation

Yunbei Zhang, Akshay Mehra, Shuaicheng Niu et al.

ICML 2025posterarXiv:2406.10737
14
citations
#2147

NEST: A Neuromodulated Small-world Hypergraph Trajectory Prediction Model for Autonomous Driving

Chengyue Wang, Haicheng Liao, Bonan Wang et al.

AAAI 2025paperarXiv:2412.11682
14
citations
#2148

VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation

Ziyang Luo, Haoning Wu, Dongxu Li et al.

CVPR 2025posterarXiv:2411.13281
14
citations
#2149

Spike2Former: Efficient Spiking Transformer for High-performance Image Segmentation

Zhenxin Lei, Man Yao, Jiakui Hu et al.

AAAI 2025paperarXiv:2412.14587
14
citations
#2150

SQLFixAgent: Towards Semantic-Accurate Text-to-SQL Parsing via Consistency-Enhanced Multi-Agent Collaboration

Jipeng Cen, Jiaxin Liu, Zhixu Li et al.

AAAI 2025paperarXiv:2406.13408
14
citations
#2151

SimpleTM: A Simple Baseline for Multivariate Time Series Forecasting

Hui Chen, Viet Luong, Lopamudra Mukherjee et al.

ICLR 2025oral
14
citations
#2152

Geometry Image Diffusion: Fast and Data-Efficient Text-to-3D with Image-Based Surface Representation

Slava Elizarov, Ciara Rowles, Simon Donné

ICLR 2025posterarXiv:2409.03718
14
citations
#2153

Revisiting Prefix-tuning: Statistical Benefits of Reparameterization among Prompts

Minh Le, Chau Nguyen, Huy Nguyen et al.

ICLR 2025posterarXiv:2410.02200
14
citations
#2154

Weak-to-Strong Generalization Through the Data-Centric Lens

Changho Shin, John Cooper, Frederic Sala

ICLR 2025posterarXiv:2412.03881
14
citations
#2155

BingoGuard: LLM Content Moderation Tools with Risk Levels

Fan Yin, Philippe Laban, XIANGYU PENG et al.

ICLR 2025posterarXiv:2503.06550
14
citations
#2156

InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption

Tiehan Fan, Kepan Nan, Rui Xie et al.

CVPR 2025posterarXiv:2412.09283
14
citations
#2157

Deep Distributed Optimization for Large-Scale Quadratic Programming

Augustinos Saravanos, Hunter Kuperman, Alex Oshin et al.

ICLR 2025posterarXiv:2412.12156
14
citations
#2158

A Second-Order Perspective on Model Compositionality and Incremental Learning

Angelo Porrello, Lorenzo Bonicelli, Pietro Buzzega et al.

ICLR 2025posterarXiv:2405.16350
14
citations
#2159

The Power of LLM-Generated Synthetic Data for Stance Detection in Online Political Discussions

Stefan Sylvius Wagner, Maike Behrendt, Marc Ziegele et al.

ICLR 2025posterarXiv:2406.12480
14
citations
#2160

SleeperMark: Towards Robust Watermark against Fine-Tuning Text-to-image Diffusion Models

Zilan Wang, Junfeng Guo, Jiacheng Zhu et al.

CVPR 2025posterarXiv:2412.04852
14
citations
#2161

Overcoming Lower-Level Constraints in Bilevel Optimization: A Novel Approach with Regularized Gap Functions

Wei Yao, Haian Yin, Shangzhi Zeng et al.

ICLR 2025posterarXiv:2406.01992
14
citations
#2162

Towards Universal Soccer Video Understanding

Jiayuan Rao, Haoning Wu, Hao Jiang et al.

CVPR 2025posterarXiv:2412.01820
14
citations
#2163

Retrieval-Augmented Dynamic Prompt Tuning for Incomplete Multimodal Learning

Jian Lang, Zhangtao Cheng, Ting Zhong et al.

AAAI 2025paperarXiv:2501.01120
14
citations
#2164

SkillMimic: Learning Basketball Interaction Skills from Demonstrations

Yinhuai Wang, Qihan Zhao, Runyi Yu et al.

CVPR 2025highlightarXiv:2408.15270
14
citations
#2165

Weak-to-Strong Preference Optimization: Stealing Reward from Weak Aligned Model

Wenhong Zhu, Zhiwei He, Xiaofeng Wang et al.

ICLR 2025posterarXiv:2410.18640
14
citations
#2166

Block-Attention for Efficient Prefilling

Dongyang Ma, Yan Wang, Tian Lan

ICLR 2025posterarXiv:2409.15355
14
citations
#2167

Hybrid-Level Instruction Injection for Video Token Compression in Multi-modal Large Language Models

Zhihang Liu, Chen-Wei Xie, Pandeng Li et al.

CVPR 2025posterarXiv:2503.16036
14
citations
#2168

Toward Adaptive Large Language Models Structured Pruning via Hybrid-grained Weight Importance Assessment

Jun Liu, Zhenglun Kong, Pu Zhao et al.

AAAI 2025paperarXiv:2403.10799
14
citations
#2169

OmniSync: Towards Universal Lip Synchronization via Diffusion Transformers

Ziqiao Peng, Jiwen Liu, Haoxian Zhang et al.

NEURIPS 2025oralarXiv:2505.21448
14
citations
#2170

DPU: Dynamic Prototype Updating for Multimodal Out-of-Distribution Detection

Li Li, Huixian Gong, Hao Dong et al.

CVPR 2025highlightarXiv:2411.08227
14
citations
#2171

RelGNN: Composite Message Passing for Relational Deep Learning

Tianlang Chen, Charilaos Kanatsoulis, Jure Leskovec

ICML 2025posterarXiv:2502.06784
14
citations
#2172

MMRole: A Comprehensive Framework for Developing and Evaluating Multimodal Role-Playing Agents

Yanqi Dai, Huanran Hu, Lei Wang et al.

ICLR 2025posterarXiv:2408.04203
14
citations
#2173

Large Language Model Meets Graph Neural Network in Knowledge Distillation

Shengxiang Hu, Guobing Zou, Song Yang et al.

AAAI 2025paperarXiv:2402.05894
14
citations
#2174

Inference-Time Hyper-Scaling with KV Cache Compression

Adrian Łańcucki, Konrad Staniszewski, Piotr Nawrot et al.

NEURIPS 2025posterarXiv:2506.05345
14
citations
#2175

Optimization with Access to Auxiliary Information

EL MAHDI CHAYTI, Sai Karimireddy

ICLR 2025posterarXiv:2206.00395
14
citations
#2176

NFIG: Multi-Scale Autoregressive Image Generation via Frequency Ordering

Zhihao Huang, Xi Qiu, Yukuo Ma et al.

NEURIPS 2025posterarXiv:2503.07076
14
citations
#2177

Vision-Language Gradient Descent-driven All-in-One Deep Unfolding Networks

Haijin Zeng, Xiangming Wang, Yongyong Chen et al.

CVPR 2025posterarXiv:2503.16930
14
citations
#2178

BotSim: LLM-Powered Malicious Social Botnet Simulation

Boyu Qiao, Kun Li, Wei Zhou et al.

AAAI 2025paperarXiv:2412.13420
14
citations
#2179

Uni4D: Unifying Visual Foundation Models for 4D Modeling from a Single Video

David Yifan Yao, Albert J. Zhai, Shenlong Wang

CVPR 2025highlightarXiv:2503.21761
14
citations
#2180

Lost in Translation, Found in Context: Sign Language Translation with Contextual Cues

Youngjoon Jang, Haran Raajesh, Liliane Momeni et al.

CVPR 2025posterarXiv:2501.09754
14
citations
#2181

Geolocation Representation from Large Language Models Are Generic Enhancers for Spatio-Temporal Learning

Junlin He, Tong Nie, Wei Ma

AAAI 2025paperarXiv:2408.12116
14
citations
#2182

Retrieving Semantics from the Deep: an RAG Solution for Gesture Synthesis

M. Hamza Mughal, Rishabh Dabral, Merel CJ Scholman et al.

CVPR 2025posterarXiv:2412.06786
14
citations
#2183

Semantic Convergence: Harmonizing Recommender Systems via Two-Stage Alignment and Behavioral Semantic Tokenization

Guanghan Li, Xun Zhang, Yufei Zhang et al.

AAAI 2025paperarXiv:2412.13771
14
citations
#2184

Ready-to-React: Online Reaction Policy for Two-Character Interaction Generation

Zhi Cen, Huaijin Pi, Sida Peng et al.

ICLR 2025posterarXiv:2502.20370
14
citations
#2185

Adaptive Classifier-Free Guidance via Dynamic Low-Confidence Masking

Pengxiang Li, Shilin Yan, Jiayin Cai et al.

NEURIPS 2025posterarXiv:2505.20199
14
citations
#2186

REArtGS: Reconstructing and Generating Articulated Objects via 3D Gaussian Splatting with Geometric and Motion Constraints

Di Wu, Liu Liu, Zhou Linli et al.

NEURIPS 2025posterarXiv:2503.06677
14
citations
#2187

Omnia de EgoTempo: Benchmarking Temporal Understanding of Multi-Modal LLMs in Egocentric Videos

Chiara Plizzari, Alessio Tonioni, Yongqin Xian et al.

CVPR 2025posterarXiv:2503.13646
14
citations
#2188

Mechanistic Permutability: Match Features Across Layers

Nikita Balagansky, Ian Maksimov, Daniil Gavrilov

ICLR 2025posterarXiv:2410.07656
14
citations
#2189

Video Anomaly Detection with Motion and Appearance Guided Patch Diffusion Model

Hang Zhou, Jiale Cai, Yuteng Ye et al.

AAAI 2025paperarXiv:2412.09026
14
citations
#2190

ThinkBench: Dynamic Out-of-Distribution Evaluation for Robust LLM Reasoning

Shulin Huang, Linyi Yang, Yan Song et al.

NEURIPS 2025posterarXiv:2502.16268
14
citations
#2191

4DGT: Learning a 4D Gaussian Transformer Using Real-World Monocular Videos

Zhen Xu, Zhengqin Li, Zhao Dong et al.

NEURIPS 2025spotlightarXiv:2506.08015
14
citations
#2192

ReGenesis: LLMs can Grow into Reasoning Generalists via Self-Improvement

XIANGYU PENG, Congying Xia, Xinyi Yang et al.

ICLR 2025posterarXiv:2410.02108
14
citations
#2193

Open Vision Reasoner: Transferring Linguistic Cognitive Behavior for Visual Reasoning

Yana Wei, Liang Zhao, Jianjian Sun et al.

NEURIPS 2025posterarXiv:2507.05255
14
citations
#2194

OpenUnlearning: Accelerating LLM Unlearning via Unified Benchmarking of Methods and Metrics

Vineeth Dorna, Anmol Mekala, Wenlong Zhao et al.

NEURIPS 2025posterarXiv:2506.12618
14
citations
#2195

Online Reasoning Video Segmentation with Just-in-Time Digital Twins

Yiqing Shen, Bohan Liu, Chenjia Li et al.

ICCV 2025posterarXiv:2503.21056
14
citations
#2196

Proving Olympiad Inequalities by Synergizing LLMs and Symbolic Reasoning

Zenan Li, Zhaoyu Li, Wen Tang et al.

ICLR 2025posterarXiv:2502.13834
14
citations
#2197

JamMa: Ultra-lightweight Local Feature Matching with Joint Mamba

Xiaoyong Lu, Songlin Du

CVPR 2025posterarXiv:2503.03437
14
citations
#2198

Beyond Canonicalization: How Tensorial Messages Improve Equivariant Message Passing

Peter Lippmann, Gerrit Gerhartz, Roman Remme et al.

ICLR 2025posterarXiv:2405.15389
14
citations
#2199

MedAgentBoard: Benchmarking Multi-Agent Collaboration with Conventional Methods for Diverse Medical Tasks

Yinghao Zhu, Ziyi He, Haoran Hu et al.

NEURIPS 2025posterarXiv:2505.12371
14
citations
#2200

Low-Light Image Enhancement via Generative Perceptual Priors

Han Zhou, Wei Dong, Xiaohong Liu et al.

AAAI 2025paperarXiv:2412.20916
14
citations