Most Cited 2025 "latent dimension alignment" Papers

22,274 papers found • Page 11 of 112

#2001

S2Gaussian: Sparse-View Super-Resolution 3D Gaussian Splatting

Yecong Wan, Mingwen Shao, Yuanshuo Cheng et al.

CVPR 2025posterarXiv:2503.04314
15
citations
#2002

LeVo: High-Quality Song Generation with Multi-Preference Alignment

Shun Lei, Yaoxun XU, ZhiweiLin et al.

NEURIPS 2025posterarXiv:2506.07520
15
citations
#2003

GotenNet: Rethinking Efficient 3D Equivariant Graph Neural Networks

Sarp Aykent, Tian Xia

ICLR 2025poster
15
citations
#2004

GGS: Generalizable Gaussian Splatting for Lane Switching in Autonomous Driving

Huasong Han, Kaixuan Zhou, Xiaoxiao Long et al.

AAAI 2025paperarXiv:2409.02382
15
citations
#2005

Citations and Trust in LLM Generated Responses

Yifan Ding, Matthew Facciani, Ellen Joyce et al.

AAAI 2025paperarXiv:2501.01303
15
citations
#2006

FreeTimeGS: Free Gaussian Primitives at Anytime Anywhere for Dynamic Scene Reconstruction

Yifan Wang, Peishan Yang, Zhen Xu et al.

CVPR 2025poster
15
citations
#2007

Falcon: Faster and Parallel Inference of Large Language Models Through Enhanced Semi-Autoregressive Drafting and Custom-Designed Decoding Tree

Xiangxiang Gao, Weisheng Xie, Yiwei Xiang et al.

AAAI 2025paperarXiv:2412.12639
15
citations
#2008

Security Attacks on LLM-based Code Completion Tools

Wen Cheng, Ke Sun, Xinyu Zhang et al.

AAAI 2025paperarXiv:2408.11006
15
citations
#2009

Multi-Domain Graph Foundation Models: Robust Knowledge Transfer via Topology Alignment

Shuo Wang, Bokui Wang, Zhixiang Shen et al.

ICML 2025posterarXiv:2502.02017
15
citations
#2010

OmniPaint: Mastering Object-Oriented Editing via Disentangled Insertion-Removal Inpainting

Yongsheng Yu, Ziyun Zeng, Haitian Zheng et al.

ICCV 2025posterarXiv:2503.08677
15
citations
#2011

LightPROF: A Lightweight Reasoning Framework for Large Language Model on Knowledge Graph

Tu Ao, Yanhua Yu, Yuling Wang et al.

AAAI 2025paperarXiv:2504.03137
15
citations
#2012

Training-Free Efficient Video Generation via Dynamic Token Carving

Yuechen Zhang, Jinbo Xing, bin xia et al.

NEURIPS 2025posterarXiv:2505.16864
15
citations
#2013

VideoHallu: Evaluating and Mitigating Multi-modal Hallucinations on Synthetic Video Understanding

Zongxia Li, Xiyang Wu, Guangyao Shi et al.

NEURIPS 2025posterarXiv:2505.01481
15
citations
#2014

RocketEval: Efficient automated LLM evaluation via grading checklist

Tianjun Wei, Wei Wen, Ruizhi Qiao et al.

ICLR 2025posterarXiv:2503.05142
15
citations
#2015

ChunkKV: Semantic-Preserving KV Cache Compression for Efficient Long-Context LLM Inference

Xiang Liu, Zhenheng Tang, Peijie Dong et al.

NEURIPS 2025posterarXiv:2502.00299
15
citations
#2016

SPAM: Spike-Aware Adam with Momentum Reset for Stable LLM Training

Tianjin Huang, Ziquan Zhu, Gaojie Jin et al.

ICLR 2025posterarXiv:2501.06842
15
citations
#2017

AGENTIF: Benchmarking Large Language Models Instruction Following Ability in Agentic Scenarios

Yunjia Qi, Hao Peng, Xiaozhi Wang et al.

NEURIPS 2025spotlight
15
citations
#2018

Recoverable Compression: A Multimodal Vision Token Recovery Mechanism Guided by Text Information

Yi Chen, Jian Xu, Xu-Yao Zhang et al.

AAAI 2025paperarXiv:2409.01179
15
citations
#2019

MallowsPO: Fine-Tune Your LLM with Preference Dispersions

Haoxian Chen, Hanyang Zhao, Henry Lam et al.

ICLR 2025posterarXiv:2405.14953
15
citations
#2020

MobileUse: A Hierarchical Reflection-Driven GUI Agent for Autonomous Mobile Operation

Ning Li, Xiangmou Qu, Jiamu Zhou et al.

NEURIPS 2025oral
15
citations
#2021

Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning

Qinghao Ye, Xianhan Zeng, Fu Li et al.

ICLR 2025posterarXiv:2503.07906
15
citations
#2022

Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond)

Liwei Jiang, Yuanjun Chai, Margaret Li et al.

NEURIPS 2025oralarXiv:2510.22954
15
citations
#2023

Exploring Sparse MoE in GANs for Text-conditioned Image Synthesis

Jiapeng Zhu, Ceyuan Yang, Kecheng Zheng et al.

CVPR 2025posterarXiv:2309.03904
15
citations
#2024

Any6D: Model-free 6D Pose Estimation of Novel Object

Taeyeop Lee, Bowen Wen, Minjun Kang et al.

CVPR 2025posterarXiv:2503.18673
15
citations
#2025

Transformers Struggle to Learn to Search

Abulhair Saparov, Srushti Ajay Pawar, Shreyas Pimpalgaonkar et al.

ICLR 2025posterarXiv:2412.04703
15
citations
#2026

Prismatic Synthesis: Gradient-based Data Diversification Boosts Generalization in LLM Reasoning

Jaehun Jung, Seungju Han, Ximing Lu et al.

NEURIPS 2025spotlightarXiv:2505.20161
15
citations
#2027

Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval

Sheryl Hsu, Omar Khattab, Chelsea Finn et al.

ICLR 2025posterarXiv:2410.23214
15
citations
#2028

Identifiable Exchangeable Mechanisms for Causal Structure and Representation Learning

Patrik Reizinger, Siyuan Guo, Ferenc Huszar et al.

ICLR 2025posterarXiv:2406.14302
15
citations
#2029

Multi-Layer Visual Feature Fusion in Multimodal LLMs: Methods, Analysis, and Best Practices

Junyan Lin, Haoran Chen, Yue Fan et al.

CVPR 2025posterarXiv:2503.06063
15
citations
#2030

Scalable Discrete Diffusion Samplers: Combinatorial Optimization and Statistical Physics

Sebastian Sanokowski, Wilhelm Berghammer, Haoyu Wang et al.

ICLR 2025posterarXiv:2502.08696
15
citations
#2031

VideoAutoArena: An Automated Arena for Evaluating Large Multimodal Models in Video Analysis through User Simulation

Ziyang Luo, Haoning Wu, Dongxu Li et al.

CVPR 2025posterarXiv:2411.13281
15
citations
#2032

Continuous Ensemble Weather Forecasting with Diffusion models

Martin Andrae, Tomas Landelius, Joel Oskarsson et al.

ICLR 2025oralarXiv:2410.05431
15
citations
#2033

Assessing Modality Bias in Video Question Answering Benchmarks with Multimodal Large Language Models

Jean Park, Kuk Jin Jang, Basam Alasaly et al.

AAAI 2025paperarXiv:2408.12763
15
citations
#2034

Learning Efficient Positional Encodings with Graph Neural Networks

Charilaos Kanatsoulis, Evelyn Choi, Stefanie Jegelka et al.

ICLR 2025posterarXiv:2502.01122
15
citations
#2035

The Dual-Route Model of Induction

Sheridan Feucht, Eric Todd, Byron C Wallace et al.

COLM 2025paperarXiv:2504.03022
15
citations
#2036

Scaling Vision Pre-Training to 4K Resolution

Baifeng Shi, Boyi Li, Han Cai et al.

CVPR 2025highlightarXiv:2503.19903
15
citations
#2037

SiReRAG: Indexing Similar and Related Information for Multihop Reasoning

Nan Zhang, Prafulla Kumar Choubey, Alexander Fabbri et al.

ICLR 2025posterarXiv:2412.06206
15
citations
#2038

Universal Cross-Tokenizer Distillation via Approximate Likelihood Matching

Benjamin Minixhofer, Ivan Vulić, Edoardo Maria Ponti

NEURIPS 2025posterarXiv:2503.20083
15
citations
#2039

Diffusion Models are Evolutionary Algorithms

Yanbo Zhang, Benedikt Hartl, Hananel Hazan et al.

ICLR 2025posterarXiv:2410.02543
15
citations
#2040

The Pitfalls of Memorization: When Memorization Hurts Generalization

Reza Bayat, Mohammad Pezeshki, Elvis Dohmatob et al.

ICLR 2025posterarXiv:2412.07684
15
citations
#2041

TabDPT: Scaling Tabular Foundation Models on Real Data

Junwei Ma, Valentin Thomas, Rasa Hosseinzadeh et al.

NEURIPS 2025posterarXiv:2410.18164
15
citations
#2042

Streamlining Redundant Layers to Compress Large Language Models

Xiaodong Chen, Yuxuan Hu, Jing Zhang et al.

ICLR 2025posterarXiv:2403.19135
15
citations
#2043

DreamRenderer: Taming Multi-Instance Attribute Control in Large-Scale Text-to-Image Models

Dewei Zhou, Mingwei Li, Zongxin Yang et al.

ICCV 2025posterarXiv:2503.12885
15
citations
#2044

FlowTok: Flowing Seamlessly Across Text and Image Tokens

Ju He, Qihang Yu, Qihao Liu et al.

ICCV 2025posterarXiv:2503.10772
15
citations
#2045

PolyPythias: Stability and Outliers across Fifty Language Model Pre-Training Runs

Oskar van der Wal, Pietro Lesci, Max Müller-Eberstein et al.

ICLR 2025posterarXiv:2503.09543
15
citations
#2046

ET-SEED: EFFICIENT TRAJECTORY-LEVEL SE(3) EQUIVARIANT DIFFUSION POLICY

Chenrui Tie, Yue Chen, Ruihai Wu et al.

ICLR 2025posterarXiv:2411.03990
15
citations
#2047

TimeDP: Learning to Generate Multi-Domain Time Series with Domain Prompts

Yu-Hao Huang, Chang Xu, Yueying Wu et al.

AAAI 2025paperarXiv:2501.05403
15
citations
#2048

FaithDiff: Unleashing Diffusion Priors for Faithful Image Super-resolution

Junyang Chen, Jinshan Pan, Jiangxin Dong

CVPR 2025posterarXiv:2411.18824
15
citations
#2049

CoTFormer: A Chain of Thought Driven Architecture with Budget-Adaptive Computation Cost at Inference

Amirkeivan Mohtashami, Matteo Pagliardini, Martin Jaggi

ICLR 2025posterarXiv:2310.10845
15
citations
#2050

Generating Multi-Image Synthetic Data for Text-to-Image Customization

Nupur Kumari, Xi Yin, Jun-Yan Zhu et al.

ICCV 2025posterarXiv:2502.01720
15
citations
#2051

LongProc: Benchmarking Long-Context Language Models on Long Procedural Generation

Xi Ye, Fangcong Yin, Yinghui He et al.

COLM 2025paper
15
citations
#2052

AutoRedTeamer: Autonomous Red Teaming with Lifelong Attack Integration

Andy Zhou, Kevin Wu, Francesco Pinto et al.

NEURIPS 2025posterarXiv:2503.15754
15
citations
#2053

RelationAdapter: Learning and Transferring Visual Relation with Diffusion Transformers

Yan Gong, Yiren Song, Yicheng Li et al.

NEURIPS 2025posterarXiv:2506.02528
15
citations
#2054

Segment Policy Optimization: Effective Segment-Level Credit Assignment in RL for Large Language Models

Yiran Guo, Lijie Xu, Jie Liu et al.

NEURIPS 2025posterarXiv:2505.23564
15
citations
#2055

ILIAS: Instance-Level Image retrieval At Scale

Giorgos Kordopatis-Zilos, Vladan Stojnić, Anna Manko et al.

CVPR 2025posterarXiv:2502.11748
15
citations
#2056

Spiking Vision Transformer with Saccadic Attention

Shuai Wang, Malu Zhang, Dehao Zhang et al.

ICLR 2025oralarXiv:2502.12677
15
citations
#2057

From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories, and Applications

Ajay Jaiswal, Yifan Wang, Lu Yin et al.

ICML 2025posterarXiv:2407.11239
15
citations
#2058

Dynamic Camera Poses and Where to Find Them

Chris Rockwell, Joseph Tung, Tsung-Yi Lin et al.

CVPR 2025posterarXiv:2504.17788
15
citations
#2059

PuzzleFusion++: Auto-agglomerative 3D Fracture Assembly by Denoise and Verify

Zhengqing Wang, Jiacheng Chen, Yasutaka Furukawa

ICLR 2025posterarXiv:2406.00259
15
citations
#2060

GaussianFlowOcc: Sparse and Weakly Supervised Occupancy Estimation using Gaussian Splatting and Temporal Flow

Simon Boeder, Fabian Gigengack, Benjamin Risse

ICCV 2025posterarXiv:2502.17288
15
citations
#2061

SOLVE: Synergy of Language-Vision and End-to-End Networks for Autonomous Driving

Xuesong Chen, Linjiang Huang, Tao Ma et al.

CVPR 2025posterarXiv:2505.16805
15
citations
#2062

RoboScape: Physics-informed Embodied World Model

Yu Shang, Xin Zhang, Yinzhou Tang et al.

NEURIPS 2025oralarXiv:2506.23135
15
citations
#2063

Logically Consistent Language Models via Neuro-Symbolic Integration

Diego Calanzone, Stefano Teso, Antonio Vergari

ICLR 2025posterarXiv:2409.13724
15
citations
#2064

Logical Consistency of Large Language Models in Fact-Checking

Bishwamittra Ghosh, Sarah Hasan, Naheed Anjum Arafat et al.

ICLR 2025posterarXiv:2412.16100
15
citations
#2065

Stochastic Deep Restoration Priors for Imaging Inverse Problems

Yuyang Hu, Albert Peng, Weijie Gan et al.

ICML 2025posterarXiv:2410.02057
15
citations
#2066

Revisiting Nearest Neighbor for Tabular Data: A Deep Tabular Baseline Two Decades Later

Han-Jia Ye, Huai-Hong Yin, De-Chuan Zhan et al.

ICLR 2025posterarXiv:2407.03257
15
citations
#2067

Restoring Calibration for Aligned Large Language Models: A Calibration-Aware Fine-Tuning Approach

Jiancong Xiao, Bojian Hou, Zhanliang Wang et al.

ICML 2025posterarXiv:2505.01997
15
citations
#2068

DaWin: Training-free Dynamic Weight Interpolation for Robust Adaptation

Changdae Oh, Yixuan Li, Kyungwoo Song et al.

ICLR 2025posterarXiv:2410.03782
15
citations
#2069

FireEdit: Fine-grained Instruction-based Image Editing via Region-aware Vision Language Model

Jun Zhou, Jiahao Li, Zunnan Xu et al.

CVPR 2025posterarXiv:2503.19839
15
citations
#2070

Thermalizer: Stable autoregressive neural emulation of spatiotemporal chaos

Chris Pedersen, Laure Zanna, Joan Bruna

ICML 2025oralarXiv:2503.18731
15
citations
#2071

Personalized Federated Learning for Spatio-Temporal Forecasting: A Dual Semantic Alignment-Based Contrastive Approach

Qingxiang Liu, Sheng Sun, Yuxuan Liang et al.

AAAI 2025paperarXiv:2404.03702
15
citations
#2072

Regressing the Relative Future: Efficient Policy Optimization for Multi-turn RLHF

Zhaolin Gao, Wenhao Zhan, Jonathan Chang et al.

ICLR 2025posterarXiv:2410.04612
15
citations
#2073

MangaNinja: Line Art Colorization with Precise Reference Following

Zhiheng Liu, Ka Leong Cheng, Xi Chen et al.

CVPR 2025highlightarXiv:2501.08332
15
citations
#2074

ST3: Accelerating Multimodal Large Language Model by Spatial-Temporal Visual Token Trimming

Jiedong Zhuang, Lu Lu, Ming Dai et al.

AAAI 2025paper
15
citations
#2075

Is Artificial Intelligence Generated Image Detection a Solved Problem?

Ziqiang Li, Jiazhen Yan, Ziwen He et al.

NEURIPS 2025posterarXiv:2505.12335
15
citations
#2076

Two-stream Beats One-stream: Asymmetric Siamese Network for Efficient Visual Tracking

Jiawen Zhu, Huayi Tang, Xin Chen et al.

AAAI 2025paperarXiv:2503.00516
15
citations
#2077

Breaking the Low-Rank Dilemma of Linear Attention

Qihang Fan, Huaibo Huang, Ran He

CVPR 2025posterarXiv:2411.07635
15
citations
#2078

Wasserstein Flow Matching: Generative Modeling Over Families of Distributions

Doron Haviv, Aram-Alexandre Pooladian, Dana Pe'er et al.

ICML 2025posterarXiv:2411.00698
15
citations
#2079

Reinforcement Learning Finetunes Small Subnetworks in Large Language Models

Sagnik Mukherjee, Lifan Yuan, Dilek Hakkani-Tur et al.

NEURIPS 2025posterarXiv:2505.11711
15
citations
#2080

Erwin: A Tree-based Hierarchical Transformer for Large-scale Physical Systems

Maksim Zhdanov, Max Welling, Jan-Willem van de Meent

ICML 2025posterarXiv:2502.17019
15
citations
#2081

Towards Robust and Parameter-Efficient Knowledge Unlearning for LLMs

Sungmin Cha, Sungjun Cho, Dasol Hwang et al.

ICLR 2025posterarXiv:2408.06621
15
citations
#2082

Tuning Timestep-Distilled Diffusion Model Using Pairwise Sample Optimization

Zichen Miao, Zhengyuan Yang, Kevin Lin et al.

ICLR 2025posterarXiv:2410.03190
15
citations
#2083

One-for-All Few-Shot Anomaly Detection via Instance-Induced Prompt Learning

Wenxi Lv, Qinliang Su, Wenchao Xu

ICLR 2025poster
15
citations
#2084

Black-Box Detection of Language Model Watermarks

Thibaud Gloaguen, Nikola Jovanović, Robin Staab et al.

ICLR 2025posterarXiv:2405.20777
15
citations
#2085

AI-Slop to AI-Polish? Aligning Language Models through Edit-Based Writing Rewards and Test-time computation

Tuhin Chakrabarty, Philippe Laban, Chien-Sheng Wu

COLM 2025paperarXiv:2504.07532
15
citations
#2086

BillBoard Splatting (BBSplat): Learnable Textured Primitives for Novel View Synthesis

David Svitov, Pietro Morerio, Lourdes Agapito et al.

ICCV 2025posterarXiv:2411.08508
15
citations
#2087

AutoToM: Scaling Model-based Mental Inference via Automated Agent Modeling

Zhining Zhang, Chuanyang Jin, Mung Yao Jia et al.

NEURIPS 2025spotlightarXiv:2502.15676
15
citations
#2088

Dynamic Gaussians Mesh: Consistent Mesh Reconstruction from Dynamic Scenes

Isabella Liu, Hao Su, Xiaolong Wang

ICLR 2025oralarXiv:2404.12379
15
citations
#2089

TimeKAN: KAN-based Frequency Decomposition Learning Architecture for Long-term Time Series Forecasting

Songtao Huang, Zhen Zhao, Can Li et al.

ICLR 2025oralarXiv:2502.06910
15
citations
#2090

DiffMS: Diffusion Generation of Molecules Conditioned on Mass Spectra

Montgomery Bohde, Mrunali Manjrekar, Runzhong Wang et al.

ICML 2025posterarXiv:2502.09571
15
citations
#2091

UNSURE: self-supervised learning with Unknown Noise level and Stein's Unbiased Risk Estimate

Julián Tachella, Mike Davies, Laurent Jacques

ICLR 2025posterarXiv:2409.01985
15
citations
#2092

PACT: Pruning and Clustering-Based Token Reduction for Faster Visual Language Models

Dhouib Mohamed, Davide Buscaldi, Vanier Sonia et al.

CVPR 2025posterarXiv:2504.08966
15
citations
#2093

IncEventGS: Pose-Free Gaussian Splatting from a Single Event Camera

Jian Huang, Chengrui Dong, Xuanhua Chen et al.

CVPR 2025highlightarXiv:2410.08107
15
citations
#2094

AIR-BENCH 2024: A Safety Benchmark based on Regulation and Policies Specified Risk Categories

Yi Zeng, Yu Yang, Andy Zhou et al.

ICLR 2025poster
15
citations
#2095

Multimodal Class-aware Semantic Enhancement Network for Audio-Visual Video Parsing

Pengcheng Zhao, Jinxing Zhou, Yang Zhao et al.

AAAI 2025paperarXiv:2412.11248
15
citations
#2096

VMBench: A Benchmark for Perception-Aligned Video Motion Generation

Xinran Ling, Chen Zhu, Meiqi Wu et al.

ICCV 2025posterarXiv:2503.10076
15
citations
#2097

CADCrafter: Generating Computer-Aided Design Models from Unconstrained Images

Chen Cheng, Jiacheng Wei, Tianrun Chen et al.

CVPR 2025posterarXiv:2504.04753
15
citations
#2098

MANTA: A Large-Scale Multi-View and Visual-Text Anomaly Detection Dataset for Tiny Objects

Lei Fan, Dongdong Fan, Zhiguang Hu et al.

CVPR 2025posterarXiv:2412.04867
15
citations
#2099

SpectroMotion: Dynamic 3D Reconstruction of Specular Scenes

Cheng-De Fan, Chen-Wei Chang, Yi-Ruei Liu et al.

CVPR 2025posterarXiv:2410.17249
15
citations
#2100

AdaGrad under Anisotropic Smoothness

Yuxing Liu, Rui Pan, Tong Zhang

ICLR 2025posterarXiv:2406.15244
14
citations
#2101

SITCOM: Step-wise Triple-Consistent Diffusion Sampling For Inverse Problems

Ismail Alkhouri, Shijun Liang, Cheng-Han Huang et al.

ICML 2025posterarXiv:2410.04479
14
citations
#2102

An Empirical Analysis of Uncertainty in Large Language Model Evaluations

Qiujie Xie, Qingqiu Li, Zhuohao Yu et al.

ICLR 2025posterarXiv:2502.10709
14
citations
#2103

ActionPiece: Contextually Tokenizing Action Sequences for Generative Recommendation

Yupeng Hou, Jianmo Ni, Zhankui He et al.

ICML 2025spotlightarXiv:2502.13581
14
citations
#2104

Presto! Distilling Steps and Layers for Accelerating Music Generation

Zachary Novack, Ge Zhu, Jonah Casebeer et al.

ICLR 2025posterarXiv:2410.05167
14
citations
#2105

MrT5: Dynamic Token Merging for Efficient Byte-level Language Models

Julie Kallini, Shikhar Murty, Christopher Manning et al.

ICLR 2025posterarXiv:2410.20771
14
citations
#2106

Efficient Track Anything

Yunyang Xiong, Chong Zhou, Xiaoyu Xiang et al.

ICCV 2025posterarXiv:2411.18933
14
citations
#2107

Improved Regret Bounds for Linear Adversarial MDPs via Linear Optimization

XiangCheng Zhang, Fang Kong, Baoxiang Wang et al.

ICLR 2025posterarXiv:2302.06834
14
citations
#2108

SwS: Self-aware Weakness-driven Problem Synthesis in Reinforcement Learning for LLM Reasoning

Xiao Liang, Zhong-Zhi Li, Yeyun Gong et al.

NEURIPS 2025posterarXiv:2506.08989
14
citations
#2109

Learning to Generate Unit Tests for Automated Debugging

Archiki Prasad, Elias Stengel-Eskin, Justin Chen et al.

COLM 2025paperarXiv:2502.01619
14
citations
#2110

Unified Parameter-Efficient Unlearning for LLMs

Chenlu Ding, Jiancan Wu, Yancheng Yuan et al.

ICLR 2025posterarXiv:2412.00383
14
citations
#2111

Diffusion on Language Model Encodings for Protein Sequence Generation

Viacheslav Meshchaninov, Pavel Strashnov, Andrey Shevtsov et al.

ICML 2025posterarXiv:2403.03726
14
citations
#2112

Toward Understanding In-context vs. In-weight Learning

Bryan Chan, Xinyi Chen, Andras Gyorgy et al.

ICLR 2025posterarXiv:2410.23042
14
citations
#2113

When Large Vision-Language Model Meets Large Remote Sensing Imagery: Coarse-to-Fine Text-Guided Token Pruning

Junwei Luo, Yingying Zhang, Xue Yang et al.

ICCV 2025posterarXiv:2503.07588
14
citations
#2114

Refine Knowledge of Large Language Models via Adaptive Contrastive Learning

Yinghui Li, Haojing Huang, Jiayi Kuang et al.

ICLR 2025posterarXiv:2502.07184
14
citations
#2115

Optimizing Temperature for Language Models with Multi-Sample Inference

Weihua Du, Yiming Yang, Sean Welleck

ICML 2025posterarXiv:2502.05234
14
citations
#2116

CofCA: A STEP-WISE Counterfactual Multi-hop QA benchmark

Jian Wu, Linyi Yang, Zhen Wang et al.

ICLR 2025posterarXiv:2402.11924
14
citations
#2117

Backdoor Attacks on Dense Retrieval via Public and Unintentional Triggers

Quanyu Long, Yue Deng, Leilei Gan et al.

COLM 2025paperarXiv:2402.13532
14
citations
#2118

TextToucher: Fine-Grained Text-to-Touch Generation

Jiahang Tu, Hao Fu, Fengyu Yang et al.

AAAI 2025paperarXiv:2409.05427
14
citations
#2119

Optimal transport-based conformal prediction

Gauthier Thurin, Kimia Nadjahi, Claire Boyer

ICML 2025posterarXiv:2501.18991
14
citations
#2120

KVTuner: Sensitivity-Aware Layer-Wise Mixed-Precision KV Cache Quantization for Efficient and Nearly Lossless LLM Inference

Xing Li, Zeyu Xing, Yiming Li et al.

ICML 2025posterarXiv:2502.04420
14
citations
#2121

UniOcc: A Unified Benchmark for Occupancy Forecasting and Prediction in Autonomous Driving

Yuping Wang, Xiangyu Huang, Xiaokang Sun et al.

ICCV 2025posterarXiv:2503.24381
14
citations
#2122

MVSAnywhere: Zero-Shot Multi-View Stereo

Sergio Izquierdo, Mohamed Sayed, Michael Firman et al.

CVPR 2025posterarXiv:2503.22430
14
citations
#2123

Explore In-Context Segmentation via Latent Diffusion Models

Chaoyang Wang, Xiangtai Li, Henghui Ding et al.

AAAI 2025paperarXiv:2403.09616
14
citations
#2124

Provably Accurate Shapley Value Estimation via Leverage Score Sampling

Christopher Musco, R. Teal Witter

ICLR 2025posterarXiv:2410.01917
14
citations
#2125

Probabilistic Language-Image Pre-Training

Sanghyuk Chun, Wonjae Kim, Song Park et al.

ICLR 2025posterarXiv:2410.18857
14
citations
#2126

Quantized Spike-driven Transformer

Xuerui Qiu, Malu Zhang, Jieyuan Zhang et al.

ICLR 2025posterarXiv:2501.13492
14
citations
#2127

Code Graph Model (CGM): A Graph-Integrated Large Language Model for Repository-Level Software Engineering Tasks

Hongyuan Tao, Ying Zhang, Zhenhao Tang et al.

NEURIPS 2025posterarXiv:2505.16901
14
citations
#2128

R-TPT: Improving Adversarial Robustness of Vision-Language Models through Test-Time Prompt Tuning

Lijun Sheng, Jian Liang, Zilei Wang et al.

CVPR 2025posterarXiv:2504.11195
14
citations
#2129

Joint Velocity-Growth Flow Matching for Single-Cell Dynamics Modeling

Dongyi Wang, Yuanwei Jiang, Zhenyi Zhang et al.

NEURIPS 2025posterarXiv:2505.13413
14
citations
#2130

To Backtrack or Not to Backtrack: When Sequential Search Limits Model Reasoning

Tian Qin, David Alvarez-Melis, Samy Jelassi et al.

COLM 2025paperarXiv:2504.07052
14
citations
#2131

STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis

Jiatao Gu, Tianrong Chen, David Berthelot et al.

NEURIPS 2025spotlightarXiv:2506.06276
14
citations
#2132

LONG3R: Long Sequence Streaming 3D Reconstruction

Zhuoguang Chen, Minghui Qin, Tianyuan Yuan et al.

ICCV 2025posterarXiv:2507.18255
14
citations
#2133

UGMathBench: A Diverse and Dynamic Benchmark for Undergraduate-Level Mathematical Reasoning with Large Language Models

Xin Xu, Jiaxin ZHANG, Tianhao Chen et al.

ICLR 2025posterarXiv:2501.13766
14
citations
#2134

Can Transformers Learn Full Bayesian Inference in Context?

Arik Reuter, Tim G. J. Rudner, Vincent Fortuin et al.

ICML 2025posterarXiv:2501.16825
14
citations
#2135

FlowDec: A flow-based full-band general audio codec with high perceptual quality

Simon Welker, Matthew Le, Ricky T. Q. Chen et al.

ICLR 2025posterarXiv:2503.01485
14
citations
#2136

AerialMegaDepth: Learning Aerial-Ground Reconstruction and View Synthesis

Khiem Vuong, Anurag Ghosh, Deva Ramanan et al.

CVPR 2025posterarXiv:2504.13157
14
citations
#2137

Samba: Synchronized Set-of-Sequences Modeling for Multiple Object Tracking

Mattia Segu, Luigi Piccinelli, Siyuan Li et al.

ICLR 2025oralarXiv:2410.01806
14
citations
#2138

Pippo: High-Resolution Multi-View Humans from a Single Image

Yash Kant, Ethan Weber, Jin Kyu Kim et al.

CVPR 2025highlightarXiv:2502.07785
14
citations
#2139

Patch-level Sounding Object Tracking for Audio-Visual Question Answering

Zhangbin Li, Jinxing Zhou, Jing Zhang et al.

AAAI 2025paperarXiv:2412.10749
14
citations
#2140

DyMO: Training-Free Diffusion Model Alignment with Dynamic Multi-Objective Scheduling

Xin Xie, Dong Gong

CVPR 2025posterarXiv:2412.00759
14
citations
#2141

Position: Don't Use the CLT in LLM Evals With Fewer Than a Few Hundred Datapoints

Sam Bowyer, Laurence Aitchison, Desi Ivanova

ICML 2025spotlightarXiv:2503.01747
14
citations
#2142

MovieBench: A Hierarchical Movie Level Dataset for Long Video Generation

Weijia Wu, Mingyu Liu, Zeyu Zhu et al.

CVPR 2025posterarXiv:2411.15262
14
citations
#2143

HiLo: A Learning Framework for Generalized Category Discovery Robust to Domain Shifts

Hongjun Wang, Sagar Vaze, Kai Han

ICLR 2025posterarXiv:2408.04591
14
citations
#2144

Provable weak-to-strong generalization via benign overfitting

David Wu, Anant Sahai

ICLR 2025posterarXiv:2410.04638
14
citations
#2145

Reversible Decoupling Network for Single Image Reflection Removal

Hao Zhao, Mingjia Li, Qiming Hu et al.

CVPR 2025posterarXiv:2410.08063
14
citations
#2146

FaceShot: Bring Any Character into Life

Junyao Gao, Yanan Sun, Fei Shen et al.

ICLR 2025posterarXiv:2503.00740
14
citations
#2147

UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface

Hao Tang, Chen-Wei Xie, Haiyang Wang et al.

NEURIPS 2025spotlightarXiv:2503.01342
14
citations
#2148

GigaHands: A Massive Annotated Dataset of Bimanual Hand Activities

Rao Fu, Dingxi Zhang, Alex Jiang et al.

CVPR 2025highlightarXiv:2412.04244
14
citations
#2149

E-Valuating Classifier Two-Sample Tests

Tim Bakker, Christian A. Naesseth, Patrick Forré et al.

ICLR 2025posterarXiv:2210.13027
14
citations
#2150

Personalized Preference Fine-tuning of Diffusion Models

Meihua Dang, Anikait Singh, Linqi Zhou et al.

CVPR 2025posterarXiv:2501.06655
14
citations
#2151

Bridging Modalities: Improving Universal Multimodal Retrieval by Multimodal Large Language Models

Xin Zhang, Yanzhao Zhang, Wen Xie et al.

CVPR 2025poster
14
citations
#2152

NitroFusion: High-Fidelity Single-Step Diffusion through Dynamic Adversarial Training

Dar-Yen Chen, Hmrishav Bandyopadhyay, Kai Zou et al.

CVPR 2025posterarXiv:2412.02030
14
citations
#2153

LlavaGuard: An Open VLM-based Framework for Safeguarding Vision Datasets and Models

Lukas Helff, Felix Friedrich, Manuel Brack et al.

ICML 2025posterarXiv:2406.05113
14
citations
#2154

IDProtector: An Adversarial Noise Encoder to Protect Against ID-Preserving Image Generation

Yiren Song, Pei Yang, Hai Ci et al.

CVPR 2025posterarXiv:2412.11638
14
citations
#2155

Layout-your-3D: Controllable and Precise 3D Generation with 2D Blueprint

Junwei Zhou, Xueting Li, Lu Qi et al.

ICLR 2025posterarXiv:2410.15391
14
citations
#2156

Assessing and Learning Alignment of Unimodal Vision and Language Models

Le Zhang, Qian Yang, Aishwarya Agrawal

CVPR 2025highlightarXiv:2412.04616
14
citations
#2157

X-Dyna: Expressive Dynamic Human Image Animation

Di Chang, Hongyi Xu, You Xie et al.

CVPR 2025highlightarXiv:2501.10021
14
citations
#2158

DRAWER: Digital Reconstruction and Articulation With Environment Realism

Hongchi Xia, Entong Su, Marius Memmel et al.

CVPR 2025posterarXiv:2504.15278
14
citations
#2159

Rethinking End-to-End 2D to 3D Scene Segmentation in Gaussian Splatting

Runsong Zhu, Shi Qiu, ZHENGZHE LIU et al.

CVPR 2025posterarXiv:2503.14029
14
citations
#2160

Unleashing Vecset Diffusion Model for Fast Shape Generation

Zeqiang Lai, Zhao Yunfei, Zibo Zhao et al.

ICCV 2025highlightarXiv:2503.16302
14
citations
#2161

Robust Function-Calling for On-Device Language Model via Function Masking

Qiqiang Lin, Muning Wen, Qiuying Peng et al.

ICLR 2025posterarXiv:2410.04587
14
citations
#2162

Beyond Canonicalization: How Tensorial Messages Improve Equivariant Message Passing

Peter Lippmann, Gerrit Gerhartz, Roman Remme et al.

ICLR 2025posterarXiv:2405.15389
14
citations
#2163

DynamicScaler: Seamless and Scalable Video Generation for Panoramic Scenes

Jinxiu Liu, Shaoheng Lin, Yinxiao Li et al.

CVPR 2025posterarXiv:2412.11100
14
citations
#2164

Know "No" Better: A Data-Driven Approach for Enhancing Negation Awareness in CLIP

Junsung Park, Jungbeom Lee, Jongyoon Song et al.

ICCV 2025posterarXiv:2501.10913
14
citations
#2165

Video Diffusion Models Are Strong Video Inpainter

Minhyeok Lee, Suhwan Cho, Chajin Shin et al.

AAAI 2025paperarXiv:2408.11402
14
citations
#2166

SAIST: Segment Any Infrared Small Target Model Guided by Contrastive Language-Image Pretraining

Mingjin Zhang, Xiaolong Li, Fei Gao et al.

CVPR 2025poster
14
citations
#2167

WorkflowLLM: Enhancing Workflow Orchestration Capability of Large Language Models

Shengda Fan, Xin Cong, Yuepeng Fu et al.

ICLR 2025posterarXiv:2411.05451
14
citations
#2168

Weighted-Reward Preference Optimization for Implicit Model Fusion

Ziyi Yang, Fanqi Wan, Longguang Zhong et al.

ICLR 2025posterarXiv:2412.03187
14
citations
#2169

CircuitFusion: Multimodal Circuit Representation Learning for Agile Chip Design

Wenji Fang, Shang Liu, Jing Wang et al.

ICLR 2025posterarXiv:2505.02168
14
citations
#2170

Implicit Search via Discrete Diffusion: A Study on Chess

Jiacheng Ye, Zhenyu Wu, Jiahui Gao et al.

ICLR 2025posterarXiv:2502.19805
14
citations
#2171

DPCore: Dynamic Prompt Coreset for Continual Test-Time Adaptation

Yunbei Zhang, Akshay Mehra, Shuaicheng Niu et al.

ICML 2025posterarXiv:2406.10737
14
citations
#2172

A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules

Kairong Luo, Haodong Wen, Shengding Hu et al.

ICLR 2025posterarXiv:2503.12811
14
citations
#2173

Multi-Turn Jailbreaking Large Language Models via Attention Shifting

Xiaohu Du, Fan Mo, Ming Wen et al.

AAAI 2025paper
14
citations
#2174

Pitfalls of Evidence-Based AI Policy

Stephen Casper, David Krueger, Dylan Hadfield-Menell

ICLR 2025posterarXiv:2502.09618
14
citations
#2175

Knowledge Editing with Dynamic Knowledge Graphs for Multi-Hop Question Answering

Yifan Lu, Yigeng Zhou, Jing Li et al.

AAAI 2025paperarXiv:2412.13782
14
citations
#2176

MicroVQA: A Multimodal Reasoning Benchmark for Microscopy-Based Scientific Research

James Burgess, Jeffrey J Nirschl, Laura Bravo-Sánchez et al.

CVPR 2025posterarXiv:2503.13399
14
citations
#2177

Towards a Theoretical Understanding of Synthetic Data in LLM Post-Training: A Reverse-Bottleneck Perspective

Zeyu Gan, Yong Liu

ICLR 2025posterarXiv:2410.01720
14
citations
#2178

Docopilot: Improving Multimodal Models for Document-Level Understanding

Yuchen Duan, Zhe Chen, Yusong Hu et al.

CVPR 2025posterarXiv:2507.14675
14
citations
#2179

DiffusionGuard: A Robust Defense Against Malicious Diffusion-based Image Editing

William June Suk Choi, Kyungmin Lee, Jongheon Jeong et al.

ICLR 2025posterarXiv:2410.05694
14
citations
#2180

Dissecting and Mitigating Diffusion Bias via Mechanistic Interpretability

Yingdong Shi, Changming Li, Yifan Wang et al.

CVPR 2025posterarXiv:2503.20483
14
citations
#2181

Synthetic-to-Real Self-supervised Robust Depth Estimation via Learning with Motion and Structure Priors

Weilong Yan, Ming Li, Li Haipeng et al.

CVPR 2025posterarXiv:2503.20211
14
citations
#2182

Cross-Modal Safety Mechanism Transfer in Large Vision-Language Models

Shicheng Xu, Liang Pang, Yunchang Zhu et al.

ICLR 2025posterarXiv:2410.12662
14
citations
#2183

Implicit Reward as the Bridge: A Unified View of SFT and DPO Connections

Bo Wang, Qinyuan Cheng, Runyu Peng et al.

NEURIPS 2025posterarXiv:2507.00018
14
citations
#2184

HGSFusion: Radar-Camera Fusion with Hybrid Generation and Synchronization for 3D Object Detection

Zijian Gu, Jianwei Ma, Yan Huang et al.

AAAI 2025paperarXiv:2412.11489
14
citations
#2185

Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models

Lucas Bandarkar, Benjamin Muller, Pritish Yuvraj et al.

ICLR 2025posterarXiv:2410.01335
14
citations
#2186

Mixture of Parrots: Experts improve memorization more than reasoning

Samy Jelassi, Clara Mohri, David Brandfonbrener et al.

ICLR 2025posterarXiv:2410.19034
14
citations
#2187

Spike2Former: Efficient Spiking Transformer for High-performance Image Segmentation

Zhenxin Lei, Man Yao, Jiakui Hu et al.

AAAI 2025paperarXiv:2412.14587
14
citations
#2188

SPORTU: A Comprehensive Sports Understanding Benchmark for Multimodal Large Language Models

Haotian Xia, Zhengbang Yang, Junbo Zou et al.

ICLR 2025posterarXiv:2410.08474
14
citations
#2189

CO-SPY: Combining Semantic and Pixel Features to Detect Synthetic Images by AI

Siyuan Cheng, Lingjuan Lyu, Zhenting Wang et al.

CVPR 2025posterarXiv:2503.18286
14
citations
#2190

Nested Learning: The Illusion of Deep Learning Architectures

Ali Behrouz, Meisam Razaviyayn, Peilin Zhong et al.

NEURIPS 2025posterarXiv:2512.24695
14
citations
#2191

Active Data Curation Effectively Distills Large-Scale Multimodal Models

Vishaal Udandarao, Nikhil Parthasarathy, Muhammad Ferjad Naeem et al.

CVPR 2025posterarXiv:2411.18674
14
citations
#2192

Robust Self-Paced Hashing for Cross-Modal Retrieval with Noisy Labels

Ruitao Pu, Yuan Sun, Yang Qin et al.

AAAI 2025paperarXiv:2501.01699
14
citations
#2193

NEST: A Neuromodulated Small-world Hypergraph Trajectory Prediction Model for Autonomous Driving

Chengyue Wang, Haicheng Liao, Bonan Wang et al.

AAAI 2025paperarXiv:2412.11682
14
citations
#2194

AutoMMLab: Automatically Generating Deployable Models from Language Instructions for Computer Vision Tasks

Zekang Yang, Wang Zeng, Sheng Jin et al.

AAAI 2025paperarXiv:2402.15351
14
citations
#2195

Geometry Image Diffusion: Fast and Data-Efficient Text-to-3D with Image-Based Surface Representation

Slava Elizarov, Ciara Rowles, Simon Donné

ICLR 2025posterarXiv:2409.03718
14
citations
#2196

Revisiting Prefix-tuning: Statistical Benefits of Reparameterization among Prompts

Minh Le, Chau Nguyen, Huy Nguyen et al.

ICLR 2025posterarXiv:2410.02200
14
citations
#2197

SQLFixAgent: Towards Semantic-Accurate Text-to-SQL Parsing via Consistency-Enhanced Multi-Agent Collaboration

Jipeng Cen, Jiaxin Liu, Zhixu Li et al.

AAAI 2025paperarXiv:2406.13408
14
citations
#2198

RelGNN: Composite Message Passing for Relational Deep Learning

Tianlang Chen, Charilaos Kanatsoulis, Jure Leskovec

ICML 2025posterarXiv:2502.06784
14
citations
#2199

Weak-to-Strong Generalization Through the Data-Centric Lens

Changho Shin, John Cooper, Frederic Sala

ICLR 2025posterarXiv:2412.03881
14
citations
#2200

InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption

Tiehan Fan, Kepan Nan, Rui Xie et al.

CVPR 2025posterarXiv:2412.09283
14
citations