Most Cited 2025 "invariant feature estimation" Papers

22,274 papers found • Page 26 of 112

#5001

Broken Tokens? Your Language Model can Secretly Handle Non-Canonical Tokenizations

Brian Zheng, Alisa Liu, Orevaoghene Ahia et al.

NEURIPS 2025spotlightarXiv:2506.19004
9
citations
#5002

VRBench: A Benchmark for Multi-Step Reasoning in Long Narrative Videos

Jiashuo Yu, Yue Wu, Meng Chu et al.

ICCV 2025arXiv:2506.10857
9
citations
#5003

HiMTok: Learning Hierarchical Mask Tokens for Image Segmentation with Large Multimodal Model

Tao Wang, Changxu Cheng, Lingfeng Wang et al.

ICCV 2025arXiv:2503.13026
9
citations
#5004

Task Generalization with Autoregressive Compositional Structure: Can Learning from $D$ Tasks Generalize to $D^T$ Tasks?

Amirhesam Abedsoltan, Huaqing Zhang, Kaiyue Wen et al.

ICML 2025arXiv:2502.08991
9
citations
#5005

Relieving Universal Label Noise for Unsupervised Visible-Infrared Person Re-Identification by Inferring from Neighbors

Xiao Teng, Long Lan, Dingyao Chen et al.

AAAI 2025paperarXiv:2412.12220
9
citations
#5006

MLE-STAR: Machine Learning Engineering Agent via Search and Targeted Refinement

Jaehyun Nam, Jinsung Yoon, Jiefeng Chen et al.

NEURIPS 2025arXiv:2506.15692
9
citations
#5007

DPC: Dual-Prompt Collaboration for Tuning Vision-Language Models

Haoyang Li, Liang Wang, Chao Wang et al.

CVPR 2025arXiv:2503.13443
9
citations
#5008

Compositional Risk Minimization

Divyat Mahajan, Mohammad Pezeshki, Charles Arnal et al.

ICML 2025arXiv:2410.06303
9
citations
#5009

Grounding Language with Vision: A Conditional Mutual Information Calibrated Decoding Strategy for Reducing Hallucinations in LVLMs

Hao Fang, Changle Zhou, Jiawei Kong et al.

NEURIPS 2025arXiv:2505.19678
9
citations
#5010

MUSE: Mamba Is Efficient Multi-scale Learner for Text-video Retrieval

Haoran Tang, Meng Cao, Jinfa Huang et al.

AAAI 2025paperarXiv:2408.10575
9
citations
#5011

From Specificity to Generality: Revisiting Generalizable Artifacts in Detecting Face Deepfakes

Long Ma, Zhiyuan Yan, Jin Xu et al.

NEURIPS 2025arXiv:2504.04827
9
citations
#5012

Evaluating Large Language Models through Role-Guide and Self-Reflection: A Comparative Study

Lili Zhao, Yang Wang, Qi Liu et al.

ICLR 2025
9
citations
#5013

DepthCues: Evaluating Monocular Depth Perception in Large Vision Models

Duolikun Danier, Mehmet Aygun, Changjian Li et al.

CVPR 2025arXiv:2411.17385
9
citations
#5014

DELIFT: Data Efficient Language model Instruction Fine-Tuning

Ishika Agarwal, Krishnateja Killamsetty, Lucian Popa et al.

ICLR 2025arXiv:2411.04425
9
citations
#5015

Do Visual Imaginations Improve Vision-and-Language Navigation Agents?

Akhil Perincherry, Jacob Krantz, Stefan Lee

CVPR 2025arXiv:2503.16394
9
citations
#5016

TabFlex: Scaling Tabular Learning to Millions with Linear Attention

Yuchen Zeng, Tuan Dinh, Wonjun Kang et al.

ICML 2025spotlightarXiv:2506.05584
9
citations
#5017

ManiVideo: Generating Hand-Object Manipulation Video with Dexterous and Generalizable Grasping

Youxin Pang, Ruizhi Shao, Jiajun Zhang et al.

CVPR 2025highlightarXiv:2412.16212
9
citations
#5018

Neighbor Does Matter: Density-Aware Contrastive Learning for Medical Semi-supervised Segmentation

Feilong Tang, Zhongxing Xu, Ming Hu et al.

AAAI 2025paperarXiv:2412.19871
9
citations
#5019

Combining Cost Constrained Runtime Monitors for AI Safety

Tim Hua, James Baskerville, Henri Lemoine et al.

NEURIPS 2025arXiv:2507.15886
9
citations
#5020

CompGS: Unleashing 2D Compositionality for Compositional Text-to-3D via Dynamically Optimizing 3D Gaussians

Chongjian GE, Chenfeng Xu, Yuanfeng Ji et al.

CVPR 2025arXiv:2410.20723
9
citations
#5021

Chimera: Improving Generalist Model with Domain-Specific Experts

Tianshuo Peng, Mingsheng Li, Jiakang Yuan et al.

ICCV 2025arXiv:2412.05983
9
citations
#5022

Hierarchical Vector Quantization for Unsupervised Action Segmentation

Federico Spurio, Emad Bahrami, Gianpiero Francesca et al.

AAAI 2025paperarXiv:2412.17640
9
citations
#5023

RepLDM: Reprogramming Pretrained Latent Diffusion Models for High-Quality, High-Efficiency, High-Resolution Image Generation

Boyuan Cao, Jiaxin Ye, Yujie Wei et al.

NEURIPS 2025spotlightarXiv:2410.06055
9
citations
#5024

HELM: Hierarchical Encoding for mRNA Language Modeling

Mehdi Yazdani-Jahromi, Mangal Prakash, Tommaso Mansi et al.

ICLR 2025arXiv:2410.12459
9
citations
#5025

DICE: End-to-end Deformation Capture of Hand-Face Interactions from a Single Image

Qingxuan Wu, Zhiyang Dou, Sirui Xu et al.

ICLR 2025arXiv:2406.17988
9
citations
#5026

Rapidly Adapting Policies to the Real-World via Simulation-Guided Fine-Tuning

Patrick Yin, Tyler Westenbroek, Ching-An Cheng et al.

ICLR 2025arXiv:2502.02705
9
citations
#5027

DIPO: Dual-State Images Controlled Articulated Object Generation Powered by Diverse Data

Ruiqi Wu, Xinjie wang, Liu.Liu et al.

NEURIPS 2025arXiv:2505.20460
9
citations
#5028

ActiveGAMER: Active GAussian Mapping through Efficient Rendering

Liyan Chen, Huangying Zhan, Kevin Chen et al.

CVPR 2025arXiv:2501.06897
9
citations
#5029

Mimic In-Context Learning for Multimodal Tasks

Yuchu Jiang, Jiale Fu, chenduo hao et al.

CVPR 2025arXiv:2504.08851
9
citations
#5030

BoA: Attention-aware Post-training Quantization without Backpropagation

Junhan Kim, Ho-young Kim, Eulrang Cho et al.

ICML 2025arXiv:2406.13474
9
citations
#5031

Multi-Focus Image Fusion via Explicit Defocus Blur Modelling

Yuhui Quan, Xi Wan, Zitao Tang et al.

AAAI 2025paper
9
citations
#5032

See It from My Perspective: How Language Affects Cultural Bias in Image Understanding

Amith Ananthram, Elias Stengel-Eskin, Mohit Bansal et al.

ICLR 2025arXiv:2406.11665
9
citations
#5033

Multi-Modal and Multi-Attribute Generation of Single Cells with CFGen

Alessandro Palma, Till Richter, Hanyi Zhang et al.

ICLR 2025arXiv:2407.11734
9
citations
#5034

Seurat: From Moving Points to Depth

Seokju Cho, Gabriel Huang, Seungryong Kim et al.

CVPR 2025highlightarXiv:2504.14687
9
citations
#5035

TransPixeler: Advancing Text-to-Video Generation with Transparency

Luozhou Wang, Yijun Li, ZhiFei Chen et al.

CVPR 2025arXiv:2501.03006
9
citations
#5036

A General Adaptive Dual-level Weighting Mechanism for Remote Sensing Pansharpening

Jie Huang, Haorui Chen, Jiaxuan Ren et al.

CVPR 2025arXiv:2503.13214
9
citations
#5037

MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments

MATTHIEU CORD, Antonin Vobecky, Oriane Siméoni et al.

ICLR 2025arXiv:2307.09361
9
citations
#5038

Lightweight Dataset Pruning without Full Training via Example Difficulty and Prediction Uncertainty

Yeseul Cho, Baekrok Shin, Changmin Kang et al.

ICML 2025arXiv:2502.06905
9
citations
#5039

iMoT: Inertial Motion Transformer for Inertial Navigation

Son Minh Nguyen, Duc Viet Le, Paul Havinga

AAAI 2025paperarXiv:2412.12190
9
citations
#5040

MP-SfM: Monocular Surface Priors for Robust Structure-from-Motion

Zador Pataki, Paul-Edouard Sarlin, Johannes Schönberger et al.

CVPR 2025arXiv:2504.20040
9
citations
#5041

ReVisionLLM: Recursive Vision-Language Model for Temporal Grounding in Hour-Long Videos

Tanveer Hannan, Md Mohaiminul Islam, Jindong Gu et al.

CVPR 2025arXiv:2411.14901
9
citations
#5042

From Debate to Equilibrium: Belief‑Driven Multi‑Agent LLM Reasoning via Bayesian Nash Equilibrium

Yi Xie, Zhanke Zhou, Chentao Cao et al.

ICML 2025arXiv:2506.08292
9
citations
#5043

PAL: Sample-Efficient Personalized Reward Modeling for Pluralistic Alignment

Daiwei Chen, Yi Chen, Aniket Rege et al.

ICLR 2025
9
citations
#5044

Progressive Compositionality in Text-to-Image Generative Models

Xu Han, Linghao Jin, Xiaofeng Liu et al.

ICLR 2025arXiv:2410.16719
9
citations
#5045

SegLLM: Multi-round Reasoning Segmentation with Large Language Models

Xudong Wang, Shaolun Zhang, Shufan Li et al.

ICLR 2025
9
citations
#5046

Ringmaster ASGD: The First Asynchronous SGD with Optimal Time Complexity

Artavazd Maranjyan, Alexander Tyurin, Peter Richtarik

ICML 2025arXiv:2501.16168
9
citations
#5047

Can Textual Gradient Work in Federated Learning?

Minghui Chen, Ruinan Jin, Wenlong Deng et al.

ICLR 2025arXiv:2502.19980
9
citations
#5048

DORNet: A Degradation Oriented and Regularized Network for Blind Depth Super-Resolution

Zhengxue Wang, Zhiqiang Yan, Jinshan Pan et al.

CVPR 2025arXiv:2410.11666
9
citations
#5049

ViCaS: A Dataset for Combining Holistic and Pixel-level Video Understanding using Captions with Grounded Segmentation

Ali Athar, Xueqing Deng, Liang-Chieh Chen

CVPR 2025arXiv:2412.09754
9
citations
#5050

Toward a Unified Theory of Gradient Descent under Generalized Smoothness

Alexander Tyurin

ICML 2025arXiv:2412.11773
9
citations
#5051

Mosaic3D: Foundation Dataset and Model for Open-Vocabulary 3D Segmentation

Junha Lee, Chunghyun Park, Jaesung Choe et al.

CVPR 2025arXiv:2502.02548
9
citations
#5052

Adversarial Generative Flow Network for Solving Vehicle Routing Problems

Ni Zhang, Jingfeng Yang, Zhiguang Cao et al.

ICLR 2025arXiv:2503.01931
9
citations
#5053

Breaking AR’s Sampling Bottleneck: Provable Acceleration via Diffusion Language Models

Gen Li, Changxiao Cai

NEURIPS 2025arXiv:2505.21400
9
citations
#5054

REGENT: A Retrieval-Augmented Generalist Agent That Can Act In-Context in New Environments

Kaustubh Sridhar, Souradeep Dutta, Dinesh Jayaraman et al.

ICLR 2025arXiv:2412.04759
9
citations
#5055

vesselFM: A Foundation Model for Universal 3D Blood Vessel Segmentation

Bastian Wittmann, Yannick Wattenberg, Tamaz Amiranashvili et al.

CVPR 2025arXiv:2411.17386
9
citations
#5056

Coarse Correspondences Boost Spatial-Temporal Reasoning in Multimodal Language Model

Benlin Liu, Yuhao Dong, Yiqin Wang et al.

CVPR 2025arXiv:2408.00754
9
citations
#5057

Generalizing from SIMPLE to HARD Visual Reasoning: Can We Mitigate Modality Imbalance in VLMs?

Simon Park, Abhishek Panigrahi, Yun Cheng et al.

ICML 2025arXiv:2501.02669
9
citations
#5058

Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation

Yuying Ge, Yizhuo Li, Yixiao Ge et al.

CVPR 2025arXiv:2412.04432
9
citations
#5059

Decomposition Polyhedra of Piecewise Linear Functions

Marie-Charlotte Brandenburg, Moritz Grillo, Christoph Hertrich

ICLR 2025arXiv:2410.04907
9
citations
#5060

Marten: Visual Question Answering with Mask Generation for Multi-modal Document Understanding

Zining Wang, Tongkun Guan, Pei Fu et al.

CVPR 2025arXiv:2503.14140
9
citations
#5061

Offline Model-Based Optimization by Learning to Rank

Rong-Xi Tan, Ke Xue, Shen-Huan Lyu et al.

ICLR 2025arXiv:2410.11502
9
citations
#5062

EdgeTAM: On-Device Track Anything Model

Chong Zhou, Chenchen Zhu, Yunyang Xiong et al.

CVPR 2025arXiv:2501.07256
9
citations
#5063

DataEnvGym: Data Generation Agents in Teacher Environments with Student Feedback

Zaid Khan, Elias Stengel-Eskin, Jaemin Cho et al.

ICLR 2025arXiv:2410.06215
9
citations
#5064

MVREC: A General Few-shot Defect Classification Model Using Multi-View Region-Context

Shuai Lyu, Rongchen Zhang, Zeqi Ma et al.

AAAI 2025paperarXiv:2412.16897
9
citations
#5065

DualOpt: A Dual Divide-and-Optimize Algorithm for the Large-scale Traveling Salesman Problem

Shipei Zhou, Yuandong Ding, Chi Zhang et al.

AAAI 2025paperarXiv:2501.08565
9
citations
#5066

MOVIS: Enhancing Multi-Object Novel View Synthesis for Indoor Scenes

Ruijie Lu, Yixin Chen, Junfeng Ni et al.

CVPR 2025arXiv:2412.11457
9
citations
#5067

Rendering-Aware Reinforcement Learning for Vector Graphics Generation

Juan Rodriguez, Haotian Zhang, Abhay Puri et al.

NEURIPS 2025arXiv:2505.20793
9
citations
#5068

Thinking Racial Bias in Fair Forgery Detection: Models, Datasets and Evaluations

Decheng Liu, Zongqi Wang, Chunlei Peng et al.

AAAI 2025paperarXiv:2407.14367
9
citations
#5069

Local-Prompt: Extensible Local Prompts for Few-Shot Out-of-Distribution Detection

Fanhu Zeng, Zhen Cheng, Fei Zhu et al.

ICLR 2025arXiv:2409.04796
9
citations
#5070

DriveEditor: A Unified 3D Information-Guided Framework for Controllable Object Editing in Driving Scenes

Yiyuan Liang, Zhiying Yan, Liqun Chen et al.

AAAI 2025paperarXiv:2412.19458
9
citations
#5071

Show and Segment: Universal Medical Image Segmentation via In-Context Learning

Yunhe Gao, Di Liu, Zhuowei Li et al.

CVPR 2025arXiv:2503.19359
9
citations
#5072

Scent of Knowledge: Optimizing Search-Enhanced Reasoning with Information Foraging

Hongjin Qian, Zheng Liu

NEURIPS 2025spotlightarXiv:2505.09316
9
citations
#5073

DreamPRM: Domain-reweighted Process Reward Model for Multimodal Reasoning

Qi Cao, Ruiyi Wang, Ruiyi Zhang et al.

NEURIPS 2025arXiv:2505.20241
9
citations
#5074

EventSplat: 3D Gaussian Splatting from Moving Event Cameras for Real-time Rendering

Toshiya Yura, Ashkan Mirzaei, Igor Gilitschenski

CVPR 2025arXiv:2412.07293
9
citations
#5075

Feature Denoising Diffusion Model for Blind Image Quality Assessment

Xudong Li, Yan Zhang, Yunhang Shen et al.

AAAI 2025paperarXiv:2401.11949
9
citations
#5076

Near-Optimal Sample Complexity for MDPs via Anchoring

Jongmin Lee, Mario Bravo, Roberto Cominetti

ICML 2025arXiv:2502.04477
9
citations
#5077

Energy-based Backdoor Defense Against Federated Graph Learning

Guancheng Wan, Zitong Shi, Wenke Huang et al.

ICLR 2025
9
citations
#5078

Is Your Video Language Model a Reliable Judge?

Ming Liu, Wensheng Zhang

ICLR 2025arXiv:2503.05977
9
citations
#5079

AnyBimanual: Transferring Unimanual Policy for General Bimanual Manipulation

Guanxing Lu, Tengbo Yu, Haoyuan Deng et al.

ICCV 2025arXiv:2412.06779
9
citations
#5080

PanSplat: 4K Panorama Synthesis with Feed-Forward Gaussian Splatting

Cheng Zhang, Haofei Xu, Qianyi Wu et al.

CVPR 2025arXiv:2412.12096
9
citations
#5081

NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative

Asmar Nadeem, Faegheh Sardari, Robert Dawes et al.

ICLR 2025oralarXiv:2406.06499
9
citations
#5082

FixTalk: Taming Identity Leakage for High-Quality Talking Head Generation in Extreme Cases

Shuai Tan, Bill Gong, Bin Ji et al.

ICCV 2025arXiv:2507.01390
9
citations
#5083

Gumbel Counterfactual Generation From Language Models

Shauli Ravfogel, Anej Svete, Vésteinn Snæbjarnarson et al.

ICLR 2025arXiv:2411.07180
9
citations
#5084

Shortcuts and Identifiability in Concept-based Models from a Neuro-Symbolic Lens

Samuele Bortolotti, Emanuele Marconato, Paolo Morettin et al.

NEURIPS 2025arXiv:2502.11245
9
citations
#5085

Superposition Yields Robust Neural Scaling

Yizhou Liu, Ziming Liu, Jeff Gore

NEURIPS 2025oralarXiv:2505.10465
9
citations
#5086

MindAligner: Explicit Brain Functional Alignment for Cross-Subject Visual Decoding from Limited fMRI Data

Yuqin Dai, Zhouheng Yao, Chunfeng Song et al.

ICML 2025arXiv:2502.05034
9
citations
#5087

Towards Autonomous Micromobility through Scalable Urban Simulation

Wayne Wu, Honglin He, Chaoyuan Zhang et al.

CVPR 2025highlightarXiv:2505.00690
9
citations
#5088

Rethinking Open-Vocabulary Segmentation of Radiance Fields in 3D Space

Hyunjee Lee, Youngsik Yun, Jeongmin Bae et al.

AAAI 2025paperarXiv:2408.07416
9
citations
#5089

Highly Compressed Tokenizer Can Generate Without Training

Lukas Lao Beyer, Tianhong Li, Xinlei Chen et al.

ICML 2025arXiv:2506.08257
9
citations
#5090

VidChain: Chain-of-Tasks with Metric-based Direct Preference Optimization for Dense Video Captioning

Ji Soo Lee, Jongha Kim, Jeehye Na et al.

AAAI 2025paperarXiv:2501.06761
9
citations
#5091

Correcting Deviations from Normality: A Reformulated Diffusion Model for Multi-Class Unsupervised Anomaly Detection

Farzad Beizaee, Gregory A. Lodygensky, Christian Desrosiers et al.

CVPR 2025arXiv:2503.19357
9
citations
#5092

Beyond Sequence: Impact of Geometric Context for RNA Property Prediction

Junjie Xu, Artem Moskalev, Tommaso Mansi et al.

ICLR 2025arXiv:2410.11933
9
citations
#5093

Improved Finite-Particle Convergence Rates for Stein Variational Gradient Descent

Sayan Banerjee, Krishna Balasubramanian, PROMIT GHOSAL

ICLR 2025arXiv:2409.08469
9
citations
#5094

AudSemThinker: Enhancing Audio-Language Models Through Reasoning over Semantics of Sound

Gijs Wijngaard, Elia Formisano, Michele Esposito et al.

NEURIPS 2025arXiv:2505.14142
9
citations
#5095

RBench-V: A Primary Assessment for Visual Reasoning Models with Multimodal Outputs

Meng-Hao Guo, Xuanyu Chu, Qianrui Yang et al.

NEURIPS 2025
9
citations
#5096

StreamMind: Unlocking Full Frame Rate Streaming Video Dialogue through Event-Gated Cognition

Xin Ding, Hao Wu, Yifan Yang et al.

ICCV 2025arXiv:2503.06220
9
citations
#5097

VideoAuteur: Towards Long Narrative Video Generation

Junfei Xiao, Feng Cheng, Lu Qi et al.

ICCV 2025arXiv:2501.06173
9
citations
#5098

Multi-Granular Multimodal Clue Fusion for Meme Understanding

Li Zheng, Hao Fei, Ting Dai et al.

AAAI 2025paperarXiv:2503.12560
9
citations
#5099

MAGE: Model-Level Graph Neural Networks Explanations via Motif-based Graph Generation

Zhaoning Yu, Hongyang Gao

ICLR 2025arXiv:2405.12519
9
citations
#5100

APOLLO: Automated LLM and Lean Collaboration for Advanced Formal Reasoning

Azim Ospanov, Farzan Farnia, Roozbeh Yousefzadeh

NEURIPS 2025arXiv:2505.05758
9
citations
#5101

Human-centered Interactive Learning via MLLMs for Text-to-Image Person Re-identification

Yang Qin, Chao Chen, Zhihang Fu et al.

CVPR 2025arXiv:2506.11036
9
citations
#5102

DEEPTalk: Dynamic Emotion Embedding for Probabilistic Speech-Driven 3D Face Animation

Jisoo Kim, Jungbin Cho, Joonho Park et al.

AAAI 2025paperarXiv:2408.06010
9
citations
#5103

Near, far: Patch-ordering enhances vision foundation models' scene understanding

Valentinos Pariza, Mohammadreza Salehi, Gertjan J Burghouts et al.

ICLR 2025arXiv:2408.11054
9
citations
#5104

Assessing the Creativity of LLMs in Proposing Novel Solutions to Mathematical Problems

Junyi Ye, Jingyi Gu, Xinyun Zhao et al.

AAAI 2025paperarXiv:2410.18336
9
citations
#5105

Distilling Structural Representations into Protein Sequence Models

Jeffrey Ouyang-Zhang, Chengyue Gong, Yue Zhao et al.

ICLR 2025
9
citations
#5106

Understanding and Improving Length Generalization in Recurrent Models

Ricardo Buitrago Ruiz, Albert Gu

ICML 2025arXiv:2507.02782
9
citations
#5107

PDE-Transformer: Efficient and Versatile Transformers for Physics Simulations

Benjamin Holzschuh, Qiang Liu, Georg Kohl et al.

ICML 2025oralarXiv:2505.24717
9
citations
#5108

MSE-Adapter: A Lightweight Plugin Endowing LLMs with the Capability to Perform Multimodal Sentiment Analysis and Emotion Recognition

Yang Yang, Xunde Dong, Yupeng Qiang

AAAI 2025paperarXiv:2502.12478
9
citations
#5109

OpenAD: Open-World Autonomous Driving Benchmark for 3D Object Detection

Zhongyu Xia, Jishuo Li, Zhiwei Lin et al.

NEURIPS 2025arXiv:2411.17761
9
citations
#5110

Prioritized Generative Replay

Ren Wang, Kevin Frans, Pieter Abbeel et al.

ICLR 2025arXiv:2410.18082
9
citations
#5111

InPO: Inversion Preference Optimization with Reparametrized DDIM for Efficient Diffusion Model Alignment

Yunhong Lu, Qichao Wang, Hengyuan Cao et al.

CVPR 2025highlightarXiv:2503.18454
9
citations
#5112

Value-Based Deep RL Scales Predictably

Oleh Rybkin, Michal Nauman, Preston Fu et al.

ICML 2025arXiv:2502.04327
9
citations
#5113

InverseCoder: Self-improving Instruction-Tuned Code LLMs with Inverse-Instruct

Yutong Wu, Di Huang, Wenxuan Shi et al.

AAAI 2025paperarXiv:2407.05700
9
citations
#5114

VRVVC: Variable-Rate NeRF-Based Volumetric Video Compression

Qiang Hu, Houqiang Zhong, Zihan Zheng et al.

AAAI 2025paperarXiv:2412.11362
9
citations
#5115

Everything Everywhere All at Once: LLMs can In-Context Learn Multiple Tasks in Superposition

Zheyang Xiong, Jack Cai, John Cooper et al.

ICML 2025spotlightarXiv:2410.05603
9
citations
#5116

FreCaS: Efficient Higher-Resolution Image Generation via Frequency-aware Cascaded Sampling

zhengqiang ZHANG, Ruihuang Li, Lei Zhang

ICLR 2025arXiv:2410.18410
9
citations
#5117

Distilling Monocular Foundation Model for Fine-grained Depth Completion

Yingping Liang, Yutao Hu, Wenqi Shao et al.

CVPR 2025arXiv:2503.16970
9
citations
#5118

ReAttention: Training-Free Infinite Context with Finite Attention Scope

Xiaoran Liu, Ruixiao Li, Zhigeng Liu et al.

ICLR 2025arXiv:2407.15176
9
citations
#5119

An Interpretable N-gram Perplexity Threat Model for Large Language Model Jailbreaks

Valentyn Boreiko, Alexander Panfilov, Václav Voráček et al.

ICML 2025arXiv:2410.16222
9
citations
#5120

Information-Driven Design of Imaging Systems

Henry Pinkard, Leyla Kabuli, Eric Markley et al.

NEURIPS 2025arXiv:2405.20559
9
citations
#5121

NeRAF: 3D Scene Infused Neural Radiance and Acoustic Fields

Amandine Brunetto, Sascha Hornauer, Fabien Moutarde

ICLR 2025arXiv:2405.18213
9
citations
#5122

VidTwin: Video VAE with Decoupled Structure and Dynamics

Yuchi Wang, Junliang Guo, Xinyi Xie et al.

CVPR 2025arXiv:2412.17726
9
citations
#5123

WildFake: A Large-Scale and Hierarchical Dataset for AI-Generated Images Detection

Yan Hong, Jianming Feng, Haoxing Chen et al.

AAAI 2025paper
9
citations
#5124

DiffCalib: Reformulating Monocular Camera Calibration as Diffusion-Based Dense Incident Map Generation

Xiankang He, Guangkai Xu, Bo Zhang et al.

AAAI 2025paperarXiv:2405.15619
9
citations
#5125

HOPE for a Robust Parameterization of Long-memory State Space Models

Annan Yu, Michael W Mahoney, N. Benjamin Erichson

ICLR 2025arXiv:2405.13975
9
citations
#5126

Kiss3DGen: Repurposing Image Diffusion Models for 3D Asset Generation

Jiantao Lin, Xin Yang, Meixi Chen et al.

CVPR 2025arXiv:2503.01370
9
citations
#5127

Probability Density Geodesics in Image Diffusion Latent Space

Qingtao Yu, Jaskirat Singh, Zhaoyuan Yang et al.

CVPR 2025arXiv:2504.06675
9
citations
#5128

Generative Inbetweening through Frame-wise Conditions-Driven Video Generation

Tianyi Zhu, Dongwei Ren, Qilong Wang et al.

CVPR 2025arXiv:2412.11755
9
citations
#5129

TEncDM: Understanding the Properties of the Diffusion Model in the Space of Language Model Encodings

Alexander Shabalin, Viacheslav Meshchaninov, Egor Chimbulatov et al.

AAAI 2025paperarXiv:2402.19097
9
citations
#5130

PatientSim: A Persona-Driven Simulator for Realistic Doctor-Patient Interactions

Daeun Kyung, Hyunseung Chung, Seongsu Bae et al.

NEURIPS 2025spotlightarXiv:2505.17818
9
citations
#5131

REWIND: Real-Time Egocentric Whole-Body Motion Diffusion with Exemplar-Based Identity Conditioning

Jihyun Lee, Weipeng Xu, Alexander Richard et al.

CVPR 2025arXiv:2504.04956
9
citations
#5132

Preference-Guided Diffusion for Multi-Objective Offline Optimization

Yashas Annadani, Syrine Belakaria, Stefano Ermon et al.

NEURIPS 2025arXiv:2503.17299
9
citations
#5133

On the Optimization and Generalization of Two-layer Transformers with Sign Gradient Descent

Bingrui Li, Wei Huang, Andi Han et al.

ICLR 2025arXiv:2410.04870
9
citations
#5134

GENTEEL-NEGOTIATOR: LLM-Enhanced Mixture-of-Expert-Based Reinforcement Learning Approach for Polite Negotiation Dialogue

Priyanshu Priya, Rishikant Chigrupaatii, Mauajama Firdaus et al.

AAAI 2025paper
9
citations
#5135

SMARTIES: Spectrum-Aware Multi-Sensor Auto-Encoder for Remote Sensing Images

Gencer Sumbul, Chang Xu, Emanuele Dalsasso et al.

ICCV 2025arXiv:2506.19585
9
citations
#5136

Monet: Mixture of Monosemantic Experts for Transformers

Jungwoo Park, Young Jin Ahn, Kee-Eung Kim et al.

ICLR 2025arXiv:2412.04139
9
citations
#5137

Aligned Datasets Improve Detection of Latent Diffusion-Generated Images

Anirudh Sundara Rajan, Utkarsh Ojha, Jedidiah Schloesser et al.

ICLR 2025arXiv:2410.11835
9
citations
#5138

HaDeMiF: Hallucination Detection and Mitigation in Large Language Models

Xiaoling Zhou, Mingjie Zhang, Zhemg Lee et al.

ICLR 2025
9
citations
#5139

(Mis)Fitting Scaling Laws: A Survey of Scaling Law Fitting Techniques in Deep Learning

Margaret Li, Sneha Kudugunta, Luke Zettlemoyer

ICLR 2025
9
citations
#5140

Multi-Pair Temporal Sentence Grounding via Multi-Thread Knowledge Transfer Network

Xiang Fang, Wanlong Fang, Changshuo Wang et al.

AAAI 2025paperarXiv:2412.15678
9
citations
#5141

Sample complexity of data-driven tuning of model hyperparameters in neural networks with structured parameter-dependent dual function

Maria-Florina Balcan, Anh Nguyen, Dravyansh Sharma

NEURIPS 2025arXiv:2501.13734
9
citations
#5142

Depth-Centric Dehazing and Depth-Estimation from Real-World Hazy Driving Video

Junkai Fan, Kun Wang, Zhiqiang Yan et al.

AAAI 2025paperarXiv:2412.11395
9
citations
#5143

Fine-structure Preserved Real-world Image Super-resolution via Transfer VAE Training

Qiaosi Yi, Shuai Li, Rongyuan Wu et al.

ICCV 2025highlightarXiv:2507.20291
9
citations
#5144

OS-ATLAS: Foundation Action Model for Generalist GUI Agents

Zhiyong Wu, Zhenyu Wu, Fangzhi Xu et al.

ICLR 2025
9
citations
#5145

Fast Think-on-Graph: Wider, Deeper and Faster Reasoning of Large Language Model on Knowledge Graph

Xujian Liang, Zhaoquan Gu

AAAI 2025paperarXiv:2501.14300
9
citations
#5146

Spectral-Refiner: Accurate Fine-Tuning of Spatiotemporal Fourier Neural Operator for Turbulent Flows

Shuhao Cao, Francesco Brarda, Ruipeng Li et al.

ICLR 2025oralarXiv:2405.17211
9
citations
#5147

Neural Approximate Mirror Maps for Constrained Diffusion Models

Berthy Feng, Ricardo Baptista, Katherine Bouman

ICLR 2025arXiv:2406.12816
9
citations
#5148

LOMA: Language-assisted Semantic Occupancy Network via Triplane Mamba

Yubo Cui, Zhiheng Li, Jiaqiang Wang et al.

AAAI 2025paperarXiv:2412.08388
9
citations
#5149

MultiGO: Towards Multi-level Geometry Learning for Monocular 3D Textured Human Reconstruction

Gangjian Zhang, Nanjie Yao, Shunsi Zhang et al.

CVPR 2025arXiv:2412.03103
9
citations
#5150

Breaking Free from MMI: A New Frontier in Rationalization by Probing Input Utilization

Wei Liu, Zhiying Deng, Zhongyu Niu et al.

ICLR 2025arXiv:2503.06202
9
citations
#5151

Knowledge Distillation with Refined Logits

Wujie Sun, Defang Chen, Siwei Lyu et al.

ICCV 2025arXiv:2408.07703
9
citations
#5152

Lumina-T2X: Scalable Flow-based Large Diffusion Transformer for Flexible Resolution Generation

Gao Peng, Le Zhuo, Dongyang Liu et al.

ICLR 2025oral
9
citations
#5153

Do LLMs Really Forget? Evaluating Unlearning with Knowledge Correlation and Confidence Awareness

Rongzhe Wei, Peizhi Niu, Hans Hao-Hsun Hsu et al.

NEURIPS 2025arXiv:2506.05735
9
citations
#5154

SPARTAN: A Sparse Transformer World Model Attending to What Matters

Anson Lei, Bernhard Schölkopf, Ingmar Posner

NEURIPS 2025arXiv:2411.06890
9
citations
#5155

Queryable Prototype Multiple Instance Learning with Vision-Language Models for Incremental Whole Slide Image Classification

Jiaxiang Gou, Luping Ji, Pei Liu et al.

AAAI 2025paperarXiv:2410.10573
9
citations
#5156

Black Swan: Abductive and Defeasible Video Reasoning in Unpredictable Events

Aditya Chinchure, Sahithya Ravi, Raymond Ng et al.

CVPR 2025arXiv:2412.05725
9
citations
#5157

M3Net: Multimodal Multi-task Learning for 3D Detection, Segmentation, and Occupancy Prediction in Autonomous Driving

Xuesong Chen, Shaoshuai Shi, Tao Ma et al.

AAAI 2025paperarXiv:2503.18100
9
citations
#5158

Does Editing Provide Evidence for Localization?

Zihao Wang, Victor Veitch

ICLR 2025arXiv:2502.11447
9
citations
#5159

An End-to-End Robust Point Cloud Semantic Segmentation Network with Single-Step Conditional Diffusion Models

Wentao Qu, Jing Wang, Yongshun Gong et al.

CVPR 2025arXiv:2411.16308
9
citations
#5160

LLM+AL: Bridging Large Language Models and Action Languages for Complex Reasoning About Actions

Adam Ishay, Joohyung Lee

AAAI 2025paperarXiv:2501.00830
9
citations
#5161

Measuring what Matters: Construct Validity in Large Language Model Benchmarks

Andrew M. Bean, Ryan Othniel Kearns, Angelika Romanou et al.

NEURIPS 2025arXiv:2511.04703
9
citations
#5162

A Training-Free Sub-quadratic Cost Transformer Model Serving Framework with Hierarchically Pruned Attention

Heejun Lee, Geon Park, Youngwan Lee et al.

ICLR 2025arXiv:2406.09827
9
citations
#5163

RNG: Relightable Neural Gaussians

Jiahui Fan, Fujun Luan, Jian Yang et al.

CVPR 2025arXiv:2409.19702
9
citations
#5164

Cross-View Referring Multi-Object Tracking

Sijia Chen, En Yu, Wenbing Tao

AAAI 2025paperarXiv:2412.17807
9
citations
#5165

U-REPA: Aligning Diffusion U-Nets to ViTs

Yuchuan Tian, Hanting Chen, Mengyu Zheng et al.

NEURIPS 2025arXiv:2503.18414
9
citations
#5166

Learning Safety Constraints for Large Language Models

Xin Chen, Yarden As, Andreas Krause

ICML 2025spotlightarXiv:2505.24445
9
citations
#5167

CustomContrast: A Multilevel Contrastive Perspective for Subject-Driven Text-to-Image Customization

Nan Chen, Mengqi Huang, Zhuowei Chen et al.

AAAI 2025paperarXiv:2409.05606
9
citations
#5168

SimulPL: Aligning Human Preferences in Simultaneous Machine Translation

Donglei Yu, Yang Zhao, Jie Zhu et al.

ICLR 2025arXiv:2502.00634
9
citations
#5169

Simulation-Free Hierarchical Latent Policy Planning for Proactive Dialogues

Tao He, Lizi Liao, Yixin Cao et al.

AAAI 2025paperarXiv:2412.14584
9
citations
#5170

ChainHOI: Joint-based Kinematic Chain Modeling for Human-Object Interaction Generation

Ling-An Zeng, Guohong Huang, Yi-Lin Wei et al.

CVPR 2025arXiv:2503.13130
9
citations
#5171

DeFT: Decoding with Flash Tree-attention for Efficient Tree-structured LLM Inference

Jinwei Yao, Kaiqi Chen, Kexun Zhang et al.

ICLR 2025arXiv:2404.00242
9
citations
#5172

BrainUICL: An Unsupervised Individual Continual Learning Framework for EEG Applications

Yangxuan Zhou, Sha Zhao, Jiquan Wang et al.

ICLR 2025
9
citations
#5173

Generative Multiview Relighting for 3D Reconstruction under Extreme Illumination Variation

Hadi Alzayer, Philipp Henzler, Jonathan T. Barron et al.

CVPR 2025highlightarXiv:2412.15211
9
citations
#5174

ForestFormer3D: A Unified Framework for End-to-End Segmentation of Forest LiDAR 3D Point Clouds

Binbin Xiang, Maciej Wielgosz, Stefano Puliti et al.

ICCV 2025arXiv:2506.16991
9
citations
#5175

SAM2-LOVE: Segment Anything Model 2 in Language-aided Audio-Visual Scenes

Yuji Wang, Haoran Xu, Yong Liu et al.

CVPR 2025arXiv:2506.01558
9
citations
#5176

AGFSync: Leveraging AI-Generated Feedback for Preference Optimization in Text-to-Image Generation

Jingkun An, Yinghao Zhu, Zongjian Li et al.

AAAI 2025paperarXiv:2403.13352
9
citations
#5177

In-Context Deep Learning via Transformer Models

Weimin Wu, Maojiang Su, Jerry Yao-Chieh Hu et al.

ICML 2025arXiv:2411.16549
9
citations
#5178

Cafe-Talk: Generating 3D Talking Face Animation with Multimodal Coarse- and Fine-grained Control

Hejia Chen, Haoxian Zhang, Shoulong Zhang et al.

ICLR 2025oralarXiv:2503.14517
9
citations
#5179

ALE-Bench: A Benchmark for Long-Horizon Objective-Driven Algorithm Engineering

Yuki Imajuku, Kohki Horie, Yoichi Iwata et al.

NEURIPS 2025arXiv:2506.09050
9
citations
#5180

Progressive distillation induces an implicit curriculum

Abhishek Panigrahi, Bingbin Liu, Sadhika Malladi et al.

ICLR 2025arXiv:2410.05464
9
citations
#5181

UAVScenes: A Multi-Modal Dataset for UAVs

Sijie Wang, Siqi Li, Yawei Zhang et al.

ICCV 2025arXiv:2507.22412
9
citations
#5182

DRESSing Up LLM: Efficient Stylized Question-Answering via Style Subspace Editing

Xinyu Ma, Yifeng Xu, Yang Lin et al.

ICLR 2025arXiv:2501.14371
9
citations
#5183

Decentralized Diffusion Models

David McAllister, Matthew Tancik, Jiaming Song et al.

CVPR 2025arXiv:2501.05450
9
citations
#5184

AlphaZero Neural Scaling and Zipf's Law: a Tale of Board Games and Power Laws

Oren Neumann, Claudius Gros

NEURIPS 2025spotlightarXiv:2412.11979
9
citations
#5185

CP-Guard: Malicious Agent Detection and Defense in Collaborative Bird’s Eye View Perception

Senkang Hu, Yihang Tao, Guowen Xu et al.

AAAI 2025paperarXiv:2412.12000
9
citations
#5186

Vision Foundation Models as Effective Visual Tokenizers for Autoregressive Generation

Zheng Anlin, Xin Wen, Xuanyang Zhang et al.

NEURIPS 2025
9
citations
#5187

Towards Accurate Binary Spiking Neural Networks: Learning with Adaptive Gradient Modulation Mechanism

Yu Liang, Wenjie Wei, Ammar Belatreche et al.

AAAI 2025paperarXiv:2502.14344
9
citations
#5188

GVMGen: A General Video-to-Music Generation Model with Hierarchical Attentions

Heda Zuo, Weitao You, Junxian Wu et al.

AAAI 2025paperarXiv:2501.09972
9
citations
#5189

Efficient Multi-modal Large Language Models via Progressive Consistency Distillation

Zichen Wen, Shaobo Wang, Yufa Zhou et al.

NEURIPS 2025arXiv:2510.00515
9
citations
#5190

MolSpectra: Pre-training 3D Molecular Representation with Multi-modal Energy Spectra

Liang Wang, Shaozhen Liu, Yu Rong et al.

ICLR 2025arXiv:2502.16284
9
citations
#5191

ALOcc: Adaptive Lifting-based 3D Semantic Occupancy and Cost Volume-based Flow Predictions

Dubing Chen, Jin Fang, Wencheng Han et al.

ICCV 2025arXiv:2411.07725
9
citations
#5192

MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions

Yekun Chai, Haoran Sun, Huang Fang et al.

ICLR 2025oralarXiv:2410.02743
9
citations
#5193

MalCL: Leveraging GAN-Based Generative Replay to Combat Catastrophic Forgetting in Malware Classification

Jimin Park, AHyun Ji, Minji Park et al.

AAAI 2025paperarXiv:2501.01110
9
citations
#5194

MAP: Unleashing Hybrid Mamba-Transformer Vision Backbone's Potential with Masked Autoregressive Pretraining

Yunze Liu, Li Yi

CVPR 2025arXiv:2410.00871
9
citations
#5195

Asynchronous Federated Clustering with Unknown Number of Clusters

Yunfan Zhang, Yiqun Zhang, Yang Lu et al.

AAAI 2025paperarXiv:2412.20341
9
citations
#5196

Residual-MPPI: Online Policy Customization for Continuous Control

Pengcheng Wang, Chenran Li, Catherine Weaver et al.

ICLR 2025arXiv:2407.00898
9
citations
#5197

Compression via Pre-trained Transformers: A Study on Byte-Level Multimodal Data

David Heurtel-Depeiges, Anian Ruoss, Joel Veness et al.

ICML 2025arXiv:2410.05078
9
citations
#5198

BiDeV: Bilateral Defusing Verification for Complex Claim Fact-Checking

Yuxuan Liu, Hongda Sun, Wenya Guo et al.

AAAI 2025paperarXiv:2502.16181
9
citations
#5199

GeoSplatting: Towards Geometry Guided Gaussian Splatting for Physically-based Inverse Rendering

Kai Ye, Chong Gao, Guanbin Li et al.

ICCV 2025arXiv:2410.24204
9
citations
#5200

Hotspot-Driven Peptide Design via Multi-Fragment Autoregressive Extension

Jiahan Li, Tong Chen, Shitong Luo et al.

ICLR 2025arXiv:2411.18463
9
citations