Most Cited 2025 "squared error loss" Papers

22,274 papers found • Page 53 of 112

#10401

Feed-Forward SceneDINO for Unsupervised Semantic Scene Completion

Aleksandar Jevtić, Christoph Reich, Felix Wimbauer et al.

ICCV 2025arXiv:2507.06230
3
citations
#10402

ARMesh: Autoregressive Mesh Generation via Next-Level-of-Detail Prediction

Jiabao Lei, Kewei Shi, Zhihao Liang et al.

NEURIPS 2025arXiv:2509.20824
3
citations
#10403

Performative Validity of Recourse Explanations

Gunnar König, Hidde Fokkema, Timo Freiesleben et al.

NEURIPS 2025arXiv:2506.15366
3
citations
#10404

Dimension-Reduction Attack! Video Generative Models are Experts on Controllable Image Synthesis

Hengyuan Cao, Yutong Feng, Biao Gong et al.

NEURIPS 2025oralarXiv:2505.23325
3
citations
#10405

LightFair: Towards an Efficient Alternative for Fair T2I Diffusion via Debiasing Pre-trained Text Encoders

Boyu Han, Qianqian Xu, Shilong Bao et al.

NEURIPS 2025arXiv:2509.23639
3
citations
#10406

VFRTok: Variable Frame Rates Video Tokenizer with Duration-Proportional Information Assumption

Tianxiong Zhong, Xingye Tian, Boyuan Jiang et al.

NEURIPS 2025oralarXiv:2505.12053
3
citations
#10407

TITAN: A Trajectory-Informed Technique for Adaptive Parameter Freezing in Large-Scale VQE

Yifeng Peng, Xinyi Li, Samuel Yen-Chi Chen et al.

NEURIPS 2025arXiv:2509.15193
3
citations
#10408

A Hidden Stumbling Block in Generalized Category Discovery: Distracted Attention

Qiyu Xu, Zhanxuan Hu, Yu Duan et al.

ICCV 2025arXiv:2507.14315
3
citations
#10409

GenM3: Generative Pretrained Multi-path Motion Model for Text Conditional Human Motion Generation

Junyu Shi, Lijiang LIU, Yong Sun et al.

ICCV 2025
3
citations
#10410

PRVQL: Progressive Knowledge-guided Refinement for Robust Egocentric Visual Query Localization

Bing Fan, Yunhe Feng, Yapeng Tian et al.

ICCV 2025arXiv:2502.07707
3
citations
#10411

Learning long range dependencies through time reversal symmetry breaking

Guillaume Pourcel, Maxence Ernoult

NEURIPS 2025oralarXiv:2506.05259
3
citations
#10412

Enhancing Graph Of Thought: Enhancing Prompts with LLM Rationales and Dynamic Temperature Control

Sunguk Shin, Youngjoon Kim

ICLR 2025
3
citations
#10413

Aligning Transformers with Continuous Feedback via Energy Rank Alignment

Shriram Chennakesavalu, Frank Hu, Sebastian Ibarraran et al.

NEURIPS 2025arXiv:2405.12961
3
citations
#10414

IllumiCraft: Unified Geometry and Illumination Diffusion for Controllable Video Generation

Yuanze Lin, Yi-Wen Chen, Yi-Hsuan Tsai et al.

NEURIPS 2025oralarXiv:2506.03150
3
citations
#10415

When Lighting Deceives: Exposing Vision-Language Models' Illumination Vulnerability Through Illumination Transformation Attack

Hanqing Liu, Shouwei Ruan, Yao Huang et al.

ICCV 2025arXiv:2503.06903
3
citations
#10416

CLOVER: Cross-Layer Orthogonal Vectors Pruning and Fine-Tuning

Fanxu Meng, Muhan Zhang

ICLR 2025arXiv:2411.17426
3
citations
#10417

Scale Efficient Training for Large Datasets

Qing Zhou, Junyu Gao, Qi Wang

CVPR 2025arXiv:2503.13385
3
citations
#10418

STEP: A Unified Spiking Transformer Evaluation Platform for Fair and Reproducible Benchmarking

Sicheng Shen, Dongcheng Zhao, Linghao Feng et al.

NEURIPS 2025oralarXiv:2505.11151
3
citations
#10419

Measuring Scientific Capabilities of Language Models with a Systems Biology Dry Lab

Haonan Duan, Stephen Lu, Caitlin F Harrigan et al.

NEURIPS 2025arXiv:2507.02083
3
citations
#10420

Transformed Low-rank Adaptation via Tensor Decomposition and Its Applications to Text-to-image Models

Zerui Tao, Yuhta Takida, Naoki Murata et al.

ICCV 2025arXiv:2501.08727
3
citations
#10421

DiffDoctor: Diagnosing Image Diffusion Models Before Treating

Yiyang Wang, Xi Chen, Xiaogang Xu et al.

ICCV 2025arXiv:2501.12382
3
citations
#10422

Constructing Ophthalmic MLLM for Positioning-diagnosis Collaboration Through Clinical Cognitive Chain Reasoning

Xinyao Liu, Diping Song

ICCV 2025arXiv:2507.17539
3
citations
#10423

Hallucinatory Image Tokens: A Training-free EAZY Approach to Detecting and Mitigating Object Hallucinations in LVLMs

Liwei Che, Qingze T Liu, Jing Jia et al.

ICCV 2025arXiv:2503.07772
3
citations
#10424

ToVE: Efficient Vision-Language Learning via Knowledge Transfer from Vision Experts

Yuanchen Wu, Junlong Du, Ke Yan et al.

ICLR 2025arXiv:2504.00691
3
citations
#10425

A Differentiable Wave Optics Model for End-to-End Computational Imaging System Optimization

Chi-Jui Ho, Yash Belhe, Steve Rotenberg et al.

ICCV 2025arXiv:2412.09774
3
citations
#10426

Bring Your Rear Cameras for Egocentric 3D Human Pose Estimation

HIroyasu Akada, Jian Wang, Vladislav Golyanik et al.

ICCV 2025arXiv:2503.11652
3
citations
#10427

FGBench: A Dataset and Benchmark for Molecular Property Reasoning at Functional Group-Level in Large Language Models

Xuan Liu, Siru Ouyang, Xianrui Zhong et al.

NEURIPS 2025arXiv:2508.01055
3
citations
#10428

SC-Captioner: Improving Image Captioning with Self-Correction by Reinforcement Learning

Lin Zhang, Xianfang Zeng, Kangcong Li et al.

ICCV 2025arXiv:2508.06125
3
citations
#10429

Free-Lunch Color-Texture Disentanglement for Stylized Image Generation

Jiang Qin, Alexandra Gomez-Villa, Senmao Li et al.

NEURIPS 2025arXiv:2503.14275
3
citations
#10430

Hallucination at a Glance: Controlled Visual Edits and Fine-Grained Multimodal Learning

Tianyi Bai, Yuxuan Fan, Qiu Jiantao et al.

NEURIPS 2025arXiv:2506.07227
3
citations
#10431

BoltzNCE: Learning likelihoods for Boltzmann Generation with Stochastic Interpolants and Noise Contrastive Estimation

Rishal Aggarwal, Jacky Chen, Nicholas Boffi et al.

NEURIPS 2025arXiv:2507.00846
3
citations
#10432

CSD-VAR: Content-Style Decomposition in Visual Autoregressive Models

Quang-Binh Nguyen, Minh Luu, Quang Nguyen et al.

ICCV 2025arXiv:2507.13984
3
citations
#10433

Self-Reinforcing Prototype Evolution with Dual-Knowledge Cooperation for Semi-Supervised Lifelong Person Re-Identification

Kunlun Xu, Fan Zhuo, Jiangmeng Li et al.

ICCV 2025arXiv:2507.01884
3
citations
#10434

On the Coexistence and Ensembling of Watermarks

Aleksandar Petrov, Shruti Agarwal, Philip Torr et al.

NEURIPS 2025arXiv:2501.17356
3
citations
#10435

Integrating Visual Interpretation and Linguistic Reasoning for Geometric Problem Solving

Zixian Guo, Ming Liu, Qilong Wang et al.

ICCV 2025
3
citations
#10436

SMMILE: An expert-driven benchmark for multimodal medical in-context learning

Melanie Rieff, Maya Varma, Ossian Rabow et al.

NEURIPS 2025arXiv:2506.21355
3
citations
#10437

Latent Mixture of Symmetries for Sample-Efficient Dynamic Learning

Haoran Li, CHENHAN XIAO, Muhao Guo et al.

NEURIPS 2025oralarXiv:2510.03578
3
citations
#10438

LLM-Driven Treatment Effect Estimation Under Inference Time Text Confounding

Yuchen Ma, Dennis Frauen, Jonas Schweisthal et al.

NEURIPS 2025arXiv:2507.02843
3
citations
#10439

Graph Neural Networks for Edge Signals: Orientation Equivariance and Invariance

Dominik Fuchsgruber, Tim Postuvan, Stephan Günnemann et al.

ICLR 2025arXiv:2410.16935
3
citations
#10440

Dist Loss: Enhancing Regression in Few-Shot Region through Distribution Distance Constraint

Guangkun Nie, Gongzheng Tang, Shenda Hong

ICLR 2025arXiv:2411.15216
3
citations
#10441

Multi-focal Conditioned Latent Diffusion for Person Image Synthesis

Jiaqi Liu, Jichao Zhang, Paolo Rota et al.

CVPR 2025arXiv:2503.15686
3
citations
#10442

Multi-modal contrastive learning adapts to intrinsic dimensions of shared latent variables

Yu Gui, Cong Ma, Zongming Ma

NEURIPS 2025arXiv:2505.12473
3
citations
#10443

R-KV: Redundancy-aware KV Cache Compression for Reasoning Models

Zefan Cai, Wen Xiao, Hanshi Sun et al.

NEURIPS 2025arXiv:2505.24133
3
citations
#10444

Invisible Backdoor Attack against Self-supervised Learning

Hanrong Zhang, Zhenting Wang, Boheng Li et al.

CVPR 2025arXiv:2405.14672
3
citations
#10445

Feel-Good Thompson Sampling for Contextual Bandits: a Markov Chain Monte Carlo Showdown

Emile Anand, Sarah Liaw

NEURIPS 2025arXiv:2507.15290
3
citations
#10446

Joint Diffusion Models in Continual Learning

Paweł Skierś, Kamil Deja

ICCV 2025arXiv:2411.08224
3
citations
#10447

GT-Loc: Unifying When and Where in Images through a Joint Embedding Space

David G. Shatwell, Ishan Rajendrakumar Dave, Swetha Sirnam et al.

ICCV 2025arXiv:2507.10473
3
citations
#10448

Credal Prediction based on Relative Likelihood

Timo Löhr, Paul Hofman, Felix Mohr et al.

NEURIPS 2025spotlightarXiv:2505.22332
3
citations
#10449

ScenePainter: Semantically Consistent Perpetual 3D Scene Generation with Concept Relation Alignment

Chong Xia, Shengjun Zhang, Fangfu Liu et al.

ICCV 2025arXiv:2507.19058
3
citations
#10450

Is `Right' Right? Enhancing Object Orientation Understanding in Multimodal Large Language Models through Egocentric Instruction Tuning

JiHyeok Jung, EunTae Kim, SeoYeon Kim et al.

CVPR 2025arXiv:2411.16761
3
citations
#10451

ReCAP: Recursive Context-Aware Reasoning and Planning for Large Language Model Agents

Zhenyu Zhang, Tianyi Chen, Weiran Xu et al.

NEURIPS 2025arXiv:2510.23822
3
citations
#10452

Trans-EnV: A Framework for Evaluating the Linguistic Robustness of LLMs Against English Varieties

Jiyoung Lee, Seungho Kim, Jieun Han et al.

NEURIPS 2025arXiv:2505.20875
3
citations
#10453

Demystifying Spectral Feature Learning for Instrumental Variable Regression

Dimitri Meunier, Antoine Moulin, Jakub Wornbard et al.

NEURIPS 2025arXiv:2506.10899
3
citations
#10454

ProJudge: A Multi-Modal Multi-Discipline Benchmark and Instruction-Tuning Dataset for MLLM-based Process Judges

Jiaxin Ai, Pengfei Zhou, xu Pan et al.

ICCV 2025arXiv:2503.06553
3
citations
#10455

Efficient Preference-Based Reinforcement Learning: Randomized Exploration meets Experimental Design

Andreas Schlaginhaufen, Reda Ouhamma, Maryam Kamgarpour

NEURIPS 2025arXiv:2506.09508
3
citations
#10456

T1: A Tool-Oriented Conversational Dataset for Multi-Turn Agentic Planning

NEURIPS 2025arXiv:2505.16986
3
citations
#10457

Latte: Collaborative Test-Time Adaptation of Vision-Language Models in Federated Learning

Wenxuan Bao, Ruxi Deng, Ruizhong Qiu et al.

ICCV 2025arXiv:2507.21494
3
citations
#10458

AutoSSVH: Exploring Automated Frame Sampling for Efficient Self-Supervised Video Hashing

Niu Lian, Jun Li, Jinpeng Wang et al.

CVPR 2025arXiv:2504.03587
3
citations
#10459

Efficient Speech Language Modeling via Energy Distance in Continuous Latent Space

Zhengrui Ma, Yang Feng, Chenze Shao et al.

NEURIPS 2025arXiv:2505.13181
3
citations
#10460

SGFormer: Satellite-Ground Fusion for 3D Semantic Scene Completion

Xiyue Guo, Jiarui Hu, Junjie Hu et al.

CVPR 2025arXiv:2503.16825
3
citations
#10461

Bootstrapping Grounded Chain-of-Thought in Multimodal LLMs for Data-Efficient Model Adaptation

Jiaer Xia, Bingkui Tong, Yuhang Zang et al.

ICCV 2025highlightarXiv:2507.02859
3
citations
#10462

ICPC-Eval: Probing the Frontiers of LLM Reasoning with Competitive Programming Contests

Shiyi Xu, Hu Yiwen, Yingqian Min et al.

NEURIPS 2025arXiv:2506.04894
3
citations
#10463

Offline Goal-conditioned Reinforcement Learning with Quasimetric Representations

Vivek Myers, Bill Zheng, Benjamin Eysenbach et al.

NEURIPS 2025oralarXiv:2509.20478
3
citations
#10464

Video Individual Counting for Moving Drones

Yaowu Fan, Jia Wan, Tao Han et al.

ICCV 2025highlightarXiv:2503.10701
3
citations
#10465

Eyes Wide Open: Ego Proactive Video-LLM for Streaming Video

Xueyang Yu, Cheng Shi, Yang Wang et al.

NEURIPS 2025arXiv:2510.14560
3
citations
#10466

TalkCuts: A Large-Scale Dataset for Multi-Shot Human Speech Video Generation

Jiaben Chen, Zixin Wang, AILING ZENG et al.

NEURIPS 2025arXiv:2510.07249
3
citations
#10467

SAMA: Towards Multi-Turn Referential Grounded Video Chat with Large Language Models

Ye Sun, Hao Zhang, Henghui Ding et al.

NEURIPS 2025oralarXiv:2505.18812
3
citations
#10468

LIFT: Latent Implicit Functions for Task- and Data-Agnostic Encoding

Amirhossein Kazerouni, Soroush Mehraban, Michael Brudno et al.

ICCV 2025arXiv:2503.15420
3
citations
#10469

CAPability: A Comprehensive Visual Caption Benchmark for Evaluating Both Correctness and Thoroughness

Zhihang Liu, Chen-Wei Xie, Bin Wen et al.

NEURIPS 2025arXiv:2502.14914
3
citations
#10470

MARVEL-40M+: Multi-Level Visual Elaboration for High-Fidelity Text-to-3D Content Creation

Sankalp Sinha, Mohammad Sadil Khan, Muhammad Usama et al.

CVPR 2025arXiv:2411.17945
3
citations
#10471

Moderating the Generalization of Score-based Generative Model

Wan Jiang, He Wang, Xin Zhang et al.

ICCV 2025arXiv:2412.07229
3
citations
#10472

Charm: The Missing Piece in ViT Fine-Tuning for Image Aesthetic Assessment

Fatemeh Behrad, Tinne Tuytelaars, Johan Wagemans

CVPR 2025arXiv:2504.02522
3
citations
#10473

A Lesson in Splats: Teacher-Guided Diffusion for 3D Gaussian Splats Generation with 2D Supervision

Chensheng Peng, Ido Sobol, Masayoshi Tomizuka et al.

ICCV 2025arXiv:2412.00623
3
citations
#10474

GraSS: Scalable Data Attribution with Gradient Sparsification and Sparse Projection

Pingbang Hu, Joseph Melkonian, Weijing Tang et al.

NEURIPS 2025arXiv:2505.18976
3
citations
#10475

Rethinking Multimodal Learning from the Perspective of Mitigating Classification Ability Disproportion

Qing-Yuan Jiang, Longfei Huang, Yang Yang

NEURIPS 2025oralarXiv:2502.20120
3
citations
#10476

PAC Bench: Do Foundation Models Understand Prerequisites for Executing Manipulation Policies?

Atharva Gundawar, Som Sagar, Ransalu Senanayake

NEURIPS 2025arXiv:2506.23725
3
citations
#10477

Open-set Cross Modal Generalization via Multimodal Unified Representation

Hai Huang, Yan Xia, Shulei Wang et al.

ICCV 2025arXiv:2507.14935
3
citations
#10478

KL-Regularized RLHF with Multiple Reference Models: Exact Solutions and Sample Complexity

Gholamali Aminian, Amir R. Asadi, Idan Shenfeld et al.

NEURIPS 2025arXiv:2502.01203
3
citations
#10479

Normalization in Attention Dynamics

Nikita Karagodin, Shu Ge, Yury Polyanskiy et al.

NEURIPS 2025arXiv:2510.22026
3
citations
#10480

Who You Are Matters: Bridging Interests and Social Roles via LLM-Enhanced Logic Recommendation

Qing Yu, Xiaobei Wang, Shuchang Liu et al.

NEURIPS 2025oral
3
citations
#10481

Adaptive Frontier Exploration on Graphs with Applications to Network-Based Disease Testing

XianJun, Davin Choo, Yuqi Pan, Tonghan Wang et al.

NEURIPS 2025arXiv:2505.21671
3
citations
#10482

On the creation of narrow AI: hierarchy and nonlocality of neural network skills

Eric Michaud, Asher Parker-Sartori, Max Tegmark

NEURIPS 2025arXiv:2505.15811
3
citations
#10483

Towards Straggler-Resilient Split Federated Learning: An Unbalanced Update Approach

Dandan Liang, Jianing Zhang, Evan Chen et al.

NEURIPS 2025arXiv:2510.21155
3
citations
#10484

PETRA: Parallel End-to-end Training with Reversible Architectures

Stéphane Rivaud, Louis Fournier, Thomas Pumir et al.

ICLR 2025arXiv:2406.02052
3
citations
#10485

Who is a Better Talker: Subjective and Objective Quality Assessment for AI-Generated Talking Heads

Yingjie Zhou, Jiezhang Cao, Zicheng Zhang et al.

ICCV 2025arXiv:2507.23343
3
citations
#10486

Flatness is Necessary, Neural Collapse is Not: Rethinking Generalization via Grokking

Ting Han, Linara Adilova, Henning Petzka et al.

NEURIPS 2025oralarXiv:2509.17738
3
citations
#10487

Seeing the Abstract: Translating the Abstract Language for Vision Language Models

Davide Talon, Federico Girella, Ziyue Liu et al.

CVPR 2025arXiv:2505.03242
3
citations
#10488

A Statistical Theory of Contrastive Learning via Approximate Sufficient Statistics

Licong Lin, Song Mei

NEURIPS 2025arXiv:2503.17538
3
citations
#10489

VolumetricSMPL: A Neural Volumetric Body Model for Efficient Interactions, Contacts, and Collisions

Marko Mihajlovic, Siwei Zhang, Gen Li et al.

ICCV 2025highlightarXiv:2506.23236
3
citations
#10490

Hierarchical Retrieval: The Geometry and a Pretrain-Finetune Recipe

Chong You, Rajesh Jayaram, Ananda Theertha Suresh et al.

NEURIPS 2025arXiv:2509.16411
3
citations
#10491

Balanced Conic Rectified Flow

Kim Shin seong, Mingi Kwon, Jaeseok Jeong et al.

NEURIPS 2025arXiv:2510.25229
3
citations
#10492

Causal Head Gating: A Framework for Interpreting Roles of Attention Heads in Transformers

Andrew Nam, Henry Conklin, Yukang Yang et al.

NEURIPS 2025arXiv:2505.13737
3
citations
#10493

EA-KD: Entropy-based Adaptive Knowledge Distillation

Chi-Ping Su, Ching-Hsun Tseng, Bin Pu et al.

ICCV 2025arXiv:2311.13621
3
citations
#10494

Efficient Parametric SVD of Koopman Operator for Stochastic Dynamical Systems

Minchan Jeong, Jongha (Jon) Ryu, Se-Young Yun et al.

NEURIPS 2025arXiv:2507.07222
3
citations
#10495

AutoDiscovery: Open-ended Scientific Discovery via Bayesian Surprise

Dhruv Agarwal, Bodhisattwa Prasad Majumder, Reece Adamson et al.

NEURIPS 2025oralarXiv:2507.00310
3
citations
#10496

CXReasonBench: A Benchmark for Evaluating Structured Diagnostic Reasoning in Chest X-rays

Hyungyung Lee, Geon Choi, Jung-Oh Lee et al.

NEURIPS 2025spotlightarXiv:2505.18087
3
citations
#10497

MuHBoost: Multi-Label Boosting For Practical Longitudinal Human Behavior Modeling

Nguyen Thach, Patrick Habecker, Anika Eisenbraun et al.

ICLR 2025
3
citations
#10498

Learning Sparse Approximate Inverse Preconditioners for Conjugate Gradient Solvers on GPUs

Zhehao Li, Zhehao Li, Kangbo Lyu et al.

NEURIPS 2025arXiv:2510.27517
3
citations
#10499

Reinforced Context Order Recovery for Adaptive Reasoning and Planning

Long Ma, Fangwei Zhong, Yizhou Wang

NEURIPS 2025arXiv:2508.13070
3
citations
#10500

I Am Big, You Are Little; I Am Right, You Are Wrong

David A Kelly, Akchunya Chanchal, Nathan Blake

ICCV 2025arXiv:2507.23509
3
citations
#10501

HCRMP: An LLM-Hinted Contextual Reinforcement Learning Framework for Autonomous Driving

Zhiwen Chen, Hanming Deng, Zhuoren Li et al.

NEURIPS 2025arXiv:2505.15793
3
citations
#10502

SparseDiT: Token Sparsification for Efficient Diffusion Transformer

Shuning Chang, Pichao WANG, Jiasheng Tang et al.

NEURIPS 2025oralarXiv:2412.06028
3
citations
#10503

GeoComplete: Geometry-Aware Diffusion for Reference-Driven Image Completion

Beibei Lin, Tingting Chen, Robby Tan

NEURIPS 2025arXiv:2510.03110
3
citations
#10504

Non-stationary Bandit Convex Optimization: A Comprehensive Study

Xiaoqi Liu, Dorian Baudry, Julian Zimmert et al.

NEURIPS 2025arXiv:2506.02980
3
citations
#10505

Martian World Model: Controllable Video Synthesis with Physically Accurate 3D Reconstructions

Longfei Li, Zhiwen Fan, Wenyan Cong et al.

NEURIPS 2025arXiv:2507.07978
3
citations
#10506

From Sequence to Structure: Uncovering Substructure Reasoning in Transformers

Xinnan Dai, Kai Yang, Jay Revolinsky et al.

NEURIPS 2025arXiv:2507.10435
3
citations
#10507

DiTaiListener: Controllable High Fidelity Listener Video Generation with Diffusion

Maksim Siniukov, Di Chang, Minh Tran et al.

ICCV 2025arXiv:2504.04010
3
citations
#10508

EA-Vit: Efficient Adaptation for Elastic Vision Transformer

Chen Zhu, Wangbo Zhao, Huiwen Zhang et al.

ICCV 2025arXiv:2507.19360
3
citations
#10509

Generate, Refine, and Encode: Leveraging Synthesized Novel Samples for On-the-Fly Fine-Grained Category Discovery

Xiao Liu, Nan Pu, Haiyang Zheng et al.

ICCV 2025arXiv:2507.04051
3
citations
#10510

Tiled Diffusion

Or Madar, Ohad Fried

CVPR 2025arXiv:2412.15185
3
citations
#10511

Memory-Efficient 4-bit Preconditioned Stochastic Optimization

Jingyang Li, Kuangyu Ding, Kim-chuan Toh et al.

ICCV 2025arXiv:2412.10663
3
citations
#10512

Taming generative video models for zero-shot optical flow extraction

Seungwoo Kim, Khai Loong Aw, Klemen Kotar et al.

NEURIPS 2025oralarXiv:2507.09082
3
citations
#10513

Associative Transformer

Yuwei Sun, Hideya Ochiai, Zhirong Wu et al.

CVPR 2025arXiv:2309.12862
3
citations
#10514

On the Surprising Effectiveness of Large Learning Rates under Standard Width Scaling

Moritz Haas, Sebastian Bordt, Ulrike Luxburg et al.

NEURIPS 2025spotlightarXiv:2505.22491
3
citations
#10515

Kernel Density Steering: Inference-Time Scaling via Mode Seeking for Image Restoration

Yuyang Hu, Kangfu Mei, Mojtaba Ardakani et al.

NEURIPS 2025arXiv:2507.05604
3
citations
#10516

AdaDrive: Self-Adaptive Slow-Fast System for Language-Grounded Autonomous Driving

Ruifei Zhang, Junlin Xie, Wei Zhang et al.

ICCV 2025arXiv:2511.06253
3
citations
#10517

PRISM: Privacy-Preserving Improved Stochastic Masking for Federated Generative Models

Kyeongkook Seo, Dong-Jun Han, Jaejun Yoo

ICLR 2025arXiv:2503.08085
3
citations
#10518

CodeCrash: Exposing LLM Fragility to Misleading Natural Language in Code Reasoning

Man Ho Lam, Chaozheng Wang, Jen-Tse Huang et al.

NEURIPS 2025arXiv:2504.14119
3
citations
#10519

Extending Foundational Monocular Depth Estimators to Fisheye Cameras with Calibration Tokens

Suchisrit Gangopadhyay, Jung Hee Kim, Xien Chen et al.

ICCV 2025arXiv:2508.04928
3
citations
#10520

HybridMQA: Exploring Geometry-Texture Interactions for Colored Mesh Quality Assessment

Armin Shafiee Sarvestani, Sheyang Tang, Zhou Wang

CVPR 2025arXiv:2412.01986
3
citations
#10521

Improved Regret Bounds for Gaussian Process Upper Confidence Bound in Bayesian Optimization

Shogo Iwazaki

NEURIPS 2025oralarXiv:2506.01393
3
citations
#10522

AnyMoLe: Any Character Motion In-betweening Leveraging Video Diffusion Models

Kwan Yun, Seokhyeon Hong, Chaelin Kim et al.

CVPR 2025arXiv:2503.08417
3
citations
#10523

Graph-based Document Structure Analysis

Yufan Chen, Ruiping Liu, Junwei Zheng et al.

ICLR 2025arXiv:2502.02501
3
citations
#10524

LocDiff: Identifying Locations on Earth by Diffusing in the Hilbert Space

Zhangyu Wang, Zeping Liu, Jielu Zhang et al.

NEURIPS 2025arXiv:2503.18142
3
citations
#10525

Towards Fully FP8 GEMM LLM Training at Scale

Alejandro Hernández Cano, Dhia Garbaya, Imanol Schlag et al.

NEURIPS 2025arXiv:2505.20524
3
citations
#10526

On the Robustness of Transformers against Context Hijacking for Linear Classification

Tianle Li, Chenyang Zhang, Xingwu Chen et al.

NEURIPS 2025arXiv:2502.15609
3
citations
#10527

RoPECraft: Training-Free Motion Transfer with Trajectory-Guided RoPE Optimization on Diffusion Transformers

Ahmet Berke Gökmen, Yiğit Ekin, Bahri Batuhan Bilecen et al.

NEURIPS 2025arXiv:2505.13344
3
citations
#10528

Synthesize Privacy-Preserving High-Resolution Images via Private Textual Intermediaries

Haoxiang Wang, Zinan Lin, Da Yu et al.

NEURIPS 2025arXiv:2506.07555
3
citations
#10529

Generalization Guarantees for Representation Learning via Data-Dependent Gaussian Mixture Priors

Milad Sefidgaran, Abdellatif Zaidi, Piotr Krasnowski

ICLR 2025arXiv:2502.15540
3
citations
#10530

Improving the Euclidean Diffusion Generation of Manifold Data by Mitigating Score Function Singularity

Zichen Liu, Wei Zhang, Tiejun Li

NEURIPS 2025arXiv:2505.09922
3
citations
#10531

Towards General Modality Translation with Contrastive and Predictive Latent Diffusion Bridge

Nimrod Berman, Omkar Joglekar, Eitan Kosman et al.

NEURIPS 2025arXiv:2510.20819
3
citations
#10532

World-aware Planning Narratives Enhance Large Vision-Language Model Planner

Junhao Shi, Zhaoye Fei, Siyin Wang et al.

NEURIPS 2025arXiv:2506.21230
3
citations
#10533

Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights

Zhiyuan Liang, Dongwen Tang, Yuhao Zhou et al.

NEURIPS 2025arXiv:2506.16406
3
citations
#10534

ODHSR: Online Dense 3D Reconstruction of Humans and Scenes from Monocular Videos

Zetong Zhang, Manuel Kaufmann, Lixin Xue et al.

CVPR 2025arXiv:2504.13167
3
citations
#10535

Probably Approximately Precision and Recall Learning

Lee Cohen, Yishay Mansour, Shay Moran et al.

NEURIPS 2025arXiv:2411.13029
3
citations
#10536

SCOUT: Teaching Pre-trained Language Models to Enhance Reasoning via Flow Chain-of-Thought

Guanghao Li, Wenhao Jiang, Mingfeng Chen et al.

NEURIPS 2025arXiv:2505.24181
3
citations
#10537

ExCap3D: Expressive 3D Scene Understanding via Object Captioning with Varying Detail

Chandan Yeshwanth, David Rozenberszki, Angela Dai

ICCV 2025arXiv:2503.17044
3
citations
#10538

LLM-Explorer: A Plug-in Reinforcement Learning Policy Exploration Enhancement Driven by Large Language Models

Qianyue Hao, Yiwen Song, Qingmin Liao et al.

NEURIPS 2025spotlightarXiv:2505.15293
3
citations
#10539

ConformalSAM: Unlocking the Potential of Foundational Segmentation Models in Semi-Supervised Semantic Segmentation with Conformal Prediction

Danhui Chen, Ziquan Liu, Chuxi Yang et al.

ICCV 2025arXiv:2507.15803
3
citations
#10540

Scaling Language-centric Omnimodal Representation Learning

Chenghao Xiao, Hou Pong (Ken) Chan, Hao Zhang et al.

NEURIPS 2025arXiv:2510.11693
3
citations
#10541

FG-OrIU: Towards Better Forgetting via Feature-Gradient Orthogonality for Incremental Unlearning

qian feng, Jiahang Tu, Mintong Kang et al.

ICCV 2025arXiv:2601.13578
3
citations
#10542

Human-assisted Robotic Policy Refinement via Action Preference Optimization

Wenke Xia, Yichu Yang, Hongtao Wu et al.

NEURIPS 2025arXiv:2506.07127
3
citations
#10543

Underwater Visual SLAM with Depth Uncertainty and Medium Modeling

Rui Liu, Sheng Fan, Wenguan Wang et al.

ICCV 2025highlight
3
citations
#10544

Low Rank Gradients and Where to Find Them

Rishi Sonthalia, Michael Murray, Guido Montufar

NEURIPS 2025arXiv:2510.01303
3
citations
#10545

EVOREFUSE: Evolutionary Prompt Optimization for Evaluation and Mitigation of LLM Over-Refusal to Pseudo-Malicious Instructions

Xiaorui Wu, Fei Li, Xiaofeng Mao et al.

NEURIPS 2025arXiv:2505.23473
3
citations
#10546

Efficient Policy Optimization in Robust Constrained MDPs with Iteration Complexity Guarantees

Sourav Ganguly, Kishan Panaganti, Arnob Ghosh et al.

NEURIPS 2025arXiv:2505.19238
3
citations
#10547

ProxyTransformation: Preshaping Point Cloud Manifold With Proxy Attention For 3D Visual Grounding

Qihang Peng, Henry Zheng, Gao Huang

CVPR 2025arXiv:2502.19247
3
citations
#10548

Conditional Balance: Improving Multi-Conditioning Trade-Offs in Image Generation

Nadav Z. Cohen, Oron Nir, Ariel Shamir

CVPR 2025arXiv:2412.19853
3
citations
#10549

AdaDetectGPT: Adaptive Detection of LLM-Generated Text with Statistical Guarantees

Hongyi Zhou, Jin Zhu, Pingfan Su et al.

NEURIPS 2025arXiv:2510.01268
3
citations
#10550

Silencer: From Discovery to Mitigation of Self-Bias in LLM-as-Benchmark-Generator

Peiwen Yuan, Yiwei Li, Shaoxiong Feng et al.

NEURIPS 2025arXiv:2505.20738
3
citations
#10551

A Unified Framework for the Transportability of Population-Level Causal Measures

Ahmed Boughdiri, Clément Berenfeld, Julie Josse et al.

NEURIPS 2025arXiv:2505.13104
3
citations
#10552

Visual Modality Prompt for Adapting Vision-Language Object Detectors

Heitor Rapela Medeiros, Atif Belal, Srikanth Muralidharan et al.

ICCV 2025arXiv:2412.00622
3
citations
#10553

Efficient Unsupervised Shortcut Learning Detection and Mitigation in Transformers

Lukas Kuhn, sari sadiya, Jörg Schlötterer et al.

ICCV 2025arXiv:2501.00942
3
citations
#10554

Generalization Bounds for Canonicalization: A Comparative Study with Group Averaging

Behrooz Tahmasebi, Stefanie Jegelka

ICLR 2025
3
citations
#10555

Scaling Laws for Robust Comparison of Open Foundation Language-Vision Models and Datasets

Marianna Nezhurina, Tomer Porian, Giovanni Puccetti et al.

NEURIPS 2025arXiv:2506.04598
3
citations
#10556

Quantifying Cross-Modality Memorization in Vision-Language Models

Yuxin Wen, Yangsibo Huang, Tom Goldstein et al.

NEURIPS 2025arXiv:2506.05198
3
citations
#10557

Learning from positive and unlabeled examples -Finite size sample bounds

Farnam Mansouri, Shai Ben-David

NEURIPS 2025arXiv:2507.07354
3
citations
#10558

Mitigating Ambiguities in 3D Classification with Gaussian Splatting

Ruiqi Zhang, Hao Zhu, Jingyi Zhao et al.

CVPR 2025arXiv:2503.08352
3
citations
#10559

I2-NeRF: Learning Neural Radiance Fields Under Physically-Grounded Media Interactions

Shuhong Liu, Lin Gu, Ziteng Cui et al.

NEURIPS 2025arXiv:2510.22161
3
citations
#10560

AVCD: Mitigating Hallucinations in Audio-Visual Large Language Models through Contrastive Decoding

Chaeyoung Jung, Youngjoon Jang, Joon Son Chung

NEURIPS 2025arXiv:2505.20862
3
citations
#10561

Attention! Your Vision Language Model Could Be Maliciously Manipulated

Xiaosen Wang, Shaokang Wang, Zhijin Ge et al.

NEURIPS 2025arXiv:2505.19911
3
citations
#10562

Statistical Tractability of Off-policy Evaluation of History-dependent Policies in POMDPs

Yuheng Zhang, Nan Jiang

ICLR 2025arXiv:2503.01134
3
citations
#10563

ReDi: Rectified Discrete Flow

Jaehoon Yoo, Wonjung Kim, Seunghoon Hong

NEURIPS 2025arXiv:2507.15897
3
citations
#10564

Constrained Entropic Unlearning: A Primal-Dual Framework for Large Language Models

Taha Entesari, Arman Hatami, Rinat Khaziev et al.

NEURIPS 2025arXiv:2506.05314
3
citations
#10565

BTL-UI: Blink-Think-Link Reasoning Model for GUI Agent

Shaojie Zhang, Ruoceng Zhang, Pei Fu et al.

NEURIPS 2025arXiv:2509.15566
3
citations
#10566

ElasticMM: Efficient Multimodal LLMs Serving with Elastic Multimodal Parallelism

Zedong Liu, Shenggan Cheng, Guangming Tan et al.

NEURIPS 2025oralarXiv:2507.10069
3
citations
#10567

Diffusion Models and the Manifold Hypothesis: Log-Domain Smoothing is Geometry Adaptive

Tyler Farghly, Peter Potaptchik, Samuel Howard et al.

NEURIPS 2025arXiv:2510.02305
3
citations
#10568

Bisimulation Metric for Model Predictive Control

Yutaka Shimizu, Masayoshi Tomizuka

ICLR 2025arXiv:2410.04553
3
citations
#10569

SynBrain: Enhancing Visual-to-fMRI Synthesis via Probabilistic Representation Learning

Weijian Mai, Jiamin Wu, Yu Zhu et al.

NEURIPS 2025arXiv:2508.10298
3
citations
#10570

Generalization Error Analysis for Selective State-Space Models Through the Lens of Attention

Arya Honarpisheh, Mustafa Bozdag, Octavia Camps et al.

NEURIPS 2025arXiv:2502.01473
3
citations
#10571

Who Reasons in the Large Language Models?

Jie Shao, Jianxin Wu

NEURIPS 2025arXiv:2505.20993
3
citations
#10572

RoboTron-Nav: A Unified Framework for Embodied Navigation Integrating Perception, Planning, and Prediction

Yufeng Zhong, Chengjian Feng, Feng yan et al.

ICCV 2025arXiv:2503.18525
3
citations
#10573

BMMR: A Large-Scale Bilingual Multimodal Multi-Discipline Reasoning Dataset

Zhiheng Xi, Guanyu Li, Yutao Fan et al.

NEURIPS 2025arXiv:2507.03483
3
citations
#10574

SkyLadder: Better and Faster Pretraining via Context Window Scheduling

Tongyao Zhu, Qian Liu, Haonan Wang et al.

NEURIPS 2025arXiv:2503.15450
3
citations
#10575

VoteSplat: Hough Voting Gaussian Splatting for 3D Scene Understanding

Minchao Jiang, Shunyu Jia, Jiaming Gu et al.

ICCV 2025arXiv:2506.22799
3
citations
#10576

ATLAS: Decoupling Skeletal and Shape Parameters for Expressive Parametric Human Modeling

Jinhyung Park, Javier Romero, Shunsuke Saito et al.

ICCV 2025arXiv:2508.15767
3
citations
#10577

VQToken: Neural Discrete Token Representation Learning for Extreme Token Reduction in Video Large Language Models

Haichao Zhang, Yun Fu

NEURIPS 2025oralarXiv:2503.16980
3
citations
#10578

On the Generalization of Representation Uncertainty in Earth Observation

Spyros Kondylatos, Nikolaos Ioannis Bountos, Dimitrios Michail et al.

ICCV 2025arXiv:2503.07082
3
citations
#10579

FRESA: Feedforward Reconstruction of Personalized Skinned Avatars from Few Images

Rong Wang, Fabian Prada, Ziyan Wang et al.

CVPR 2025highlightarXiv:2503.19207
3
citations
#10580

E-BATS: Efficient Backpropagation-Free Test-Time Adaptation for Speech Foundation Models

Jiaheng Dong, Hong Jia, Soumyajit Chatterjee et al.

NEURIPS 2025arXiv:2506.07078
3
citations
#10581

Predict-Optimize-Distill: A Self-Improving Cycle for 4D Object Understanding

Mingxuan Wu, Huang Huang, Justin Kerr et al.

ICCV 2025arXiv:2504.17441
3
citations
#10582

Distribution-Free Data Uncertainty for Neural Network Regression

Domokos M. Kelen, Ádám Jung, Péter Kersch et al.

ICLR 2025
3
citations
#10583

Learning (Approximately) Equivariant Networks via Constrained Optimization

Andrei Manolache, Luiz Chamon, Mathias Niepert

NEURIPS 2025oralarXiv:2505.13631
3
citations
#10584

Jigsaw++: Imagining Complete Shape Priors for Object Reassembly

Jiaxin Lu, Gang Hua, Qixing Huang

ICCV 2025arXiv:2410.11816
3
citations
#10585

Language Modeling by Language Models

Junyan Cheng, Peter Clark, Kyle Richardson

NEURIPS 2025spotlightarXiv:2506.20249
3
citations
#10586

Second-Order Convergence in Private Stochastic Non-Convex Optimization

Youming Tao, Zuyuan Zhang, Dongxiao Yu et al.

NEURIPS 2025arXiv:2505.15647
3
citations
#10587

Disentangled Clothed Avatar Generation with Layered Representation

Weitian Zhang, Yichao Yan, Sijing Wu et al.

ICCV 2025highlightarXiv:2501.04631
3
citations
#10588

Mixture-of-Experts Meets In-Context Reinforcement Learning

Wenhao Wu, Fuhong Liu, Haoru Li et al.

NEURIPS 2025arXiv:2506.05426
3
citations
#10589

HoliGS: Holistic Gaussian Splatting for Embodied View Synthesis

Xiaoyuan Wang, Yizhou Zhao, Botao Ye et al.

NEURIPS 2025arXiv:2506.19291
3
citations
#10590

ResQ: A Novel Framework to Implement Residual Neural Networks on Analog Rydberg Atom Quantum Computers

Nicholas DiBrita, Jason Han, Tirthak Patel

ICCV 2025arXiv:2506.21537
3
citations
#10591

Contrastive Representations for Temporal Reasoning

Alicja Ziarko, Michał Bortkiewicz, Michał Zawalski et al.

NEURIPS 2025oralarXiv:2508.13113
3
citations
#10592

Emergence of Linear Truth Encodings in Language Models

Shauli Ravfogel, Gilad Yehudai, Tal Linzen et al.

NEURIPS 2025arXiv:2510.15804
3
citations
#10593

Monitoring Risks in Test-Time Adaptation

Mona Schirmer, Metod Jazbec, Christian Andersson Naesseth et al.

NEURIPS 2025arXiv:2507.08721
3
citations
#10594

Color Matching Using Hypernetwork-Based Kolmogorov-Arnold Networks

Artem Nikonorov, Georgy Perevozchikov, Andrei Korepanov et al.

ICCV 2025arXiv:2503.11781
3
citations
#10595

Learning to price with resource constraints: from full information to machine-learned prices

Ruicheng Ao, Jiashuo Jiang, David Simchi-Levi

NEURIPS 2025arXiv:2501.14155
3
citations
#10596

Avoiding exp(R) scaling in RLHF through Preference-based Exploration

Mingyu Chen, Yiding Chen, Wen Sun et al.

NEURIPS 2025
3
citations
#10597

Hierarchical-aware Orthogonal Disentanglement Framework for Fine-grained Skeleton-based Action Recognition

Haochen Chang, Pengfei Ren, Haoyang Zhang et al.

ICCV 2025
3
citations
#10598

GyroSwin: 5D Surrogates for Gyrokinetic Plasma Turbulence Simulations

Fabian Paischer, Gianluca Galletti, William Hornsby et al.

NEURIPS 2025arXiv:2510.07314
3
citations
#10599

An Analytical Theory of Spectral Bias in the Learning Dynamics of Diffusion Models

Binxu Wang, Cengiz Pehlevan

NEURIPS 2025spotlightarXiv:2503.03206
3
citations
#10600

More of the Same: Persistent Representational Harms Under Increased Representation

Jennifer Mickel, Maria De-Arteaga, Liu Leqi et al.

NEURIPS 2025arXiv:2503.00333
3
citations