Most Cited 2025 "margin penalty" Papers

22,274 papers found • Page 41 of 112

#8001

NightAdapter: Learning a Frequency Adapter for Generalizable Night-time Scene Segmentation

Qi Bi, Jingjun Yi, Huimin Huang et al.

CVPR 2025
5
citations
#8002

MERGE: Multi-faceted Hierarchical Graph-based GNN for Gene Expression Prediction from Whole Slide Histopathology Images

Aniruddha Ganguly, Debolina Chatterjee, Wentao Huang et al.

CVPR 2025arXiv:2412.02601
5
citations
#8003

3DEnhancer: Consistent Multi-View Diffusion for 3D Enhancement

Yihang Luo, Shangchen Zhou, Yushi Lan et al.

CVPR 2025arXiv:2412.18565
5
citations
#8004

ABC-Former: Auxiliary Bimodal Cross-domain Transformer with Interactive Channel Attention for White Balance

Yu-Cheng Chiu, GUAN-RONG CHEN, Zihao Chen et al.

CVPR 2025
5
citations
#8005

Vid-SME: Membership Inference Attacks against Large Video Understanding Models

Qi Li, Runpeng Yu, Xinchao Wang

NEURIPS 2025oralarXiv:2506.03179
5
citations
#8006

BUFFER-X: Towards Zero-Shot Point Cloud Registration in Diverse Scenes

Minkyun Seo, Hyungtae Lim, Kanghee Lee et al.

ICCV 2025highlightarXiv:2503.07940
5
citations
#8007

PanoWan: Lifting Diffusion Video Generation Models to 360$^\circ$ with Latitude/Longitude-aware Mechanisms

Yifei Xia, Shuchen Weng, Siqi Yang et al.

NEURIPS 2025
5
citations
#8008

Improved Balanced Classification with Theoretically Grounded Loss Functions

Corinna Cortes, Mehryar Mohri, Yutao Zhong

NEURIPS 2025arXiv:2512.23947
5
citations
#8009

Unraveling the Effects of Synthetic Data on End-to-End Autonomous Driving

Junhao Ge, Zuhong Liu, Longteng Fan et al.

ICCV 2025arXiv:2503.18108
5
citations
#8010

VisNumBench: Evaluating Number Sense of Multimodal Large Language Models

Tengjin Weng, Jingyi Wang, Wenhao Jiang et al.

ICCV 2025arXiv:2503.14939
5
citations
#8011

Spiral: Semantic-Aware Progressive LiDAR Scene Generation and Understanding

Dekai Zhu, Yixuan Hu, Youquan Liu et al.

NEURIPS 2025arXiv:2505.22643
5
citations
#8012

BIP3D: Bridging 2D Images and 3D Perception for Embodied Intelligence

Xuewu Lin, Tianwei Lin, Alan Huang et al.

CVPR 2025arXiv:2411.14869
5
citations
#8013

DecoupledGaussian: Object-Scene Decoupling for Physics-Based Interaction

Miaowei Wang, Yibo Zhang, Rui Ma et al.

CVPR 2025arXiv:2503.05484
5
citations
#8014

IAAO: Interactive Affordance Learning for Articulated Objects in 3D Environments

Can Zhang, Gim Hee Lee

CVPR 2025arXiv:2504.06827
5
citations
#8015

MagicHOI: Leveraging 3D Priors for Accurate Hand-object Reconstruction from Short Monocular Video Clips

SHIBO WANG, Haonan He, Maria Parelli et al.

ICCV 2025arXiv:2508.05506
5
citations
#8016

Deterministic Image-to-Image Translation via Denoising Brownian Bridge Models with Dual Approximators

Bohan Xiao, PEIYONG WANG, Qisheng He et al.

CVPR 2025arXiv:2512.23463
5
citations
#8017

MEGADance: Mixture-of-Experts Architecture for Genre-Aware 3D Dance Generation

kaixing yang, Xulong Tang, Ziqiao Peng et al.

NEURIPS 2025arXiv:2505.17543
5
citations
#8018

Scendi Score: Prompt‑Aware Diversity Evaluation via Schur Complement of CLIP Embeddings

Azim Ospanov, Mohammad Jalali, Farzan Farnia

ICCV 2025highlightarXiv:2412.18645
5
citations
#8019

BEDLAM2.0: Synthetic humans and cameras in motion

Joachim Tesch, Giorgio Becherini, Prerana Achar et al.

NEURIPS 2025oralarXiv:2511.14394
5
citations
#8020

Constrained Optimization From a Control Perspective via Feedback Linearization

Runyu Zhang, Arvind Raghunathan, Jeff Shamma et al.

NEURIPS 2025arXiv:2503.12665
5
citations
#8021

MEMOIR: Lifelong Model Editing with Minimal Overwrite and Informed Retention for LLMs

Ke Wang, Yiming QIN, Nikolaos Dimitriadis et al.

NEURIPS 2025arXiv:2506.07899
5
citations
#8022

The Fluorescent Veil: A Stealthy and Effective Physical Adversarial Patch Against Traffic Sign Recognition

Shuai Yuan, Xingshuo Han, Hongwei Li et al.

NEURIPS 2025arXiv:2409.12394
5
citations
#8023

Learning to Highlight Audio by Watching Movies

Chao Huang, Ruohan Gao, J. M. F. Tsang et al.

CVPR 2025arXiv:2505.12154
5
citations
#8024

MaterialRefGS: Reflective Gaussian Splatting with Multi-view Consistent Material Inference

Wenyuan Zhang, Jimin Tang, Weiqi Zhang et al.

NEURIPS 2025arXiv:2510.11387
5
citations
#8025

Modeling Thousands of Human Annotators for Generalizable Text-to-Image Person Re-identification

Jiayu Jiang, Changxing Ding, Wentao Tan et al.

CVPR 2025highlightarXiv:2503.09962
5
citations
#8026

Flow Matching-Based Autonomous Driving Planning with Advanced Interactive Behavior Modeling

Tianyi Tan, Yinan Zheng, Ruiming Liang et al.

NEURIPS 2025oralarXiv:2510.11083
5
citations
#8027

Dynamic Dictionary Learning for Remote Sensing Image Segmentation

Xuechao Zou, Yue Li, Shun Zhang et al.

ICCV 2025arXiv:2503.06683
5
citations
#8028

WorldWeaver: Generating Long-Horizon Video Worlds via Rich Perception

Zhiheng Liu, Xueqing Deng, Shoufa Chen et al.

NEURIPS 2025oralarXiv:2508.15720
5
citations
#8029

Context-Enhanced Memory-Refined Transformer for Online Action Detection

Zhanzhong Pang, Fadime Sener, Angela Yao

CVPR 2025arXiv:2503.18359
5
citations
#8030

Toward Robust Neural Reconstruction from Sparse Point Sets

Amine Ouasfi, Shubhendu Jena, Eric Marchand et al.

CVPR 2025arXiv:2412.16361
5
citations
#8031

Chain-of-Action: Trajectory Autoregressive Modeling for Robotic Manipulation

Wenbo Zhang, Tianrun Hu, Hanbo Zhang et al.

NEURIPS 2025oralarXiv:2506.09990
5
citations
#8032

BecomingLit: Relightable Gaussian Avatars with Hybrid Neural Shading

Jonathan Schmidt, Simon Giebenhain, Matthias Niessner

NEURIPS 2025arXiv:2506.06271
5
citations
#8033

ETCH: Generalizing Body Fitting to Clothed Humans via Equivariant Tightness

Boqian Li, Zeyu Cai, Michael Black et al.

ICCV 2025highlightarXiv:2503.10624
5
citations
#8034

Glocal Information Bottleneck for Time Series Imputation

Jie Yang, Kexin Zhang, Guibin Zhang et al.

NEURIPS 2025oralarXiv:2510.04910
5
citations
#8035

Understanding and Rectifying Safety Perception Distortion in VLMs

Xiaohan Zou, Jian Kang, George Kesidis et al.

NEURIPS 2025arXiv:2502.13095
5
citations
#8036

HyperNet Fields: Efficiently Training Hypernetworks without Ground Truth by Learning Weight Trajectories

Eric Hedlin, Munawar Hayat, Fatih Porikli et al.

CVPR 2025arXiv:2412.17040
5
citations
#8037

Understanding Fine-tuning CLIP for Open-vocabulary Semantic Segmentation in Hyperbolic Space

Zelin Peng, Zhengqin Xu, Zhilin Zeng et al.

CVPR 2025
5
citations
#8038

ExGra-Med: Extended Context Graph Alignment for Medical Vision-Language Models

Duy M. H. Nguyen, Nghiem Diep, Trung Nguyen et al.

NEURIPS 2025arXiv:2410.02615
5
citations
#8039

HyperLoRA: Parameter-Efficient Adaptive Generation for Portrait Synthesis

Mengtian Li, Jinshu Chen, Wanquan Feng et al.

CVPR 2025highlightarXiv:2503.16944
5
citations
#8040

L-MTP: Leap Multi-Token Prediction Beyond Adjacent Context for Large Language Models

Xiaohao Liu, Xiaobo Xia, Weixiang Zhao et al.

NEURIPS 2025arXiv:2505.17505
5
citations
#8041

Zero-Shot 4D Lidar Panoptic Segmentation

Yushan Zhang, Aljoša Ošep, Laura Leal-Taixe et al.

CVPR 2025arXiv:2504.00848
5
citations
#8042

Precise Action-to-Video Generation Through Visual Action Prompts

Yuang Wang, Chao Wen, Haoyu Guo et al.

ICCV 2025arXiv:2508.13104
5
citations
#8043

Let Me Think! A Long Chain of Thought Can Be Worth Exponentially Many Short Ones

Parsa Mirtaheri, Ezra Edelman, Samy Jelassi et al.

NEURIPS 2025arXiv:2505.21825
5
citations
#8044

The Structural Complexity of Matrix-Vector Multiplication

Emile Anand, Jan van den Brand, Rose McCarty

NEURIPS 2025arXiv:2502.21240
5
citations
#8045

Improving Transferable Targeted Attacks with Feature Tuning Mixup

Kaisheng Liang, Xuelong Dai, Yanjie Li et al.

CVPR 2025arXiv:2411.15553
5
citations
#8046

The Silent Assistant: NoiseQuery as Implicit Guidance for Goal-Driven Image Generation

Ruoyu Wang, Huayang Huang, Ye Zhu et al.

ICCV 2025highlightarXiv:2412.05101
5
citations
#8047

VeriThoughts: Enabling Automated Verilog Code Generation using Reasoning and Formal Verification

Patrick Yubeaton, Andre Nakkab, Weihua Xiao et al.

NEURIPS 2025arXiv:2505.20302
5
citations
#8048

Transformer brain encoders explain human high-level visual responses

Hossein Adeli, Sun Minni, Nikolaus Kriegeskorte

NEURIPS 2025spotlightarXiv:2505.17329
5
citations
#8049

Leveraging BEV Paradigm for Ground-to-Aerial Image Synthesis

Junyan Ye, Jun He, Weijia Li et al.

ICCV 2025arXiv:2408.01812
5
citations
#8050

Unleashing Diffusion Transformers for Visual Correspondence by Modulating Massive Activations

Chaofan Gan, Yuanpeng Tu, Xi Chen et al.

NEURIPS 2025arXiv:2505.18584
5
citations
#8051

Poison as Cure: Visual Noise for Mitigating Object Hallucinations in LVMs

Kejia Zhang, Keda TAO, Jiasheng Tang et al.

NEURIPS 2025arXiv:2501.19164
5
citations
#8052

Uni-Renderer: Unifying Rendering and Inverse Rendering Via Dual Stream Diffusion

ZhiFei Chen, Tianshuo Xu, Wenhang Ge et al.

CVPR 2025arXiv:2412.15050
5
citations
#8053

When Are Concepts Erased From Diffusion Models?

Kevin Lu, Nicky Kriplani, Rohit Gandikota et al.

NEURIPS 2025arXiv:2505.17013
5
citations
#8054

FedVLA: Federated Vision-Language-Action Learning with Dual Gating Mixture-of-Experts for Robotic Manipulation

Cui Miao, Tao Chang, meihan wu et al.

ICCV 2025arXiv:2508.02190
5
citations
#8055

FaceLift: Learning Generalizable Single Image 3D Face Reconstruction from Synthetic Heads

Weijie Lyu, Yi Zhou, Ming-Hsuan Yang et al.

ICCV 2025arXiv:2412.17812
5
citations
#8056

AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play

Ran Xu, Yuchen Zhuang, Zihan Dong et al.

NEURIPS 2025spotlightarXiv:2509.24193
5
citations
#8057

Information Density Principle for MLLM Benchmarks

Chunyi Li, Xiaozhe Li, Zicheng Zhang et al.

ICCV 2025arXiv:2503.10079
5
citations
#8058

Disrupting Model Merging: A Parameter-Level Defense Without Sacrificing Accuracy

JUNHAO WEI, YU ZHE, Jun Sakuma

ICCV 2025arXiv:2503.07661
5
citations
#8059

DiffVsgg: Diffusion-Driven Online Video Scene Graph Generation

Mu Chen, Liulei Li, Wenguan Wang et al.

CVPR 2025arXiv:2503.13957
5
citations
#8060

RAGNet: Large-scale Reasoning-based Affordance Segmentation Benchmark towards General Grasping

Dongming Wu, Yanping Fu, Saike Huang et al.

ICCV 2025arXiv:2507.23734
5
citations
#8061

Open-World Objectness Modeling Unifies Novel Object Detection

Shan Zhang, Yao Ni, Jinhao Du et al.

CVPR 2025
5
citations
#8062

FreeGave: 3D Physics Learning from Dynamic Videos by Gaussian Velocity

Jinxi Li, Ziyang Song, Siyuan Zhou et al.

CVPR 2025arXiv:2506.07865
5
citations
#8063

Scalable Video-to-Dataset Generation for Cross-Platform Mobile Agents

Yunseok Jang, Yeda Song, Sungryull Sohn et al.

CVPR 2025arXiv:2505.12632
5
citations
#8064

ZeroGrasp: Zero-Shot Shape Reconstruction Enabled Robotic Grasping

Shun Iwase, Muhammad Zubair Irshad, Katherine Liu et al.

CVPR 2025arXiv:2504.10857
5
citations
#8065

Geo-Sign: Hyperbolic Contrastive Regularisation for Geometrically Aware Sign Language Translation

Edward Fish, Richard Bowden

NEURIPS 2025oralarXiv:2506.00129
5
citations
#8066

PerLA: Perceptive 3D Language Assistant

Guofeng Mei, Wei Lin, Luigi Riz et al.

CVPR 2025arXiv:2411.19774
5
citations
#8067

Motion Synthesis with Sparse and Flexible Keyjoint Control

Inwoo Hwang, Jinseok Bae, Donggeun Lim et al.

ICCV 2025arXiv:2503.15557
5
citations
#8068

R$^2$ec: Towards Large Recommender Models with Reasoning

Runyang You, Yongqi Li, Xinyu Lin et al.

NEURIPS 2025arXiv:2505.16994
5
citations
#8069

AlgoTune: Can Language Models Speed Up General-Purpose Numerical Programs?

Ori Press, Brandon Amos, Haoyu Zhao et al.

NEURIPS 2025arXiv:2507.15887
5
citations
#8070

Exploring the Visual Feature Space for Multimodal Neural Decoding

Weihao Xia, Cengiz Oztireli

ICCV 2025arXiv:2505.15755
5
citations
#8071

One2Any: One-Reference 6D Pose Estimation for Any Object

Mengya Liu, Siyuan Li, Ajad Chhatkuli et al.

CVPR 2025arXiv:2505.04109
5
citations
#8072

Lay-Your-Scene: Natural Scene Layout Generation with Diffusion Transformers

Divyansh Srivastava, Xiang Zhang, He Wen et al.

ICCV 2025arXiv:2505.04718
5
citations
#8073

Contrastive Test-Time Composition of Multiple LoRA Models for Image Generation

Tuna Meral, Enis Simsar, Federico Tombari et al.

ICCV 2025highlightarXiv:2403.19776
5
citations
#8074

SparseMM: Head Sparsity Emerges from Visual Concept Responses in MLLMs

Jiahui Wang, Zuyan Liu, Yongming Rao et al.

ICCV 2025arXiv:2506.05344
5
citations
#8075

Adaptive Hyper-Graph Convolution Network for Skeleton-based Human Action Recognition with Virtual Connections

Youwei Zhou, Tianyang Xu, Cong Wu et al.

ICCV 2025arXiv:2411.14796
5
citations
#8076

FALCON: An ML Framework for Fully Automated Layout-Constrained Analog Circuit Design

Asal Mehradfar, Xuzhe Zhao, Yilun Huang et al.

NEURIPS 2025arXiv:2505.21923
5
citations
#8077

Vid2Sim: Generalizable, Video-based Reconstruction of Appearance, Geometry and Physics for Mesh-free Simulation

Chuhao Chen, Zhiyang Dou, Chen Wang et al.

CVPR 2025arXiv:2506.06440
5
citations
#8078

Concept Replacer: Replacing Sensitive Concepts in Diffusion Models via Precision Localization

lingyun zhang, Yu Xie, Yanwei Fu et al.

CVPR 2025arXiv:2412.01244
5
citations
#8079

EgoExOR: An Ego-Exo-Centric Operating Room Dataset for Surgical Activity Understanding

Ege Özsoy, Arda Mamur, Felix Tristram et al.

NEURIPS 2025arXiv:2505.24287
5
citations
#8080

Convergence of Clipped SGD on Convex $(L_0,L_1)$-Smooth Functions

Ofir Gaash, Kfir Y. Levy, Yair Carmon

NEURIPS 2025arXiv:2502.16492
5
citations
#8081

On the Zero-shot Adversarial Robustness of Vision-Language Models: A Truly Zero-shot and Training-free Approach

Baoshun Tong, Hanjiang Lai, Yan Pan et al.

CVPR 2025
5
citations
#8082

Zero-AVSR: Zero-Shot Audio-Visual Speech Recognition with LLMs by Learning Language-Agnostic Speech Representations

Jeong Hun Yeo, Minsu Kim, Chae Won Kim et al.

ICCV 2025arXiv:2503.06273
5
citations
#8083

Birth and Death of a Rose

Chen Geng, Yunzhi Zhang, Shangzhe Wu et al.

CVPR 2025arXiv:2412.05278
5
citations
#8084

Exact: Exploring Space-Time Perceptive Clues for Weakly Supervised Satellite Image Time Series Semantic Segmentation

Hao Zhu, Yan Zhu, Jiayu Xiao et al.

CVPR 2025highlightarXiv:2412.03968
5
citations
#8085

Incentivizing Reasoning for Advanced Instruction-Following of Large Language Models

Yulei Qin, Gang Li, Zongyi Li et al.

NEURIPS 2025arXiv:2506.01413
5
citations
#8086

MindGYM: What Matters in Question Synthesis for Thinking-Centric Fine-Tuning?

Zhe Xu, Daoyuan Chen, Zhenqing Ling et al.

NEURIPS 2025arXiv:2503.09499
5
citations
#8087

MedMax: Mixed-Modal Instruction Tuning for Training Biomedical Assistants

Hritik Bansal, Daniel Israel, Siyan Zhao et al.

NEURIPS 2025arXiv:2412.12661
5
citations
#8088

Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks

Wei-Jin Huang, Yuan-Ming Li, Zhi-Wei Xia et al.

CVPR 2025arXiv:2503.22405
5
citations
#8089

Modeling Microenvironment Trajectories on Spatial Transcriptomics with NicheFlow

Kristiyan Sakalyan, Alessandro Palma, Filippo Guerranti et al.

NEURIPS 2025oralarXiv:2511.00977
5
citations
#8090

RadarQA: Multi-modal Quality Analysis of Weather Radar Forecasts

Xuming He, Zhiyuan You, Junchao Gong et al.

NEURIPS 2025arXiv:2508.12291
5
citations
#8091

JTD-UAV: MLLM-Enhanced Joint Tracking and Description Framework for Anti-UAV Systems

Yifan Wang, Jian Zhao, Zhaoxin Fan et al.

CVPR 2025
5
citations
#8092

Audio-Visual Semantic Graph Network for Audio-Visual Event Localization

Liang Liu, Shuaiyong Li, Yongqiang Zhu

CVPR 2025
5
citations
#8093

CLOC: Contrastive Learning for Ordinal Classification with Multi-Margin N-pair Loss

Dileepa Pitawela, Gustavo Carneiro, Hsiang-Ting Chen

CVPR 2025arXiv:2504.17813
5
citations
#8094

The Rich and the Simple: On the Implicit Bias of Adam and SGD

Bhavya Vasudeva, Jung Lee, Vatsal Sharan et al.

NEURIPS 2025arXiv:2505.24022
5
citations
#8095

CoHD: A Counting-Aware Hierarchical Decoding Framework for Generalized Referring Expression Segmentation

Zhuoyan Luo, Yinghao Wu, Tianheng Cheng et al.

ICCV 2025arXiv:2405.15658
5
citations
#8096

Unsupervised Joint Learning of Optical Flow and Intensity with Event Cameras

Shuang Guo, Friedhelm Hamann, Guillermo Gallego

ICCV 2025highlightarXiv:2503.17262
5
citations
#8097

MIRAGE: A Benchmark for Multimodal Information-Seeking and Reasoning in Agricultural Expert-Guided Conversations

Vardhan Dongre, Chi Gui, Shubham Garg et al.

NEURIPS 2025arXiv:2506.20100
5
citations
#8098

FDS: Frequency-Aware Denoising Score for Text-Guided Latent Diffusion Image Editing

Yufan Ren, Zicong Jiang, Tong Zhang et al.

CVPR 2025arXiv:2503.19191
5
citations
#8099

D^3-Human: Dynamic Disentangled Digital Human from Monocular Video

Honghu Chen, Bo Peng, Yunfan Tao et al.

CVPR 2025arXiv:2501.01589
5
citations
#8100

Convex Relaxation for Robust Vanishing Point Estimation in Manhattan World

Bangyan Liao, Zhenjun Zhao, Haoang Li et al.

CVPR 2025arXiv:2505.04788
5
citations
#8101

Adversarial Domain Prompt Tuning and Generation for Single Domain Generalization

Zhipeng Xu, De Cheng, XINYANG JIANG et al.

CVPR 2025
5
citations
#8102

Better Language Model Inversion by Compactly Representing Next-Token Distributions

Murtaza Nazir, Matthew Finlayson, John Morris et al.

NEURIPS 2025arXiv:2506.17090
5
citations
#8103

Dynamic Stereotype Theory Induced Micro-expression Recognition with Oriented Deformation

Bohao Zhang, Xuejiao Wang, Changbo Wang et al.

CVPR 2025
5
citations
#8104

macOSWorld: A Multilingual Interactive Benchmark for GUI Agents

Pei Yang, Hai Ci, Mike Zheng Shou

NEURIPS 2025arXiv:2506.04135
5
citations
#8105

Interpretable Generative Models through Post-hoc Concept Bottlenecks

Akshay R. Kulkarni, Ge Yan, Chung-En Sun et al.

CVPR 2025arXiv:2503.19377
5
citations
#8106

Fine-Grained Preference Optimization Improves Spatial Reasoning in VLMs

Yifan Shen, Yuanzhe Liu, Jingyuan Zhu et al.

NEURIPS 2025arXiv:2506.21656
5
citations
#8107

Silent Branding Attack: Trigger-free Data Poisoning Attack on Text-to-Image Diffusion Models

Sangwon Jang, June Suk Choi, Jaehyeong Jo et al.

CVPR 2025arXiv:2503.09669
5
citations
#8108

VideoHandles: Editing 3D Object Compositions in Videos Using Video Generative Priors

Juil Koo, Paul Guerrero, Chun-Hao P. Huang et al.

CVPR 2025arXiv:2503.01107
5
citations
#8109

AnimeGamer: Infinite Anime Life Simulation with Next Game State Prediction

Junhao Cheng, Yuying Ge, Yixiao Ge et al.

ICCV 2025arXiv:2504.01014
5
citations
#8110

IDFace: Face Template Protection for Efficient and Secure Identification

Sunpill Kim, Seunghun Paik, Chanwoo Hwang et al.

ICCV 2025arXiv:2507.12050
5
citations
#8111

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Wenqi Zhang, Hang Zhang, Xin Li et al.

ICCV 2025highlightarXiv:2501.00958
5
citations
#8112

Multi-Resolution Pathology-Language Pre-training Model with Text-Guided Visual Representation

Shahad Albastaki, Anabia Sohail, IYYAKUTTI IYAPPAN GANAPATHI et al.

CVPR 2025arXiv:2504.18856
5
citations
#8113

Rate-In: Information-Driven Adaptive Dropout Rates for Improved Inference-Time Uncertainty Estimation

Tal Zeevi, Ravid Shwartz-Ziv, Yann LeCun et al.

CVPR 2025arXiv:2412.07169
5
citations
#8114

CHOICE: Benchmarking the Remote Sensing Capabilities of Large Vision-Language Models

Xiao An, Jiaxing Sun, Zihan Gui et al.

NEURIPS 2025arXiv:2411.18145
5
citations
#8115

Conformal Prediction for Zero-Shot Models

Julio Silva-Rodríguez, Ismail Ben Ayed, Jose Dolz

CVPR 2025arXiv:2505.24693
5
citations
#8116

MeshLLM: Empowering Large Language Models to Progressively Understand and Generate 3D Mesh

Shuangkang Fang, I-Chao Shen, Yufeng Wang et al.

ICCV 2025highlightarXiv:2508.01242
5
citations
#8117

Commonsense Video Question Answering through Video-Grounded Entailment Tree Reasoning

Huabin Liu, Filip Ilievski, Cees G. M. Snoek

CVPR 2025arXiv:2501.05069
5
citations
#8118

Single Domain Generalization for Few-Shot Counting via Universal Representation Matching

Xianing Chen, Si Huo, Borui Jiang et al.

CVPR 2025arXiv:2505.16778
5
citations
#8119

Flexible MOF Generation with Torsion-Aware Flow Matching

Nayoung Kim, Seongsu Kim, Sungsoo Ahn

NEURIPS 2025arXiv:2505.17914
5
citations
#8120

Dynamic Integration of Task-Specific Adapters for Class Incremental Learning

Jiashuo Li, Shaokun Wang, Bo Qian et al.

CVPR 2025arXiv:2409.14983
5
citations
#8121

WeatherGen: A Unified Diverse Weather Generator for LiDAR Point Clouds via Spider Mamba Diffusion

Yang Wu, Yun Zhu, Kaihua Zhang et al.

CVPR 2025arXiv:2504.13561
5
citations
#8122

Joint Relational Database Generation via Graph-Conditional Diffusion Models

Mohamed Amine Ketata, David Lüdke, Leo Schwinn et al.

NEURIPS 2025arXiv:2505.16527
5
citations
#8123

ProbeSDF: Light Field Probes For Neural Surface Reconstruction

Briac Toussaint, Diego Thomas, Jean-Sébastien Franco

CVPR 2025arXiv:2412.10084
5
citations
#8124

Entropic Time Schedulers for Generative Diffusion Models

Dejan Stancevic, Florian Handke, Luca Ambrogioni

NEURIPS 2025arXiv:2504.13612
5
citations
#8125

Generating Computational Cognitive models using Large Language Models

Milena Rmus, Akshay Kumar Jagadish, Marvin Mathony et al.

NEURIPS 2025oralarXiv:2502.00879
5
citations
#8126

SALOVA: Segment-Augmented Long Video Assistant for Targeted Retrieval and Routing in Long-Form Video Analysis

Junho Kim, Hyunjun Kim, Hosu Lee et al.

CVPR 2025arXiv:2411.16173
5
citations
#8127

Logic.py: Bridging the Gap between LLMs and Constraint Solvers

Pascal Kesseli, Peter O'Hearn, Ricardo Cabral

NEURIPS 2025arXiv:2502.15776
5
citations
#8128

Mind the Trojan Horse: Image Prompt Adapter Enabling Scalable and Deceptive Jailbreaking

Junxi Chen, Junhao Dong, Xiaohua Xie

CVPR 2025highlightarXiv:2504.05838
5
citations
#8129

Enhancing Facial Privacy Protection via Weakening Diffusion Purification

Ali Salar, Qing Liu, Yingli Tian et al.

CVPR 2025arXiv:2503.10350
5
citations
#8130

Generalization or Hallucination? Understanding Out-of-Context Reasoning in Transformers

Yixiao Huang, Hanlin Zhu, Tianyu Guo et al.

NEURIPS 2025arXiv:2506.10887
5
citations
#8131

Failure Prediction at Runtime for Generative Robot Policies

Ralf Römer, Adrian Kobras, Luca Worbis et al.

NEURIPS 2025arXiv:2510.09459
5
citations
#8132

MOSCATO: Predicting Multiple Object State Change Through Actions

Parnian Zameni, Yuhan Shen, Ehsan Elhamifar

ICCV 2025
5
citations
#8133

The Overthinker's DIET: Cutting Token Calories with DIfficulty-AwarE Training

Weize Chen, Jiarui yuan, Jin Tailin et al.

NEURIPS 2025arXiv:2505.19217
5
citations
#8134

Scaling Up Liquid-Resistance Liquid-Capacitance Networks for Efficient Sequence Modeling

Mónika Farsang, Radu Grosu

NEURIPS 2025arXiv:2505.21717
5
citations
#8135

Sat2City: 3D City Generation from A Single Satellite Image with Cascaded Latent Diffusion

Tongyan Hua, Lutao Jiang, Ying-Cong Chen et al.

ICCV 2025arXiv:2507.04403
5
citations
#8136

Spotting the Unexpected (STU): A 3D LiDAR Dataset for Anomaly Segmentation in Autonomous Driving

Alexey Nekrasov, Malcolm Burdorf, Stewart Worrall et al.

CVPR 2025arXiv:2505.02148
5
citations
#8137

DeGauss: Dynamic-Static Decomposition with Gaussian Splatting for Distractor-free 3D Reconstruction

Rui Wang, Quentin Lohmeyer, Mirko Meboldt et al.

ICCV 2025arXiv:2503.13176
5
citations
#8138

Exploring Multimodal Diffusion Transformers for Enhanced Prompt-based Image Editing

Joonghyuk Shin, Alchan Hwang, Yujin Kim et al.

ICCV 2025arXiv:2508.07519
5
citations
#8139

AutoLUT: LUT-Based Image Super-Resolution with Automatic Sampling and Adaptive Residual Learning

Yuheng Xu, Shijie Yang, Xin Liu et al.

CVPR 2025arXiv:2503.01565
5
citations
#8140

TextSSR: Diffusion-based Data Synthesis for Scene Text Recognition

Xingsong Ye, Yongkun Du, Yunbo Tao et al.

ICCV 2025arXiv:2412.01137
5
citations
#8141

Manual-PA: Learning 3D Part Assembly from Instruction Diagrams

Jiahao Zhang, Anoop Cherian, Cristian Rodriguez-Opazo et al.

ICCV 2025arXiv:2411.18011
5
citations
#8142

Toward Engineering AGI: Benchmarking the Engineering Design Capabilities of LLMs

Xingang Guo, Yaxin Li, XiangYi Kong et al.

NEURIPS 2025arXiv:2509.16204
5
citations
#8143

2HandedAfforder: Learning Precise Actionable Bimanual Affordances from Human Videos

Marvin Heidinger, Snehal Jauhri, Vignesh Prasad et al.

ICCV 2025arXiv:2503.09320
5
citations
#8144

Efficient Quadratic Corrections for Frank-Wolfe Algorithms

Jannis Halbey, Seta Rakotomandimby, Mathieu Besançon et al.

NEURIPS 2025arXiv:2506.02635
5
citations
#8145

LongDiff: Training-Free Long Video Generation in One Go

Zhuoling Li, Hossein Rahmani, Qiuhong Ke et al.

CVPR 2025arXiv:2503.18150
5
citations
#8146

Predicting Empirical AI Research Outcomes with Language Models

Jiaxin Wen, Chenglei Si, Yueh-Han Chen et al.

NEURIPS 2025arXiv:2506.00794
5
citations
#8147

DEAL: Data-Efficient Adversarial Learning for High-Quality Infrared Imaging

Zhu Liu, Zijun Wang, Jinyuan Liu et al.

CVPR 2025arXiv:2503.00905
5
citations
#8148

PoLAR: Polar-Decomposed Low-Rank Adapter Representation

Kai Lion, Liang Zhang, Bingcong Li et al.

NEURIPS 2025arXiv:2506.03133
5
citations
#8149

From Trial to Triumph: Advancing Long Video Understanding via Visual Context Sample Scaling and Self-reward Alignment

Yucheng Suo, Fan Ma, Linchao Zhu et al.

ICCV 2025arXiv:2503.20472
5
citations
#8150

Semi-Supervised State-Space Model with Dynamic Stacking Filter for Real-World Video Deraining

Shangquan Sun, Wenqi Ren, Juxiang Zhou et al.

CVPR 2025arXiv:2505.16811
5
citations
#8151

Minority-Focused Text-to-Image Generation via Prompt Optimization

Soobin Um, Jong Chul Ye

CVPR 2025arXiv:2410.07838
5
citations
#8152

DiTASK: Multi-Task Fine-Tuning with Diffeomorphic Transformations

Krishna Sri Ipsit Mantri, Carola-Bibiane Schönlieb, Bruno Ribeiro et al.

CVPR 2025arXiv:2502.06029
5
citations
#8153

Jailbreaking the Non-Transferable Barrier via Test-Time Data Disguising

Yongli Xiang, Ziming Hong, Lina Yao et al.

CVPR 2025arXiv:2503.17198
5
citations
#8154

Multiplayer Federated Learning: Reaching Equilibrium with Less Communication

TaeHo Yoon, Sayantan Choudhury, Nicolas Loizou

NEURIPS 2025arXiv:2501.08263
5
citations
#8155

PeRL: Permutation-Enhanced Reinforcement Learning for Interleaved Vision-Language Reasoning

Yizhen Zhang, Yang Ding, Shuoshuo Zhang et al.

NEURIPS 2025arXiv:2506.14907
5
citations
#8156

MIRA: Medical Time Series Foundation Model for Real-World Health Data

Hao Li, Bowen Deng, Chang Xu et al.

NEURIPS 2025oralarXiv:2506.07584
5
citations
#8157

Efficient Data Selection at Scale via Influence Distillation

Mahdi Nikdan, Vincent Cohen-Addad, Dan Alistarh et al.

NEURIPS 2025arXiv:2505.19051
5
citations
#8158

Transformers Provably Learn Chain-of-Thought Reasoning with Length Generalization

Yu Huang, Zixin Wen, Aarti Singh et al.

NEURIPS 2025arXiv:2511.07378
5
citations
#8159

VFlowOpt: A Token Pruning Framework for LMMs with Visual Information Flow-Guided Optimization

Sihan Yang, Runsen Xu, Chenhang Cui et al.

ICCV 2025arXiv:2508.05211
5
citations
#8160

Neural Motion Simulator Pushing the Limit of World Models in Reinforcement Learning

Chenjie Hao, Weyl Lu, Yifan Xu et al.

CVPR 2025arXiv:2504.07095
5
citations
#8161

DH-FaceVid-1K: A Large-Scale High-Quality Dataset for Face Video Generation

Donglin Di, He Feng, Wenzhang SUN et al.

ICCV 2025arXiv:2410.07151
5
citations
#8162

LightsOut: Diffusion-based Outpainting for Enhanced Lens Flare Removal

Shr-Ruei Tsai, Wei-Cheng Chang, Jie-Ying Lee et al.

ICCV 2025arXiv:2510.15868
5
citations
#8163

How Benchmark Prediction from Fewer Data Misses the Mark

Guanhua Zhang, Florian E. Dorner, Moritz Hardt

NEURIPS 2025arXiv:2506.07673
5
citations
#8164

DiffPortrait360: Consistent Portrait Diffusion for 360 View Synthesis

Yuming Gu, Phong Tran, Yujian Zheng et al.

CVPR 2025arXiv:2503.15667
5
citations
#8165

Top-H Decoding: Adapting the Creativity and Coherence with Bounded Entropy in Text Generation

Erfan Baghaei Potraghloo, Seyedarmin Azizi, Souvik Kundu et al.

NEURIPS 2025arXiv:2509.02510
5
citations
#8166

NLPrompt: Noise-Label Prompt Learning for Vision-Language Models

Bikang Pan, Qun Li, Xiaoying Tang et al.

CVPR 2025highlightarXiv:2412.01256
5
citations
#8167

Watermarking Autoregressive Image Generation

Nikola Jovanović, Ismail Labiad, Tomas Soucek et al.

NEURIPS 2025arXiv:2506.16349
5
citations
#8168

Pos3R: 6D Pose Estimation for Unseen Objects Made Easy

Weijian Deng, Dylan Campbell, Chunyi Sun et al.

CVPR 2025
5
citations
#8169

Street Gaussians without 3D Object Tracker

Ruida Zhang, Chengxi Li, Chenyangguang Zhang et al.

ICCV 2025arXiv:2412.05548
5
citations
#8170

egoPPG: Heart Rate Estimation from Eye-Tracking Cameras in Egocentric Systems to Benefit Downstream Vision Tasks

Björn Braun, Rayan Armani, Manuel Meier et al.

ICCV 2025arXiv:2502.20879
5
citations
#8171

Tight Lower Bounds and Improved Convergence in Performative Prediction

Pedram Khorsandi, Rushil Gupta, Mehrnaz Mofakhami et al.

NEURIPS 2025arXiv:2412.03671
5
citations
#8172

A Polarization-Aided Transformer for Image Deblurring via Motion Vector Decomposition

Duosheng Chen, Shihao Zhou, Jinshan Pan et al.

CVPR 2025highlight
5
citations
#8173

IterIS: Iterative Inference-Solving Alignment for LoRA Merging

Hongxu chen, Zhen Wang, Runshi Li et al.

CVPR 2025arXiv:2411.15231
5
citations
#8174

Robust Message Embedding via Attention Flow-Based Steganography

Huayuan Ye, Shenzhuo Zhang, Shiqi Jiang et al.

CVPR 2025arXiv:2405.16414
5
citations
#8175

SphereUFormer: A U-Shaped Transformer for Spherical 360 Perception

Yaniv Benny, Lior Wolf

CVPR 2025arXiv:2412.06968
5
citations
#8176

Learning from Neighbors: Category Extrapolation for Long-Tail Learning

Shizhen Zhao, Xin Wen, Jiahui Liu et al.

CVPR 2025arXiv:2410.15980
5
citations
#8177

SpatialCrafter: Unleashing the Imagination of Video Diffusion Models for Scene Reconstruction from Limited Observations

Songchun Zhang, Huiyao Xu, Sitong Guo et al.

ICCV 2025arXiv:2505.11992
5
citations
#8178

Bringing CLIP to the Clinic: Dynamic Soft Labels and Negation-Aware Learning for Medical Analysis

Hanbin Ko, Chang Min Park

CVPR 2025arXiv:2505.22079
5
citations
#8179

L$^2$M: Mutual Information Scaling Law for Long-Context Language Modeling

Zhuo Chen, Oriol Comas, Zhuotao Jin et al.

NEURIPS 2025arXiv:2503.04725
5
citations
#8180

Frame In-N-Out: Unbounded Controllable Image-to-Video Generation

Boyang Wang, Xuweiyi Chen, Matheus Gadelha et al.

NEURIPS 2025oralarXiv:2505.21491
5
citations
#8181

TacoDepth: Towards Efficient Radar-Camera Depth Estimation with One-stage Fusion

Yiran Wang, Jiaqi Li, Chaoyi Hong et al.

CVPR 2025arXiv:2504.11773
5
citations
#8182

Latent Drifting in Diffusion Models for Counterfactual Medical Image Synthesis

Yousef Yeganeh, Ioannis Charisiadis, Marta Hasny et al.

CVPR 2025highlightarXiv:2412.20651
5
citations
#8183

Object-centric 3D Motion Field for Robot Learning from Human Videos

Zhao-Heng Yin, Sherry Yang, Pieter Abbeel

NEURIPS 2025spotlightarXiv:2506.04227
5
citations
#8184

ChemPile: A 250 GB Diverse and Curated Dataset for Chemical Foundation Models

Adrian Mirza, Nawaf Alampara, Martiño Ríos-García et al.

NEURIPS 2025
5
citations
#8185

Detect Any Mirrors: Boosting Learning Reliability on Large-Scale Unlabeled Data with an Iterative Data Engine

Zhaohu Xing, Lihao Liu, Yijun Yang et al.

CVPR 2025
5
citations
#8186

Enhancing Testing-Time Robustness for Trusted Multi-View Classification in the Wild

Wei Liu, Yufei Chen, Xiaodong Yue

CVPR 2025
5
citations
#8187

BeliefMapNav: 3D Voxel-Based Belief Map for Zero-Shot Object Navigation

Zibo Zhou, Yue Hu, Lingkai Zhang et al.

NEURIPS 2025arXiv:2506.06487
5
citations
#8188

Towards Understanding the Mechanisms of Classifier-Free Guidance

Xiang Li, Rongrong Wang, Qing Qu

NEURIPS 2025spotlightarXiv:2505.19210
5
citations
#8189

Understanding Prompt Tuning and In-Context Learning via Meta-Learning

Tim Genewein, Kevin Li, Jordi Grau-Moya et al.

NEURIPS 2025spotlightarXiv:2505.17010
5
citations
#8190

DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval

Leqi Shen, Guoqiang Gong, Tianxiang Hao et al.

CVPR 2025arXiv:2506.08887
5
citations
#8191

NoPain: No-box Point Cloud Attack via Optimal Transport Singular Boundary

Zezeng Li, Xiaoyu Du, Na Lei et al.

CVPR 2025arXiv:2503.00063
5
citations
#8192

Locality-Aware Zero-Shot Human-Object Interaction Detection

Sanghyun Kim, Deunsol Jung, Minsu Cho

CVPR 2025arXiv:2505.19503
5
citations
#8193

EnvPoser: Environment-aware Realistic Human Motion Estimation from Sparse Observations with Uncertainty Modeling

Songpengcheng Xia, Yu Zhang, Zhuo Su et al.

CVPR 2025arXiv:2412.10235
5
citations
#8194

RULE: Reinforcement UnLEarning Achieves Forget-retain Pareto Optimality

Chenlong Zhang, Zhuoran Jin, Hongbang Yuan et al.

NEURIPS 2025arXiv:2506.07171
5
citations
#8195

SoMi-ToM: Evaluating Multi-Perspective Theory of Mind in Embodied Social Interactions

Xianzhe Fan, Xuhui Zhou, Chuanyang Jin et al.

NEURIPS 2025arXiv:2506.23046
5
citations
#8196

Enforcing Hard Linear Constraints in Deep Learning Models with Decision Rules

Gonzalo E. Constante, Hao Chen, Can Li

NEURIPS 2025arXiv:2505.13858
5
citations
#8197

Flash-Split: 2D Reflection Removal with Flash Cues and Latent Diffusion Separation

Tianfu Wang, Mingyang Xie, Haoming Cai et al.

CVPR 2025arXiv:2501.00637
5
citations
#8198

Novel View Synthesis with Pixel-Space Diffusion Models

Noam Elata, Bahjat Kawar, Yaron Ostrovsky-Berman et al.

CVPR 2025arXiv:2411.07765
5
citations
#8199

VASparse: Towards Efficient Visual Hallucination Mitigation via Visual-Aware Token Sparsification

Xianwei Zhuang, Zhihong Zhu, Yuxin Xie et al.

CVPR 2025arXiv:2501.06553
5
citations
#8200

OSLoPrompt: Bridging Low-Supervision Challenges and Open-Set Domain Generalization in CLIP

Mohamad Hassan N C, Divyam Gupta, Mainak Singha et al.

CVPR 2025arXiv:2503.16106
5
citations