Most Cited 2025 "k-space data" Papers

22,274 papers found • Page 56 of 112

#11001

Bridge Frame and Event: Common Spatiotemporal Fusion for High-Dynamic Scene Optical Flow

Hanyu Zhou, Haonan Wang, Haoyue Liu et al.

CVPR 2025arXiv:2503.06992
3
citations
#11002

On Large Multimodal Models as Open-World Image Classifiers

Alessandro Conti, Massimiliano Mancini, Enrico Fini et al.

ICCV 2025arXiv:2503.21851
3
citations
#11003

Sample-efficient Learning of Concepts with Theoretical Guarantees: from Data to Concepts without Interventions

Hidde Fokkema, Tim van Erven, Sara Magliacane

NEURIPS 2025arXiv:2502.06536
3
citations
#11004

Detecting Adversarial Data Using Perturbation Forgery

Qian Wang, Chen Li, Yuchen Luo et al.

CVPR 2025arXiv:2405.16226
3
citations
#11005

C-NAV: Towards Self-Evolving Continual Object Navigation in Open World

MingMing Yu, Fei Zhu, Wenzhuo Liu et al.

NEURIPS 2025oralarXiv:2510.20685
3
citations
#11006

GO-N3RDet: Geometry Optimized NeRF-enhanced 3D Object Detector

Zechuan Li, Hongshan Yu, Yihao Ding et al.

CVPR 2025arXiv:2503.15211
3
citations
#11007

Replicable Online Learning

Saba Ahmadi, Siddharth Bhandari, Avrim Blum

NEURIPS 2025arXiv:2411.13730
3
citations
#11008

Just One Layer Norm Guarantees Stable Extrapolation

Juliusz Ziomek, George Whittle, Michael A Osborne

NEURIPS 2025arXiv:2505.14512
3
citations
#11009

JanusDNA: A Powerful Bi-directional Hybrid DNA Foundation Model

Qihao Duan, Bingding Huang, Zhenqiao Song et al.

NEURIPS 2025arXiv:2505.17257
3
citations
#11010

Balanced Rate-Distortion Optimization in Learned Image Compression

Yichi Zhang, Zhihao Duan, Yuning Huang et al.

CVPR 2025highlightarXiv:2502.20161
3
citations
#11011

PBCAT: Patch-Based Composite Adversarial Training against Physically Realizable Attacks on Object Detection

Xiao Li, Yiming Zhu, Yifan Huang et al.

ICCV 2025arXiv:2506.23581
3
citations
#11012

Evaluating multiple models using labeled and unlabeled data

Divya Shanmugam, Shuvom Sadhuka, Manish Raghavan et al.

NEURIPS 2025arXiv:2501.11866
3
citations
#11013

TREND: Unsupervised 3D Representation Learning via Temporal Forecasting for LiDAR Perception

Runjian Chen, Hyoungseob Park, Bo Zhang et al.

NEURIPS 2025oralarXiv:2412.03054
3
citations
#11014

Martingale Score: An Unsupervised Metric for Bayesian Rationality in LLM Reasoning

Zhonghao He, Tianyi (Alex) Qiu, Hirokazu Shirado et al.

NEURIPS 2025arXiv:2512.02914
3
citations
#11015

Bidirectional Likelihood Estimation with Multi-Modal Large Language Models for Text-Video Retrieval

Dohwan Ko, Ji Soo Lee, Minhyuk Choi et al.

ICCV 2025highlightarXiv:2507.23284
3
citations
#11016

Bridge the Gap: From Weak to Full Supervision for Temporal Action Localization with PseudoFormer

Ziyi Liu, Yangcen Liu

CVPR 2025arXiv:2504.14860
3
citations
#11017

Image as an IMU: Estimating Camera Motion from a Single Motion-Blurred Image

Jerred Chen, Ronald Clark

ICCV 2025arXiv:2503.17358
3
citations
#11018

ArcPro: Architectural Programs for Structured 3D Abstraction of Sparse Points

Qirui Huang, Runze Zhang, Kangjun Liu et al.

CVPR 2025highlightarXiv:2503.02745
3
citations
#11019

Over-squashing in Spatiotemporal Graph Neural Networks

Ivan Marisca, Jacob Bamberger, Cesare Alippi et al.

NEURIPS 2025oralarXiv:2506.15507
3
citations
#11020

MEgoHand: Multimodal Egocentric Hand-Object Interaction Motion Generation

Bohan Zhou, Yi Zhan, Zhongbin Zhang et al.

NEURIPS 2025oralarXiv:2505.16602
3
citations
#11021

Recurrent Feature Mining and Keypoint Mixup Padding for Category-Agnostic Pose Estimation

Junjie Chen, Weilong Chen, Yifan Zuo et al.

CVPR 2025arXiv:2503.21140
3
citations
#11022

FIction: 4D Future Interaction Prediction from Video

Kumar Ashutosh, Georgios Pavlakos, Kristen Grauman

CVPR 2025highlightarXiv:2412.00932
3
citations
#11023

Social Debiasing for Fair Multi-modal LLMs

Harry Cheng, Yangyang Guo, Qingpei Guo et al.

ICCV 2025arXiv:2408.06569
3
citations
#11024

ConText-CIR: Learning from Concepts in Text for Composed Image Retrieval

Eric Xing, Pranavi Kolouju, Robert Pless et al.

CVPR 2025arXiv:2505.20764
3
citations
#11025

Long-term Traffic Simulation with Interleaved Autoregressive Motion and Scenario Generation

Xiuyu Yang, Shuhan Tan, Philipp Kraehenbuehl

ICCV 2025arXiv:2506.17213
3
citations
#11026

Parallelizing MCMC Across the Sequence Length

David Zoltowski, Skyler Wu, Xavier Gonzalez et al.

NEURIPS 2025arXiv:2508.18413
3
citations
#11027

Mechanism and Emergence of Stacked Attention Heads in Multi-Layer Transformers

Tiberiu Mușat

ICLR 2025
3
citations
#11028

Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy

Yiting Yang, Hao Luo, Yuan Sun et al.

ICCV 2025arXiv:2507.13260
3
citations
#11029

Self-Refining Language Model Anonymizers via Adversarial Distillation

Kyuyoung Kim, Hyunjun Jeon, Jinwoo Shin

NEURIPS 2025arXiv:2506.01420
3
citations
#11030

Reading Recognition in the Wild

Charig Yang, Samiul Alam, Shakhrul Iman Siam et al.

NEURIPS 2025arXiv:2505.24848
3
citations
#11031

UniPhy: Learning a Unified Constitutive Model for Inverse Physics Simulation

Himangi Mittal, Peiye Zhuang, Hsin-Ying Lee et al.

CVPR 2025arXiv:2505.16971
3
citations
#11032

BlockScan: Detecting Anomalies in Blockchain Transactions

Jiahao Yu, Xian Wu, Hao Liu et al.

NEURIPS 2025arXiv:2410.04039
3
citations
#11033

Gyro-based Neural Single Image Deblurring

Heemin Yang, Jaesung Rim, Seungyong Lee et al.

CVPR 2025arXiv:2404.00916
3
citations
#11034

Bridging Domain Generalization to Multimodal Domain Generalization via Unified Representations

Hai Huang, Yan Xia, Sashuai Zhou et al.

ICCV 2025arXiv:2507.03304
3
citations
#11035

Diving into the Fusion of Monocular Priors for Generalized Stereo Matching

Chengtang Yao, Lidong Yu, Zhidan Liu et al.

ICCV 2025arXiv:2505.14414
3
citations
#11036

Enhancing Image Restoration Transformer via Adaptive Translation Equivariance

JiaKui Hu, Zhengjian Yao, Lujia Jin et al.

ICCV 2025arXiv:2506.18520
3
citations
#11037

Progressive Test Time Energy Adaptation for Medical Image Segmentation

Xiaoran Zhang, Byung-Woo Hong, Hyoungseob Park et al.

ICCV 2025highlightarXiv:2503.16616
3
citations
#11038

MonarchAttention: Zero-Shot Conversion to Fast, Hardware-Aware Structured Attention

Can Yaras, Alec Xu, Pierre Abillama et al.

NEURIPS 2025spotlightarXiv:2505.18698
3
citations
#11039

MMMG: A Massive, Multidisciplinary, Multi-Tier Generation Benchmark for Text-to-Image Reasoning

Yuxuan Luo, Ryan Yuan, Junwen Chen et al.

NEURIPS 2025arXiv:2506.10963
3
citations
#11040

DiSciPLE: Learning Interpretable Programs for Scientific Visual Discovery

Utkarsh Mall, Cheng Perng Phoo, Mia Chiquier et al.

CVPR 2025arXiv:2502.10060
3
citations
#11041

Neurons: Emulating the Human Visual Cortex Improves Fidelity and Interpretability in fMRI-to-Video Reconstruction

Haonan Wang, Qixiang ZHANG, Lehan Wang et al.

ICCV 2025arXiv:2503.11167
3
citations
#11042

STEER-ME: Assessing the Microeconomic Reasoning of Large Language Models

Narun Raman, Taylor Lundy, Thiago Amin et al.

NEURIPS 2025arXiv:2502.13119
3
citations
#11043

EDCFlow: Exploring Temporally Dense Difference Maps for Event-based Optical Flow Estimation

Daikun Liu, Lei Cheng, Teng Wang et al.

CVPR 2025arXiv:2506.03512
3
citations
#11044

Z-Magic: Zero-shot Multiple Attributes Guided Image Creator

Yingying Deng, Xiangyu He, Fan Tang et al.

CVPR 2025arXiv:2503.12124
3
citations
#11045

Seeing in the Dark: Benchmarking Egocentric 3D Vision with the Oxford Day-and-Night Dataset

Zirui Wang, Wenjing Bian, Xinghui Li et al.

NEURIPS 2025arXiv:2506.04224
3
citations
#11046

Rethinking Invariance Regularization in Adversarial Training to Improve Robustness-Accuracy Trade-off

Futa Waseda, Ching-Chun Chang, Isao Echizen

ICLR 2025arXiv:2402.14648
3
citations
#11047

From Imitation to Innovation: The Emergence of AI's Unique Artistic Styles and the Challenge of Copyright Protection

Zexi Jia, Chuanwei Huang, Hongyan Fei et al.

ICCV 2025arXiv:2507.04769
3
citations
#11048

Parameterized Blur Kernel Prior Learning for Local Motion Deblurring

Zhenxuan Fang, Fangfang Wu, Tao Huang et al.

CVPR 2025
3
citations
#11049

ATCTrack: Aligning Target-Context Cues with Dynamic Target States for Robust Vision-Language Tracking

Xiaokun Feng, Shiyu Hu, Xuchen Li et al.

ICCV 2025highlightarXiv:2507.19875
3
citations
#11050

CineTechBench: A Benchmark for Cinematographic Technique Understanding and Generation

Xinran Wang, Songyu Xu, Shan Xiangxuan et al.

NEURIPS 2025arXiv:2505.15145
3
citations
#11051

MQuAKE-Remastered: Multi-Hop Knowledge Editing Can Only Be Advanced with Reliable Evaluations

Shaochen Zhong, Yifan (Louie) Lu, Lize Shao et al.

ICLR 2025
3
citations
#11052

Wide-Horizon Thinking and Simulation-Based Evaluation for Real-World LLM Planning with Multifaceted Constraints

Dongjie Yang, Chengqiang Lu, Qimeng Wang et al.

NEURIPS 2025spotlightarXiv:2506.12421
3
citations
#11053

How To Make Your Cell Tracker Say "I dunno!"

Richard D Paul, Johannes Seiffarth, David Rügamer et al.

ICCV 2025
3
citations
#11054

HoGS: Unified Near and Far Object Reconstruction via Homogeneous Gaussian Splatting

Xinpeng Liu, Zeyi Huang, Fumio Okura et al.

CVPR 2025arXiv:2503.19232
3
citations
#11055

GPO: Learning from Critical Steps to Improve LLM Reasoning

Jiahao Yu, Zelei Cheng, Xian Wu et al.

NEURIPS 2025arXiv:2509.16456
3
citations
#11056

Polar Sparsity: High Throughput Batched LLM Inferencing with Scalable Contextual Sparsity

Susav Shrestha, Bradley Settlemyer, Nikoli Dryden et al.

NEURIPS 2025arXiv:2505.14884
3
citations
#11057

Differential-informed Sample Selection Accelerates Multimodal Contrastive Learning

Zihua Zhao, Feng Hong, Mengxi Chen et al.

ICCV 2025arXiv:2507.12998
3
citations
#11058

Open-ended Hierarchical Streaming Video Understanding with Vision Language Models

Hyolim Kang, Yunsu Park, Youngbeom Yoo et al.

ICCV 2025arXiv:2509.12145
3
citations
#11059

BATCLIP: Bimodal Online Test-Time Adaptation for CLIP

Sarthak Kumar Maharana, Baoming Zhang, Leonid Karlinsky et al.

ICCV 2025arXiv:2412.02837
3
citations
#11060

Image Referenced Sketch Colorization Based on Animation Creation Workflow

Dingkun Yan, Xinrui Wang, Zhuoru Li et al.

CVPR 2025arXiv:2502.19937
3
citations
#11061

Diverse Policies Recovering via Pointwise Mutual Information Weighted Imitation Learning

Hanlin Yang, Jian Yao, Weiming Liu et al.

ICLR 2025oralarXiv:2410.15910
3
citations
#11062

TimeTracker: Event-based Continuous Point Tracking for Video Frame Interpolation with Non-linear Motion

Haoyue Liu, Jinghan Xu, Yi Chang et al.

CVPR 2025arXiv:2505.03116
3
citations
#11063

RLZero: Direct Policy Inference from Language Without In-Domain Supervision

Harshit Sushil Sikchi, Siddhant Agarwal, Pranaya Jajoo et al.

NEURIPS 2025arXiv:2412.05718
3
citations
#11064

Discrete Diffusion Models: Novel Analysis and New Sampler Guarantees

Yuchen Liang, Yingbin Liang, Lifeng LAI et al.

NEURIPS 2025arXiv:2509.16756
3
citations
#11065

Monocular Semantic Scene Completion via Masked Recurrent Networks

Xuzhi Wang, Xinran Wu, Song Wang et al.

ICCV 2025arXiv:2507.17661
3
citations
#11066

Fast and Accurate Gigapixel Pathological Image Classification with Hierarchical Distillation Multi-Instance Learning

Jiuyang Dong, Junjun Jiang, Kui Jiang et al.

CVPR 2025arXiv:2502.21130
3
citations
#11067

Option-aware Temporally Abstracted Value for Offline Goal-Conditioned Reinforcement Learning

Hongjoon Ahn, Heewoong Choi, Jisu Han et al.

NEURIPS 2025oralarXiv:2505.12737
3
citations
#11068

UNICL-SAM: Uncertainty-Driven In-Context Segmentation with Part Prototype Discovery

Dianmo Sheng, Dongdong Chen, Zhentao Tan et al.

CVPR 2025
3
citations
#11069

BEEM: Boosting Performance of Early Exit DNNs using Multi-Exit Classifiers as Experts

Divya Jyoti Bajpai, Manjesh Kumar Hanawal

ICLR 2025arXiv:2502.00745
3
citations
#11070

Grouped Speculative Decoding for Autoregressive Image Generation

Junhyuk So, Juncheol Shin, Hyunho Kook et al.

ICCV 2025arXiv:2508.07747
3
citations
#11071

CrossAD: Time Series Anomaly Detection with Cross-scale Associations and Cross-window Modeling

Beibu Li, Qichao Shentu, Yang Shu et al.

NEURIPS 2025arXiv:2510.12489
3
citations
#11072

Regret Analysis of Average-Reward Unichain MDPs via an Actor-Critic Approach

Swetha Ganesh, Vaneet Aggarwal

NEURIPS 2025arXiv:2505.19986
3
citations
#11073

SPAZER: Spatial-Semantic Progressive Reasoning Agent for Zero-shot 3D Visual Grounding

Zhao Jin, Rong-Cheng Tu, Jingyi Liao et al.

NEURIPS 2025arXiv:2506.21924
3
citations
#11074

Inpaint4Drag: Repurposing Inpainting Models for Drag-Based Image Editing via Bidirectional Warping

Jingyi Lu, Kai Han

ICCV 2025arXiv:2509.04582
3
citations
#11075

FACE: Faithful Automatic Concept Extraction

Dipkamal Bhusal, Michael Clifford, Sara Rampazzi et al.

NEURIPS 2025arXiv:2510.11675
3
citations
#11076

Spectral Analysis of Representational Similarity with Limited Neurons

Hyunmo Kang, Abdulkadir Canatar, SueYeon Chung

NEURIPS 2025arXiv:2502.19648
3
citations
#11077

SplitFlow: Flow Decomposition for Inversion-Free Text-to-Image Editing

Sung-Hoon Yoon, Minghan Li, Gaspard Beaudouin et al.

NEURIPS 2025arXiv:2510.25970
3
citations
#11078

RoomPainter: View-Integrated Diffusion for Consistent Indoor Scene Texturing

Zhipeng Huang, Wangbo Yu, Xinhua Cheng et al.

CVPR 2025arXiv:2412.16778
3
citations
#11079

PropVG: End-to-End Proposal-Driven Visual Grounding with Multi-Granularity Discrimination

Ming Dai, Wenxuan Cheng, Jiedong Zhuang et al.

ICCV 2025arXiv:2509.04833
3
citations
#11080

HairCUP: Hair Compositional Universal Prior for 3D Gaussian Avatars

Byungjun Kim, Shunsuke Saito, Giljoo Nam et al.

ICCV 2025arXiv:2507.19481
3
citations
#11081

Semantic Equitable Clustering: A Simple and Effective Strategy for Clustering Vision Tokens

Qihang Fan, Huaibo Huang, Mingrui Chen et al.

ICCV 2025arXiv:2405.13337
3
citations
#11082

SAUCE: Selective Concept Unlearning in Vision-Language Models with Sparse Autoencoders

Jiahui Geng, Qing Li

ICCV 2025arXiv:2503.14530
3
citations
#11083

InfiGFusion: Graph-on-Logits Distillation via Efficient Gromov-Wasserstein for Model Fusion

Yuanyi Wang, Zhaoyi Yan, Yiming Zhang et al.

NEURIPS 2025arXiv:2505.13893
3
citations
#11084

CALICO: Part-Focused Semantic Co-Segmentation with Large Vision-Language Models

Kiet A. Nguyen, Adheesh Juvekar, Tianjiao Yu et al.

CVPR 2025arXiv:2412.19331
3
citations
#11085

PriOr-Flow: Enhancing Primitive Panoramic Optical Flow with Orthogonal View

Longliang Liu, Miaojie Feng, Junda Cheng et al.

ICCV 2025highlightarXiv:2506.23897
3
citations
#11086

Self-Ensembling Gaussian Splatting for Few-Shot Novel View Synthesis

Chen Zhao, Xuan Wang, Tong Zhang et al.

ICCV 2025arXiv:2411.00144
3
citations
#11087

SP3D: Boosting Sparsely-Supervised 3D Object Detection via Accurate Cross-Modal Semantic Prompts

Shijia Zhao, Qiming Xia, Xusheng Guo et al.

CVPR 2025highlightarXiv:2503.06467
3
citations
#11088

Inference-Time Reward Hacking in Large Language Models

Hadi Khalaf, Claudio Mayrink Verdun, Alex Oesterling et al.

NEURIPS 2025spotlightarXiv:2506.19248
3
citations
#11089

Scale Your Instructions: Enhance the Instruction-Following Fidelity of Unified Image Generation Model by Self-Adaptive Attention Scaling

Chao Zhou, Tianyi Wei, Nenghai Yu

ICCV 2025arXiv:2507.16240
3
citations
#11090

Wav2Sem: Plug-and-Play Audio Semantic Decoupling for 3D Speech-Driven Facial Animation

Hao Li, Ju Dai, Xin Zhao et al.

CVPR 2025arXiv:2505.23290
3
citations
#11091

Memory Decoder: A Pretrained, Plug-and-Play Memory for Large Language Models

Jiaqi Cao, Jiarui Wang, Rubin Wei et al.

NEURIPS 2025arXiv:2508.09874
3
citations
#11092

Simultaneous Modeling of Protein Conformation and Dynamics via Autoregression

Yuning Shen, Lihao Wang, Huizhuo Yuan et al.

NEURIPS 2025oralarXiv:2505.17478
3
citations
#11093

A Snapshot of Influence: A Local Data Attribution Framework for Online Reinforcement Learning

Yuzheng Hu, Fan Wu, Haotian Ye et al.

NEURIPS 2025oralarXiv:2505.19281
3
citations
#11094

LLM Safety Alignment is Divergence Estimation in Disguise

Rajdeep Haldar, Ziyi Wang, Guang Lin et al.

NEURIPS 2025arXiv:2502.00657
3
citations
#11095

Escaping the SpuriVerse: Can Large Vision-Language Models Generalize Beyond Seen Spurious Correlations?

Yiwei Yang, Chung Peng Lee, Shangbin Feng et al.

NEURIPS 2025arXiv:2506.18322
3
citations
#11096

ODG: Occupancy Prediction Using Dual Gaussians

Yunxiao Shi, Yinhao Zhu, Herbert Cai et al.

NEURIPS 2025arXiv:2506.09417
3
citations
#11097

Learning Class Prototypes for Unified Sparse-Supervised 3D Object Detection

Yun Zhu, Le Hui, Hang Yang et al.

CVPR 2025highlightarXiv:2503.21099
3
citations
#11098

SUM Parts: Benchmarking Part-Level Semantic Segmentation of Urban Meshes

Weixiao Gao, Liangliang Nan, Hugo Ledoux

CVPR 2025arXiv:2503.15300
3
citations
#11099

LibraGrad: Balancing Gradient Flow for Universally Better Vision Transformer Attributions

Faridoun Mehri, Mahdieh Baghshah, Mohammad Taher Pilehvar

CVPR 2025arXiv:2411.16760
3
citations
#11100

Bringing SAM to new heights: leveraging elevation data for tree crown segmentation from drone imagery

Mélisande Teng, Arthur Ouaknine, Etienne Laliberté et al.

NEURIPS 2025arXiv:2506.04970
3
citations
#11101

Your Scale Factors are My Weapon: Targeted Bit-Flip Attacks on Vision Transformers via Scale Factor Manipulation

Jialai Wang, Yuxiao Wu, Weiye Xu et al.

CVPR 2025
3
citations
#11102

Generalized Linear Bandits: Almost Optimal Regret with One-Pass Update

Yu-Jie Zhang, Sheng-An Xu, Peng Zhao et al.

NEURIPS 2025arXiv:2507.11847
3
citations
#11103

Anti-Aliased 2D Gaussian Splatting

Mae Younes, Adnane Boukhayma

NEURIPS 2025arXiv:2506.11252
3
citations
#11104

COALA: Numerically Stable and Efficient Framework for Context-Aware Low-Rank Approximation

Uliana Parkina, Maxim Rakhuba

NEURIPS 2025arXiv:2507.07580
3
citations
#11105

Dynamic Reconstruction of Hand-Object Interaction with Distributed Force-aware Contact Representation

Zhenjun Yu, Wenqiang Xu, Pengfei Xie et al.

ICCV 2025arXiv:2411.09572
3
citations
#11106

Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning

Yurun Yuan, Fan Chen, Zeyu Jia et al.

NEURIPS 2025arXiv:2505.15311
3
citations
#11107

Meta-Learning Objectives for Preference Optimization

Carlo Alfano, Silvia Sapora, Jakob Foerster et al.

NEURIPS 2025arXiv:2411.06568
3
citations
#11108

OmniSplat: Taming Feed-Forward 3D Gaussian Splatting for Omnidirectional Images with Editable Capabilities

Suyoung Lee, JAEYOUNG CHUNG, Kihoon Kim et al.

CVPR 2025highlightarXiv:2412.16604
3
citations
#11109

EDM: Equirectangular Projection-Oriented Dense Kernelized Feature Matching

Dongki Jung, Jaehoon Choi, Yonghan Lee et al.

CVPR 2025arXiv:2502.20685
3
citations
#11110

FSHNet: Fully Sparse Hybrid Network for 3D Object Detection

Shuai Liu, Mingyue Cui, Boyang Li et al.

CVPR 2025arXiv:2506.03714
3
citations
#11111

XIFBench: Evaluating Large Language Models on Multilingual Instruction Following

Zhenyu Li, Kehai Chen, Yunfei Long et al.

NEURIPS 2025arXiv:2503.07539
3
citations
#11112

Learning to Normalize on the SPD Manifold under Bures-Wasserstein Geometry

Rui Wang, Shaocheng Jin, Ziheng Chen et al.

CVPR 2025arXiv:2504.00660
3
citations
#11113

FOCUS: Internal MLLM Representations for Efficient Fine-Grained Visual Question Answering

Liangyu Zhong, Fabio Philipp Rosenthal, Joachim Sicking et al.

NEURIPS 2025arXiv:2506.21710
3
citations
#11114

3D Equivariant Visuomotor Policy Learning via Spherical Projection

Boce Hu, Dian Wang, David Klee et al.

NEURIPS 2025spotlightarXiv:2505.16969
3
citations
#11115

DERD-Net: Learning Depth from Event-based Ray Densities

Diego de Oliveira Hitzges, Suman Ghosh, Guillermo Gallego

NEURIPS 2025spotlightarXiv:2504.15863
3
citations
#11116

OmniCast: A Masked Latent Diffusion Model for Weather Forecasting Across Time Scales

Tung Nguyen, Tuan Pham, Troy Arcomano et al.

NEURIPS 2025arXiv:2510.18707
3
citations
#11117

Gradient Multi-Normalization for Efficient LLM Training

Meyer Scetbon, Chao Ma, Wenbo Gong et al.

NEURIPS 2025
3
citations
#11118

Put CASH on Bandits: A Max K-Armed Problem for Automated Machine Learning

Amir Rezaei Balef, Claire Vernade, Katharina Eggensperger

NEURIPS 2025arXiv:2505.05226
3
citations
#11119

Hierarchical Features Matter: A Deep Exploration of Progressive Parameterization Method for Dataset Distillation

Xinhao Zhong, Hao Fang, Bin Chen et al.

CVPR 2025arXiv:2406.05704
3
citations
#11120

RvLLM: LLM Runtime Verification with Domain Knowledge

Yedi Zhang, Sun Emma, Annabelle En et al.

NEURIPS 2025arXiv:2505.18585
3
citations
#11121

PS-Diffusion: Photorealistic Subject-Driven Image Editing with Disentangled Control and Attention

Weicheng Wang, Guoli Jia, Zhongqi Zhang et al.

CVPR 2025
3
citations
#11122

Ravan: Multi-Head Low-Rank Adaptation for Federated Fine-Tuning

Arian Raje, Baris Askin, Divyansh Jhunjhunwala et al.

NEURIPS 2025arXiv:2506.05568
3
citations
#11123

Conformal Information Pursuit for Interactively Guiding Large Language Models

Kwan Ho Ryan Chan, Yuyan Ge, Edgar Dobriban et al.

NEURIPS 2025arXiv:2507.03279
3
citations
#11124

A machine learning approach that beats Rubik's cubes

Alexander Chervov, Kirill Khoruzhii, Nikita Bukhal et al.

NEURIPS 2025spotlight
3
citations
#11125

MoPFormer: Motion-Primitive Transformer for Wearable-Sensor Activity Recognition

Hao Zhang, Zhan Zhuang, Xuehao Wang et al.

NEURIPS 2025oralarXiv:2505.20744
3
citations
#11126

Whose View of Safety? A Deep DIVE Dataset for Pluralistic Alignment of Text-to-Image Models

Charvi Rastogi, Tian Huey Teh, Pushkar Mishra et al.

NEURIPS 2025spotlightarXiv:2507.13383
3
citations
#11127

Unveiling the Learning Mind of Language Models: A Cognitive Framework and Empirical Study

Zhengyu Hu, Jianxun Lian, Zheyuan Xiao et al.

NEURIPS 2025arXiv:2506.13464
3
citations
#11128

HumanMM: Global Human Motion Recovery from Multi-shot Videos

Yuhong Zhang, Guanlin Wu, Ling-Hao Chen et al.

CVPR 2025arXiv:2503.07597
3
citations
#11129

Online Language Splatting

Saimouli Katragadda, Cho-Ying Wu, Yuliang Guo et al.

ICCV 2025arXiv:2503.09447
3
citations
#11130

Stable Part Diffusion 4D: Multi-View RGB and Kinematic Parts Video Generation

Hao Zhang, Chun-Han Yao, Simon Donné et al.

NEURIPS 2025oralarXiv:2509.10687
3
citations
#11131

Watch and Listen: Understanding Audio-Visual-Speech Moments with Multimodal LLM

Zinuo Li, Xian Zhang, Yongxin Guo et al.

NEURIPS 2025oralarXiv:2505.18110
3
citations
#11132

FeedEdit: Text-Based Image Editing with Dynamic Feedback Regulation

Fengyi Fu, Lei Zhang, Mengqi Huang et al.

CVPR 2025
3
citations
#11133

Adaptive Batch-Wise Sample Scheduling for Direct Preference Optimization

Zixuan Huang, Yikun Ban, Lean Fu et al.

NEURIPS 2025arXiv:2506.17252
3
citations
#11134

THUNDER: Tile-level Histopathology image UNDERstanding benchmark

Pierre Marza, Leo Fillioux, Sofiène Boutaj et al.

NEURIPS 2025spotlightarXiv:2507.07860
3
citations
#11135

Floating No More: Object-Ground Reconstruction from a Single Image

Yunze Man, Yichen Sheng, Jianming Zhang et al.

CVPR 2025arXiv:2407.18914
3
citations
#11136

SViMo: Synchronized Diffusion for Video and Motion Generation in Hand-object Interaction Scenarios

Lingwei Dang, Ruizhi Shao, Hongwen Zhang et al.

NEURIPS 2025spotlightarXiv:2506.02444
3
citations
#11137

Does Object Binding Naturally Emerge in Large Pretrained Vision Transformers?

Yihao Li, Saeed Salehi, Lyle Ungar et al.

NEURIPS 2025spotlightarXiv:2510.24709
3
citations
#11138

GG-SSMs: Graph-Generating State Space Models

Nikola Zubic, Davide Scaramuzza

CVPR 2025
3
citations
#11139

Strassen Attention, Split VC Dimension and Compositionality in Transformers

Alexander Kozachinskiy, Felipe Urrutia, Hector Orellana et al.

NEURIPS 2025arXiv:2501.19215
3
citations
#11140

DV-Matcher: Deformation-based Non-rigid Point Cloud Matching Guided by Pre-trained Visual Features

Zhangquan Chen, Puhua Jiang, Ruqi Huang

CVPR 2025arXiv:2408.08568
3
citations
#11141

Distilled Decoding 2: One-step Sampling of Image Auto-regressive Models with Conditional Score Distillation

Enshu Liu, Qian Chen, Xuefei Ning et al.

NEURIPS 2025arXiv:2510.21003
3
citations
#11142

Understand Before You Generate: Self-Guided Training for Autoregressive Image Generation

Xiaoyu Yue, ZiDong Wang, Yuqing Wang et al.

NEURIPS 2025arXiv:2509.15185
3
citations
#11143

LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion

Fangfu Liu, Hao Li, Jiawei Chi et al.

ICCV 2025arXiv:2507.02813
3
citations
#11144

DeCafNet: Delegate and Conquer for Efficient Temporal Grounding in Long Videos

Zijia Lu, ASM Iftekhar, Gaurav Mittal et al.

CVPR 2025arXiv:2505.16376
3
citations
#11145

SafePTR: Token-Level Jailbreak Defense in Multimodal LLMs via Prune-then-Restore Mechanism

Beitao Chen, Xinyu Lyu, shengming yuan et al.

NEURIPS 2025arXiv:2507.01513
3
citations
#11146

On Fairness of Unified Multimodal Large Language Model for Image Generation

Ming Liu, Hao Chen, Jindong Wang et al.

NEURIPS 2025arXiv:2502.03429
3
citations
#11147

KL Penalty Control via Perturbation for Direct Preference Optimization

Sangkyu Lee, Janghoon Han, Hosung Song et al.

NEURIPS 2025arXiv:2502.13177
3
citations
#11148

Uncertainty Weighted Gradients for Model Calibration

Jinxu Lin, Linwei Tao, Minjing Dong et al.

CVPR 2025arXiv:2503.22725
3
citations
#11149

Understanding challenges to the interpretation of disaggregated evaluations of algorithmic fairness

Stephen Pfohl, Natalie Harris, Chirag Nagpal et al.

NEURIPS 2025arXiv:2506.04193
3
citations
#11150

Orientation-anchored Hyper-Gaussian for 4D Reconstruction from Casual Videos

Junyi Wu, Jiachen Tao, Haoxuan Wang et al.

NEURIPS 2025arXiv:2509.23492
3
citations
#11151

RobotSmith: Generative Robotic Tool Design for Acquisition of Complex Manipulation Skills

Chunru Lin, Haotian Yuan, Yian Wang et al.

NEURIPS 2025arXiv:2506.14763
3
citations
#11152

PQPP: A Joint Benchmark for Text-to-Image Prompt and Query Performance Prediction

Eduard Poesina, Adriana Valentina Costache, Adrian-Gabriel Chifu et al.

CVPR 2025arXiv:2406.04746
3
citations
#11153

Zero-Shot Trajectory Planning for Signal Temporal Logic Tasks

Ruijia Liu, Ancheng Hou, Xiao Yu et al.

NEURIPS 2025oralarXiv:2501.13457
3
citations
#11154

Noise Diffusion for Enhancing Semantic Faithfulness in Text-to-Image Synthesis

Boming Miao, Chunxiao Li, Xiaoxiao Wang et al.

CVPR 2025arXiv:2411.16503
3
citations
#11155

Adaptive Distraction: Probing LLM Contextual Robustness with Automated Tree Search

Yanbo Wang, Zixiang Xu, Yue Huang et al.

NEURIPS 2025arXiv:2502.01609
3
citations
#11156

GradMetaNet: An Equivariant Architecture for Learning on Gradients

Yoav Gelberg, Yam Eitan, Aviv Navon et al.

NEURIPS 2025arXiv:2507.01649
3
citations
#11157

A Lightweight UDF Learning Framework for 3D Reconstruction Based on Local Shape Functions

Jiangbei Hu, Yanggeng Li, Fei Hou et al.

CVPR 2025arXiv:2407.01330
3
citations
#11158

Detection, Pose Estimation and Segmentation for Multiple Bodies: Closing the Virtuous Circle

Miroslav Purkrabek, Jiri Matas

ICCV 2025arXiv:2412.01562
3
citations
#11159

Exact and Linear Convergence for Federated Learning under Arbitrary Client Participation is Attainable

Bicheng Ying, Zhe Li, Haibo Yang

NEURIPS 2025arXiv:2503.20117
3
citations
#11160

Generalizable Object Re-Identification via Visual In-Context Prompting

Zhizhong Huang, Xiaoming Liu

ICCV 2025arXiv:2508.21222
3
citations
#11161

Towards High-fidelity 3D Talking Avatar with Personalized Dynamic Texture

Xuanchen Li, Jianyu Wang, Yuhao Cheng et al.

CVPR 2025arXiv:2503.00495
3
citations
#11162

AdaLRS: Loss-Guided Adaptive Learning Rate Search for Efficient Foundation Model Pretraining

Hongyuan Dong, Dingkang Yang, Xiao Liang et al.

NEURIPS 2025arXiv:2506.13274
3
citations
#11163

Partial Gromov-Wasserstein Metric

Yikun Bai, Rocio Diaz Martin, Abihith Kothapalli et al.

ICLR 2025arXiv:2402.03664
3
citations
#11164

PoseTraj: Pose-Aware Trajectory Control in Video Diffusion

longbin ji, Lei Zhong, Pengfei Wei et al.

CVPR 2025arXiv:2503.16068
3
citations
#11165

How Different from the Past? Spatio-Temporal Time Series Forecasting with Self-Supervised Deviation Learning

Haotian Gao, Zheng Dong, Jiawei Yong et al.

NEURIPS 2025oralarXiv:2510.04908
3
citations
#11166

Volume Tells: Dual Cycle-Consistent Diffusion for 3D Fluorescence Microscopy De-noising and Super-Resolution

ZELIN LI, Chenwei Wang, Zhaoke Huang et al.

CVPR 2025highlightarXiv:2503.02261
3
citations
#11167

Learning Phase Distortion with Selective State Space Models for Video Turbulence Mitigation

Xingguang Zhang, Nicholas M Chimitt, Xijun Wang et al.

CVPR 2025highlightarXiv:2504.02697
3
citations
#11168

Continual Release Moment Estimation with Differential Privacy

Nikita Kalinin, Jalaj Upadhyay, Christoph Lampert

NEURIPS 2025arXiv:2502.06597
3
citations
#11169

SemCoT: Accelerating Chain-of-Thought Reasoning through Semantically-Aligned Implicit Tokens

Yinhan He, Wendy Zheng, Yaochen Zhu et al.

NEURIPS 2025arXiv:2510.24940
3
citations
#11170

BRACE: A Benchmark for Robust Audio Caption Quality Evaluation

Tianyu Guo, Hongyu Chen, Hao Liang et al.

NEURIPS 2025arXiv:2512.10403
3
citations
#11171

Deep learning for continuous-time stochastic control with jumps

Patrick Cheridito, Jean-Loup Dupret, Donatien Hainaut

NEURIPS 2025arXiv:2505.15602
3
citations
#11172

Reparameterized LLM Training via Orthogonal Equivalence Transformation

Zeju Qiu, Simon Buchholz, Tim Xiao et al.

NEURIPS 2025arXiv:2506.08001
3
citations
#11173

Fast Direct: Query-Efficient Online Black-box Guidance for Diffusion-model Target Generation

Kim Yong Tan, YUEMING LYU, Ivor Tsang et al.

ICLR 2025arXiv:2502.01692
3
citations
#11174

Task Descriptors Help Transformers Learn Linear Models In-Context

Ruomin Huang, Rong Ge

ICLR 2025
3
citations
#11175

SocialMOIF: Multi-Order Intention Fusion for Pedestrian Trajectory Prediction

Kai Chen, Xiaodong Zhao, Yujie Huang et al.

CVPR 2025arXiv:2504.15616
3
citations
#11176

Seeing What Matters: Generalizable AI-generated Video Detection with Forensic-Oriented Augmentation

Riccardo Corvi, Davide Cozzolino, Ekta Prashnani et al.

NEURIPS 2025arXiv:2506.16802
3
citations
#11177

Stochastic Gradients under Nuisances

Facheng Yu, Ronak Mehta, Alex Luedtke et al.

NEURIPS 2025arXiv:2508.20326
2
citations
#11178

Triad: Empowering LMM-based Anomaly Detection with Expert-guided Region-of-Interest Tokenizer and Manufacturing Process

Yuanze Li, Shihao Yuan, Haolin Wang et al.

ICCV 2025
2
citations
#11179

SiM3D: Single-instance Multiview Multimodal and Multisetup 3D Anomaly Detection Benchmark

Alex Costanzino, Pierluigi Zama Ramirez, Luigi Lella et al.

ICCV 2025arXiv:2506.21549
2
citations
#11180

SUB: Benchmarking CBM Generalization via Synthetic Attribute Substitutions

Jessica Bader, Leander Girrbach, Stephan Alaniz et al.

ICCV 2025arXiv:2507.23784
2
citations
#11181

Refer to Any Segmentation Mask Group With Vision-Language Prompts

Shengcao Cao, Zijun Wei, Jason Kuen et al.

ICCV 2025arXiv:2506.05342
2
citations
#11182

Decouple to Reconstruct: High Quality UHD Restoration via Active Feature Disentanglement and Reversible Fusion

Yidi Liu, Dong Li, Yuxin Ma et al.

ICCV 2025arXiv:2503.12764
2
citations
#11183

Consensus-Driven Active Model Selection

Justin Kay, Grant Horn, Subhransu Maji et al.

ICCV 2025highlightarXiv:2507.23771
2
citations
#11184

OpenAnimals: Revisiting Person Re-Identification for Animals Towards Better Generalization

Saihui Hou, Panjian Huang, Zengbin Wang et al.

ICCV 2025arXiv:2410.00204
2
citations
#11185

Backdoor Attacks on Neural Networks via One-Bit Flip

Xiang Li, Lannan Luo, Qiang Zeng

ICCV 2025
2
citations
#11186

FB-Diff: Fourier Basis-guided Diffusion for Temporal Interpolation of 4D Medical Imaging

Xin You, Runze Yang, Chuyan Zhang et al.

ICCV 2025arXiv:2507.04547
2
citations
#11187

Describe, Don’t Dictate: Semantic Image Editing with Natural Language Intent

En Ci, Shanyan Guan, Yanhao Ge et al.

ICCV 2025
2
citations
#11188

Enhance Multi-View Classification Through Multi-Scale Alignment and Expanded Boundary

Yuena Lin, Yiyuan Wang, Gengyu Lyu et al.

ICLR 2025
2
citations
#11189

SplArt: Articulation Estimation and Part-Level Reconstruction with 3D Gaussian Splatting

Shengjie Lin, Jiading Fang, Muhammad Zubair Irshad et al.

ICCV 2025arXiv:2506.03594
2
citations
#11190

Uncertainty-Driven Expert Control: Enhancing the Reliability of Medical Vision-Language Models

Xiao Liang, Di Wang, Zhicheng Jiao et al.

ICCV 2025arXiv:2507.09209
2
citations
#11191

Multimodal Prompt Alignment for Facial Expression Recognition

Fuyan Ma, Yiran He, Bin Sun et al.

ICCV 2025arXiv:2506.21017
2
citations
#11192

CABLD: Contrast-Agnostic Brain Landmark Detection with Consistency-Based Regularization

Soorena Salari, Arash Harirpoush, Hassan Rivaz et al.

ICCV 2025arXiv:2411.17845
2
citations
#11193

Stable Score Distillation

Haiming Zhu, Yangyang Xu, Chenshu Xu et al.

ICCV 2025arXiv:2507.09168
2
citations
#11194

BézierGS: Dynamic Urban Scene Reconstruction with Bézier Curve Gaussian Splatting

Zipei Ma, Junzhe Jiang, Yurui Chen et al.

ICCV 2025arXiv:2506.22099
2
citations
#11195

Diffusion Image Prior

Hamadi Chihaoui, Paolo Favaro

ICCV 2025arXiv:2503.21410
2
citations
#11196

Text2VDM: Text to Vector Displacement Maps for Expressive and Interactive 3D Sculpting

Hengyu Meng, Duotun Wang, Zhijing Shao et al.

ICCV 2025arXiv:2502.20045
2
citations
#11197

Federated Domain Generalization with Domain-specific Soft Prompts Generation

Jianhan Wu, Xiaoyang Qu, Zhangcheng Huang et al.

ICCV 2025arXiv:2509.20807
2
citations
#11198

Selective Contrastive Learning for Weakly Supervised Affordance Grounding

WonJun Moon, Hyun Seok Seong, Jae-Pil Heo

ICCV 2025arXiv:2508.07877
2
citations
#11199

CoST: Efficient Collaborative Perception From Unified Spatiotemporal Perspective

Zongheng Tang, Yi Liu, Yifan Sun et al.

ICCV 2025highlightarXiv:2508.00359
2
citations
#11200

UniConvNet: Expanding Effective Receptive Field while Maintaining Asymptotically Gaussian Distribution for ConvNets of Any Scale

Yuhao Wang, Wei Xi

ICCV 2025arXiv:2508.09000
2
citations