Most Cited 2025 "hierarchical joint embedding" Papers

22,274 papers found • Page 51 of 112

Filters:Most Cited 2025 hierarchical joint embedding Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

#10001

ReferEverything: Towards Segmenting Everything We Can Speak of in Videos

Anurag Bagchi, Zhipeng Bao, Yu-Xiong Wang et al.

ICCV 2025arXiv:2410.23287

citations

#10002

From Sequence to Structure: Uncovering Substructure Reasoning in Transformers

Xinnan Dai, Kai Yang, Jay Revolinsky et al.

NEURIPS 2025arXiv:2507.10435

citations

#10003

Set Smoothness Unlocks Clarke Hyper-stationarity in Bilevel Optimization

He Chen, Jiajin Li, Anthony Man-Cho So

NEURIPS 2025spotlightarXiv:2506.04587

citations

#10004

SemAlign3D: Semantic Correspondence between RGB-Images through Aligning 3D Object-Class Representations

Krispin Wandel, Hesheng Wang

CVPR 2025arXiv:2503.22462

citations

#10005

Towards Efficient Foundation Model for Zero-shot Amodal Segmentation

Zhaochen Liu, Limeng Qiao, Xiangxiang Chu et al.

CVPR 2025

citations

#10006

Uncertainty-Aware Gradient Stabilization for Small Object Detection

Huixin Sun, Yanjing Li, Linlin Yang et al.

ICCV 2025arXiv:2303.01803

citations

#10007

VoteSplat: Hough Voting Gaussian Splatting for 3D Scene Understanding

Minchao Jiang, Shunyu Jia, Jiaming Gu et al.

ICCV 2025arXiv:2506.22799

citations

#10008

MoESD: Unveil Speculative Decoding's Potential for Accelerating Sparse MoE

Zongle Huang, Lei Zhu, ZongYuan Zhan et al.

NEURIPS 2025spotlightarXiv:2505.19645

citations

#10009

FlySearch: Exploring how vision-language models explore

Adam Pardyl, Dominik Matuszek, Mateusz Przebieracz et al.

NEURIPS 2025arXiv:2506.02896

citations

#10010

Parallelizing MCMC Across the Sequence Length

David Zoltowski, Skyler Wu, Xavier Gonzalez et al.

NEURIPS 2025arXiv:2508.18413

citations

#10011

Turbocharging Gaussian Process Inference with Approximate Sketch-and-Project

Pratik Rathore, Zachary Frangella, Sachin Garg et al.

NEURIPS 2025arXiv:2505.13723

citations

#10012

Diffusion Classifiers Understand Compositionality, but Conditions Apply

Yujin Jeong, Arnas Uselis, Seong Joon Oh et al.

NEURIPS 2025arXiv:2505.17955

citations

#10013

ScenePainter: Semantically Consistent Perpetual 3D Scene Generation with Concept Relation Alignment

Chong Xia, Shengjun Zhang, Fangfu Liu et al.

ICCV 2025arXiv:2507.19058

citations

#10014

CoLMDriver: LLM-based Negotiation Benefits Cooperative Autonomous Driving

Changxing Liu, Genjia Liu, Zijun Wang et al.

ICCV 2025arXiv:2503.08683

citations

#10015

Neurons: Emulating the Human Visual Cortex Improves Fidelity and Interpretability in fMRI-to-Video Reconstruction

Haonan Wang, Qixiang ZHANG, Lehan Wang et al.

ICCV 2025arXiv:2503.11167

citations

#10016

MMMG: A Massive, Multidisciplinary, Multi-Tier Generation Benchmark for Text-to-Image Reasoning

Yuxuan Luo, Ryan Yuan, Junwen Chen et al.

NEURIPS 2025arXiv:2506.10963

citations

#10017

I Am Big, You Are Little; I Am Right, You Are Wrong

David A Kelly, Akchunya Chanchal, Nathan Blake

ICCV 2025arXiv:2507.23509

citations

#10018

Learning (Approximately) Equivariant Networks via Constrained Optimization

Andrei Manolache, Luiz Chamon, Mathias Niepert

NEURIPS 2025oralarXiv:2505.13631

citations

#10019

Learning to Normalize on the SPD Manifold under Bures-Wasserstein Geometry

Rui Wang, Shaocheng Jin, Ziheng Chen et al.

CVPR 2025arXiv:2504.00660

citations

#10020

Sensitivity-Aware Efficient Fine-Tuning via Compact Dynamic-Rank Adaptation

Tianran Chen, Jiarui Chen, Baoquan Zhang et al.

CVPR 2025

citations

#10021

Graph-KV: Breaking Sequence via Injecting Structural Biases into Large Language Models

Haoyu Wang, Peihao Wang, Mufei Li et al.

NEURIPS 2025arXiv:2506.07334

citations

#10022

World-aware Planning Narratives Enhance Large Vision-Language Model Planner

Junhao Shi, Zhaoye Fei, Siyin Wang et al.

NEURIPS 2025arXiv:2506.21230

citations

#10023

DiffDoctor: Diagnosing Image Diffusion Models Before Treating

Yiyang Wang, Xi Chen, Xiaogang Xu et al.

ICCV 2025arXiv:2501.12382

citations

#10024

Understanding challenges to the interpretation of disaggregated evaluations of algorithmic fairness

Stephen Pfohl, Natalie Harris, Chirag Nagpal et al.

NEURIPS 2025arXiv:2506.04193

citations

#10025

Less Attention is More: Prompt Transformer for Generalized Category Discovery

Wei Zhang, Baopeng Zhang, Zhu Teng et al.

CVPR 2025

citations

#10026

Fine-Grained Image-Text Correspondence with Cost Aggregation for Open-Vocabulary Part Segmentation

Jiho Choi, Seonho Lee, Minhyun Lee et al.

CVPR 2025arXiv:2501.09688

citations

#10027

SoftShadow: Leveraging Soft Masks for Penumbra-Aware Shadow Removal

Xinrui Wang, Lanqing Guo, Xiyu Wang et al.

CVPR 2025arXiv:2409.07041

citations

#10028

DexFlyWheel: A Scalable and Self-improving Data Generation Framework for Dexterous Manipulation

Kefei Zhu, Fengshuo Bai, YuanHao Xiang et al.

NEURIPS 2025spotlightarXiv:2509.23829

citations

#10029

Towards General Modality Translation with Contrastive and Predictive Latent Diffusion Bridge

Nimrod Berman, Omkar Joglekar, Eitan Kosman et al.

NEURIPS 2025arXiv:2510.20819

citations

#10030

Discrete Diffusion Models: Novel Analysis and New Sampler Guarantees

Yuchen Liang, Yingbin Liang, Lifeng LAI et al.

NEURIPS 2025arXiv:2509.16756

citations

#10031

Taming generative video models for zero-shot optical flow extraction

Seungwoo Kim, Khai Loong Aw, Klemen Kotar et al.

NEURIPS 2025oralarXiv:2507.09082

citations

#10032

OmniCast: A Masked Latent Diffusion Model for Weather Forecasting Across Time Scales

Tung Nguyen, Tuan Pham, Troy Arcomano et al.

NEURIPS 2025arXiv:2510.18707

citations

#10033

FROSS: Faster-Than-Real-Time Online 3D Semantic Scene Graph Generation from RGB-D Images

Hao-Yu Hou, Chun-Yi Lee, Motoharu Sonogashira et al.

ICCV 2025arXiv:2507.19993

citations

#10034

Adaptive Batch-Wise Sample Scheduling for Direct Preference Optimization

Zixuan Huang, Yikun Ban, Lean Fu et al.

NEURIPS 2025arXiv:2506.17252

citations

#10035

Polar Sparsity: High Throughput Batched LLM Inferencing with Scalable Contextual Sparsity

Susav Shrestha, Bradley Settlemyer, Nikoli Dryden et al.

NEURIPS 2025arXiv:2505.14884

citations

#10036

Synthesizing Near-Boundary OOD Samples for Out-of-Distribution Detection

Jinglun Li, Kaixun Jiang, Zhaoyu Chen et al.

ICCV 2025highlightarXiv:2507.10225

citations

#10037

Which Viewpoint Shows it Best? Language for Weakly Supervising View Selection in Multi-view Instructional Videos

Sagnik Majumder, Tushar Nagarajan, Ziad Al-Halah et al.

CVPR 2025highlightarXiv:2411.08753

citations

#10038

Reproducible Vision-Language Models Meet Concepts Out of Pre-Training

Ziliang Chen, Xin Huang, Xiaoxuan Fan et al.

CVPR 2025

citations

#10039

3D Equivariant Visuomotor Policy Learning via Spherical Projection

Boce Hu, Dian Wang, David Klee et al.

NEURIPS 2025spotlightarXiv:2505.16969

citations

#10040

Visual Prompting for One-shot Controllable Video Editing without Inversion

Zhengbo Zhang, Yuxi Zhou, DUO PENG et al.

CVPR 2025arXiv:2504.14335

citations

#10041

Position: AI Should Sense Better, Not Just Scale Bigger: Adaptive Sensing as a Paradigm Shift

Eunsu Baek, Keondo Park, Jeonggil Ko et al.

NEURIPS 2025

citations

#10042

E-BATS: Efficient Backpropagation-Free Test-Time Adaptation for Speech Foundation Models

Jiaheng Dong, Hong Jia, Soumyajit Chatterjee et al.

NEURIPS 2025arXiv:2506.07078

citations

#10043

LogoSP: Local-global Grouping of Superpoints for Unsupervised Semantic Segmentation of 3D Point Clouds

Zihui Zhang, Weisheng Dai, Hongtao Wen et al.

CVPR 2025arXiv:2506.07857

citations

#10044

Floating No More: Object-Ground Reconstruction from a Single Image

Yunze Man, Yichen Sheng, Jianming Zhang et al.

CVPR 2025arXiv:2407.18914

citations

#10045

Long-term Traffic Simulation with Interleaved Autoregressive Motion and Scenario Generation

Xiuyu Yang, Shuhan Tan, Philipp Kraehenbuehl

ICCV 2025arXiv:2506.17213

citations

#10046

BrepGiff: Lightweight Generation of Complex B-rep with 3D GAT Diffusion

Hao Guo, Xiaoshui Huang, Hao jiacheng et al.

CVPR 2025

citations

#10047

Improving the Euclidean Diffusion Generation of Manifold Data by Mitigating Score Function Singularity

Zichen Liu, Wei Zhang, Tiejun Li

NEURIPS 2025arXiv:2505.09922

citations

#10048

SA-Occ: Satellite-Assisted 3D Occupancy Prediction in Real World

Chen Chen, Zhirui Wang, Taowei Sheng et al.

ICCV 2025arXiv:2503.16399

citations

#10049

Learning Phase Distortion with Selective State Space Models for Video Turbulence Mitigation

Xingguang Zhang, Nicholas M Chimitt, Xijun Wang et al.

CVPR 2025highlightarXiv:2504.02697

citations

#10050

AdaDrive: Self-Adaptive Slow-Fast System for Language-Grounded Autonomous Driving

Ruifei Zhang, Junlin Xie, Wei Zhang et al.

ICCV 2025arXiv:2511.06253

citations

#10051

Align Your Rhythm: Generating Highly Aligned Dance Poses with Gating-Enhanced Rhythm-Aware Feature Representation

Congyi Fan, Jian Guan, Xuanjia Zhao et al.

ICCV 2025arXiv:2503.17340

citations

#10052

FOCUS: Internal MLLM Representations for Efficient Fine-Grained Visual Question Answering

Liangyu Zhong, Fabio Philipp Rosenthal, Joachim Sicking et al.

NEURIPS 2025arXiv:2506.21710

citations

#10053

Bridge the Gap: From Weak to Full Supervision for Temporal Action Localization with PseudoFormer

Ziyi Liu, Yangcen Liu

CVPR 2025arXiv:2504.14860

citations

#10054

Cascaded Language Models for Cost-Effective Human–AI Decision-Making

Claudio Fanconi, Mihaela van der Schaar

NEURIPS 2025arXiv:2506.11887

citations

#10055

Low Rank Gradients and Where to Find Them

Rishi Sonthalia, Michael Murray, Guido Montufar

NEURIPS 2025arXiv:2510.01303

citations

#10056

CReFT-CAD: Boosting Orthographic Projection Reasoning for CAD via Reinforcement Fine-Tuning

Ke Niu, Zhuofan Chen, Haiyang Yu et al.

NEURIPS 2025arXiv:2506.00568

citations

#10057

Aligning Transformers with Continuous Feedback via Energy Rank Alignment

Shriram Chennakesavalu, Frank Hu, Sebastian Ibarraran et al.

NEURIPS 2025arXiv:2405.12961

citations

#10058

Positive2Negative: Breaking the Information-Lossy Barrier in Self-Supervised Single Image Denoising

Tong Li, Lizhi Wang, Zhiyuan Xu et al.

CVPR 2025arXiv:2412.16460

citations

#10059

Search and Detect: Training-Free Long Tail Object Detection via Web-Image Retrieval

Mankeerat Sidhu, Hetarth Chopra, Ansel Blume et al.

CVPR 2025arXiv:2409.18733

citations

#10060

ArcPro: Architectural Programs for Structured 3D Abstraction of Sparse Points

Qirui Huang, Runze Zhang, Kangjun Liu et al.

CVPR 2025highlightarXiv:2503.02745

citations

#10061

On the Generalization of Handwritten Text Recognition Models

Carlos Garrido-Munoz, Jorge Calvo-Zaragoza

CVPR 2025arXiv:2411.17332

citations

#10062

ShapeEmbed: a self-supervised learning framework for 2D contour quantification

Anna Foix-Romero, Craig Russell, Alexander Krull et al.

NEURIPS 2025arXiv:2507.01009

citations

#10063

FSHNet: Fully Sparse Hybrid Network for 3D Object Detection

Shuai Liu, Mingyue Cui, Boyang Li et al.

CVPR 2025arXiv:2506.03714

citations

#10064

FilmComposer: LLM-Driven Music Production for Silent Film Clips

Zhifeng Xie, Qile He, Youjia Zhu et al.

CVPR 2025arXiv:2503.08147

citations

#10065

OmniSplat: Taming Feed-Forward 3D Gaussian Splatting for Omnidirectional Images with Editable Capabilities

Suyoung Lee, JAEYOUNG CHUNG, Kihoon Kim et al.

CVPR 2025highlightarXiv:2412.16604

citations

#10066

VQToken: Neural Discrete Token Representation Learning for Extreme Token Reduction in Video Large Language Models

Haichao Zhang, Yun Fu

NEURIPS 2025oralarXiv:2503.16980

citations

#10067

Exploiting Diffusion Prior for Task-driven Image Restoration

Jaeha Kim, Junghun Oh, Kyoung Mu Lee

ICCV 2025arXiv:2507.22459

citations

#10068

Integrating Visual Interpretation and Linguistic Reasoning for Geometric Problem Solving

Zixian Guo, Ming Liu, Qilong Wang et al.

ICCV 2025

citations

#10069

Traversing Distortion-Perception Tradeoff using a Single Score-Based Generative Model

Yuhan Wang, Suzhi Bi, Ying-Jun Angela Zhang et al.

CVPR 2025arXiv:2503.20297

citations

#10070

Beyond Sight: Towards Cognitive Alignment in LVLM via Enriched Visual Knowledge

Yaqi Zhao, Yuanyang Yin, Lin Li et al.

CVPR 2025arXiv:2411.16824

citations

#10071

Seeing in the Dark: Benchmarking Egocentric 3D Vision with the Oxford Day-and-Night Dataset

Zirui Wang, Wenjing Bian, Xinghui Li et al.

NEURIPS 2025arXiv:2506.04224

citations

#10072

LLaFEA: Frame-Event Complementary Fusion for Fine-Grained Spatiotemporal Understanding in LMMs

Hanyu Zhou, Gim Hee Lee

ICCV 2025arXiv:2503.06934

citations

#10073

Unlocking hidden biomolecular conformational landscapes in diffusion models at inference time

Daniel D. Richman, Jessica Karaguesian, Carl-Mikael Suomivuori et al.

NEURIPS 2025spotlightarXiv:2512.03312

citations

#10074

Tiled Diffusion

Or Madar, Ohad Fried

CVPR 2025arXiv:2412.15185

citations

#10075

seq-JEPA: Autoregressive Predictive Learning of Invariant-Equivariant World Models

Hafez Ghaemi, Eilif B. Muller, Shahab Bakhtiari

NEURIPS 2025arXiv:2505.03176

citations

#10076

Differentiable Inverse Rendering with Interpretable Basis BRDFs

Hoon-Gyu Chung, Seokjun Choi, Seung-Hwan Baek

CVPR 2025arXiv:2411.17994

citations

#10077

R-KV: Redundancy-aware KV Cache Compression for Reasoning Models

Zefan Cai, Wen Xiao, Hanshi Sun et al.

NEURIPS 2025arXiv:2505.24133

citations

#10078

Discretized Gaussian Representation for Tomographic Reconstruction

Shaokai Wu, Yuxiang Lu, Yapan Guo et al.

ICCV 2025arXiv:2411.04844

citations

#10079

Volume Tells: Dual Cycle-Consistent Diffusion for 3D Fluorescence Microscopy De-noising and Super-Resolution

ZELIN LI, Chenwei Wang, Zhaoke Huang et al.

CVPR 2025highlightarXiv:2503.02261

citations

#10080

Dark-ISP: Enhancing RAW Image Processing for Low-Light Object Detection

Jiasheng Guo, Xin Gao, Yuxiang Yan et al.

ICCV 2025arXiv:2509.09183

citations

#10081

DriveScape: High-Resolution Driving Video Generation by Multi-View Feature Fusion

Wei Wu, Xi Guo, Weixuan TANG et al.

CVPR 2025

citations

#10082

XIFBench: Evaluating Large Language Models on Multilingual Instruction Following

Zhenyu Li, Kehai Chen, Yunfei Long et al.

NEURIPS 2025arXiv:2503.07539

citations

#10083

Asymptotic Theory of Geometric and Adaptive $k$-Means Clustering

Adam Quinn Jaffe

NEURIPS 2025arXiv:2202.13423

citations

#10084

Memory-Efficient 4-bit Preconditioned Stochastic Optimization

Jingyang Li, Kuangyu Ding, Kim-chuan Toh et al.

ICCV 2025arXiv:2412.10663

citations

#10085

BATCLIP: Bimodal Online Test-Time Adaptation for CLIP

Sarthak Kumar Maharana, Baoming Zhang, Leonid Karlinsky et al.

ICCV 2025arXiv:2412.02837

citations

#10086

HumanSAM: Classifying Human-centric Forgery Videos in Human Spatial, Appearance, and Motion Anomaly

Chang Liu, Yunfan Ye, Fan Zhang et al.

ICCV 2025arXiv:2507.19924

citations

#10087

PoseTraj: Pose-Aware Trajectory Control in Video Diffusion

longbin ji, Lei Zhong, Pengfei Wei et al.

CVPR 2025arXiv:2503.16068

citations

#10088

ICPC-Eval: Probing the Frontiers of LLM Reasoning with Competitive Programming Contests

Shiyi Xu, Hu Yiwen, Yingqian Min et al.

NEURIPS 2025arXiv:2506.04894

citations

#10089

Vivid4D: Improving 4D Reconstruction from Monocular Video by Video Inpainting

Jiaxin Huang, Sheng Miao, Bangbang Yang et al.

ICCV 2025arXiv:2504.11092

citations

#10090

VideoAds for Fast-Paced Video Understanding

Zheyuan Zhang, Wanying Dou, Linkai Peng et al.

ICCV 2025arXiv:2504.09282

citations

#10091

RLZero: Direct Policy Inference from Language Without In-Domain Supervision

Harshit Sushil Sikchi, Siddhant Agarwal, Pranaya Jajoo et al.

NEURIPS 2025arXiv:2412.05718

citations

#10092

Strassen Attention, Split VC Dimension and Compositionality in Transformers

Alexander Kozachinskiy, Felipe Urrutia, Hector Orellana et al.

NEURIPS 2025arXiv:2501.19215

citations

#10093

Towards a Golden Classifier-Free Guidance Path via Foresight Fixed Point Iterations

Kaibo Wang, Jianda Mao, Tong Wu et al.

NEURIPS 2025spotlightarXiv:2510.21512

citations

#10094

Balanced Rate-Distortion Optimization in Learned Image Compression

Yichi Zhang, Zhihao Duan, Yuning Huang et al.

CVPR 2025highlightarXiv:2502.20161

citations

#10095

More of the Same: Persistent Representational Harms Under Increased Representation

Jennifer Mickel, Maria De-Arteaga, Liu Leqi et al.

NEURIPS 2025arXiv:2503.00333

citations

#10096

Improve Representation for Imbalanced Regression through Geometric Constraints

Zijian Dong, Yilei Wu, Chongyao Chen et al.

CVPR 2025arXiv:2503.00876

citations

#10097

JailBound: Jailbreaking Internal Safety Boundaries of Vision-Language Models

Jiaxin Song, Yixu Wang, Jie Li et al.

NEURIPS 2025arXiv:2505.19610

citations

#10098

Leveraging SD Map to Augment HD Map-based Trajectory Prediction

Zhiwei Dong, Ran Ding, Wei Li et al.

CVPR 2025

citations

#10099

VODiff: Controlling Object Visibility Order in Text-to-Image Generation

Dong Liang, Jinyuan Jia, Yuhao Liu et al.

CVPR 2025

citations

#10100

DiMPLe - Disentangled Multi-Modal Prompt Learning: Enhancing Out-Of-Distribution Alignment with Invariant and Spurious Feature Separation

Umaima Rahman, Mohammad Yaqub, Dwarikanath Mahapatra

ICCV 2025arXiv:2506.21237

citations

#10101

TCM-Ladder: A Benchmark for Multimodal Question Answering on Traditional Chinese Medicine

Jiacheng Xie, Yang Yu, Ziyang Zhang et al.

NEURIPS 2025arXiv:2505.24063

citations

#10102

Ask a Strong LLM Judge when Your Reward Model is Uncertain

Zhenghao Xu, Qin Lu, Qingru Zhang et al.

NEURIPS 2025arXiv:2510.20369

citations

#10103

SAMA: Towards Multi-Turn Referential Grounded Video Chat with Large Language Models

Ye Sun, Hao Zhang, Henghui Ding et al.

NEURIPS 2025oralarXiv:2505.18812

citations

#10104

FG-OrIU: Towards Better Forgetting via Feature-Gradient Orthogonality for Incremental Unlearning

qian feng, Jiahang Tu, Mintong Kang et al.

ICCV 2025arXiv:2601.13578

citations

#10105

ICP: Immediate Compensation Pruning for Mid-to-high Sparsity

Xin Luo, Fu Xueming, Zihang Jiang et al.

CVPR 2025highlight

citations

#10106

Sufficient Invariant Learning for Distribution Shift

Taero Kim, Subeen Park, Sungjun Lim et al.

CVPR 2025arXiv:2210.13533

citations

#10107

Inpaint4Drag: Repurposing Inpainting Models for Drag-Based Image Editing via Bidirectional Warping

Jingyi Lu, Kai Han

ICCV 2025arXiv:2509.04582

citations

#10108

Player-Centric Multimodal Prompt Generation for Large Language Model Based Identity-Aware Basketball Video Captioning

Zeyu Xi, Haoying Sun, Yaofei Wu et al.

ICCV 2025arXiv:2507.20163

citations

#10109

ExCap3D: Expressive 3D Scene Understanding via Object Captioning with Varying Detail

Chandan Yeshwanth, David Rozenberszki, Angela Dai

ICCV 2025arXiv:2503.17044

citations

#10110

AeSPa : Attention-guided Self-supervised Parallel Imaging for MRI Reconstruction

Jinho Joo, Hyeseong Kim, Hyeyeon Won et al.

CVPR 2025

citations

#10111

Alignment, Mining and Fusion: Representation Alignment with Hard Negative Mining and Selective Knowledge Fusion for Medical Visual Question Answering

Yuanhao Zou, Zhaozheng Yin

CVPR 2025arXiv:2510.08791

citations

#10112

4DGCPro: Efficient Hierarchical 4D Gaussian Compression for Progressive Volumetric Video Streaming

Zihan Zheng, Zhenlong Wu, Houqiang Zhong et al.

NEURIPS 2025oralarXiv:2509.17513

citations

#10113

Multi-modal contrastive learning adapts to intrinsic dimensions of shared latent variables

Yu Gui, Cong Ma, Zongming Ma

NEURIPS 2025arXiv:2505.12473

citations

#10114

Cognitive Mirrors: Exploring the Diverse Functional Roles of Attention Heads in LLM Reasoning

Xueqi Ma, Jun Wang, Yanbei Jiang et al.

NEURIPS 2025arXiv:2512.10978

citations

#10115

Your Scale Factors are My Weapon: Targeted Bit-Flip Attacks on Vision Transformers via Scale Factor Manipulation

Jialai Wang, Yuxiao Wu, Weiye Xu et al.

CVPR 2025

citations

#10116

Acquire and then Adapt: Squeezing out Text-to-Image Model for Image Restoration

Junyuan Deng, Xinyi Wu, Yongxing Yang et al.

CVPR 2025arXiv:2504.15159

citations

#10117

From Panels to Prose: Generating Literary Narratives from Comics

Ragav Sachdeva, Andrew Zisserman

ICCV 2025arXiv:2503.23344

citations

#10118

HairCUP: Hair Compositional Universal Prior for 3D Gaussian Avatars

Byungjun Kim, Shunsuke Saito, Giljoo Nam et al.

ICCV 2025arXiv:2507.19481

citations

#10119

Flow-NeRF: Joint Learning of Geometry, Poses, and Dense Flow within Unified Neural Representations

Xunzhi Zheng, Dan Xu

CVPR 2025arXiv:2503.10464

citations

#10120

Personalized Safety in LLMs: A Benchmark and A Planning-Based Agent Approach

Yuchen Wu, Edward Sun, Kaijie Zhu et al.

NEURIPS 2025oralarXiv:2505.18882

citations

#10121

Wavelet and Prototype Augmented Query-based Transformer for Pixel-level Surface Defect Detection

Feng Yan, Xiaoheng Jiang, Yang Lu et al.

CVPR 2025

citations

#10122

Mesh Mamba: A Unified State Space Model for Saliency Prediction in Non-Textured and Textured Meshes

Kaiwei Zhang, Dandan Zhu, Xiongkuo Min et al.

CVPR 2025arXiv:2504.01466

citations

#10123

Effortless Active Labeling for Long-Term Test-Time Adaptation

Guowei Wang, Changxing Ding

CVPR 2025arXiv:2503.14564

citations

#10124

FREE-Merging: Fourier Transform for Efficient Model Merging

Shenghe Zheng, Hongzhi Wang

ICCV 2025arXiv:2411.16815

citations

#10125

Amodal Depth Anything: Amodal Depth Estimation in the Wild

Zhenyu Li, Mykola Lavreniuk, Jian Shi et al.

ICCV 2025arXiv:2412.02336

citations

#10126

Martian World Model: Controllable Video Synthesis with Physically Accurate 3D Reconstructions

Longfei Li, Zhiwen Fan, Wenyan Cong et al.

NEURIPS 2025arXiv:2507.07978

citations

#10127

Visual Modality Prompt for Adapting Vision-Language Object Detectors

Heitor Rapela Medeiros, Atif Belal, Srikanth Muralidharan et al.

ICCV 2025arXiv:2412.00622

citations

#10128

CAP-Net: A Unified Network for 6D Pose and Size Estimation of Categorical Articulated Parts from a Single RGB-D Image

Jingshun Huang, Haitao Lin, Tianyu Wang et al.

CVPR 2025highlightarXiv:2504.11230

citations

#10129

Colors See Colors Ignore: Clothes Changing ReID with Color Disentanglement

Priyank Pathak, Yogesh Rawat

ICCV 2025arXiv:2507.07230

citations

#10130

OracleFusion: Assisting the Decipherment of Oracle Bone Script with Structurally Constrained Semantic Typography

Li Caoshuo, Zengmao Ding, Xiaobin Hu et al.

ICCV 2025arXiv:2506.21101

citations

#10131

Transformed Low-rank Adaptation via Tensor Decomposition and Its Applications to Text-to-image Models

Zerui Tao, Yuhta Takida, Naoki Murata et al.

ICCV 2025arXiv:2501.08727

citations

#10132

Common Task Framework For a Critical Evaluation of Scientific Machine Learning Algorithms

Philippe Wyder, Judah Goldfeder, Alexey Yermakov et al.

NEURIPS 2025arXiv:2510.23166

citations

#10133

COSMO: Combination of Selective Memorization for Low-cost Vision-and-Language Navigation

Siqi Zhang, Yanyuan Qiao, Qunbo Wang et al.

ICCV 2025arXiv:2503.24065

citations

#10134

Tracing Back the Malicious Clients in Poisoning Attacks to Federated Learning

Yuqi Jia, Minghong Fang, Hongbin Liu et al.

NEURIPS 2025arXiv:2407.07221

citations

#10135

Splat-LOAM: Gaussian Splatting LiDAR Odometry and Mapping

Emanuele Giacomini, Luca Di Giammarino, Lorenzo De Rebotti et al.

ICCV 2025arXiv:2503.17491

citations

#10136

Decoder Gradient Shield: Provable and High-Fidelity Prevention of Gradient-Based Box-Free Watermark Removal

Haonan An, Guang Hua, Zhengru Fang et al.

CVPR 2025arXiv:2502.20924

citations

#10137

Associative Transformer

Yuwei Sun, Hideya Ochiai, Zhirong Wu et al.

CVPR 2025arXiv:2309.12862

citations

#10138

From Head to Tail: Efficient Black-box Model Inversion Attack via Long-tailed Learning

Ziang Li, Hongguang Zhang, Juan Wang et al.

CVPR 2025arXiv:2503.16266

citations

#10139

Multi-Scale Neighborhood Occupancy Masked Autoencoder for Self-Supervised Learning in LiDAR Point Clouds

Mohamed Abdelsamad, Michael Ulrich, Claudius Glaeser et al.

CVPR 2025arXiv:2502.20316

citations

#10140

SparseDiT: Token Sparsification for Efficient Diffusion Transformer

Shuning Chang, Pichao WANG, Jiasheng Tang et al.

NEURIPS 2025oralarXiv:2412.06028

citations

#10141

Unified Algorithms for RL with Decision-Estimation Coefficients: PAC, Reward-Free, Preference-Based Learning, and Beyond

Fan Chen, Song Mei, Yu Bai

NEURIPS 2025arXiv:2209.11745

citations

#10142

AION-1: Omnimodal Foundation Model for Astronomical Sciences

Liam Parker, Francois Lanusse, Jeff Shen et al.

NEURIPS 2025arXiv:2510.17960

citations

#10143

AutoJudge: Judge Decoding Without Manual Annotation

Roman Garipov, Fedor Velikonivtsev, Ivan Ermakov et al.

NEURIPS 2025arXiv:2504.20039

citations

#10144

Can't Slow Me Down: Learning Robust and Hardware-Adaptive Object Detectors against Latency Attacks for Edge Devices

Tianyi Wang, Zichen Wang, Cong Wang et al.

CVPR 2025arXiv:2412.02171

citations

#10145

Joint Diffusion Models in Continual Learning

Paweł Skierś, Kamil Deja

ICCV 2025arXiv:2411.08224

citations

#10146

Semantic Causality-Aware Vision-Based 3D Occupancy Prediction

Dubing Chen, Huan Zheng, Yucheng Zhou et al.

ICCV 2025arXiv:2509.08388

citations

#10147

BoltzNCE: Learning likelihoods for Boltzmann Generation with Stochastic Interpolants and Noise Contrastive Estimation

Rishal Aggarwal, Jacky Chen, Nicholas Boffi et al.

NEURIPS 2025arXiv:2507.00846

citations

#10148

Image Referenced Sketch Colorization Based on Animation Creation Workflow

Dingkun Yan, Xinrui Wang, Zhuoru Li et al.

CVPR 2025arXiv:2502.19937

citations

#10149

TailedCore: Few-Shot Sampling for Unsupervised Long-Tail Noisy Anomaly Detection

Yoon Gyo Jung, Jaewoo Park, Jaeho Yoon et al.

CVPR 2025arXiv:2504.02775

citations

#10150

GPAvatar: High-fidelity Head Avatars by Learning Efficient Gaussian Projections

Weiqi Feng, Dong Han, Zekang Zhou et al.

CVPR 2025

citations

#10151

Meta-Learning Objectives for Preference Optimization

Carlo Alfano, Silvia Sapora, Jakob Foerster et al.

NEURIPS 2025arXiv:2411.06568

citations

#10152

STRAP: Spatio-Temporal Pattern Retrieval for Out-of-Distribution Generalization

Haoyu Zhang, WentaoZhang, Hao Miao et al.

NEURIPS 2025oralarXiv:2505.19547

citations

#10153

Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning

Yurun Yuan, Fan Chen, Zeyu Jia et al.

NEURIPS 2025arXiv:2505.15311

citations

#10154

Gyro-based Neural Single Image Deblurring

Heemin Yang, Jaesung Rim, Seungyong Lee et al.

CVPR 2025arXiv:2404.00916

citations

#10155

Composing Global Solutions to Reasoning Tasks via Algebraic Objects in Neural Nets

Yuandong Tian

NEURIPS 2025arXiv:2410.01779

citations

#10156

TalkCuts: A Large-Scale Dataset for Multi-Shot Human Speech Video Generation

Jiaben Chen, Zixin Wang, AILING ZENG et al.

NEURIPS 2025arXiv:2510.07249

citations

#10157

RoboTron-Nav: A Unified Framework for Embodied Navigation Integrating Perception, Planning, and Prediction

Yufeng Zhong, Chengjian Feng, Feng yan et al.

ICCV 2025arXiv:2503.18525

citations

#10158

Learning Class Prototypes for Unified Sparse-Supervised 3D Object Detection

Yun Zhu, Le Hui, Hang Yang et al.

CVPR 2025highlightarXiv:2503.21099

citations

#10159

SkyLadder: Better and Faster Pretraining via Context Window Scheduling

Tongyao Zhu, Qian Liu, Haonan Wang et al.

NEURIPS 2025arXiv:2503.15450

citations

#10160

CuMPerLay: Learning Cubical Multiparameter Persistence Vectorizations

Caner Korkmaz, Brighton Nuwagira, Baris Coskunuzer et al.

ICCV 2025arXiv:2510.12795

citations

#10161

Bidirectional Likelihood Estimation with Multi-Modal Large Language Models for Text-Video Retrieval

Dohwan Ko, Ji Soo Lee, Minhyuk Choi et al.

ICCV 2025highlightarXiv:2507.23284

citations

#10162

Wav2Sem: Plug-and-Play Audio Semantic Decoupling for 3D Speech-Driven Facial Animation

Hao Li, Ju Dai, Xin Zhao et al.

CVPR 2025arXiv:2505.23290

citations

#10163

UniPhy: Learning a Unified Constitutive Model for Inverse Physics Simulation

Himangi Mittal, Peiye Zhuang, Hsin-Ying Lee et al.

CVPR 2025arXiv:2505.16971

citations

#10164

Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy

Yiting Yang, Hao Luo, Yuan Sun et al.

ICCV 2025arXiv:2507.13260

citations

#10165

Value Gradient Guidance for Flow Matching Alignment

Zhen Liu, Tim Xiao, Carles Domingo i Enrich et al.

NEURIPS 2025arXiv:2512.05116

citations

#10166

Vision-Language Embodiment for Monocular Depth Estimation

Jinchang Zhang, Guoyu Lu

CVPR 2025arXiv:2503.16535

citations

#10167

Reinforced Context Order Recovery for Adaptive Reasoning and Planning

Long Ma, Fangwei Zhong, Yizhou Wang

NEURIPS 2025arXiv:2508.13070

citations

#10168

ImViD: Immersive Volumetric Videos for Enhanced VR Engagement

Zhengxian Yang, Shi Pan, Shengqi Wang et al.

CVPR 2025highlightarXiv:2503.14359

citations

#10169

Seeing What Matters: Generalizable AI-generated Video Detection with Forensic-Oriented Augmentation

Riccardo Corvi, Davide Cozzolino, Ekta Prashnani et al.

NEURIPS 2025arXiv:2506.16802

citations

#10170

Image as an IMU: Estimating Camera Motion from a Single Motion-Blurred Image

Jerred Chen, Ronald Clark

ICCV 2025arXiv:2503.17358

citations

#10171

Task Vector Quantization for Memory-Efficient Model Merging

Youngeun Kim, Seunghwan Lee, Aecheon Jung et al.

ICCV 2025arXiv:2503.06921

citations

#10172

Eyes Wide Open: Ego Proactive Video-LLM for Streaming Video

Xueyang Yu, Cheng Shi, Yang Wang et al.

NEURIPS 2025arXiv:2510.14560

citations

#10173

Simpler Diffusion: 1.5 FID on ImageNet512 with Pixel-space Diffusion

Emiel Hoogeboom, Thomas Mensink, Jonathan Heek et al.

CVPR 2025

citations

#10174

OmniVTON: Training-Free Universal Virtual Try-On

Zhaotong Yang, Yuhui Li, Shengfeng He et al.

ICCV 2025arXiv:2507.15037

citations

#10175

Anti-Aliased 2D Gaussian Splatting

Mae Younes, Adnane Boukhayma

NEURIPS 2025arXiv:2506.11252

citations

#10176

EVOREFUSE: Evolutionary Prompt Optimization for Evaluation and Mitigation of LLM Over-Refusal to Pseudo-Malicious Instructions

Xiaorui Wu, Fei Li, Xiaofeng Mao et al.

NEURIPS 2025arXiv:2505.23473

citations

#10177

Reanimating Images using Neural Representations of Dynamic Stimuli

Jacob Yeung, Andrew Luo, Gabriel Sarch et al.

CVPR 2025arXiv:2406.02659

citations

#10178

Probably Approximately Precision and Recall Learning

Lee Cohen, Yishay Mansour, Shay Moran et al.

NEURIPS 2025arXiv:2411.13029

citations

#10179

Feed-Forward SceneDINO for Unsupervised Semantic Scene Completion

Aleksandar Jevtić, Christoph Reich, Felix Wimbauer et al.

ICCV 2025arXiv:2507.06230

citations

#10180

Auto-Connect: Connectivity-Preserving RigFormer with Direct Preference Optimization

jingfeng Guo, Jian Liu, Jinnan Chen et al.

NEURIPS 2025arXiv:2506.11430

citations

#10181

HumanMM: Global Human Motion Recovery from Multi-shot Videos

Yuhong Zhang, Guanlin Wu, Ling-Hao Chen et al.

CVPR 2025arXiv:2503.07597

citations

#10182

PI-HMR: Towards Robust In-bed Temporal Human Shape Reconstruction with Contact Pressure Sensing

Ziyu Wu, Yufan Xiong, Mengting Niu et al.

CVPR 2025arXiv:2503.00068

citations

#10183

KL Penalty Control via Perturbation for Direct Preference Optimization

Sangkyu Lee, Janghoon Han, Hosung Song et al.

NEURIPS 2025arXiv:2502.13177

citations

#10184

BMMR: A Large-Scale Bilingual Multimodal Multi-Discipline Reasoning Dataset

Zhiheng Xi, Guanyu Li, Yutao Fan et al.

NEURIPS 2025arXiv:2507.03483

citations

#10185

Generalized Linear Bandits: Almost Optimal Regret with One-Pass Update

Yu-Jie Zhang, Sheng-An Xu, Peng Zhao et al.

NEURIPS 2025arXiv:2507.11847

citations

#10186

TokensGen: Harnessing Condensed Tokens for Long Video Generation

Wenqi Ouyang, Zeqi Xiao, Danni Yang et al.

ICCV 2025arXiv:2507.15728

citations

#10187

PGC: Physics-Based Gaussian Cloth from a Single Pose

Michelle Guo, Matt Jen-Yuan Chiang, Igor Santesteban et al.

CVPR 2025highlightarXiv:2503.20779

citations

#10188

Flat Channels to Infinity in Neural Loss Landscapes

Flavio Martinelli, Alexander van Meegen, Berfin Simsek et al.

NEURIPS 2025arXiv:2506.14951

citations

#10189

BEAST: Efficient Tokenization of B-Splines Encoded Action Sequences for Imitation Learning

Hongyi Zhou, Weiran Liao, Xi Huang et al.

NEURIPS 2025arXiv:2506.06072

citations

#10190

A Unified Framework for the Transportability of Population-Level Causal Measures

Ahmed Boughdiri, Clément Berenfeld, Julie Josse et al.

NEURIPS 2025arXiv:2505.13104

citations

#10191

Z-Magic: Zero-shot Multiple Attributes Guided Image Creator

Yingying Deng, Xiangyu He, Fan Tang et al.

CVPR 2025arXiv:2503.12124

citations

#10192

Joint Self-Supervised Video Alignment and Action Segmentation

Ali Shah Ali, Syed Ahmed Mahmood, Mubin Saeed et al.

ICCV 2025arXiv:2503.16832

citations

#10193

Learning to Better Search with Language Models via Guided Reinforced Self-Training

Seungyong Moon, Bumsoo Park, Hyun Oh Song

NEURIPS 2025arXiv:2410.02992

citations

#10194

NeRF Is a Valuable Assistant for 3D Gaussian Splatting

Shuangkang Fang, I-Chao Shen, Takeo Igarashi et al.

ICCV 2025arXiv:2507.23374

citations

#10195

Precise Information Control in Long-Form Text Generation

Jacqueline He, Howard Yen, Margaret Li et al.

NEURIPS 2025arXiv:2506.06589

citations

#10196

Adaptive Markup Language Generation for Contextually-Grounded Visual Document Understanding

Han Xiao, yina xie, Guanxin tan et al.

CVPR 2025arXiv:2505.05446

citations

#10197

Mitigating Ambiguities in 3D Classification with Gaussian Splatting

Ruiqi Zhang, Hao Zhu, Jingyi Zhao et al.

CVPR 2025arXiv:2503.08352

citations

#10198

PBCAT: Patch-Based Composite Adversarial Training against Physically Realizable Attacks on Object Detection

Xiao Li, Yiming Zhu, Yifan Huang et al.

ICCV 2025arXiv:2506.23581

citations

#10199

StickMotion: Generating 3D Human Motions by Drawing a Stickman

Tao Wang, Zhihua Wu, Qiaozhi He et al.

CVPR 2025arXiv:2503.04829

citations

#10200

Longer Context, Deeper Thinking: Uncovering the Role of Long-Context Ability in Reasoning

Wang Yang, Zirui Liu, Hongye Jin et al.

NEURIPS 2025arXiv:2505.17315

citations

← Previous

1...49 50 51 52 53...112