Most Cited 2025 "grasping motion generation" Papers

22,274 papers found • Page 46 of 112

#9001

Testing Causal Models with Hidden Variables in Polynomial Delay via Conditional Independencies

Hyunchai Jeong, Adiba Ejaz, Jin Tian et al.

AAAI 2025paperarXiv:2409.14593
4
citations
#9002

Graph-Guided Scene Reconstruction from Images with 3D Gaussian Splatting

Chong Cheng, Gaochao Song, Yiyang Yao et al.

ICLR 2025arXiv:2502.17377
4
citations
#9003

BSDB-Net: Band-Split Dual-Branch Network with Selective State Spaces Mechanism for Monaural Speech Enhancement

Cunhang Fan, Enrui Liu, Andong Li et al.

AAAI 2025paperarXiv:2412.19099
4
citations
#9004

Visually Consistent Hierarchical Image Classification

Seulki Park, Youren Zhang, Stella Yu et al.

ICLR 2025arXiv:2406.11608
4
citations
#9005

Factor Graph-based Interpretable Neural Networks

Yicong Li, Kuanjiu Zhou, Shuo Yu et al.

ICLR 2025arXiv:2502.14572
4
citations
#9006

Spectral Convolutional Conditional Neural Process

Peiman Mohseni, Nick Duffield

NEURIPS 2025
4
citations
#9007

Low-Rank Adapting Models for Sparse Autoencoders

Matthew Chen, Josh Engels, Max Tegmark

ICML 2025arXiv:2501.19406
4
citations
#9008

$InterLCM$: Low-Quality Images as Intermediate States of Latent Consistency Models for Effective Blind Face Restoration

Senmao Li, Kai Wang, Joost van de Weijer et al.

ICLR 2025arXiv:2502.02215
4
citations
#9009

CFP-Gen: Combinatorial Functional Protein Generation via Diffusion Language Models

Junbo Yin, Chao Zha, Wenjia He et al.

ICML 2025arXiv:2505.22869
4
citations
#9010

Reinforcement Learning for Quantum Control under Physical Constraints

Jan Ole Ernst, Aniket Chatterjee, Tim Franzmeyer et al.

ICML 2025arXiv:2501.14372
4
citations
#9011

End-to-End Vision Tokenizer Tuning

Wenxuan Wang, Fan Zhang, Yufeng Cui et al.

NEURIPS 2025arXiv:2505.10562
4
citations
#9012

Towards Generalizable Multi-Camera 3D Object Detection via Perspective Rendering

Hao Lu, Yunpeng Zhang, Guoqing Wang et al.

AAAI 2025paper
4
citations
#9013

Semi-Supervised Online Cross-Modal Hashing

Xiao Kang, Xingbo Liu, Xuening Zhang et al.

AAAI 2025paper
4
citations
#9014

Flow Distillation Sampling: Regularizing 3D Gaussians with Pre-trained Matching Priors

Lin-Zhuo Chen, Kangjie Liu, Youtian Lin et al.

ICLR 2025arXiv:2502.07615
4
citations
#9015

Enhancing SQL Query Generation with Neurosymbolic Reasoning

Henrijs Princis, Cristina David, Alan Mycroft

AAAI 2025paperarXiv:2408.13888
4
citations
#9016

Watch Video, Catch Keyword: Context-aware Keyword Attention for Moment Retrieval and Highlight Detection

Sung Jin Um, Dongjin Kim, Sangmin Lee et al.

AAAI 2025paperarXiv:2501.02504
4
citations
#9017

Efficient Action-Constrained Reinforcement Learning via Acceptance-Rejection Method and Augmented MDPs

Wei Hung, Shao-Hua Sun, Ping-Chun Hsieh

ICLR 2025arXiv:2503.12932
4
citations
#9018

Training with “Paraphrasing the Original Text” Teaches LLM to Better Retrieve in Long-Context Tasks

Yijiong Yu, Yongfeng Huang, Zhixiao Qi et al.

AAAI 2025paperarXiv:2312.11193
4
citations
#9019

When, Where and Why to Average Weights?

Niccolò Ajroldi, Antonio Orvieto, Jonas Geiping

ICML 2025arXiv:2502.06761
4
citations
#9020

UniMatch: Universal Matching from Atom to Task for Few-Shot Drug Discovery

Ruifeng Li, Mingqian Li, Wei Liu et al.

ICLR 2025arXiv:2502.12453
4
citations
#9021

Calibrating LLMs with Information-Theoretic Evidential Deep Learning

Yawei Li, David Rügamer, Bernd Bischl et al.

ICLR 2025arXiv:2502.06351
4
citations
#9022

Captured by Captions: On Memorization and its Mitigation in CLIP Models

Wenhao Wang, Adam Dziedzic, Grace Kim et al.

ICLR 2025arXiv:2502.07830
4
citations
#9023

Partially Observable Reinforcement Learning with Memory Traces

Onno Eberhard, Michael Muehlebach, Claire Vernade

ICML 2025arXiv:2503.15200
4
citations
#9024

Reliable and Diverse Evaluation of LLM Medical Knowledge Mastery

Yuxuan Zhou, Xien Liu, Chen Ning et al.

ICLR 2025arXiv:2409.14302
4
citations
#9025

CARTS: Advancing Neural Theorem Proving with Diversified Tactic Calibration and Bias-Resistant Tree Search

Xiao-Wen Yang, Zhi Zhou, Haiming Wang et al.

ICLR 2025
4
citations
#9026

IntersectionZoo: Eco-driving for Benchmarking Multi-Agent Contextual Reinforcement Learning

Vindula Jayawardana, Baptiste Freydt, Ao Qu et al.

ICLR 2025arXiv:2410.15221
4
citations
#9027

Zero-Shot Offline Imitation Learning via Optimal Transport

Thomas Rupf, Marco Bagatella, Nico Gürtler et al.

ICML 2025arXiv:2410.08751
4
citations
#9028

SplineGS: Learning Smooth Trajectories in Gaussian Splatting for Dynamic Scene Reconstruction

Jihwan Yoon, Sangbeom Han, Jaeseok Oh et al.

ICLR 2025oral
4
citations
#9029

Controllable Blur Data Augmentation Using 3D-Aware Motion Estimation

Insoo Kim, Hana Lee, Hyong-Euk Lee et al.

ICLR 2025
4
citations
#9030

Geometric Hyena Networks for Large-scale Equivariant Learning

Artem Moskalev, Mangal Prakash, Junjie Xu et al.

ICML 2025spotlightarXiv:2505.22560
4
citations
#9031

Is Limited Participant Diversity Impeding EEG-based Machine Learning?

Philipp Bomatter, Henry Gouk

NEURIPS 2025arXiv:2503.13497
4
citations
#9032

Clustering Properties of Self-Supervised Learning

Xi Weng, Jianing An, Xudong Ma et al.

ICML 2025arXiv:2501.18452
4
citations
#9033

Long-horizon Visual Instruction Generation with Logic and Attribute Self-reflection

Yucheng Suo, Fan Ma, Kaixin Shen et al.

ICLR 2025arXiv:2503.13500
4
citations
#9034

Noisy Test-Time Adaptation in Vision-Language Models

Chentao Cao, Zhun Zhong, (Andrew) Zhanke Zhou et al.

ICLR 2025arXiv:2502.14604
4
citations
#9035

Algorithms and SQ Lower Bounds for Robustly Learning Real-valued Multi-Index Models

Ilias Diakonikolas, Giannis Iakovidis, Daniel Kane et al.

NEURIPS 2025spotlightarXiv:2505.21475
4
citations
#9036

Relating Misfit to Gain in Weak-to-Strong Generalization Beyond the Squared Loss

Abhijeet Mulgund, Chirag Pabbaraju

ICML 2025arXiv:2501.19105
4
citations
#9037

Hypo3D: Exploring Hypothetical Reasoning in 3D

Ye Mao, Weixun Luo, Junpeng Jing et al.

ICML 2025arXiv:2502.00954
4
citations
#9038

KAES: Multi-aspect Shared Knowledge Finding and Aligning for Cross-prompt Automated Scoring of Essay Traits

Xia Li, Wenjing Pan

AAAI 2025paper
4
citations
#9039

BiM-VFI: Bidirectional Motion Field-Guided Frame Interpolation for Video with Non-uniform Motions

Wonyong Seo, Jihyong Oh, Munchurl Kim

CVPR 2025arXiv:2412.11365
4
citations
#9040

Believing is Seeing: Unobserved Object Detection using Generative Models

Subhransu S. Bhattacharjee, Dylan Campbell, Rahul Shome

CVPR 2025arXiv:2410.05869
4
citations
#9041

MarkushGrapher: Joint Visual and Textual Recognition of Markush Structures

Lucas Morin, Valery Weber, Ahmed Nassar et al.

CVPR 2025arXiv:2503.16096
4
citations
#9042

A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning

Xin Wen, Bingchen Zhao, Yilun Chen et al.

CVPR 2025arXiv:2503.06960
4
citations
#9043

Tokenize Image Patches: Global Context Fusion for Effective Haze Removal in Large Images

Jiuchen Chen, Xinyu Yan, Qizhi Xu et al.

CVPR 2025arXiv:2504.09621
4
citations
#9044

GCC: Generative Color Constancy via Diffusing a Color Checker

Chen-Wei Chang, Cheng-De Fan, Chia-Che Chang et al.

CVPR 2025arXiv:2502.17435
4
citations
#9045

AdaCM^2: On Understanding Extremely Long-Term Video with Adaptive Cross-Modality Memory Reduction

Yuanbin Man, Ying Huang, Chengming Zhang et al.

CVPR 2025highlightarXiv:2411.12593
4
citations
#9046

LineArt: A Knowledge-guided Training-free High-quality Appearance Transfer for Design Drawing with Diffusion Model

Xi Wang, Hongzhen Li, Heng Fang et al.

CVPR 2025arXiv:2412.11519
4
citations
#9047

LidarGait++: Learning Local Features and Size Awareness from LiDAR Point Clouds for 3D Gait Recognition

Chuanfu Shen, Rui Wang, Lixin Duan et al.

CVPR 2025
4
citations
#9048

LatentHOI: On the Generalizable Hand Object Motion Generation with Latent Hand Diffusion.

Muchen Li, Sammy Christen, Chengde Wan et al.

CVPR 2025
4
citations
#9049

Nearly Zero-Cost Protection Against Mimicry by Personalized Diffusion Models

Namhyuk Ahn, KiYoon Yoo, Wonhyuk Ahn et al.

CVPR 2025arXiv:2412.11423
4
citations
#9050

GENIUS: A Generative Framework for Universal Multimodal Search

Sungyeon Kim, Xinliang Zhu, Xiaofan Lin et al.

CVPR 2025arXiv:2503.19868
4
citations
#9051

PersonaBooth: Personalized Text-to-Motion Generation

Boeun Kim, Hea In Jeong, JungHoon Sung et al.

CVPR 2025arXiv:2503.07390
4
citations
#9052

BIGS: Bimanual Category-agnostic Interaction Reconstruction from Monocular Videos via 3D Gaussian Splatting

Jeongwan On, Kyeonghwan Gwak, Gunyoung Kang et al.

CVPR 2025arXiv:2504.09097
4
citations
#9053

Symmetry Strikes Back: From Single-Image Symmetry Detection to 3D Generation

Xiang Li, Zixuan Huang, Anh Thai et al.

CVPR 2025highlightarXiv:2411.17763
4
citations
#9054

Learning Affine Correspondences by Integrating Geometric Constraints

Pengju Sun, Banglei Guan, Zhenbao Yu et al.

CVPR 2025arXiv:2504.04834
4
citations
#9055

BG-Triangle: Bézier Gaussian Triangle for 3D Vectorization and Rendering

Minye Wu, Haizhao Dai, Kaixin Yao et al.

CVPR 2025arXiv:2503.13961
4
citations
#9056

Context-Aware Multimodal Pretraining

Karsten Roth, Zeynep Akata, Dima Damen et al.

CVPR 2025highlightarXiv:2411.15099
4
citations
#9057

Zero-1-to-A: Zero-Shot One Image to Animatable Head Avatars Using Video Diffusion

Zhenglin Zhou, Fan Ma, Hehe Fan et al.

CVPR 2025arXiv:2503.15851
4
citations
#9058

Universal Scene Graph Generation

Shengqiong Wu, Hao Fei, Tat-seng Chua

CVPR 2025highlightarXiv:2503.15005
4
citations
#9059

CASP: Compression of Large Multimodal Models Based on Attention Sparsity

Mohsen Gholami, Mohammad Akbari, Kevin Cannons et al.

CVPR 2025highlightarXiv:2503.05936
4
citations
#9060

QuCOOP: A Versatile Framework for Solving Composite and Binary-Parametrised Problems on Quantum Annealers

Natacha Kuete Meli, Vladislav Golyanik, Marcel Seelbach Benkner et al.

CVPR 2025highlightarXiv:2503.19718
4
citations
#9061

Resilient Sensor Fusion Under Adverse Sensor Failures via Multi-Modal Expert Fusion

Konyul Park, Yecheol Kim, Daehun Kim et al.

CVPR 2025arXiv:2503.19776
4
citations
#9062

Preserve or Modify? Context-Aware Evaluation for Balancing Preservation and Modification in Text-Guided Image Editing

Yoonjeon Kim, Soohyun Ryu, Yeonsung Jung et al.

CVPR 2025arXiv:2410.11374
4
citations
#9063

Self-supervised ControlNet with Spatio-Temporal Mamba for Real-world Video Super-resolution

Shijun Shi, Jing Xu, Lijing Lu et al.

CVPR 2025arXiv:2506.01037
4
citations
#9064

LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale

Joya Chen, Yiqi Lin, Ziyun Zeng et al.

CVPR 2025arXiv:2504.16030
4
citations
#9065

Period-LLM: Extending the Periodic Capability of Multimodal Large Language Model

Yuting Zhang, Hao Lu, Qingyong Hu et al.

CVPR 2025arXiv:2505.24476
4
citations
#9066

PSBD: Prediction Shift Uncertainty Unlocks Backdoor Detection

Wei Li, Pin-Yu Chen, Sijia Liu et al.

CVPR 2025arXiv:2406.05826
4
citations
#9067

MV-SSM: Multi-View State Space Modeling for 3D Human Pose Estimation

Aviral Chharia, Wenbo Gou, Haoye Dong

CVPR 2025arXiv:2509.00649
4
citations
#9068

Continuous Locomotive Crowd Behavior Generation

Inhwan Bae, Junoh Lee, Hae-Gon Jeon

CVPR 2025arXiv:2504.04756
4
citations
#9069

Token Cropr: Faster ViTs for Quite a Few Tasks

Benjamin Bergner, Christoph Lippert, Aravindh Mahendran

CVPR 2025arXiv:2412.00965
4
citations
#9070

Doppelgangers++: Improved Visual Disambiguation with Geometric 3D Features

Yuanbo Xiangli, Ruojin Cai, Hanyu Chen et al.

CVPR 2025highlightarXiv:2412.05826
4
citations
#9071

Common3D: Self-Supervised Learning of 3D Morphable Models for Common Objects in Neural Feature Space

Leonhard Sommer, Olaf Dünkel, Christian Theobalt et al.

CVPR 2025arXiv:2504.21749
4
citations
#9072

Learning 4D Panoptic Scene Graph Generation from Rich 2D Visual Scene

Shengqiong Wu, Hao Fei, Jingkang Yang et al.

CVPR 2025highlightarXiv:2503.15019
4
citations
#9073

Dual-view X-ray Detection: Can AI Detect Prohibited Items from Dual-view X-ray Images like Humans?

Renshuai Tao, Haoyu Wang, Yuzhe Guo et al.

CVPR 2025arXiv:2411.18082
4
citations
#9074

Scalable Autoregressive Monocular Depth Estimation

Jinhong Wang, Jintai Chen, Jian liu et al.

CVPR 2025arXiv:2411.11361
4
citations
#9075

Reasoning to Attend: Try to Understand How <SEG> Token Works

Rui Qian, Xin Yin, Dejing Dou

CVPR 2025arXiv:2412.17741
4
citations
#9076

Temporal Score Analysis for Understanding and Correcting Diffusion Artifacts

Yu Cao, Zengqun Zhao, Ioannis Patras et al.

CVPR 2025arXiv:2503.16218
4
citations
#9077

GauCho: Gaussian Distributions with Cholesky Decomposition for Oriented Object Detection

Jeffri Erwin Murrugarra Llerena, José Henrique Marques, Claudio Jung

CVPR 2025arXiv:2502.01565
4
citations
#9078

Real-time Free-view Human Rendering from Sparse-view RGB Videos using Double Unprojected Textures

Guoxing Sun, Rishabh Dabral, Heming Zhu et al.

CVPR 2025highlightarXiv:2412.13183
4
citations
#9079

MixerMDM: Learnable Composition of Human Motion Diffusion Models

Pablo Ruiz-Ponce, German Barquero, Cristina Palmero et al.

CVPR 2025arXiv:2504.01019
4
citations
#9080

Localizing Events in Videos with Multimodal Queries

Gengyuan Zhang, Mang Ling Ada Fok, Jialu Ma et al.

CVPR 2025arXiv:2406.10079
4
citations
#9081

SPMTrack: Spatio-Temporal Parameter-Efficient Fine-Tuning with Mixture of Experts for Scalable Visual Tracking

Wenrui Cai, Qingjie Liu, Yunhong Wang

CVPR 2025arXiv:2503.18338
4
citations
#9082

PolarFree: Polarization-based Reflection-Free Imaging

Mingde Yao, Menglu Wang, King Man Tam et al.

CVPR 2025arXiv:2503.18055
4
citations
#9083

H-MoRe: Learning Human-centric Motion Representation for Action Analysis

Zhanbo Huang, Xiaoming Liu, Yu Kong

CVPR 2025highlightarXiv:2504.10676
4
citations
#9084

SMILE: Infusing Spatial and Motion Semantics in Masked Video Learning

Fida Mohammad Thoker, Letian Jiang, Chen Zhao et al.

CVPR 2025arXiv:2504.00527
4
citations
#9085

SAT-HMR: Real-Time Multi-Person 3D Mesh Estimation via Scale-Adaptive Tokens

Chi Su, Xiaoxuan Ma, Jiajun Su et al.

CVPR 2025arXiv:2411.19824
4
citations
#9086

Ges3ViG : Incorporating Pointing Gestures into Language-Based 3D Visual Grounding for Embodied Reference Understanding

Atharv Mahesh Mane, Dulanga Weerakoon, Vigneshwaran Subbaraju et al.

CVPR 2025arXiv:2504.09623
4
citations
#9087

Enhanced OoD Detection through Cross-Modal Alignment of Multi-Modal Representations

Jeonghyeon Kim, Sangheum Hwang

CVPR 2025arXiv:2503.18817
4
citations
#9088

Learning Physics-Based Full-Body Human Reaching and Grasping from Brief Walking References

Yitang Li, Mingxian Lin, Zhuo Lin et al.

CVPR 2025arXiv:2503.07481
4
citations
#9089

TKG-DM: Training-free Chroma Key Content Generation Diffusion Model

Ryugo Morita, Stanislav Frolov, Brian Bernhard Moser et al.

CVPR 2025highlightarXiv:2411.15580
4
citations
#9090

OODD: Test-time Out-of-Distribution Detection with Dynamic Dictionary

Yifeng Yang, Lin Zhu, Zewen Sun et al.

CVPR 2025arXiv:2503.10468
4
citations
#9091

GASP: Gaussian Avatars with Synthetic Priors

Jack Saunders, Charlie Hewitt, Yanan Jian et al.

CVPR 2025arXiv:2412.07739
4
citations
#9092

Faster Parameter-Efficient Tuning with Token Redundancy Reduction

Kwonyoung Kim, Jungin Park, Jin Kim et al.

CVPR 2025arXiv:2503.20282
4
citations
#9093

Test-Time Visual In-Context Tuning

Jiahao Xie, Alessio Tonioni, Nathalie Rauschmayr et al.

CVPR 2025arXiv:2503.21777
4
citations
#9094

FruitNinja: 3D Object Interior Texture Generation with Gaussian Splatting

Fangyu Wu, Yuhao Chen

CVPR 2025arXiv:2411.12089
4
citations
#9095

Prompt2Perturb (P2P): Text-Guided Diffusion-Based Adversarial Attack on Breast Ultrasound Images

Yasamin Medghalchi, Moein Heidari, Clayton Allard et al.

CVPR 2025arXiv:2412.09910
4
citations
#9096

VerbDiff: Text-Only Diffusion Models with Enhanced Interaction Awareness

SeungJu Cha, Kwanyoung Lee, Ye-Chan Kim et al.

CVPR 2025arXiv:2503.16406
4
citations
#9097

MultiMorph: On-demand Atlas Construction

Mazdak Abulnaga, Andrew Hoopes, Neel Dey et al.

CVPR 2025arXiv:2504.00247
4
citations
#9098

SURGEON: Memory-Adaptive Fully Test-Time Adaptation via Dynamic Activation Sparsity

Ke Ma, Jiaqi Tang, Bin Guo et al.

CVPR 2025highlightarXiv:2503.20354
4
citations
#9099

Rethinking Epistemic and Aleatoric Uncertainty for Active Open-Set Annotation: An Energy-Based Approach

Chen-Chen Zong, Sheng-Jun Huang

CVPR 2025arXiv:2502.19691
4
citations
#9100

Reconstructing Animals and the Wild

Peter Kulits, Michael J. Black, Silvia Zuffi

CVPR 2025arXiv:2411.18807
4
citations
#9101

Preserving Clusters in Prompt Learning for Unsupervised Domain Adaptation

Long Tung Vuong, Hoang Phan, Vy Vo et al.

CVPR 2025arXiv:2506.11493
4
citations
#9102

One-Way Ticket: Time-Independent Unified Encoder for Distilling Text-to-Image Diffusion Models

Senmao Li, Lei Wang, Kai Wang et al.

CVPR 2025
4
citations
#9103

Learning Dynamic Collaborative Network for Semi-supervised 3D Vessel Segmentation

Jiao Xu, Xin Chen, Lihe Zhang

CVPR 2025arXiv:2601.07377
4
citations
#9104

Subnet-Aware Dynamic Supernet Training for Neural Architecture Search

Jeimin Jeon, Youngmin Oh, Junghyup Lee et al.

CVPR 2025arXiv:2503.10740
4
citations
#9105

Geometry in Style: 3D Stylization via Surface Normal Deformation

Nam Anh Dinh, Itai Lang, Hyunwoo Kim et al.

CVPR 2025arXiv:2503.23241
4
citations
#9106

FlashSloth : Lightning Multimodal Large Language Models via Embedded Visual Compression

Bo Tong, Bokai Lai, Yiyi Zhou et al.

CVPR 2025arXiv:2412.04317
4
citations
#9107

MeshGen: Generating PBR Textured Mesh with Render-Enhanced Auto-Encoder and Generative Data Augmentation

Zilong Chen, Yikai Wang, Wenqiang Sun et al.

CVPR 2025highlightarXiv:2505.04656
4
citations
#9108

Radio Frequency Ray Tracing with Neural Object Representation for Enhanced RF Modeling

Xingyu Chen, Zihao Feng, Kun Qian et al.

CVPR 2025
4
citations
#9109

VoteFlow: Enforcing Local Rigidity in Self-Supervised Scene Flow

Yancong Lin, Shiming Wang, Liangliang Nan et al.

CVPR 2025arXiv:2503.22328
4
citations
#9110

PointLoRA: Low-Rank Adaptation with Token Selection for Point Cloud Learning

Song Wang, Xiaolu Liu, Lingdong Kong et al.

CVPR 2025arXiv:2504.16023
4
citations
#9111

Full-DoF Egomotion Estimation for Event Cameras Using Geometric Solvers

Ji Zhao, Banglei Guan, Zibin Liu et al.

CVPR 2025highlightarXiv:2503.03307
4
citations
#9112

SAM-I2V: Upgrading SAM to Support Promptable Video Segmentation with Less than 0.2% Training Cost

Haiyang Mei, Pengyu Zhang, Mike Zheng Shou

CVPR 2025arXiv:2506.01304
4
citations
#9113

ForestLPR: LiDAR Place Recognition in Forests Attentioning Multiple BEV Density Images

Yanqing Shen, Turcan Tuna, Marco Hutter et al.

CVPR 2025highlightarXiv:2503.04475
4
citations
#9114

T-FAKE: Synthesizing Thermal Images for Facial Landmarking

Philipp Flotho, Moritz Piening, Anna Kukleva et al.

CVPR 2025arXiv:2408.15127
4
citations
#9115

Satellite to GroundScape - Large-scale Consistent Ground View Generation from Satellite Views

Ningli Xu, Rongjun Qin

CVPR 2025arXiv:2504.15786
4
citations
#9116

Tracktention: Leveraging Point Tracking to Attend Videos Faster and Better

Zihang Lai, Andrea Vedaldi

CVPR 2025highlightarXiv:2503.19904
4
citations
#9117

The Devil is in Low-Level Features for Cross-Domain Few-Shot Segmentation

Yuhan Liu, Yixiong Zou, Yuhua Li et al.

CVPR 2025arXiv:2503.21150
4
citations
#9118

Let Humanoids Hike! Integrative Skill Development on Complex Trails

Kwan-Yee Lin, Stella X. Yu

CVPR 2025arXiv:2505.06218
4
citations
#9119

Buffer Anytime: Zero-Shot Video Depth and Normal from Image Priors

Zhengfei Kuang, Tianyuan Zhang, Kai Zhang et al.

CVPR 2025arXiv:2411.17249
4
citations
#9120

Event Ellipsometer: Event-based Mueller-Matrix Video Imaging

Ryota Maeda, Yunseong Moon, Seung-Hwan Baek

CVPR 2025highlightarXiv:2411.17313
4
citations
#9121

Taste More, Taste Better: Diverse Data and Strong Model Boost Semi-Supervised Crowd Counting

Maochen Yang, Zekun Li, Jian Zhang et al.

CVPR 2025arXiv:2503.17984
4
citations
#9122

Balanced Direction from Multifarious Choices: Arithmetic Meta-Learning for Domain Generalization

Xiran Wang, Jian Zhang, Lei Qi et al.

CVPR 2025arXiv:2503.18987
4
citations
#9123

Spectral State Space Model for Rotation-Invariant Visual Representation Learning

Sahar Dastani, Ali Bahri, Moslem Yazdanpanah et al.

CVPR 2025arXiv:2503.06369
4
citations
#9124

Mono3DVLT: Monocular-Video-Based 3D Visual Language Tracking

Hongkai Wei, YANG YANG, Shijie Sun et al.

CVPR 2025
4
citations
#9125

What Makes a Good Dataset for Knowledge Distillation?

Logan Frank, Jim Davis

CVPR 2025arXiv:2411.12817
4
citations
#9126

Order-One Rolling Shutter Cameras

Marvin Anas Hahn, Kathlén Kohn, Orlando Marigliano et al.

CVPR 2025highlightarXiv:2403.11295
4
citations
#9127

DreamCache: Finetuning-Free Lightweight Personalized Image Generation via Feature Caching

Emanuele Aiello, Umberto Michieli, Diego Valsesia et al.

CVPR 2025arXiv:2411.17786
4
citations
#9128

Floxels: Fast Unsupervised Voxel Based Scene Flow Estimation

David T. Hoffmann, Syed Haseeb Raza, Hanqiu Jiang et al.

CVPR 2025arXiv:2503.04718
4
citations
#9129

Synchronized Video-to-Audio Generation via Mel Quantization-Continuum Decomposition

Juncheng Wang, Chao Xu, Cheng Yu et al.

CVPR 2025arXiv:2503.06984
4
citations
#9130

PatchVSR: Breaking Video Diffusion Resolution Limits with Patch-wise Video Super-Resolution

Shian Du, Menghan Xia, Chang Liu et al.

CVPR 2025arXiv:2509.26025
4
citations
#9131

Optimizing for the Shortest Path in Denoising Diffusion Model

Ping Chen, Xingpeng Zhang, Zhaoxiang Liu et al.

CVPR 2025highlightarXiv:2503.03265
4
citations
#9132

Apply Hierarchical-Chain-of-Generation to Complex Attributes Text-to-3D Generation

Yiming Qin, Zhu Xu, Yang Liu

CVPR 2025arXiv:2505.05505
4
citations
#9133

Distribution Prototype Diffusion Learning for Open-set Supervised Anomaly Detection

Fuyun Wang, Tong Zhang, Yuanzhi Wang et al.

CVPR 2025arXiv:2502.20981
4
citations
#9134

Not Only Text: Exploring Compositionality of Visual Representations in Vision-Language Models

Davide Berasi, Matteo Farina, Massimiliano Mancini et al.

CVPR 2025highlightarXiv:2503.17142
4
citations
#9135

HOP: Heterogeneous Topology-based Multimodal Entanglement for Co-Speech Gesture Generation

Hongye Cheng, Tianyu Wang, guangsi shi et al.

CVPR 2025arXiv:2503.01175
4
citations
#9136

CADRef: Robust Out-of-Distribution Detection via Class-Aware Decoupled Relative Feature Leveraging

Zhiwei Ling, Yachen Chang, Hailiang Zhao et al.

CVPR 2025arXiv:2503.00325
4
citations
#9137

DaCapo: Score Distillation as Stacked Bridge for Fast and High-quality 3D Editing

Yufei Huang, Bangyan Liao, Yuqi Hu et al.

CVPR 2025
4
citations
#9138

CASAGPT: Cuboid Arrangement and Scene Assembly for Interior Design

Weitao Feng, Hang Zhou, Jing Liao et al.

CVPR 2025highlightarXiv:2504.19478
4
citations
#9139

GIVEPose: Gradual Intra-class Variation Elimination for RGB-based Category-Level Object Pose Estimation

Ziqin Huang, Gu Wang, Chenyangguang Zhang et al.

CVPR 2025arXiv:2503.15110
4
citations
#9140

Precise Event Spotting in Sports Videos: Solving Long-Range Dependency and Class Imbalance

Sanchayan Santra, Vishal Chudasama, Pankaj Wasnik et al.

CVPR 2025arXiv:2503.00147
4
citations
#9141

Steady Progress Beats Stagnation: Mutual Aid of Foundation and Conventional Models in Mixed Domain Semi-Supervised Medical Image Segmentation

Qinghe Ma, Jian Zhang, Zekun Li et al.

CVPR 2025arXiv:2503.16997
4
citations
#9142

ATP: Adaptive Threshold Pruning for Efficient Data Encoding in Quantum Neural Networks

Mohamed Afane, Gabrielle Ebbrecht, Ying Wang et al.

CVPR 2025arXiv:2503.21815
4
citations
#9143

dFLMoE: Decentralized Federated Learning via Mixture of Experts for Medical Data Analysis

Luyuan Xie, Tianyu Luan, Wenyuan Cai et al.

CVPR 2025arXiv:2503.10412
4
citations
#9144

Towards All-in-One Medical Image Re-Identification

Yuan Tian, Kaiyuan Ji, Rongzhao Zhang et al.

CVPR 2025arXiv:2503.08173
4
citations
#9145

WISE: A Framework for Gigapixel Whole-Slide-Image Lossless Compression

Yu Mao, Jun Wang, Nan Guan et al.

CVPR 2025arXiv:2503.18074
4
citations
#9146

Secret Lies in Color: Enhancing AI-Generated Images Detection with Color Distribution Analysis

Zexi Jia, Chuanwei Huang, Yeshuang Zhu et al.

CVPR 2025
4
citations
#9147

LATTE-MV: Learning to Anticipate Table Tennis Hits from Monocular Videos

Daniel Etaat, Dvij Rajesh Kalaria, Nima Rahmanian et al.

CVPR 2025arXiv:2503.20936
4
citations
#9148

MuTri: Multi-view Tri-alignment for OCT to OCTA 3D Image Translation

zhuangzhuang chen, hualiang wang, Chubin Ou et al.

CVPR 2025arXiv:2504.01428
4
citations
#9149

SPARS3R: Semantic Prior Alignment and Regularization for Sparse 3D Reconstruction

Yutao Tang, Yuxiang Guo, Deming Li et al.

CVPR 2025arXiv:2411.12592
4
citations
#9150

Gradient Inversion Attacks on Parameter-Efficient Fine-Tuning

Hasin Us Sami, Swapneel Sen, Amit K. Roy-Chowdhury et al.

CVPR 2025arXiv:2506.04453
4
citations
#9151

ESCAPE: Equivariant Shape Completion via Anchor Point Encoding

Burak Bekci, Nassir Navab, Federico Tombari et al.

CVPR 2025arXiv:2412.00952
4
citations
#9152

Learnable Infinite Taylor Gaussian for Dynamic View Rendering

Bingbing Hu, Yanyan Li, rui xie et al.

CVPR 2025arXiv:2412.04282
4
citations
#9153

Taxonomy-Aware Evaluation of Vision-Language Models

Vésteinn Snæbjarnarson, Kevin Du, Niklas Stoehr et al.

CVPR 2025arXiv:2504.05457
4
citations
#9154

Event Fields: Capturing Light Fields at High Speed, Resolution, and Dynamic Range

Ziyuan Qu, Zihao Zou, Vivek Boominathan et al.

CVPR 2025highlightarXiv:2412.06191
4
citations
#9155

CMMLoc: Advancing Text-to-PointCloud Localization with Cauchy-Mixture-Model Based Framework

Yanlong Xu, Haoxuan Qu, Jun Liu et al.

CVPR 2025arXiv:2503.02593
4
citations
#9156

Foundations of the Theory of Performance-Based Ranking

Sébastien Piérard, Anaïs Halin, Anthony Cioppa et al.

CVPR 2025arXiv:2412.04227
4
citations
#9157

SmartCLIP: Modular Vision-language Alignment with Identification Guarantees

Shaoan Xie, Lingjing Kong, Yujia Zheng et al.

CVPR 2025highlightarXiv:2507.22264
4
citations
#9158

OmniStereo: Real-time Omnidireactional Depth Estimation with Multiview Fisheye Cameras

Jiaxi Deng, Yushen Wang, Haitao Meng et al.

CVPR 2025
4
citations
#9159

Efficient Video Face Enhancement with Enhanced Spatial-Temporal Consistency

Yutong Wang, Jiajie Teng, Jiajiong Cao et al.

CVPR 2025arXiv:2411.16468
4
citations
#9160

Neural Hierarchical Decomposition for Single Image Plant Modeling

Zhihao Liu, Zhanglin Cheng, Naoto Yokoya

CVPR 2025
4
citations
#9161

Robust-MVTON: Learning Cross-Pose Feature Alignment and Fusion for Robust Multi-View Virtual Try-On

Nannan Zhang, Yijiang Li, Dong Du et al.

CVPR 2025
4
citations
#9162

Unity in Diversity: Video Editing via Gradient-Latent Purification

Junyu Gao, Kunlin Yang, Xuan Yao et al.

CVPR 2025
4
citations
#9163

Towards Generalizable Trajectory Prediction using Dual-Level Representation Learning and Adaptive Prompting

Kaouther Messaoud, Matthieu Cord, Alex Alahi

CVPR 2025arXiv:2501.04815
4
citations
#9164

A Regularization-Guided Equivariant Approach for Image Restoration

Yulu Bai, Jiahong Fu, Qi Xie et al.

CVPR 2025arXiv:2505.19799
4
citations
#9165

DPSeg: Dual-Prompt Cost Volume Learning for Open-Vocabulary Semantic Segmentation

Ziyu Zhao, Xiaoguang Li, Lingjia Shi et al.

CVPR 2025arXiv:2505.11676
4
citations
#9166

Segment Any-Quality Images with Generative Latent Space Enhancement

Guangqian Guo, Yong Guo, Xuehui Yu et al.

CVPR 2025arXiv:2503.12507
4
citations
#9167

Enhancing Online Continual Learning with Plug-and-Play State Space Model and Class-Conditional Mixture of Discretization

Sihao Liu, Yibo Yang, Xiaojie Li et al.

CVPR 2025arXiv:2412.18177
4
citations
#9168

SVG-IR: Spatially-Varying Gaussian Splatting for Inverse Rendering

Hanxiao Sun, Yupeng Gao, Jin Xie et al.

CVPR 2025arXiv:2504.06815
4
citations
#9169

Are Images Indistinguishable to Humans Also Indistinguishable to Classifiers?

Zebin You, Xinyu Zhang, Hanzhong Guo et al.

CVPR 2025arXiv:2405.18029
4
citations
#9170

GS-DiT: Advancing Video Generation with Dynamic 3D Gaussian Fields through Efficient Dense 3D Point Tracking

Weikang Bian, Zhaoyang Huang, Xiaoyu Shi et al.

CVPR 2025
4
citations
#9171

Learning to Detect Objects from Multi-Agent LiDAR Scans without Manual Labels

Qiming Xia, Wenkai Lin, Haoen Xiang et al.

CVPR 2025arXiv:2503.08421
4
citations
#9172

Evaluating Vision-Language Models as Evaluators in Path Planning

Mohamed Aghzal, Xiang Yue, Erion Plaku et al.

CVPR 2025arXiv:2411.18711
4
citations
#9173

Dynamic Motion Blending for Versatile Motion Editing

Nan Jiang, Hongjie Li, Ziye Yuan et al.

CVPR 2025arXiv:2503.20724
4
citations
#9174

Reconstructing In-the-Wild Open-Vocabulary Human-Object Interactions

Boran Wen, Dingbang Huang, Zichen Zhang et al.

CVPR 2025arXiv:2503.15898
4
citations
#9175

LightLoc: Learning Outdoor LiDAR Localization at Light Speed

Wen Li, Chen Liu, Shangshu Yu et al.

CVPR 2025arXiv:2503.17814
4
citations
#9176

PyTorchGeoNodes: Enabling Differentiable Shape Programs for 3D Shape Reconstruction

Sinisa Stekovic, Arslan Artykov, Stefan Ainetter et al.

CVPR 2025arXiv:2404.10620
4
citations
#9177

FRAMES-VQA: Benchmarking Fine-Tuning Robustness across Multi-Modal Shifts in Visual Question Answering

Chengyue Huang, Brisa Maneechotesuwan, Shivang Chopra et al.

CVPR 2025arXiv:2505.21755
4
citations
#9178

Scaling Down Text Encoders of Text-to-Image Diffusion Models

Lifu Wang, Daqing Liu, Xinchen Liu et al.

CVPR 2025arXiv:2503.19897
4
citations
#9179

Efficient Transfer Learning for Video-language Foundation Models

Haoxing Chen, Zizheng Huang, Yan Hong et al.

CVPR 2025arXiv:2411.11223
4
citations
#9180

3D Dental Model Segmentation with Geometrical Boundary Preserving

Shufan Xi, Zexian Liu, Junlin Chang et al.

CVPR 2025arXiv:2503.23702
4
citations
#9181

CryptoFace: End-to-End Encrypted Face Recognition

Wei Ao, Vishnu Naresh Boddeti

CVPR 2025arXiv:2509.00332
4
citations
#9182

LiVOS: Light Video Object Segmentation with Gated Linear Matching

Qin Liu, Jianfeng Wang, Zhengyuan Yang et al.

CVPR 2025arXiv:2411.02818
4
citations
#9183

ViiNeuS: Volumetric Initialization for Implicit Neural Surface Reconstruction of Urban Scenes with Limited Image Overlap

Hala Djeghim, Nathan Piasco, Moussab Bennehar et al.

CVPR 2025arXiv:2403.10344
4
citations
#9184

STiL: Semi-supervised Tabular-Image Learning for Comprehensive Task-Relevant Information Exploration in Multimodal Classification

Siyi Du, Xinzhe Luo, Declan ORegan et al.

CVPR 2025arXiv:2503.06277
4
citations
#9185

DNF: Unconditional 4D Generation with Dictionary-based Neural Fields

Xinyi Zhang, Naiqi Li, Angela Dai

CVPR 2025arXiv:2412.05161
4
citations
#9186

Generating 3D-Consistent Videos from Unposed Internet Photos

Gene Chou, Kai Zhang, Sai Bi et al.

CVPR 2025arXiv:2411.13549
4
citations
#9187

Reasoning in Visual Navigation of End-to-end Trained Agents: A Dynamical Systems Approach

Steeven JANNY, Hervé Poirier, Leonid Antsfeld et al.

CVPR 2025highlightarXiv:2503.08306
4
citations
#9188

Simplification Is All You Need against Out-of-Distribution Overconfidence

Keke Tang, Chao Hou, Weilong Peng et al.

CVPR 2025
4
citations
#9189

DSV-LFS: Unifying LLM-Driven Semantic Cues with Visual Features for Robust Few-Shot Segmentation

Amin Karimi, Charalambos Poullis

CVPR 2025arXiv:2503.04006
4
citations
#9190

ASHiTA: Automatic Scene-grounded HIerarchical Task Analysis

Yun Chang, Leonor Fermoselle, Duy Ta et al.

CVPR 2025arXiv:2504.06553
4
citations
#9191

Protecting Your Video Content: Disrupting Automated Video-based LLM Annotations

Haitong Liu, Kuofeng Gao, Yang Bai et al.

CVPR 2025arXiv:2503.21824
4
citations
#9192

Inference-Scale Complexity in ANN-SNN Conversion for High-Performance and Low-Power Applications

Tong Bu, Maohua Li, Zhaofei Yu

CVPR 2025arXiv:2409.03368
4
citations
#9193

End-to-End Implicit Neural Representations for Classification

Alexander Gielisse, Jan van Gemert

CVPR 2025arXiv:2503.18123
4
citations
#9194

Understanding Multi-Task Activities from Single-Task Videos

Yuhan Shen, Ehsan Elhamifar

CVPR 2025highlight
4
citations
#9195

CustomKD: Customizing Large Vision Foundation for Edge Model Improvement via Knowledge Distillation

Jungsoo Lee, Debasmit Das, Munawar Hayat et al.

CVPR 2025arXiv:2503.18244
4
citations
#9196

Chapter-Llama: Efficient Chaptering in Hour-Long Videos with LLMs

Lucas Ventura, Antoine Yang, Cordelia Schmid et al.

CVPR 2025arXiv:2504.00072
4
citations
#9197

VoxelSplat: Dynamic Gaussian Splatting as an Effective Loss for Occupancy and Flow Prediction

Ziyue Zhu, Shenlong Wang, Jin Xie et al.

CVPR 2025arXiv:2506.05563
4
citations
#9198

DistinctAD: Distinctive Audio Description Generation in Contexts

Bo Fang, Wenhao Wu, Qiangqiang Wu et al.

CVPR 2025highlightarXiv:2411.18180
4
citations
#9199

RUBIK: A Structured Benchmark for Image Matching across Geometric Challenges

Thibaut Loiseau, Guillaume Bourmaud

CVPR 2025arXiv:2502.19955
4
citations
#9200

OSMamba: Omnidirectional Spectral Mamba with Dual-Domain Prior Generator for Exposure Correction

Gehui Li, Bin Chen, Chen Zhao et al.

CVPR 2025arXiv:2411.15255
4
citations