Most Cited CVPR "single-view prediction" Papers
5,589 papers found • Page 19 of 28
Conference
ANNEXE: Unified Analyzing, Answering, and Pixel Grounding for Egocentric Interaction
YUEJIAO SU, Yi Wang, Qiongyang Hu et al.
ProbeSDF: Light Field Probes For Neural Surface Reconstruction
Briac Toussaint, Diego Thomas, Jean-Sébastien Franco
SVDC: Consistent Direct Time-of-Flight Video Depth Completion with Frequency Selective Fusion
Xuan Zhu, Jijun Xiang, Xianqi Wang et al.
IterIS: Iterative Inference-Solving Alignment for LoRA Merging
Hongxu chen, Zhen Wang, Runshi Li et al.
Adventurer: Optimizing Vision Mamba Architecture Designs for Efficiency
Feng Wang, Timing Yang, Yaodong Yu et al.
Concept Replacer: Replacing Sensitive Concepts in Diffusion Models via Precision Localization
lingyun zhang, Yu Xie, Yanwei Fu et al.
Your Student is Better Than Expected: Adaptive Teacher-Student Collaboration for Text-Conditional Diffusion Models
Nikita Starodubcev, Dmitry Baranchuk, Artem Fedorov et al.
Efficient ANN-Guided Distillation: Aligning Rate-based Features of Spiking Neural Networks through Hybrid Block-wise Replacement
Shu Yang, Chengting Yu, Lei Liu et al.
DIMAT: Decentralized Iterative Merging-And-Training for Deep Learning Models
Nastaran Saadati, Minh Pham, Nasla Saleem et al.
Self-supervised Debiasing Using Low Rank Regularization
Geon Yeong Park, Chanyong Jung, Sangmin Lee et al.
DiffVsgg: Diffusion-Driven Online Video Scene Graph Generation
Mu Chen, Liulei Li, Wenguan Wang et al.
ACT-Diffusion: Efficient Adversarial Consistency Training for One-step Diffusion Models
Fei Kong, Jinhao Duan, Lichao Sun et al.
AIpparel: A Multimodal Foundation Model for Digital Garments
Kiyohiro Nakayama, Jan Ackermann, Timur Levent Kesdogan et al.
ITA-MDT: Image-Timestep-Adaptive Masked Diffusion Transformer Framework for Image-Based Virtual Try-On
Ji Woo Hong, Tri Ton, Trung X. Pham et al.
Learning Visual Generative Priors without Text
Shuailei Ma, Kecheng Zheng, Ying Wei et al.
FADA: Fast Diffusion Avatar Synthesis with Mixed-Supervised Multi-CFG Distillation
Tianyun Zhong, Chao Liang, Jianwen Jiang et al.
FaceChain-SuDe: Building Derived Class to Inherit Category Attributes for One-shot Subject-Driven Generation
Pengchong Qiao, Lei Shang, Chang Liu et al.
AutoLUT: LUT-Based Image Super-Resolution with Automatic Sampling and Adaptive Residual Learning
Yuheng Xu, Shijie Yang, Xin Liu et al.
GeoAuxNet: Towards Universal 3D Representation Learning for Multi-sensor Point Clouds
Shengjun Zhang, Xin Fei, Yueqi Duan
Spotting the Unexpected (STU): A 3D LiDAR Dataset for Anomaly Segmentation in Autonomous Driving
Alexey Nekrasov, Malcolm Burdorf, Stewart Worrall et al.
SALOVA: Segment-Augmented Long Video Assistant for Targeted Retrieval and Routing in Long-Form Video Analysis
Junho Kim, Hyunjun Kim, Hosu Lee et al.
WeGen: A Unified Model for Interactive Multimodal Generation as We Chat
Zhipeng Huang, Shaobin Zhuang, Canmiao Fu et al.
HeMoRa: Unsupervised Heuristic Consensus Sampling for Robust Point Cloud Registration
Shaocheng Yan, Yiming Wang, Kaiyan Zhao et al.
Open-World Objectness Modeling Unifies Novel Object Detection
Shan Zhang, Yao Ni, Jinhao Du et al.
CLOC: Contrastive Learning for Ordinal Classification with Multi-Margin N-pair Loss
Dileepa Pitawela, Gustavo Carneiro, Hsiang-Ting Chen
Convex Relaxation for Robust Vanishing Point Estimation in Manhattan World
Bangyan Liao, Zhenjun Zhao, Haoang Li et al.
Boosting Adversarial Transferability through Augmentation in Hypothesis Space
Yu Guo, Weiquan Liu, Qingshan Xu et al.
V-Stylist: Video Stylization via Collaboration and Reflection of MLLM Agents
Zhengrong Yue, Shaobin Zhuang, Kunchang Li et al.
Pseudo Label Refinery for Unsupervised Domain Adaptation on Cross-dataset 3D Object Detection
Zhanwei Zhang, Minghao Chen, Shuai Xiao et al.
Distilling Long-tailed Datasets
Zhenghao Zhao, Haoxuan Wang, Yuzhang Shang et al.
OCRT: Boosting Foundation Models in the Open World with Object-Concept-Relation Triad
Luyao Tang, Chaoqi Chen, Yuxuan Yuan et al.
EdgeDiff: Edge-aware Diffusion Network for Building Reconstruction from Point Clouds
Yujun Liu, Ruisheng Wang, Shangfeng Huang et al.
3DEnhancer: Consistent Multi-View Diffusion for 3D Enhancement
Yihang Luo, Shangchen Zhou, Yushi Lan et al.
Exploration-Driven Generative Interactive Environments
Nedko Savov, Naser Kazemi, Mohammad Mahdi et al.
Extreme Rotation Estimation in the Wild
Hana Bezalel, Dotan Ankri, Ruojin Cai et al.
Generalizable Face Landmarking Guided by Conditional Face Warping
Jiayi Liang, Haotian Liu, Hongteng Xu et al.
Learning with Unreliability: Fast Few-shot Voxel Radiance Fields with Relative Geometric Consistency
Xu Yingjie, Bangzhen Liu, Hao Tang et al.
Seg2Reg: Differentiable 2D Segmentation to 1D Regression Rendering for 360 Room Layout Reconstruction
Cheng Sun, Wei-En Tai, Yu-Lin Shih et al.
Comprehensive Relighting: Generalizable and Consistent Monocular Human Relighting and Harmonization
Junying Wang, Jingyuan Liu, Xin Sun et al.
NeISF++: Neural Incident Stokes Field for Polarized Inverse Rendering of Conductors and Dielectrics
Chenhao Li, Taishi Ono, Takeshi Uemori et al.
Zero-shot 3D Question Answering via Voxel-based Dynamic Token Compression
Hsiang-Wei Huang, Fu-Chen Chen, Wenhao Chai et al.
Generalized Diffusion Detector: Mining Robust Features from Diffusion Models for Domain-Generalized Detection
Boyong He, Yuxiang Ji, Qianwen Ye et al.
EFormer: Enhanced Transformer towards Semantic-Contour Features of Foreground for Portraits Matting
Zitao Wang, Qiguang Miao, Yue Xi et al.
One2Any: One-Reference 6D Pose Estimation for Any Object
Mengya Liu, Siyuan Li, Ajad Chhatkuli et al.
KVQ: Boosting Video Quality Assessment via Saliency-guided Local Perception
Yunpeng Qu, Kun Yuan, Qizhi Xie et al.
Improving Adversarial Transferability on Vision Transformers via Forward Propagation Refinement
Yuchen Ren, Zhengyu Zhao, Chenhao Lin et al.
Learning Group Activity Features Through Person Attribute Prediction
Chihiro Nakatani, Hiroaki Kawashima, Norimichi Ukita
FaceLift: Semi-supervised 3D Facial Landmark Localization
David Ferman, Pablo Garrido, Gaurav Bharaj
Scalable Video-to-Dataset Generation for Cross-Platform Mobile Agents
Yunseok Jang, Yeda Song, Sungryull Sohn et al.
Unsupervised Learning of Category-Level 3D Pose from Object-Centric Videos
Leonhard Sommer, Artur Jesslen, Eddy Ilg et al.
Shape My Moves: Text-Driven Shape-Aware Synthesis of Human Motions
Ting-Hsuan Liao, Yi Zhou, Yu Shen et al.
Unleashing Network Potentials for Semantic Scene Completion
Fengyun Wang, Qianru Sun, Dong Zhang et al.
Context-Enhanced Memory-Refined Transformer for Online Action Detection
Zhanzhong Pang, Fadime Sener, Angela Yao
Prosody-Enhanced Acoustic Pre-training and Acoustic-Disentangled Prosody Adapting for Movie Dubbing
Zhedong Zhang, Liang Li, Chenggang Yan et al.
Toward Robust Neural Reconstruction from Sparse Point Sets
Amine Ouasfi, Shubhendu Jena, Eric Marchand et al.
Any3DIS: Class-Agnostic 3D Instance Segmentation by 2D Mask Tracking
Phuc Nguyen, Minh Luu, Anh Tran et al.
Finding Local Diffusion Schrödinger Bridge using Kolmogorov-Arnold Network
Xingyu Qiu, Mengying Yang, Xinghua Ma et al.
DiaLoc: An Iterative Approach to Embodied Dialog Localization
Chao Zhang, Mohan Li, Ignas Budvytis et al.
HUSH: Holistic Panoramic 3D Scene Understanding using Spherical Harmonics
Jongsung Lee, HARIN PARK, Byeong-Uk Lee et al.
VSRD: Instance-Aware Volumetric Silhouette Rendering for Weakly Supervised 3D Object Detection
Zihua Liu, Hiroki Sakuma, Masatoshi Okutomi
OVMR: Open-Vocabulary Recognition with Multi-Modal References
Zehong Ma, Shiliang Zhang, Longhui Wei et al.
Unsegment Anything by Simulating Deformation
Jiahao Lu, Xingyi Yang, Xinchao Wang
Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks
Wei-Jin Huang, Yuan-Ming Li, Zhi-Wei Xia et al.
EFHQ: Multi-purpose ExtremePose-Face-HQ dataset
Trung Dao, Duc H Vu, Cuong Pham et al.
BerfScene: Bev-conditioned Equivariant Radiance Fields for Infinite 3D Scene Generation
Qihang Zhang, Yinghao Xu, Yujun Shen et al.
ConceptGuard: Continual Personalized Text-to-Image Generation with Forgetting and Confusion Mitigation
Zirun Guo, Tao Jin
FreePCA: Integrating Consistency Information across Long-short Frames in Training-free Long Video Generation via Principal Component Analysis
Jiangtong Tan, Hu Yu, Jie Huang et al.
DEAL: Data-Efficient Adversarial Learning for High-Quality Infrared Imaging
Zhu Liu, Zijun Wang, Jinyuan Liu et al.
JTD-UAV: MLLM-Enhanced Joint Tracking and Description Framework for Anti-UAV Systems
Yifan Wang, Jian Zhao, Zhaoxin Fan et al.
OpenSDI: Spotting Diffusion-Generated Images in the Open World
Yabin Wang, Zhiwu Huang, Xiaopeng Hong
Flow-Guided Online Stereo Rectification for Wide Baseline Stereo
Anush Kumar, Fahim Mannan, Omid Hosseini Jafari et al.
Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields
Shijie Zhou, Hui Ren, Yijia Weng et al.
StyleCineGAN: Landscape Cinemagraph Generation using a Pre-trained StyleGAN
Jongwoo Choi, Kwanggyoon Seo, Amirsaman Ashtari et al.
Point-to-Region Loss for Semi-Supervised Point-Based Crowd Counting
Wei Lin, Chenyang ZHAO, Antoni B. Chan
MERGE: Multi-faceted Hierarchical Graph-based GNN for Gene Expression Prediction from Whole Slide Histopathology Images
Aniruddha Ganguly, Debolina Chatterjee, Wentao Huang et al.
Move-in-2D: 2D-Conditioned Human Motion Generation
Hsin-Ping Huang, Yang Zhou, Jui-Hsien Wang et al.
SCULPT: Shape-Conditioned Unpaired Learning of Pose-dependent Clothed and Textured Human Meshes
Soubhik Sanyal, Partha Ghosh, Jinlong Yang et al.
Understanding Fine-tuning CLIP for Open-vocabulary Semantic Segmentation in Hyperbolic Space
Zelin Peng, Zhengqin Xu, Zhilin Zeng et al.
DecoupledGaussian: Object-Scene Decoupling for Physics-Based Interaction
Miaowei Wang, Yibo Zhang, Rui Ma et al.
ShapeWalk: Compositional Shape Editing Through Language-Guided Chains
Habib Slim, Mohamed Elhoseiny
Text Grouping Adapter: Adapting Pre-trained Text Detector for Layout Analysis
Tianci Bi, Xiaoyi Zhang, Zhizheng Zhang et al.
OSLoPrompt: Bridging Low-Supervision Challenges and Open-Set Domain Generalization in CLIP
Mohamad Hassan N C, Divyam Gupta, Mainak Singha et al.
SimVS: Simulating World Inconsistencies for Robust View Synthesis
Alex Trevithick, Roni Paiss, Philipp Henzler et al.
Semi-Supervised State-Space Model with Dynamic Stacking Filter for Real-World Video Deraining
Shangquan Sun, Wenqi Ren, Juxiang Zhou et al.
Minimal Perspective Autocalibration
Andrea Porfiri Dal Cin, Timothy Duff, Luca Magri et al.
AToM: Aligning Text-to-Motion Model at Event-Level with GPT-4Vision Reward
Haonan Han, Xiangzuo Wu, Huan Liao et al.
Spatial Transport Optimization by Repositioning Attention Map for Training-Free Text-to-Image Synthesis
Woojung Han, Yeonkyung Lee, Chanyoung Kim et al.
Improving Transferable Targeted Attacks with Feature Tuning Mixup
Kaisheng Liang, Xuelong Dai, Yanjie Li et al.
Long Video Diffusion Generation with Segmented Cross-Attention and Content-Rich Video Data Curation
Xin Yan, Yuxuan Cai, Qiuyue Wang et al.
ATP: Adaptive Threshold Pruning for Efficient Data Encoding in Quantum Neural Networks
Mohamed Afane, Gabrielle Ebbrecht, Ying Wang et al.
Segment Any-Quality Images with Generative Latent Space Enhancement
Guangqian Guo, Yong Guo, Xuehui Yu et al.
GCC: Generative Color Constancy via Diffusing a Color Checker
Chen-Wei Chang, Cheng-De Fan, Chia-Che Chang et al.
SleepVST: Sleep Staging from Near-Infrared Video Signals using Pre-Trained Transformers
Jonathan F. Carter, Joao Jorge, Oliver Gibson et al.
Improving Subject-Driven Image Synthesis with Subject-Agnostic Guidance
Kelvin C.K. Chan, Yang Zhao, Xuhui Jia et al.
Satellite to GroundScape - Large-scale Consistent Ground View Generation from Satellite Views
Ningli Xu, Rongjun Qin
Learning Dynamic Collaborative Network for Semi-supervised 3D Vessel Segmentation
Jiao Xu, Xin Chen, Lihe Zhang
Optimizing for the Shortest Path in Denoising Diffusion Model
Ping Chen, Xingpeng Zhang, Zhaoxiang Liu et al.
Multi-modal Medical Diagnosis via Large-small Model Collaboration
Wanyi Chen, Zihua Zhao, Jiangchao Yao et al.
Enhancing Quality of Compressed Images by Mitigating Enhancement Bias Towards Compression Domain
Qunliang Xing, Mai Xu, Shengxi Li et al.
dFLMoE: Decentralized Federated Learning via Mixture of Experts for Medical Data Analysis
Luyuan Xie, Tianyu Luan, Wenyuan Cai et al.
Enhancing Online Continual Learning with Plug-and-Play State Space Model and Class-Conditional Mixture of Discretization
Sihao Liu, Yibo Yang, Xiaojie Li et al.
SVG-IR: Spatially-Varying Gaussian Splatting for Inverse Rendering
Hanxiao Sun, Yupeng Gao, Jin Xie et al.
UniAP: Unifying Inter- and Intra-Layer Automatic Parallelism by Mixed Integer Quadratic Programming
Hao Lin, Ke Wu, Jie Li et al.
Knowledge Bridger: Towards Training-Free Missing Modality Completion
Guanzhou Ke, Shengfeng He, Xiao-Li Wang et al.
Are Images Indistinguishable to Humans Also Indistinguishable to Classifiers?
Zebin You, Xinyu Zhang, Hanzhong Guo et al.
Ges3ViG : Incorporating Pointing Gestures into Language-Based 3D Visual Grounding for Embodied Reference Understanding
Atharv Mahesh Mane, Dulanga Weerakoon, Vigneshwaran Subbaraju et al.
T-FAKE: Synthesizing Thermal Images for Facial Landmarking
Philipp Flotho, Moritz Piening, Anna Kukleva et al.
One-Way Ticket: Time-Independent Unified Encoder for Distilling Text-to-Image Diffusion Models
Senmao Li, Lei Wang, Kai Wang et al.
Atom-Level Optical Chemical Structure Recognition with Limited Supervision
Martijn Oldenhof, Edward De Brouwer, Adam Arany et al.
Anatomically Constrained Implicit Face Models
Prashanth Chandran, Gaspard Zoss
CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning
Yang Yue, Yulin Wang, Chenxin Tao et al.
GS-DiT: Advancing Video Generation with Dynamic 3D Gaussian Fields through Efficient Dense 3D Point Tracking
Weikang Bian, Zhaoyang Huang, Xiaoyu Shi et al.
Preserving Clusters in Prompt Learning for Unsupervised Domain Adaptation
Long Tung Vuong, Hoang Phan, Vy Vo et al.
Multi-Label Prototype Visual Spatial Search for Weakly Supervised Semantic Segmentation
Songsong Duan, Xi Yang, Nannan Wang
SAT-HMR: Real-Time Multi-Person 3D Mesh Estimation via Scale-Adaptive Tokens
Chi Su, Xiaoxuan Ma, Jiajun Su et al.
ViiNeuS: Volumetric Initialization for Implicit Neural Surface Reconstruction of Urban Scenes with Limited Image Overlap
Hala Djeghim, Nathan Piasco, Moussab Bennehar et al.
ASHiTA: Automatic Scene-grounded HIerarchical Task Analysis
Yun Chang, Leonor Fermoselle, Duy Ta et al.
SPIDeRS: Structured Polarization for Invisible Depth and Reflectance Sensing
Tomoki Ichikawa, Shohei Nobuhara, Ko Nishino
BIGS: Bimanual Category-agnostic Interaction Reconstruction from Monocular Videos via 3D Gaussian Splatting
Jeongwan On, Kyeonghwan Gwak, Gunyoung Kang et al.
Symmetry Strikes Back: From Single-Image Symmetry Detection to 3D Generation
Xiang Li, Zixuan Huang, Anh Thai et al.
SMILE: Infusing Spatial and Motion Semantics in Masked Video Learning
Fida Mohammad Thoker, Letian Jiang, Chen Zhao et al.
LidarGait++: Learning Local Features and Size Awareness from LiDAR Point Clouds for 3D Gait Recognition
Chuanfu Shen, Rui Wang, Lixin Duan et al.
VoxelSplat: Dynamic Gaussian Splatting as an Effective Loss for Occupancy and Flow Prediction
Ziyue Zhu, Shenlong Wang, Jin Xie et al.
Bayesian Differentiable Physics for Cloth Digitalization
Deshan Gong, Ningtao Mao, He Wang
Reconstructing Animals and the Wild
Peter Kulits, Michael J. Black, Silvia Zuffi
Learning Affine Correspondences by Integrating Geometric Constraints
Pengju Sun, Banglei Guan, Zhenbao Yu et al.
HiFi-Portrait: Zero-shot Identity-preserved Portrait Generation with High-fidelity Multi-face Fusion
Yifang Xu, BenXiang Zhai, Yunzhuo Sun et al.
Traffic Scene Parsing through the TSP6K Dataset
Peng-Tao Jiang, Yuqi Yang, Yang Cao et al.
H-MoRe: Learning Human-centric Motion Representation for Action Analysis
Zhanbo Huang, Xiaoming Liu, Yu Kong
Chapter-Llama: Efficient Chaptering in Hour-Long Videos with LLMs
Lucas Ventura, Antoine Yang, Cordelia Schmid et al.
On Train-Test Class Overlap and Detection for Image Retrieval
Chull Hwan Song, Jooyoung Yoon, Taebaek Hwang et al.
PolarFree: Polarization-based Reflection-Free Imaging
Mingde Yao, Menglu Wang, King Man Tam et al.
SPMTrack: Spatio-Temporal Parameter-Efficient Fine-Tuning with Mixture of Experts for Scalable Visual Tracking
Wenrui Cai, Qingjie Liu, Yunhong Wang
Learnable Infinite Taylor Gaussian for Dynamic View Rendering
Bingbing Hu, Yanyan Li, rui xie et al.
CustomKD: Customizing Large Vision Foundation for Edge Model Improvement via Knowledge Distillation
Jungsoo Lee, Debasmit Das, Munawar Hayat et al.
RICCARDO: Radar Hit Prediction and Convolution for Camera-Radar 3D Object Detection
Yunfei Long, Abhinav Kumar, Xiaoming Liu et al.
Localizing Events in Videos with Multimodal Queries
Gengyuan Zhang, Mang Ling Ada Fok, Jialu Ma et al.
Morpheus: Text-Driven 3D Gaussian Splat Shape and Color Stylization
Jamie Wynn, Zawar Qureshi, Jakub Powierza et al.
Pose Adapted Shape Learning for Large-Pose Face Reenactment
Gee-Sern Hsu, Jie-Ying Zhang, Yu-Hsiang Huang et al.
PatchVSR: Breaking Video Diffusion Resolution Limits with Patch-wise Video Super-Resolution
Shian Du, Menghan Xia, Chang Liu et al.
Rethinking Epistemic and Aleatoric Uncertainty for Active Open-Set Annotation: An Energy-Based Approach
Chen-Chen Zong, Sheng-Jun Huang
MixerMDM: Learnable Composition of Human Motion Diffusion Models
Pablo Ruiz-Ponce, German Barquero, Cristina Palmero et al.
Tokenize Image Patches: Global Context Fusion for Effective Haze Removal in Large Images
Jiuchen Chen, Xinyu Yan, Qizhi Xu et al.
Generalizable Novel-View Synthesis using a Stereo Camera
Haechan Lee, Wonjoon Jin, Seung-Hwan Baek et al.
Real-time Free-view Human Rendering from Sparse-view RGB Videos using Double Unprojected Textures
Guoxing Sun, Rishabh Dabral, Heming Zhu et al.
Infrared Adversarial Car Stickers
Xiaopei Zhu, Yuqiu Liu, Zhanhao Hu et al.
Secret Lies in Color: Enhancing AI-Generated Images Detection with Color Distribution Analysis
Zexi Jia, Chuanwei Huang, Yeshuang Zhu et al.
Understanding Multi-Task Activities from Single-Task Videos
Yuhan Shen, Ehsan Elhamifar
GENIUS: A Generative Framework for Universal Multimodal Search
Sungyeon Kim, Xinliang Zhu, Xiaofan Lin et al.
STiL: Semi-supervised Tabular-Image Learning for Comprehensive Task-Relevant Information Exploration in Multimodal Classification
Siyi Du, Xinzhe Luo, Declan ORegan et al.
Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language
Yicheng Chen, Xiangtai Li, Yining Li et al.
BG-Triangle: Bézier Gaussian Triangle for 3D Vectorization and Rendering
Minye Wu, Haizhao Dai, Kaixin Yao et al.
Interpretable Measures of Conceptual Similarity by Complexity-Constrained Descriptive Auto-Encoding
Alessandro Achille, Greg Ver Steeg, Tian Yu Liu et al.
Learning to Detect Objects from Multi-Agent LiDAR Scans without Manual Labels
Qiming Xia, Wenkai Lin, Haoen Xiang et al.
Evaluating Vision-Language Models as Evaluators in Path Planning
Mohamed Aghzal, Xiang Yue, Erion Plaku et al.
GauCho: Gaussian Distributions with Cholesky Decomposition for Oriented Object Detection
Jeffri Erwin Murrugarra Llerena, José Henrique Marques, Claudio Jung
End-to-End Implicit Neural Representations for Classification
Alexander Gielisse, Jan van Gemert
Context-Aware Multimodal Pretraining
Karsten Roth, Zeynep Akata, Dima Damen et al.
Towards All-in-One Medical Image Re-Identification
Yuan Tian, Kaiyuan Ji, Rongzhao Zhang et al.
Synchronized Video-to-Audio Generation via Mel Quantization-Continuum Decomposition
Juncheng Wang, Chao Xu, Cheng Yu et al.
Zero-1-to-A: Zero-Shot One Image to Animatable Head Avatars Using Video Diffusion
Zhenglin Zhou, Fan Ma, Hehe Fan et al.
Enhanced then Progressive Fusion with View Graph for Multi-View Clustering
Zhibin Dong, Meng Liu, Siwei Wang et al.
CASAGPT: Cuboid Arrangement and Scene Assembly for Interior Design
Weitao Feng, Hang Zhou, Jing Liao et al.
3DFIRES: Few Image 3D REconstruction for Scenes with Hidden Surfaces
Linyi Jin, Nilesh Kulkarni, David Fouhey
DSV-LFS: Unifying LLM-Driven Semantic Cues with Visual Features for Robust Few-Shot Segmentation
Amin Karimi, Charalambos Poullis
GASP: Gaussian Avatars with Synthetic Priors
Jack Saunders, Charlie Hewitt, Yanan Jian et al.
Temporal Score Analysis for Understanding and Correcting Diffusion Artifacts
Yu Cao, Zengqun Zhao, Ioannis Patras et al.
Dynamic Motion Blending for Versatile Motion Editing
Nan Jiang, Hongjie Li, Ziye Yuan et al.
Ev-3DOD: Pushing the Temporal Boundaries of 3D Object Detection with Event Cameras
Hoonhee Cho, Jae-Young Kang, Youngho Kim et al.
Enhancing Intrinsic Features for Debiasing via Investigating Class-Discerning Common Attributes in Bias-Contrastive Pair
Jeonghoon Park, Chaeyeon Chung, Jaegul Choo
A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning
Xin Wen, Bingchen Zhao, Yilun Chen et al.
ESCAPE: Equivariant Shape Completion via Anchor Point Encoding
Burak Bekci, Nassir Navab, Federico Tombari et al.
LatentHOI: On the Generalizable Hand Object Motion Generation with Latent Hand Diffusion.
Muchen Li, Sammy Christen, Chengde Wan et al.
Efficient Motion-Aware Video MLLM
Zijia Zhao, Yuqi Huo, Tongtian Yue et al.
Probing the Mid-level Vision Capabilities of Self-Supervised Learning
Xuweiyi Chen, Markus Marks, Zezhou Cheng
ZeroVO: Visual Odometry with Minimal Assumptions
Lei Lai, Zekai Yin, Eshed Ohn-Bar
Leak and Learn: An Attacker's Cookbook to Train Using Leaked Data from Federated Learning
Joshua C. Zhao, Ahaan Dabholkar, Atul Sharma et al.
BiM-VFI: Bidirectional Motion Field-Guided Frame Interpolation for Video with Non-uniform Motions
Wonyong Seo, Jihyong Oh, Munchurl Kim
Universal Scene Graph Generation
Shengqiong Wu, Hao Fei, Tat-seng Chua
IReNe: Instant Recoloring of Neural Radiance Fields
Alessio Mazzucchelli, Adrian Garcia-Garcia, Elena Garces et al.
CASP: Compression of Large Multimodal Models Based on Attention Sparsity
Mohsen Gholami, Mohammad Akbari, Kevin Cannons et al.
Protecting Your Video Content: Disrupting Automated Video-based LLM Annotations
Haitong Liu, Kuofeng Gao, Yang Bai et al.
ForestLPR: LiDAR Place Recognition in Forests Attentioning Multiple BEV Density Images
Yanqing Shen, Turcan Tuna, Marco Hutter et al.
SAM-I2V: Upgrading SAM to Support Promptable Video Segmentation with Less than 0.2% Training Cost
Haiyang Mei, Pengyu Zhang, Mike Zheng Shou
Reconstructing In-the-Wild Open-Vocabulary Human-Object Interactions
Boran Wen, Dingbang Huang, Zichen Zhang et al.
OODD: Test-time Out-of-Distribution Detection with Dynamic Dictionary
Yifeng Yang, Lin Zhu, Zewen Sun et al.
Inference-Scale Complexity in ANN-SNN Conversion for High-Performance and Low-Power Applications
Tong Bu, Maohua Li, Zhaofei Yu
Text2QR: Harmonizing Aesthetic Customization and Scanning Robustness for Text-Guided QR Code Generation
Guangyang Wu, Xiaohong Liu, Jun Jia et al.
TKG-DM: Training-free Chroma Key Content Generation Diffusion Model
Ryugo Morita, Stanislav Frolov, Brian Bernhard Moser et al.
AdaCM^2: On Understanding Extremely Long-Term Video with Adaptive Cross-Modality Memory Reduction
Yuanbin Man, Ying Huang, Chengming Zhang et al.
LPSNet: End-to-End Human Pose and Shape Estimation with Lensless Imaging
Haoyang Ge, Qiao Feng, Hailong Jia et al.
Non-autoregressive Sequence-to-Sequence Vision-Language Models
Kunyu Shi, Qi Dong, Luis Goncalves et al.
LeGO: Leveraging a Surface Deformation Network for Animatable Stylized Face Generation with One Example
Soyeon Yoon, Kwan Yun, Kwanggyoon Seo et al.
Unsupervised Feature Learning with Emergent Data-Driven Prototypicality
Yunhui Guo, Youren Zhang, Yubei Chen et al.
Full-DoF Egomotion Estimation for Event Cameras Using Geometric Solvers
Ji Zhao, Banglei Guan, Zibin Liu et al.
DeClotH: Decomposable 3D Cloth and Human Body Reconstruction from a Single Image
Hyeongjin Nam, Donghwan Kim, Jeongtaek Oh et al.
ReRAW: RGB-to-RAW Image Reconstruction via Stratified Sampling for Efficient Object Detection on the Edge
Radu Berdan, Beril Besbinar, Christoph Reinders et al.
LightLoc: Learning Outdoor LiDAR Localization at Light Speed
Wen Li, Chen Liu, Shangshu Yu et al.
QuCOOP: A Versatile Framework for Solving Composite and Binary-Parametrised Problems on Quantum Annealers
Natacha Kuete Meli, Vladislav Golyanik, Marcel Seelbach Benkner et al.
FruitNinja: 3D Object Interior Texture Generation with Gaussian Splatting
Fangyu Wu, Yuhao Chen