Most Cited ICCV "human-centric generation" Papers
2,701 papers found • Page 12 of 14
Conference
LDIP: Long Distance Information Propagation for Video Super-Resolution
Michael Bernasconi, Abdelaziz Djelouah, Yang Zhang et al.
ISP2HRNet: Learning to Reconstruct High Resolution Image from Irregularly Sampled Pixels via Hierarchical Gradient Learning
Yuanlin Wang, Ruiqin Xiong, Rui Zhao et al.
Sparsity Outperforms Low-Rank Projections in Few-Shot Adaptation
Nairouz Mrabah, Nicolas Richet, Ismail Ayed et al.
Continuous-Time Human Motion Field from Event Cameras
Ziyun Wang, Ruijun Zhang, Zi-Yan Liu et al.
Neural Architecture Search Driven by Locally Guided Diffusion for Personalized Federated Learning
PENG LIAO, Xilu Wang, Yaochu Jin et al.
Hierarchical 3D Scene Graphs Construction Outdoors
Jon Nyffeler, Federico Tombari, Daniel Barath
Cycle-Consistent Learning for Joint Layout-to-Image Generation and Object Detection
Xinhao Cai, Qiuxia Lai, Gensheng Pei et al.
WarpHE4D: Dense 4D Head Map toward Full Head Reconstruction
Jongseob Yun, Yong-Hoon Kwon, Min-Gyu Park et al.
Federated Representation Angle Learning
Liping Yi, Han Yu, Gang Wang et al.
Bridging Local Inductive Bias and Long-Range Dependencies with Pixel-Mamba for End-to-end Whole Slide Image Analysis
Zhongwei Qiu, Hanqing Chao, Tiancheng Lin et al.
Neuroverse3D: Developing In-Context Learning Universal Model for Neuroimaging in 3D
Jiesi Hu, Hanyang Peng, Yanwu Yang et al.
Laboring on less labors: RPCA Paradigm for Pan-sharpening
honghui xu, Chuangjie Fang, Yibin Wang et al.
Punching Bag vs. Punching Person: Motion Transferability in Videos
Raiyaan Abdullah, Jared Claypoole, Michael Cogswell et al.
Robust Test-Time Adaptation for Single Image Denoising Using Deep Gaussian Prior
Qing Ma, Pengwei Liang, Xiong Zhou et al.
Causal-Entity Reflected Egocentric Traffic Accident Video Synthesis
Lei-lei Li, Jianwu Fang, Junbin Xiao et al.
Incremental Few-Shot Semantic Segmentation via Multi-Level Switchable Visual Prompts
Maoxian Wan, Kaige Li, Qichuan Geng et al.
TrackVerse: A Large-Scale Object-Centric Video Dataset for Image-Level Representation Learning
Yibing Wei, Samuel Church, Victor Suciu et al.
DGTalker: Disentangled Generative Latent Space Learning for Audio-Driven Gaussian Talking Heads
Xiaoxi Liang, Yanbo Fan, Qiya Yang et al.
StyleSRN: Scene Text Image Super-Resolution with Text Style Embedding
Shengrong Yuan, Runmin Wang, Ke Hao et al.
Frequency-Guided Diffusion for Training-Free Text-Driven Image Translation
Zheng Gao, Jifei Song, Zhensong Zhang et al.
Less is More: Improving Motion Diffusion Models with Sparse Keyframes
Jinseok Bae, Inwoo Hwang, Young-Yoon Lee et al.
Drawing Developmental Trajectory from Cortical Surface Reconstruction
WENXUAN WU, ruowen qu, Zhongliang Liu et al.
Frequency-Semantic Enhanced Variational Autoencoder for Zero-Shot Skeleton-based Action Recognition
Wenhan Wu, Zhishuai Guo, Chen Chen et al.
How Do Multimodal Large Language Models Handle Complex Multimodal Reasoning? Placing Them in An Extensible Escape Game
Ziyue Wang, Yurui Dong, Fuwen Luo et al.
Towards Human-like Virtual Beings: Simulating Human Behavior in 3D Scenes
CHEN LIANG, Wenguan Wang, Yi Yang
Cross-Category Subjectivity Generalization for Style-Adaptive Sketch Re-ID
Zechao Hu, Zhengwei Yang, Hao Li et al.
S3R-GS: Streamlining the Pipeline for Large-Scale Street Scene Reconstruction
Guangting Zheng, Jiajun Deng, Xiaomeng Chu et al.
The Source Image is the Best Attention for Infrared and Visible Image Fusion
Song Wang, Xie Han, Liqun Kuang et al.
Blind Noisy Image Deblurring Using Residual Guidance Strategy
Heyan Liu, Jianing Sun, Jun Liu et al.
MonSTeR: a Unified Model for Motion, Scene, Text Retrieval
Luca Collorone, Matteo Gioia, Massimiliano Pappa et al.
E-NeMF: Event-based Neural Motion Field for Novel Space-time View Synthesis of Dynamic Scenes
Yan Liu, Zehao Chen, Haojie Yan et al.
Learnable Fractional Reaction-Diffusion Dynamics for Under-Display ToF Imaging and Beyond
Xin Qiao, Matteo Poggi, Xing Wei et al.
Multi-modal Identity Extraction
Ryan Webster, Teddy Furon
Reference-based Super-Resolution via Image-based Retrieval-Augmented Generation Diffusion
Byeonghun Lee, Hyunmin Cho, Honggyu Choi et al.
Deep Adaptive Unfolded Network via Spatial Morphology Stripping and Spectral Filtration for Pan-sharpening
Hebaixu Wang, Jiayi Ma
Open-World Skill Discovery from Unsegmented Demonstration Videos
Jingwen Deng, Zihao Wang, Shaofei Cai et al.
InteractAvatar: Modeling Hand-Face Interaction in Photorealistic Avatars with Deformable Gaussians
Kefan Chen, Sergiu Oprea, Justin Theiss et al.
Local Scale Equivariance with Latent Deep Equilibrium Canonicalizer
Md Ashiqur Rahman, Chiao-An Yang, Michael N Cheng et al.
Task-Oriented Human Grasp Synthesis via Context- and Task-Aware Diffusers
An Lun Liu, Yu-Wei Chao, Yi-Ting Chen
Wave-MambaAD: Wavelet-driven State Space Model for Multi-class Unsupervised Anomaly Detection
Qiao Zhang, Mingwen Shao, Xinyuan Chen et al.
3D Test-time Adaptation via Graph Spectral Driven Point Shift
Xin Wei, Qin Yang, Yijie Fang et al.
Task-Decoupled Bézier Surface Constraint for Uneven Low-Light Image Enhancement
Xingxiang Zhou, Xiangdong Su, Haoran Zhang et al.
EMoTive: Event-guided Trajectory Modeling for 3D Motion Estimation
Zengyu Wan, Wei Zhai, Yang Cao et al.
RoboAnnotatorX: A Comprehensive and Universal Annotation Framework for Accurate Understanding of Long-horizon Robot Demonstration
Longxin Kou, Fei Ni, Jianye HAO et al.
What If: Understanding Motion Through Sparse Interactions
Stefan A. Baumann, Nick Stracke, Timy Phan et al.
GeoDistill: Geometry-Guided Self-Distillation for Weakly Supervised Cross-View Localization
Shaowen Tong, Zimin Xia, Alexandre Alahi et al.
KDA: Knowledge Diffusion Alignment with Enhanced Context for Video Temporal Grounding
Ran Ran, Jiwei Wei, Shiyuan He et al.
Error Recognition in Procedural Videos using Generalized Task Graph
Shih-Po Lee, Ehsan Elhamifar
STEP-DETR: Advancing DETR-based Semi-Supervised Object Detection with Super Teacher and Pseudo-Label Guided Text Queries
Tahira Shehzadi, Khurram Azeem Hashmi, Shalini Sarode et al.
Text-to-Any-Skeleton Motion Generation Without Retargeting
Qingyuan Liu, Ke Lv, Kun Dong et al.
Completing 3D Partial Assemblies with View-Consistent 2D-3D Correspondence
Weihao Wang, Yu Lan, Mingyu You et al.
Aligning Global Semantics and Local Textures in Generative Video Enhancement
Zhikai Chen, Fuchen Long, Zhaofan Qiu et al.
Simulating Dual-Pixel Images From Ray Tracing For Depth Estimation
Fengchen He, Dayang Zhao, Hao Xu et al.
WalkVLM: Aid Visually Impaired People Walking by Vision Language Model
Zhiqiang Yuan, Ting Zhang, Yeshuang Zhu et al.
Unified Category-Level Object Detection and Pose Estimation from RGB Images using 3D Prototypes
Tom Fischer, Xiaojie Zhang, Eddy Ilg
Proactive Scene Decomposition and Reconstruction
Baicheng Li, Zike Yan, Dong Wu et al.
PASD: A Pixel-Adaptive Swarm Dynamics Approach for Unsupervised Low-Light Image Enhancement
Shuai Jin, Yuhua Qian, Feijiang Li et al.
Bridging the Sky and Ground: Towards View-Invariant Feature Learning for Aerial-Ground Person Re-Identification
Wajahat Khalid, Bin Liu, Xulin Li et al.
Combinative Matching for Geometric Shape Assembly
Nahyuk Lee, Juhong Min, Junhong Lee et al.
Dream-to-Recon: Monocular 3D Reconstruction with Diffusion-Depth Distillation from Single Images
Philipp Wulff, Felix Wimbauer, Dominik Muhle et al.
Auto-Regressive Transformation for Image Alignment
Kanggeon Lee, Soochahn Lee, Kyoung Mu Lee
Training-Free Industrial Defect Generation with Diffusion Models
Ruyi Xu, Yen-Tzu Chiu, Tai-I Chen et al.
GECO: Geometrically Consistent Embedding with Lightspeed Inference
Regine Hartwig, Dominik Muhle, Riccardo Marin et al.
GenHaze: Pioneering Controllable One-Step Realistic Haze Generation for Real-World Dehazing
Sixiang Chen, Tian Ye, Yunlong Lin et al.
WINS: Winograd Structured Pruning for Fast Winograd Convolution
Cheonjun Park, Hyunjae Oh, Mincheol Park et al.
DuET: Dual Incremental Object Detection via Exemplar-Free Task Arithmetic
Munish Monga, Vishal Chudasama, Pankaj Wasnik et al.
ReCoT: Reflective Self-Correction Training for Mitigating Confirmation Bias in Large Vision-Language Models
Mengxue Qu, Yibo Hu, Kunyang Han et al.
A View-consistent Sampling Method for Regularized Training of Neural Radiance Fields
Aoxiang Fan, Corentin Dumery, Nicolas Talabot et al.
MixRI: Mixing Features of Reference Images for Novel Object Pose Estimation
Xinhang Liu, Jiawei Shi, Zheng Dang et al.
RnGCam: High-speed video from rolling & global shutter measurements
Kevin Tandi, Xiang Dai, Chinmay Talegaonkar et al.
Test-Time Retrieval-Augmented Adaptation for Vision-Language Models
Xinqi Fan, Xueli CHEN, Luoxiao Yang et al.
SAM Encoder Breach by Adversarial Simplicial Complex Triggers Downstream Model Failures
Yi Qin, Rui Wang, Tao Huang et al.
CHARM3R: Towards Unseen Camera Height Robust Monocular 3D Detector
Abhinav Kumar, Yuliang Guo, Zhihao Zhang et al.
Environment-Agnostic Pose: Generating Environment-independent Object Representations for 6D Pose Estimation
Shaobo Zhang, Yuhang Huang, Wanqing Zhao et al.
Spatial Alignment and Temporal Matching Adapter for Video-Radar Remote Physiological Measurement
Qian Liang, Ruixu Geng, Jinbo Chen et al.
Gaze-Language Alignment for Zero-Shot Prediction of Visual Search Targets from Human Gaze Scanpaths
Sounak Mondal, Naveen Sendhilnathan, Ting Zhang et al.
PS-Mamba: Spatial-Temporal Graph Mamba for Pose Sequence Refinement
Haoye Dong, Gim Hee Lee
Event-guided Unified Framework for Low-light Video Enhancement, Frame Interpolation, and Deblurring
Taewoo Kim, Kuk-Jin Yoon
Beyond Pixel Uncertainty: Bounding the OoD Objects in Road Scenes
Huachao Zhu, Zelong Liu, Zhichao Sun et al.
A Constrained Optimization Approach for Gaussian Splatting from Coarsely-posed Images and Noisy Lidar Point Clouds
Jizong Peng, Tze Ho Elden Tse, Kai Xu et al.
Conditional Visual Autoregressive Modeling for Pathological Image Restoration
Ziyi Liu, Zhe Xu, Jiabo MA et al.
LGA-Net: Learning Local and Global Affinities for Sparse Scribble based Image Colorization
Hongjin Lyu, Bo Li, Paul Rosin et al.
Medical World Model
Yijun Yang, Zhao-Yang Wang, Qiuping Liu et al.
ImageGem: In-the-wild Generative Image Interaction Dataset for Generative Model Personalization
Yuanhe Guo, Linxi Xie, Zhuoran Chen et al.
EYE3:Turn Anything into Naked-eye 3D
Yingde Song, Zongyuan Yang, Baolin Liu et al.
C2MIL: Synchronizing Semantic and Topological Causalities in Multiple Instance Learning for Robust and Interpretable Survival Analysis
Min Cen, Zhenfeng Zhuang, Yuzhe Zhang et al.
Partially Matching Submap Helps: Uncetainty Modeling and Propagation for Text to Point Cloud Localization
Mingtao Feng, Longlong Mei, Zijie Wu et al.
TopicGeo: An Efficient Unified Framework for Geolocation
Xin Wang, Xinlin Wang, Shuiping Gou
High-Resolution Spatiotemporal Modeling with Global-Local State Space Models for Video-Based Human Pose Estimation
Runyang Feng, Hyung Jin Chang, Tze Ho Elden Tse et al.
Learning Visual Proxy for Compositional Zero-Shot Learning
Shiyu Zhang, Cheng Yan, Yang Liu et al.
CObL: Toward Zero-Shot Ordinal Layering without User Prompting
Aneel Damaraju, Dean Hazineh, Todd Zickler
Adapting Vehicle Detectors for Aerial Imagery to Unseen Domains with Weak Supervision
Xiao Fang, Minhyek Jeon, Zheyang Qin et al.
One Look is Enough: Seamless Patchwise Refinement for Zero-Shot Monocular Depth Estimation on High-Resolution Images
Byeongjun Kwon, Munchurl Kim
Background Invariance Testing According to Semantic Proximity
Zukang Liao, Min Chen
Clink! Chop! Thud! - Learning Object Sounds from Real-World Interactions
Mengyu Yang, Yiming Chen, Haozheng Pei et al.
TryOn-Refiner: Conditional Rectified-flow-based TryOn Refiner for More Accurate Detail Reconstruction
Wen Qian
EquiCaps: Predictor-Free Pose-Aware Pre-Trained Capsule Networks
Athinoulla Konstantinou, Georgios Leontidis, Mamatha Thota et al.
Mitigating Catastrophic Overfitting in Fast Adversarial Training via Label Information Elimination
Chao Pan, Ke Tang, Li Qing et al.
AJAHR: Amputated Joint Aware 3D Human Mesh Recovery
hyunjin cho, Giyun choi, Jongwon Choi
SpikeDiff: Zero-shot High-Quality Video Reconstruction from Chromatic Spike Camera and Sub-millisecond Spike Streams
Siqi Yang, Jinxiu Liang, Zhaojun Huang et al.
Scoring, Remember, and Reference: Catching Camouflaged Objects in Videos
Yuang Feng, Shuyong Gao, Fuzhen Yan et al.
Global Motion Corresponder for 3D Point-Based Scene Interpolation under Large Motion
Junru Lin, Chirag Vashist, Mikaela Uy et al.
TESPEC: Temporally-Enhanced Self-Supervised Pretraining for Event Cameras
Mohammad Mohammadi, Ziyi Wu, Igor Gilitschenski
From Abyssal Darkness to Blinding Glare: A Benchmark on Extreme Exposure Correction in Real World
Bo Wang, Huiyuan Fu, Zhiye Huang et al.
Diffusion-based Source-biased Model for Single Domain Generalized Object Detection
Han Jiang, Wenfei Yang, Tianzhu Zhang et al.
H3R: Hybrid Multi-view Correspondence for Generalizable 3D Reconstruction
Heng Jia, Na Zhao, Linchao Zhu
Estimating 2D Camera Motion with Hybrid Motion Basis
Haipeng Li, Tianhao Zhou, Zhanglei Yang et al.
MonoSOWA: Scalable monocular 3D Object detector Without human Annotations
Jan Skvrna, Lukas Neumann
Discovering Divergent Representations between Text-to-Image Models
Lisa Dunlap, Trevor Darrell, Joseph Gonzalez et al.
ScoreHOI: Physically Plausible Reconstruction of Human-Object Interaction via Score-Guided Diffusion
AO LI, Jinpeng Liu, Yixuan Zhu et al.
PHD: Personalized 3D Human Body Fitting with Point Diffusion
Hsuan-I Ho, Chen Guo, Po-Chen Wu et al.
Understanding Personal Concept in Open-Vocabulary Semantic Segmentation
Sunghyun Park, Jungsoo Lee, Shubhankar Borse et al.
FuXi-RTM: A Physics-Guided Prediction Framework with Radiative Transfer Modeling
qiusheng huang, Xiaohui Zhong, Xu Fan et al.
Geometric Alignment and Prior Modulation for View-Guided Point Cloud Completion on Unseen Categories
Jingqiao Xiu, Yicong Li, Na Zhao et al.
Visual Interestingness Decoded: How GPT-4o Mirrors Human Interests
Fitim Abdullahu, Helmut Grabner
Motion-2-to-3: Leveraging 2D Motion Data for 3D Motion Generations
Ruoxi Guo, Huaijin Pi, Zehong Shen et al.
Image-Guided Shape-from-Template Using Mesh Inextensibility Constraints
Dinh-Vinh-Thuy Tran, Ruochen Chen, Shaifali Parashar
Towards Performance Consistency in Multi-Level Model Collaboration
Qi Li, Runpeng Yu, Xinchao Wang
AutoComPose: Automatic Generation of Pose Transition Descriptions for Composed Pose Retrieval Using Multimodal LLMs
Yi-Ting Shen, Sungmin Eum, Doheon Lee et al.
Tracking Tiny Drones against Clutter: Large-Scale Infrared Benchmark with Motion-Centric Adaptive Algorithm
Jiahao Zhang, Zongli Jiang, Gang Wang et al.
FVGen: Accelerating Novel-View Synthesis with Adversarial Video Diffusion Distillation
Wenbin Teng, Gonglin Chen, Haiwei Chen et al.
PVMamba: Parallelizing Vision Mamba via Dynamic State Aggregation
Fei Xie, Zhongdao Wang, Weijia Zhang et al.
SummDiff: Generative Modeling of Video Summarization with Diffusion
Kwanseok Kim, Jaehoon Hahm, Sumin Kim et al.
CoralSRT: Revisiting Coral Reef Semantic Segmentation by Feature Rectifying via Self-supervised Guidance
Zheng Ziqiang, Wong Kwan, Binh-Son Hua et al.
Feature Decomposition-Recomposition in Large Vision-Language Model for Few-Shot Class-Incremental Learning
Zongyao Xue, Meina Kan, Shiguang Shan et al.
Diagnosing Pretrained Models for Out-of-distribution Detection
Haipeng Xiong, Kai Xu, Angela Yao
RALoc: Enhancing Outdoor LiDAR Localization via Rotation Awareness
Yuyang Yang, Wen Li, Sheng Ao et al.
CHORDS: Diffusion Sampling Accelerator with Multi-core Hierarchical ODE Solvers
Jiaqi Han, Haotian Ye, Puheng Li et al.
Enhanced Pansharpening via Quaternion Spatial-Spectral Interactions
Dong Li, Chunhui Luo, Yuanfei Bao et al.
Adversarial Training for Probabilistic Robustness
YI ZHANG, Yuhang Chen, Zhen Chen et al.
Learning to See Inside Opaque Liquid Containers using Speckle Vibrometry
Matan Kichler, Shai Bagon, Mark Sheinin
Scaling Omni-modal Pretraining with Multimodal Context: Advancing Universal Representation Learning Across Modalities
Yiyuan Zhang, Handong Li, Jing Liu et al.
LightBSR: Towards Lightweight Blind Super-Resolution via Discriminative Implicit Degradation Representation Learning
Jiang Yuan, ji ma, Bo Wang et al.
When Pixel Difference Patterns Meet ViT: PiDiViT for Few-Shot Object Detection
Hongliang Zhou, Yongxiang Liu, Canyu Mo et al.
VOccl3D: A Video Benchmark Dataset for 3D Human Pose and Shape Estimation under real Occlusions
Yash Garg, Saketh Bachu, Arindam Dutta et al.
Exploring View Consistency for Scene-Adaptive Low-Light Light Field Image Enhancement
Shuo Zhang, Chen Gao, Youfang Lin
Learning Normals of Noisy Points by Local Gradient-Aware Surface Filtering
Qing Li, Huifang Feng, Xun Gong et al.
HccePose (BF): Predicting Front & Back Surfaces to Construct Ultra-Dense 2D-3D Correspondences for Pose Estimation
Yulin Wang, Mengting Hu, Hongli Li et al.
Bayesian-Inspired Space-Time Superpixels
Kent Gauen, Stanley Chan
Progressive Distribution Bridging: Unsupervised Adaptation for Large-scale Pre-trained Models via Adaptive Auxiliary Data
Weinan He, Yixin Zhang, Zilei Wang
Kaleidoscopic Background Attack: Disrupting Pose Estimation with Multi-Fold Radial Symmetry Textures
Xinlong Ding, Hongwei Yu, Jiawei Li et al.
Client2Vec: Improving Federated Learning by Distribution Shifts Aware Client Indexing
Yongxin Guo, Lin Wang, Xiaoying Tang et al.
4DSegStreamer: Streaming 4D Panoptic Segmentation via Dual Threads
Ling Liu, Jun Tian, Li Yi
INSTINCT: Instance-Level Interaction Architecture for Query-Based Collaborative Perception
yunjiang xu, Yupeng Ouyang, Lingzhi Li et al.
SPD: Shallow Backdoor Protecting Deep Backdoor Against Backdoor Detection
Shunjie Yuan, Xinghua Li, Xuelin Cao et al.
StableDepth: Scene-Consistent and Scale-Invariant Monocular Depth
Zheng Zhang, Lihe Yang, Tianyu Yang et al.
Layer-wise Vision Injection with Disentangled Attention for Efficient LVLMs
Xuange Zhang, Dengjie Li, Bo Liu et al.
Rethinking DPO-style Diffusion Aligning Frameworks
XUN WU, Shaohan Huang, Lingjie Jiang et al.
Debiased Curriculum Adaptation for Safe Transfer Learning in Chest X-ray Classification
Mingyang Liu, Xinyang Chen, Yang Shu et al.
Weakly-Supervised Learning of Dense Functional Correspondences
Stefan Stojanov, Linan Zhao, Yunzhi Zhang et al.
End-to-End Entity-Predicate Association Reasoning for Dynamic Scene Graph Generation
LiWei Wang, YanDuo Zhang, Tao Lu et al.
PlaneRAS: Learning Planar Primitives for 3D Plane Recovery
Fang Zhang, Wenzhao Zheng, Linqing Zhao et al.
Unleashing the Temporal Potential of Stereo Event Cameras for Continuous-Time 3D Object Detection
Jae Young Kang, Hoonhee Cho, Kuk-Jin Yoon
Ensemble Foreground Management for Unsupervised Object Discovery
Ziling Wu, Armaghan Moemeni, Praminda Caleb-Solly
Forensic-MoE: Exploring Comprehensive Synthetic Image Detection Traces with Mixture of Experts
Mingqi Fang, Ziguang Li, Lingyun Yu et al.
AR-VRM: Imitating Human Motions for Visual Robot Manipulation with Analogical Reasoning
Dejie Yang, Zijing Zhao, Yang Liu
Imbalance in Balance: Online Concept Balancing in Generation Models
Yukai Shi, Jiarong Ou, Rui Chen et al.
Entropy-Adaptive Diffusion Policy Optimization with Dynamic Step Alignment
Renye Yan, Jikang Cheng, Yaozhong Gan et al.
A Simple yet Mighty Hartley Diffusion Versatilist for Generalizable Dense Vision Tasks
Qi Bi, Jingjun Yi, Huimin Huang et al.
SpatialTrackerV2: Advancing 3D Point Tracking with Explicit Camera Motion
Yuxi Xiao, Jianyuan Wang, Nan Xue et al.
Focal Plane Visual Feature Generation and Matching on a Pixel Processor Array
Hongyi Zhang, Laurie Bose, Jianing Chen et al.
Leveraging Panoptic Scene Graph for Evaluating Fine-Grained Text-to-Image Generation
Xueqing Deng, Linjie Yang, Qihang Yu et al.
PASG: A Closed-Loop Framework for Automated Geometric Primitive Extraction and Semantic Anchoring in Robotic Manipulation
Zhihao ZHU, Yifan Zheng, Siyu Pan et al.
GloPER: Unsupervised Animal Pattern Extraction from Local Reconstruction
Bowen Chen, Yun Sing Koh, Gillian Dobbie
Epipolar Consistent Attention Aggregation Network for Unsupervised Light Field Disparity Estimation
Chen Gao, Shuo Zhang, Youfang Lin
Physical Degradation Model-Guided Interferometric Hyperspectral Reconstruction with Unfolding Transformer
Yuansheng Li, Yunhao Zou, Linwei Chen et al.
VPR-Cloak: A First Look at Privacy Cloak Against Visual Place Recognition
Shuting Dong, Mingzhi Chen, Feng Lu et al.
Multi-Modal Multi-Task Unified Embedding Model (M3T-UEM): A Task-Adaptive Representation Learning Framework
Rohan Sharma, Changyou Chen, Feng-Ju Chang et al.
Instance-Level Video Depth in Groups Beyond Occlusions
Yuan Liang, Yang Zhou, Ziming Sun et al.
Hierarchical Variational Test-Time Prompt Generation for Zero-Shot Generalization
Zhaoyang Wu, Fang Liu, Licheng Jiao et al.
Prior-aware Dynamic Temporal Modeling Framework for Sequential 3D Hand Pose Estimation
Pengfei Ren, Jingyu Wang, Haifeng Sun et al.
DRaM-LHM: A Quaternion Framework for Iterative Camera Pose Estimation
Chen Lin, Weizhi Du, Zhixiang Min et al.
CO2-Net: A Physics-Informed Spatio-Temporal Model for Global Surface CO2 Reconstruction
Hao Zheng, Yuting Zheng, Hanbo Huang et al.
Scaling 3D Compositional Models for Robust Classification and Pose Estimation
Xiaoding Yuan, Prakhar Kaushik, Guofeng Zhang et al.
HOMO-Feature: Cross-Arbitrary-Modal Image Matching with Homomorphism of Organized Major Orientation
Chenzhong Gao, Wei Li, Desheng Weng
OCSplats: Observation Completeness Quantification and Label Noise Separation in 3DGS
Han Ling, Yinghui Sun, Xian Xu et al.
GSOT3D: Towards Generic 3D Single Object Tracking in the Wild
Yifan Jiao, Yunhao Li, Junhua Ding et al.
OVA-Fields: Weakly Supervised Open-Vocabulary Affordance Fields for Robot Operational Part Detection
Heng Su, Mengying Xie, Nieqing Cao et al.
Semantic Alignment and Reinforcement for Data-Free Quantization of Vision Transformers
Yunshan Zhong, Yuyao Zhou, Yuxin Zhang et al.
GeoDiffusion: A Training-Free Framework for Accurate 3D Geometric Conditioning in Image Generation
Phillip Mueller, Talip Ünlü, Sebastian Schmidt et al.
NavQ: Learning a Q-Model for Foresighted Vision-and-Language Navigation
Peiran Xu, Xicheng Gong, Yadong Mu
Motal: Unsupervised 3D Object Detection by Modality and Task-specific Knowledge Transfer
Hai Wu, Hongwei Lin, Xusheng Guo et al.
Zero-shot Inexact CAD Model Alignment from a Single Image
Pattaramanee Arsomngern, Sasikarn Khwanmuang, Matthias Nießner et al.
MGSfM: Multi-Camera Geometry Driven Global Structure-from-Motion
peilin Tao, Hainan Cui, Diantao Tu et al.
Learning Large Motion Estimation from Intermediate Representations with a High-Resolution Optical Flow Dataset Featuring Long-Range Dynamic Motion
Hoonhee Cho, Yuhwan Jeong, Kuk-Jin Yoon
WAVE: Warp-Based View Guidance for Consistent Novel View Synthesis Using a Single Image
Jiwoo Park, Tae Choi, Youngjun Jun et al.
Flow Stochastic Segmentation Networks
Fabio De Sousa Ribeiro, Omar Todd, Charles Jones et al.
MDD: A Dataset for Text-and-Music Conditioned Duet Dance Generation
Prerit Gupta, Jason Alexander Fotso-Puepi, Zhengyuan Li et al.
GeoExplorer: Active Geo-localization with Curiosity-Driven Exploration
Li Mi, Manon Béchaz, Zeming Chen et al.
Is Tracking really more challenging in First Person Egocentric Vision?
Matteo Dunnhofer, Zaira Manigrasso, Christian Micheloni
Stochastic Interpolants for Revealing Stylistic Flows across the History of Art
Pingchuan Ma, Ming Gui, Johannes Schusterbauer et al.
Real3D: Towards Scaling Large Reconstruction Models with Real Images
Hanwen Jiang, Qixing Huang, Georgios Pavlakos
CaliMatch: Adaptive Calibration for Improving Safe Semi-supervised Learning
Jinsoo Bae, Seoung Bum Kim, Hyungrok Do
SAC-GNC: SAmple Consensus for adaptive Graduated Non-Convexity
Valter Piedade, Chitturi Sidhartha, José Gaspar et al.
DiffuMatch: Category-Agnostic Spectral Diffusion Priors for Robust Non-rigid Shape Matching
Emery Pierson, Lei Li, Angela Dai et al.
Invisible Watermarks, Visible Gains: Steering Machine Unlearning with Bi-Level Watermarking Design
Yuhao Sun, Yihua Zhang, Gaowen Liu et al.
Diffusion-Based Extreme High-speed Scenes Reconstruction with the Complementary Vision Sensor
Yapeng Meng, Yihan Lin, Taoyi Wang et al.
Lightweight and Fast Real-time Image Enhancement via Decomposition of the Spatial-aware Lookup Tables
Wontae Kim, Keuntek Lee, Nam Ik Cho
Future-Aware Interaction Network For Motion Forecasting
Shijie Li, Chunyu Liu, Xun Xu et al.
CAT: A Unified Click-and-Track Framework for Realistic Tracking
Yongsheng Yuan, Jie Zhao, Dong Wang et al.