Most Cited CVPR "human pose recognition" Papers
5,589 papers found • Page 27 of 28
Conference
Learning to Sample Effective and Diverse Prompts for Text-to-Image Generation
Taeyoung Yun, Dinghuai Zhang, Jinkyoo Park et al.
Exploring Scene Affinity for Semi-Supervised LiDAR Semantic Segmentation
Chuandong Liu, Xingxing Weng, Shuguo Jiang et al.
Revisiting Audio-Visual Segmentation with Vision-Centric Transformer
Shaofei Huang, Rui Ling, Tianrui Hui et al.
Less is More: Efficient Image Vectorization with Adaptive Parameterization
Kaibo Zhao, Liang Bao, Yufei Li et al.
CARL: A Framework for Equivariant Image Registration
Hastings Greer, Lin Tian, François-Xavier Vialard et al.
Perceptual Inductive Bias Is What You Need Before Contrastive Learning
Junru Zhao, Tianqin Li, Dunhan Jiang et al.
Steering Away from Harm: An Adaptive Approach to Defending Vision Language Model Against Jailbreaks
Han Wang, Gang Wang, Huan Zhang
Dynamic Prompt Optimizing for Text-to-Image Generation
Wenyi Mo, Tianyu Zhang, Yalong Bai et al.
PEER Pressure: Model-to-Model Regularization for Single Source Domain Generalization
Dongkyu Cho, Inwoo Hwang, Sanghack Lee
AeroGen: Enhancing Remote Sensing Object Detection with Diffusion-Driven Data Generation
Datao Tang, Xiangyong Cao, Xuan Wu et al.
UniSTD: Towards Unified Spatio-Temporal Learning across Diverse Disciplines
Chen Tang, Xinzhu Ma, Encheng Su et al.
Similarity-Guided Layer-Adaptive Vision Transformer for UAV Tracking
chaocan xue, Bineng Zhong, Qihua Liang et al.
Robust Audio-Visual Segmentation via Audio-Guided Visual Convergent Alignment
Chen Liu, Peike Li, Liying Yang et al.
Unified Dense Prediction of Video Diffusion
Lehan Yang, Lu Qi, Xiangtai Li et al.
RENO: Real-Time Neural Compression for 3D LiDAR Point Clouds
Kang You, Tong Chen, Dandan Ding et al.
Joint Scheduling of Causal Prompts and Tasks for Multi-Task Learning
Chaoyang Li, Jianyang Qin, Jinhao Cui et al.
Is Your World Simulator a Good Story Presenter? A Consecutive Events-Based Benchmark for Future Long Video Generation
Yiping Wang, Xuehai He, Kuan Wang et al.
Diffusion Self-Distillation for Zero-Shot Customized Image Generation
Shengqu Cai, Eric Ryan Chan, Yunzhi Zhang et al.
DynScene: Scalable Generation of Dynamic Robotic Manipulation Scenes for Embodied AI
Sangmin Lee, Sungyong Park, Heewon Kim
EnvPoser: Environment-aware Realistic Human Motion Estimation from Sparse Observations with Uncertainty Modeling
Songpengcheng Xia, Yu Zhang, Zhuo Su et al.
DefMamba: Deformable Visual State Space Model
Leiye Liu, Miao Zhang, Jihao Yin et al.
MaskGaussian: Adaptive 3D Gaussian Representation from Probabilistic Masks
Yifei Liu, Zhihang Zhong, Yifan Zhan et al.
Forensics Adapter: Adapting CLIP for Generalizable Face Forgery Detection
Xinjie Cui, Yuezun Li, Ao Luo et al.
SceneFactor: Factored Latent 3D Diffusion for Controllable 3D Scene Generation
Aleksei Bokhovkin, Quan Meng, Shubham Tulsiani et al.
VideoGEM: Training-free Action Grounding in Videos
Felix Vogel, Walid Bousselham, Anna Kukleva et al.
ProReflow: Progressive Reflow with Decomposed Velocity
Lei Ke, Haohang Xu, Xuefei Ning et al.
Co-Speech Gesture Video Generation with Implicit Motion-Audio Entanglement
Xinjie Li, Ziyi Chen, Xinlu Yu et al.
Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation
Chengyue Wu, Xiaokang Chen, Zhiyu Wu et al.
FATE: Full-head Gaussian Avatar with Textural Editing from Monocular Video
Jiawei Zhang, Zijian Wu, Zhiyang Liang et al.
Event-Equalized Dense Video Captioning
Kangyi Wu, Pengna Li, Jingwen Fu et al.
Unsupervised Template-assisted Point Cloud Shape Correspondence Network
Jiacheng Deng, Jiahao Lu, Tianzhu Zhang
X-3D: Explicit 3D Structure Modeling for Point Cloud Recognition
Shuofeng Sun, Yongming Rao, Jiwen Lu et al.
Spectral and Polarization Vision: Spectro-polarimetric Real-world Dataset
Yujin Jeon, Eunsue Choi, Youngchan Kim et al.
Efficient Model Stealing Defense with Noise Transition Matrix
Dong-Dong Wu, Chilin Fu, Weichang Wu et al.
HOIAnimator: Generating Text-prompt Human-object Animations using Novel Perceptive Diffusion Models
Wenfeng Song, Xinyu Zhang, Shuai Li et al.
MV-Adapter: Multimodal Video Transfer Learning for Video Text Retrieval
bowen zhang, Xiaojie Jin, Weibo Gong et al.
Diffusion Models Without Attention
Jing Nathan Yan, Jiatao Gu, Alexander Rush
GazeGene: Large-scale Synthetic Gaze Dataset with 3D Eyeball Annotations
Yiwei Bao, Zhiming Wang, Feng Lu
HDQMF: Holographic Feature Decomposition Using Quantum Algorithms
Prathyush Poduval, Zhuowen Zou, Mohsen Imani
DrivingGaussian: Composite Gaussian Splatting for Surrounding Dynamic Autonomous Driving Scenes
Xiaoyu Zhou, Zhiwei Lin, Xiaojun Shan et al.
Feature Information Driven Position Gaussian Distribution Estimation for Tiny Object Detection
Jinghao Bian, Mingtao Feng, Weisheng Dong et al.
H-ViT: A Hierarchical Vision Transformer for Deformable Image Registration
Morteza Ghahremani, Mohammad Khateri, Bailiang Jian et al.
PRaDA: Projective Radial Distortion Averaging
Daniil Sinitsyn, Linus Härenstam-Nielsen, Daniel Cremers
Going Beyond Multi-Task Dense Prediction with Synergy Embedding Models
Huimin Huang, Yawen Huang, Lanfen Lin et al.
Structure from Collision
Takuhiro Kaneko
MR-VNet: Media Restoration using Volterra Networks
Siddharth Roheda, Amit Unde, Loay Rashid
OmniParser: A Unified Framework for Text Spotting Key Information Extraction and Table Recognition
Jianqiang Wan, Sibo Song, Wenwen Yu et al.
PortraitBooth: A Versatile Portrait Model for Fast Identity-preserved Personalization
Xu Peng, Junwei Zhu, Boyuan Jiang et al.
MULAN: A Multi Layer Annotated Dataset for Controllable Text-to-Image Generation
Petru-Daniel Tudosiu, Yongxin Yang, Shifeng Zhang et al.
Dr.Hair: Reconstructing Scalp-Connected Hair Strands without Pre-Training via Differentiable Rendering of Line Segments
Yusuke Takimoto, Hikari Takehara, Hiroyuki Sato et al.
Inference-Scale Complexity in ANN-SNN Conversion for High-Performance and Low-Power Applications
Tong Bu, Maohua Li, Zhaofei Yu
Language-Guided Salient Object Ranking
Fang Liu, Yuhao Liu, Ke Xu et al.
CroSel: Cross Selection of Confident Pseudo Labels for Partial-Label Learning
Shiyu Tian, Hongxin Wei, Yiqun Wang et al.
PTM-VQA: Efficient Video Quality Assessment Leveraging Diverse PreTrained Models from the Wild
Kun Yuan, Hongbo Liu, Mading Li et al.
Evolving High-Quality Rendering and Reconstruction in a Unified Framework with Contribution-Adaptive Regularization
You Shen, Zhipeng Zhang, Xinyang Li et al.
Improved Self-Training for Test-Time Adaptation
Jing Ma
Mudslide: A Universal Nuclear Instance Segmentation Method
Jun Wang
MedUnifier: Unifying Vision-and-Language Pre-training on Medical Data with Vision Generation Task using Discrete Visual Representations
Ziyang Zhang, Yang Yu, Yucheng Chen et al.
Image is All You Need to Empower Large-scale Diffusion Models for In-Domain Generation
Pu Cao, Feng Zhou, Lu Yang et al.
Collaborative Learning of Anomalies with Privacy (CLAP) for Unsupervised Video Anomaly Detection: A New Baseline
Anas Al-lahham, Muhammad Zaigham Zaheer, Nurbek Tastan et al.
Cache Me if You Can: Accelerating Diffusion Models through Block Caching
Felix Wimbauer, Bichen Wu, Edgar Schoenfeld et al.
Rewrite the Stars
Xu Ma, Xiyang Dai, Yue Bai et al.
Beyond Generation: A Diffusion-based Low-level Feature Extractor for Detecting AI-generated Images
Nan Zhong, Haoyu Chen, Yiran Xu et al.
Virtual Immunohistochemistry Staining for Histological Images Assisted by Weakly-supervised Learning
Jiahan Li, Jiuyang Dong, Shenjin Huang et al.
S2D-LFE: Sparse-to-Dense Light Field Event Generation
Yutong Liu, Wenming Weng, Yueyi Zhang et al.
3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features
Chenfeng Xu, Huan Ling, Sanja Fidler et al.
Model Adaptation for Time Constrained Embodied Control
Jaehyun Song, Minjong Yoo, Honguk Woo
Open-Canopy: Towards Very High Resolution Forest Monitoring
Fajwel Fogel, Yohann PERRON, Nikola Besic et al.
G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation
Tianxing Chen, Yao Mu, Zhixuan Liang et al.
DAGSM: Disentangled Avatar Generation with GS-enhanced Mesh
Jingyu Zhuang, Di Kang, Linchao Bao et al.
DiverGen: Improving Instance Segmentation by Learning Wider Data Distribution with More Diverse Generative Data
Chengxiang Fan, Muzhi Zhu, Hao Chen et al.
SPAD: Spatially Aware Multi-View Diffusers
Yash Kant, Aliaksandr Siarohin, Ziyi Wu et al.
SCE-MAE: Selective Correspondence Enhancement with Masked Autoencoder for Self-Supervised Landmark Estimation
Kejia Yin, Varshanth Rao, Ruowei Jiang et al.
DiffPerformer: Iterative Learning of Consistent Latent Guidance for Diffusion-based Human Video Generation
Chenyang Wang, Zerong Zheng, Tao Yu et al.
SparseOcc: Rethinking Sparse Latent Representation for Vision-Based Semantic Occupancy Prediction
Pin Tang, Zhongdao Wang, Guoqing Wang et al.
Beyond First-Order Tweedie: Solving Inverse Problems using Latent Diffusion
Litu Rout, Yujia Chen, Abhishek Kumar et al.
Unsupervised Video Domain Adaptation with Masked Pre-Training and Collaborative Self-Training
Arun Reddy, William Paul, Corban Rivera et al.
Tiled Diffusion
Or Madar, Ohad Fried
MTADiffusion: Mask Text Alignment Diffusion Model for Object Inpainting
jun huang, Ting Liu, Yihang Wu et al.
RCBEVDet: Radar-camera Fusion in Bird's Eye View for 3D Object Detection
Zhiwei Lin, Zhe Liu, Zhongyu Xia et al.
FineParser: A Fine-grained Spatio-temporal Action Parser for Human-centric Action Quality Assessment
Jinglin Xu, Sibo Yin, Guohao Zhao et al.
SceneFun3D: Fine-Grained Functionality and Affordance Understanding in 3D Scenes
Alexandros Delitzas, Ayça Takmaz, Federico Tombari et al.
ShiftwiseConv: Small Convolutional Kernel with Large Kernel Effect
Dachong Li, li li, zhuangzhuang chen et al.
MAPLM: A Real-World Large-Scale Vision-Language Benchmark for Map and Traffic Scene Understanding
Xu Cao, Tong Zhou, Yunsheng Ma et al.
Do Vision and Language Encoders Represent the World Similarly?
Mayug Maniparambil, Raiymbek Akshulakov, YASSER ABDELAZIZ DAHOU DJILALI et al.
Reversing Flow for Image Restoration
Haina Qin, Wenyang Luo, Bing Li et al.
Weakly Supervised Point Cloud Semantic Segmentation via Artificial Oracle
Hyeokjun Kweon, Jihun Kim, Kuk-Jin Yoon
Customize your NeRF: Adaptive Source Driven 3D Scene Editing via Local-Global Iterative Training
Runze He, Shaofei Huang, Xuecheng Nie et al.
Construct to Associate: Cooperative Context Learning for Domain Adaptive Point Cloud Segmentation
Guangrui Li
Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft
Hao Li, Xue Yang, Zhaokai Wang et al.
Wavelet-based Fourier Information Interaction with Frequency Diffusion Adjustment for Underwater Image Restoration
Chen Zhao, Weiling Cai, Chenyu Dong et al.
Generating Content for HDR Deghosting from Frequency View
Tao Hu, Qingsen Yan, Yuankai Qi et al.
PolarNeXt: Rethink Instance Segmentation with Polar Representation
Jiacheng Sun, Xinghong Zhou, Yiqiang Wu et al.
Direct2.5: Diverse Text-to-3D Generation via Multi-view 2.5D Diffusion
Yuanxun Lu, Jingyang Zhang, Shiwei Li et al.
Not All Prompts Are Secure: A Switchable Backdoor Attack Against Pre-trained Vision Transfomers
Sheng Yang, Jiawang Bai, Kuofeng Gao et al.
LL3DA: Visual Interactive Instruction Tuning for Omni-3D Understanding Reasoning and Planning
Sijin Chen, Xin Chen, Chi Zhang et al.
Implicit Bias Injection Attacks against Text-to-Image Diffusion Models
Huayang Huang, Xiangye Jin, Jiaxu Miao et al.
GenTron: Diffusion Transformers for Image and Video Generation
Shoufa Chen, Mengmeng Xu, Jiawei Ren et al.
Quaffure: Real-Time Quasi-Static Neural Hair Simulation
Tuur Stuyck, Gene Wei-Chin Lin, Egor Larionov et al.
Map-Relative Pose Regression for Visual Re-Localization
Shuai Chen, Tommaso Cavallari, Victor Adrian Prisacariu et al.
Gradient-based Parameter Selection for Efficient Fine-Tuning
Zhi Zhang, Qizhe Zhang, Zijun Gao et al.
Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis
Willi Menapace, Aliaksandr Siarohin, Ivan Skorokhodov et al.
Label Shift Meets Online Learning: Ensuring Consistent Adaptation with Universal Dynamic Regret
Yucong Dai, Shilin Gu, Ruidong Fan et al.
Backpropagation-free Network for 3D Test-time Adaptation
YANSHUO WANG, Ali Cheraghian, Zeeshan Hayder et al.
TransNeXt: Robust Foveal Visual Perception for Vision Transformers
Dai Shi
4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models
Wanhua Li, Renping Zhou, Jiawei Zhou et al.
InstructDiffusion: A Generalist Modeling Interface for Vision Tasks
Zigang Geng, Binxin Yang, Tiankai Hang et al.
HPL-ESS: Hybrid Pseudo-Labeling for Unsupervised Event-based Semantic Segmentation
Linglin Jing, Yiming Ding, Yunpeng Gao et al.
Phoenix: A Motion-based Self-Reflection Framework for Fine-grained Robotic Action Correction
Wenke Xia, Ruoxuan Feng, Dong Wang et al.
Promptable Behaviors: Personalizing Multi-Objective Rewards from Human Preferences
Minyoung Hwang, Luca Weihs, Chanwoo Park et al.
ROLL: Robust Noisy Pseudo-label Learning for Multi-View Clustering with Noisy Correspondence
Yuan Sun, Yongxiang Li, Zhenwen Ren et al.
Fourier Priors-Guided Diffusion for Zero-Shot Joint Low-Light Enhancement and Deblurring
Xiaoqian Lv, Shengping Zhang, Chenyang Wang et al.
Towards General Robustness Verification of MaxPool-based Convolutional Neural Networks via Tightening Linear Approximation
Yuan Xiao, Shiqing Ma, Juan Zhai et al.
Boosting Domain Incremental Learning: Selecting the Optimal Parameters is All You Need
Qiang Wang, Xiang Song, Yuhang He et al.
RichDreamer: A Generalizable Normal-Depth Diffusion Model for Detail Richness in Text-to-3D
Lingteng Qiu, Guanying Chen, Xiaodong Gu et al.
Bridging Viewpoint Gaps: Geometric Reasoning Boosts Semantic Correspondence
Qiyang Qian, Hansheng Chen, Masayoshi Tomizuka et al.
Robust Synthetic-to-Real Transfer for Stereo Matching
Jiawei Zhang, Jiahe Li, Lei Huang et al.
Understanding and Improving Source-free Domain Adaptation from a Theoretical Perspective
Yu Mitsuzumi, Akisato Kimura, Hisashi Kashima
From Isolated Islands to Pangea: Unifying Semantic Space for Human Action Understanding
Yonglu Li, Xiaoqian Wu, Xinpeng Liu et al.
SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Model
Chunlin Yu, Hanqing Wang, Ye Shi et al.
LowRankOcc: Tensor Decomposition and Low-Rank Recovery for Vision-based 3D Semantic Occupancy Prediction
Linqing Zhao, Xiuwei Xu, Ziwei Wang et al.
Overcoming Generic Knowledge Loss with Selective Parameter Update
Wenxuan Zhang, Paul Janson, Rahaf Aljundi et al.
Asynchronous Collaborative Graph Representation for Frames and Events
Dianze Li, Jianing Li, Xu Liu et al.
BT-Adapter: Video Conversation is Feasible Without Video Instruction Tuning
Ruyang Liu, Chen Li, Yixiao Ge et al.
Video Frame Interpolation via Direct Synthesis with the Event-based Reference
Yuhan Liu, Yongjian Deng, Hao Chen et al.
Lane2Seq: Towards Unified Lane Detection via Sequence Generation
Kunyang Zhou
CorrMatch: Label Propagation via Correlation Matching for Semi-Supervised Semantic Segmentation
Bo-Yuan Sun, Yuqi Yang, Le Zhang et al.
Rethinking Boundary Discontinuity Problem for Oriented Object Detection
Hang Xu, Xinyuan Liu, Haonan Xu et al.
Theory-Inspired Deep Multi-View Multi-Label Learning with Incomplete Views and Noisy Labels
Quanjiang Li, Tingjin Luo, Jiahui Liao
MCNet: Rethinking the Core Ingredients for Accurate and Efficient Homography Estimation
Haokai Zhu, Si-Yuan Cao, Jianxin Hu et al.
Move-in-2D: 2D-Conditioned Human Motion Generation
Hsin-Ping Huang, Yang Zhou, Jui-Hsien Wang et al.
UniDepth: Universal Monocular Metric Depth Estimation
Luigi Piccinelli, Yung-Hsu Yang, Christos Sakaridis et al.
Diffusion Model Alignment Using Direct Preference Optimization
Bram Wallace, Meihua Dang, Rafael Rafailov et al.
SD4Match: Learning to Prompt Stable Diffusion Model for Semantic Matching
Xinghui Li, Jingyi Lu, Kai Han et al.
Uncertainty-Guided Never-Ending Learning to Drive
Lei Lai, Eshed Ohn-Bar, Sanjay Arora et al.
Feedback-Guided Autonomous Driving
Jimuyang Zhang, Zanming Huang, Arijit Ray et al.
Small Steps and Level Sets: Fitting Neural Surface Models with Point Guidance
Chamin Hewa Koneputugodage, Yizhak Ben-Shabat, Dylan Campbell et al.
Adapt or Perish: Adaptive Sparse Transformer with Attentive Feature Refinement for Image Restoration
Shihao Zhou, Duosheng Chen, Jinshan Pan et al.
3DGStream: On-the-Fly Training of 3D Gaussians for Efficient Streaming of Photo-Realistic Free-Viewpoint Videos
Jiakai Sun, Han Jiao, Guangyuan Li et al.
Conformal Prediction and MLLM aided Uncertainty Quantification in Scene Graph Generation
Sayak Nag, Udita Ghosh, Calvin-Khang Ta et al.
LTM: Lightweight Textured Mesh Extraction and Refinement of Large Unbounded Scenes for Efficient Storage and Real-time Rendering
Jaehoon Choi, Rajvi Shah, Qinbo Li et al.
Improving Semi-Supervised Semantic Segmentation with Sliced-Wasserstein Feature Alignment and Uniformity
Chen Yi Lu, Kasra Derakhshandeh, Somali Chaterji
Hierarchical Adaptive Filtering Network for Text Image Specular Highlight Removal
Zhi Jiang, Jingbo Hu, Ling Zhang et al.
Geometry Transfer for Stylizing Radiance Fields
Hyunyoung Jung, Seonghyeon Nam, Nikolaos Sarafianos et al.
3D Human Pose Perception from Egocentric Stereo Videos
Hiroyasu Akada, Jian Wang, Vladislav Golyanik et al.
FreePCA: Integrating Consistency Information across Long-short Frames in Training-free Long Video Generation via Principal Component Analysis
Jiangtong Tan, Hu Yu, Jie Huang et al.
HERA: Hybrid Explicit Representation for Ultra-Realistic Head Avatars
Hongrui Cai, Yuting Xiao, Xuan Wang et al.
QN-Mixer: A Quasi-Newton MLP-Mixer Model for Sparse-View CT Reconstruction
Ishak Ayad, Nicolas Larue, Mai K. Nguyen
Check Locate Rectify: A Training-Free Layout Calibration System for Text-to-Image Generation
Biao Gong, Siteng Huang, Yutong Feng et al.
Prompt3D: Random Prompt Assisted Weakly-Supervised 3D Object Detection
Xiaohong Zhang, Huisheng Ye, Jingwen Li et al.
Boosting Self-Supervision for Single-View Scene Completion via Knowledge Distillation
Keonhee Han, Dominik Muhle, Felix Wimbauer et al.
Volumetric Environment Representation for Vision-Language Navigation
Liu, Wenguan Wang, Yi Yang
CrossKD: Cross-Head Knowledge Distillation for Object Detection
JiaBao Wang, yuming chen, Zhaohui Zheng et al.
Continual-MAE: Adaptive Distribution Masked Autoencoders for Continual Test-Time Adaptation
Jiaming Liu, Ran Xu, Senqiao Yang et al.
TiNO-Edit: Timestep and Noise Optimization for Robust Diffusion-Based Image Editing
Sherry X. Chen, Yaron Vaxman, Elad Ben Baruch et al.
Leveraging Camera Triplets for Efficient and Accurate Structure-from-Motion
Lalit Manam, Venu Madhav Govindu
CG-HOI: Contact-Guided 3D Human-Object Interaction Generation
Christian Diller, Angela Dai
Layer- and Timestep-Adaptive Differentiable Token Compression Ratios for Efficient Diffusion Transformers
Haoran You, Connelly Barnes, Yuqian Zhou et al.
SGSST: Scaling Gaussian Splatting Style Transfer
Bruno Galerne, Jianling WANG, Lara Raad et al.
Is Vanilla MLP in Neural Radiance Field Enough for Few-shot View Synthesis?
Hanxin Zhu, Tianyu He, Xin Li et al.
Resurrecting Old Classes with New Data for Exemplar-Free Continual Learning
Dipam Goswami, Albin Soutif, Yuyang Liu et al.
DIEM: Decomposition-Integration Enhancing Multimodal Insights
Xinyi Jiang, Guoming Wang, Junhao Guo et al.
Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters
Jiazuo Yu, Yunzhi Zhuge, Lu Zhang et al.
Unified Medical Lesion Segmentation via Self-referring Indicator
Shijie Chang, Xiaoqi Zhao, Lihe Zhang et al.
HOI-M^3: Capture Multiple Humans and Objects Interaction within Contextual Environment
Juze Zhang, Jingyan Zhang, Zining Song et al.
CORES: Convolutional Response-based Score for Out-of-distribution Detection
Keke Tang, Chao Hou, Weilong Peng et al.
Equivariant Multi-Modality Image Fusion
Zixiang Zhao, Haowen Bai, Jiangshe Zhang et al.
PDF: A Probability-Driven Framework for Open World 3D Point Cloud Semantic Segmentation
Jinfeng Xu, Siyuan Yang, Xianzhi Li et al.
NeISF: Neural Incident Stokes Field for Geometry and Material Estimation
Chenhao Li, Taishi Ono, Takeshi Uemori et al.
PromptKD: Unsupervised Prompt Distillation for Vision-Language Models
Zheng Li, Xiang Li, xinyi fu et al.
Generative Modeling of Class Probability for Multi-Modal Representation Learning
JungKyoo Shin, Bumsoo Kim, Eunwoo Kim
DeMatch: Deep Decomposition of Motion Field for Two-View Correspondence Learning
Shihua Zhang, Zizhuo Li, Yuan Gao et al.
Domain Gap Embeddings for Generative Dataset Augmentation
Yinong Oliver Wang, Younjoon Chung, Chen Henry Wu et al.
Domain-Agnostic Mutual Prompting for Unsupervised Domain Adaptation
Zhekai Du, Xinyao Li, Fengling Li et al.
TransLoc4D: Transformer-based 4D Radar Place Recognition
Guohao Peng, Heshan Li, Yangyang Zhao et al.
GRAE-3DMOT: Geometry Relation-Aware Encoder for Online 3D Multi-Object Tracking
Hyunseop Kim, Hyo-Jun Lee, Yonguk Lee et al.
Higher-order Relational Reasoning for Pedestrian Trajectory Prediction
Sungjune Kim, Hyung-gun Chi, Hyerin Lim et al.
Learn to Rectify the Bias of CLIP for Unsupervised Semantic Segmentation
Jingyun Wang, Guoliang Kang
Gazing at Rewards: Eye Movements as a Lens into Human and AI Decision-Making in Hybrid Visual Foraging
Bo Wang, Dingwei Tan, Yen-Ling Kuo et al.
RefPose: Leveraging Reference Geometric Correspondences for Accurate 6D Pose Estimation of Unseen Objects
Jaeguk Kim, Jaewoo Park, Keuntek Lee et al.
Absolute Pose from One or Two Scaled and Oriented Features
Jonathan Ventura, Zuzana Kukelova, Torsten Sattler et al.
Draw Step by Step: Reconstructing CAD Construction Sequences from Point Clouds via Multimodal Diffusion.
Weijian Ma, Shuaiqi Chen, Yunzhong Lou et al.
DSGG: Dense Relation Transformer for an End-to-end Scene Graph Generation
Zeeshan Hayder, Xuming He
Open-Vocabulary 3D Semantic Segmentation with Foundation Models
Li Jiang, Shaoshuai Shi, Bernt Schiele
Training Vision Transformers for Semi-Supervised Semantic Segmentation
Xinting Hu, Li Jiang, Bernt Schiele
APSeg: Auto-Prompt Network for Cross-Domain Few-Shot Semantic Segmentation
Weizhao He, Yang Zhang, Wei Zhuo et al.
AnyEdit: Mastering Unified High-Quality Image Editing for Any Idea
Qifan Yu, Wei Chow, Zhongqi Yue et al.
Navigating the Unseen: Zero-shot Scene Graph Generation via Capsule-Based Equivariant Features
Wenhuan Huang, Yi JI, guiqian zhu et al.
Design2Cloth: 3D Cloth Generation from 2D Masks
Jiali Zheng, Rolandos Alexandros Potamias, Stefanos Zafeiriou
Non-Natural Image Understanding with Advancing Frequency-based Vision Encoders
Wang Lin, Qingsong Wang, Yueying Feng et al.
S-DyRF: Reference-Based Stylized Radiance Fields for Dynamic Scenes
Xingyi Li, Zhiguo Cao, Yizheng Wu et al.
SatSynth: Augmenting Image-Mask Pairs through Diffusion Models for Aerial Semantic Segmentation
Aysim Toker, Marvin Eisenberger, Daniel Cremers et al.
Dual-Consistency Model Inversion for Non-Exemplar Class Incremental Learning
Zihuan Qiu, Yi Xu, Fanman Meng et al.
MeshArt: Generating Articulated Meshes with Structure-Guided Transformers
Daoyi Gao, Mohd Yawar Nihal Siddiqui, Lei Li et al.
DS-NeRV: Implicit Neural Video Representation with Decomposed Static and Dynamic Codes
Hao Yan, Zhihui Ke, Xiaobo Zhou et al.
Rolling Shutter Correction with Intermediate Distortion Flow Estimation
Mingdeng Cao, Sidi Yang, Yujiu Yang et al.
Towards Transferable Targeted 3D Adversarial Attack in the Physical World
Yao Huang, Yinpeng Dong, Shouwei Ruan et al.
Hybrid Functional Maps for Crease-Aware Non-Isometric Shape Matching
Lennart Bastian, Yizheng Xie, Nassir Navab et al.
Class Tokens Infusion for Weakly Supervised Semantic Segmentation
Sung-Hoon Yoon, Hoyong Kwon, Hyeonseong Kim et al.
SFOD: Spiking Fusion Object Detector
Yimeng Fan, Wei Zhang, Changsong Liu et al.