Most Cited 2024 "3d hand estimation" Papers

12,324 papers found • Page 54 of 62

#10601

Preserving Fairness Generalization in Deepfake Detection

Li Lin, Xinan He, Yan Ju et al.

CVPR 2024posterarXiv:2402.17229
#10602

RepViT: Revisiting Mobile CNN From ViT Perspective

Ao Wang, Hui Chen, Zijia Lin et al.

CVPR 2024posterarXiv:2307.09283
#10603

Improved Implicit Neural Representation with Fourier Reparameterized Training

Kexuan Shi, Xingyu Zhou, Shuhang Gu

CVPR 2024posterarXiv:2401.07402
#10604

Gradient Alignment for Cross-Domain Face Anti-Spoofing

MINH BINH LE, Simon Woo

CVPR 2024posterarXiv:2402.18817
#10605

U-VAP: User-specified Visual Appearance Personalization via Decoupled Self Augmentation

You Wu, Kean Liu, Xiaoyue Mi et al.

CVPR 2024posterarXiv:2403.20231
#10606

Insights from the Use of Previously Unseen Neural Architecture Search Datasets

Rob Geada, David Towers, Matthew Forshaw et al.

CVPR 2024posterarXiv:2404.02189
#10607

A Pedestrian is Worth One Prompt: Towards Language Guidance Person Re-Identification

Zexian Yang, Dayan Wu, Chenming Wu et al.

CVPR 2024highlight
#10608

Layout-Agnostic Scene Text Image Synthesis with Diffusion Models

Qilong Zhangli, Jindong Jiang, Di Liu et al.

CVPR 2024posterarXiv:2406.01062
#10609

CoDi: Conditional Diffusion Distillation for Higher-Fidelity and Faster Image Generation

Kangfu Mei, Mauricio Delbracio, Hossein Talebi et al.

CVPR 2024posterarXiv:2310.01407
#10610

From SAM to CAMs: Exploring Segment Anything Model for Weakly Supervised Semantic Segmentation

Hyeokjun Kweon, Kuk-Jin Yoon

CVPR 2024poster
#10611

Vlogger: Make Your Dream A Vlog

Shaobin Zhuang, Kunchang Li, Xinyuan Chen et al.

CVPR 2024posterarXiv:2401.09414
#10612

EpiDiff: Enhancing Multi-View Synthesis via Localized Epipolar-Constrained Diffusion

Zehuan Huang, Hao Wen, Junting Dong et al.

CVPR 2024posterarXiv:2312.06725
#10613

IPoD: Implicit Field Learning with Point Diffusion for Generalizable 3D Object Reconstruction from Single RGB-D Images

Yushuang Wu, Luyue Shi, Junhao Cai et al.

CVPR 2024highlightarXiv:2404.00269
#10614

MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding

Bo He, Hengduo Li, Young Kyun Jang et al.

CVPR 2024posterarXiv:2404.05726
#10615

SVGDreamer: Text Guided SVG Generation with Diffusion Model

XiMing Xing, Chuang Wang, Haitao Zhou et al.

CVPR 2024posterarXiv:2312.16476
#10616

Dual Prototype Attention for Unsupervised Video Object Segmentation

Suhwan Cho, Minhyeok Lee, Seunghoon Lee et al.

CVPR 2024posterarXiv:2211.12036
#10617

R-Cyclic Diffuser: Reductive and Cyclic Latent Diffusion for 3D Clothed Human Digitalization

Kennard Chan, Fayao Liu, Guosheng Lin et al.

CVPR 2024poster
#10618

Contrastive Mean-Shift Learning for Generalized Category Discovery

Sua Choi, Dahyun Kang, Minsu Cho

CVPR 2024posterarXiv:2404.09451
#10619

Panacea: Panoramic and Controllable Video Generation for Autonomous Driving

Yuqing Wen, Yucheng Zhao, Yingfei Liu et al.

CVPR 2024posterarXiv:2408.07605
#10620

Flexible Biometrics Recognition: Bridging the Multimodality Gap through Attention Alignment and Prompt Tuning

Leslie Ching Ow Tiong, Dick Sigmund, Chen-Hui Chan et al.

CVPR 2024poster
#10621

Towards Variable and Coordinated Holistic Co-Speech Motion Generation

Yifei Liu, Qiong Cao, Yandong Wen et al.

CVPR 2024posterarXiv:2404.00368
#10622

VSRD: Instance-Aware Volumetric Silhouette Rendering for Weakly Supervised 3D Object Detection

Zihua Liu, Hiroki Sakuma, Masatoshi Okutomi

CVPR 2024posterarXiv:2404.00149
#10623

Class Incremental Learning with Multi-Teacher Distillation

Haitao Wen, Lili Pan, Yu Dai et al.

CVPR 2024poster
#10624

Parameter Efficient Self-Supervised Geospatial Domain Adaptation

Linus Scheibenreif, Michael Mommert, Damian Borth

CVPR 2024poster
#10625

ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Transformers

Narges Norouzi, Svetlana Orlova, Daan de Geus et al.

CVPR 2024posterarXiv:2406.09936
#10626

Beyond Seen Primitive Concepts and Attribute-Object Compositional Learning

Nirat Saini, Khoi Pham, Abhinav Shrivastava

CVPR 2024poster
#10627

Scaling Laws of Synthetic Images for Model Training ... for Now

Lijie Fan, Kaifeng Chen, Dilip Krishnan et al.

CVPR 2024posterarXiv:2312.04567
#10628

UDiFF: Generating Conditional Unsigned Distance Fields with Optimal Wavelet Diffusion

Junsheng Zhou, Weiqi Zhang, Baorui Ma et al.

CVPR 2024posterarXiv:2404.06851
#10629

Learning Group Activity Features Through Person Attribute Prediction

Chihiro Nakatani, Hiroaki Kawashima, Norimichi Ukita

CVPR 2024posterarXiv:2403.02753
#10630

MICap: A Unified Model for Identity-Aware Movie Descriptions

Haran Raajesh, Naveen Reddy Desanur, Zeeshan Khan et al.

CVPR 2024posterarXiv:2405.11483
#10631

UFineBench: Towards Text-based Person Retrieval with Ultra-fine Granularity

Jialong Zuo, Hanyu Zhou, Ying Nie et al.

CVPR 2024posterarXiv:2312.03441
#10632

Test-Time Zero-Shot Temporal Action Localization

Benedetta Liberatori, Alessandro Conti, Paolo Rota et al.

CVPR 2024posterarXiv:2404.05426
#10633

FreeU: Free Lunch in Diffusion U-Net

Chenyang Si, Ziqi Huang, Yuming Jiang et al.

CVPR 2024posterarXiv:2309.11497
#10634

Towards Text-guided 3D Scene Composition

Qihang Zhang, Chaoyang Wang, Aliaksandr Siarohin et al.

CVPR 2024posterarXiv:2312.08885
#10635

Instance-aware Exploration-Verification-Exploitation for Instance ImageGoal Navigation

Xiaohan Lei, Min Wang, Wengang Zhou et al.

CVPR 2024posterarXiv:2402.17587
#10636

AnyScene: Customized Image Synthesis with Composited Foreground

Ruidong Chen, Lanjun Wang, Weizhi Nie et al.

CVPR 2024poster
#10637

Learning SO(3)-Invariant Semantic Correspondence via Local Shape Transform

Chunghyun Park, Seungwook Kim, Jaesik Park et al.

CVPR 2024posterarXiv:2404.11156
#10638

Color Shift Estimation-and-Correction for Image Enhancement

Yiyu Li, Ke Xu, Gerhard Hancke et al.

CVPR 2024posterarXiv:2405.17725
#10639

TE-TAD: Towards Full End-to-End Temporal Action Detection via Time-Aligned Coordinate Expression

Ho-Joong Kim, Jung-Ho Hong, Heejo Kong et al.

CVPR 2024posterarXiv:2404.02405
#10640

Endow SAM with Keen Eyes: Temporal-spatial Prompt Learning for Video Camouflaged Object Detection

Wenjun Hui, Zhenfeng Zhu, Shuai Zheng et al.

CVPR 2024poster
#10641

NICE: Neurogenesis Inspired Contextual Encoding for Replay-free Class Incremental Learning

Mustafa B Gurbuz, Jean Moorman, Constantine Dovrolis

CVPR 2024poster
#10642

Taming Mode Collapse in Score Distillation for Text-to-3D Generation

Peihao Wang, Dejia Xu, Zhiwen Fan et al.

CVPR 2024posterarXiv:2401.00909
#10643

FocusMAE: Gallbladder Cancer Detection from Ultrasound Videos with Focused Masked Autoencoders

Soumen Basu, Mayuna Gupta, Chetan Madan et al.

CVPR 2024posterarXiv:2403.08848
#10644

Noisy One-point Homographies are Surprisingly Good

Yaqing Ding, Jonathan Astermark, Magnus Oskarsson et al.

CVPR 2024poster
#10645

CSTA: CNN-based Spatiotemporal Attention for Video Summarization

Jaewon Son, Jaehun Park, Kwangsu Kim

CVPR 2024posterarXiv:2405.11905
#10646

SUGAR: Pre-training 3D Visual Representations for Robotics

Shizhe Chen, Ricardo Garcia Pinel, Ivan Laptev et al.

CVPR 2024posterarXiv:2404.01491
#10647

SnAG: Scalable and Accurate Video Grounding

Fangzhou Mu, Sicheng Mo, Yin Li

CVPR 2024posterarXiv:2404.02257
#10648

GLaMM: Pixel Grounding Large Multimodal Model

Hanoona Rasheed, Muhammad Maaz, Sahal Shaji Mullappilly et al.

CVPR 2024posterarXiv:2311.03356
#10649

ManiFPT: Defining and Analyzing Fingerprints of Generative Models

Hae Jin Song, Mahyar Khayatkhoei, Wael AbdAlmageed

CVPR 2024posterarXiv:2402.10401
#10650

Self-Calibrating Vicinal Risk Minimisation for Model Calibration

Jiawei Liu, Changkun Ye, Ruikai Cui et al.

CVPR 2024poster
#10651

Cinematic Behavior Transfer via NeRF-based Differentiable Filming

Xuekun Jiang, Anyi Rao, Jingbo Wang et al.

CVPR 2024posterarXiv:2311.17754
#10652

Finding Lottery Tickets in Vision Models via Data-driven Spectral Foresight Pruning

Leonardo Iurada, Marco Ciccone, Tatiana Tommasi

CVPR 2024posterarXiv:2406.01820
#10653

CORE-MPI: Consistency Object Removal with Embedding MultiPlane Image

Donggeun Yoon, Donghyeon Cho

CVPR 2024poster
#10654

ScoreHypo: Probabilistic Human Mesh Estimation with Hypothesis Scoring

Yuan Xu, Xiaoxuan Ma, Jiajun Su et al.

CVPR 2024poster
#10655

HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models

Nataniel Ruiz, Yuanzhen Li, Varun Jampani et al.

CVPR 2024posterarXiv:2307.06949
#10656

Intelligent Grimm - Open-ended Visual Storytelling via Latent Diffusion Models

Chang Liu, Haoning Wu, Yujie Zhong et al.

CVPR 2024posterarXiv:2306.00973
#10657

UniMix: Towards Domain Adaptive and Generalizable LiDAR Semantic Segmentation in Adverse Weather

Haimei Zhao, Jing Zhang, Zhuo Chen et al.

CVPR 2024posterarXiv:2404.05145
#10658

LaMPilot: An Open Benchmark Dataset for Autonomous Driving with Language Model Programs

Yunsheng Ma, Can Cui, Xu Cao et al.

CVPR 2024posterarXiv:2312.04372
#10659

Dynamic Policy-Driven Adaptive Multi-Instance Learning for Whole Slide Image Classification

Tingting Zheng, Kui Jiang, Hongxun Yao

CVPR 2024highlightarXiv:2403.07939
#10660

BilevelPruning: Unified Dynamic and Static Channel Pruning for Convolutional Neural Networks

Shangqian Gao, Yanfu Zhang, Feihu Huang et al.

CVPR 2024poster
#10661

Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models

Jiayi Guo, Xingqian Xu, Yifan Pu et al.

CVPR 2024posterarXiv:2312.04410
#10662

Editable Scene Simulation for Autonomous Driving via Collaborative LLM-Agents

Yuxi Wei, Zi Wang, Yifan Lu et al.

CVPR 2024highlightarXiv:2402.05746
#10663

Learning Continuous 3D Words for Text-to-Image Generation

Ta-Ying Cheng, Matheus Gadelha, Thibault Groueix et al.

CVPR 2024posterarXiv:2402.08654
#10664

Content-Adaptive Non-Local Convolution for Remote Sensing Pansharpening

Yule Duan, Xiao Wu, Haoyu Deng et al.

CVPR 2024posterarXiv:2404.07543
#10665

A Conditional Denoising Diffusion Probabilistic Model for Point Cloud Upsampling

Wentao Qu, Yuantian Shao, Lingwu Meng et al.

CVPR 2024posterarXiv:2312.02719
#10666

APISR: Anime Production Inspired Real-World Anime Super-Resolution

Boyang Wang, Fengyu Yang, Xihang Yu et al.

CVPR 2024posterarXiv:2403.01598
#10667

Instruct-ReID: A Multi-purpose Person Re-identification Task with Instructions

Weizhen He, Yiheng Deng, SHIXIANG TANG et al.

CVPR 2024posterarXiv:2306.07520
#10668

Device-Wise Federated Network Pruning

Shangqian Gao, Junyi Li, Zeyu Zhang et al.

CVPR 2024poster
#10669

SPOT: Self-Training with Patch-Order Permutation for Object-Centric Learning with Autoregressive Transformers

Ioannis Kakogeorgiou, Spyros Gidaris, Konstantinos Karantzalos et al.

CVPR 2024highlightarXiv:2312.00648
#10670

MRC-Net: 6-DoF Pose Estimation with MultiScale Residual Correlation

Yuelong Li, Yafei Mao, Raja Bala et al.

CVPR 2024posterarXiv:2403.08019
#10671

Rethinking FID: Towards a Better Evaluation Metric for Image Generation

Sadeep Jayasumana, Srikumar Ramalingam, Andreas Veit et al.

CVPR 2024highlightarXiv:2401.09603
#10672

Progress-Aware Online Action Segmentation for Egocentric Procedural Task Videos

Yuhan Shen, Ehsan Elhamifar

CVPR 2024poster
#10673

MOHO: Learning Single-view Hand-held Object Reconstruction with Multi-view Occlusion-Aware Supervision

Chenyangguang Zhang, Guanlong Jiao, Yan Di et al.

CVPR 2024posterarXiv:2310.11696
#10674

Faces that Speak: Jointly Synthesising Talking Face and Speech from Text

Youngjoon Jang, Jihoon Kim, Junseok Ahn et al.

CVPR 2024posterarXiv:2405.10272
#10675

Learning to Segment Referred Objects from Narrated Egocentric Videos

Yuhan Shen, Huiyu Wang, Xitong Yang et al.

CVPR 2024poster
#10676

EGTR: Extracting Graph from Transformer for Scene Graph Generation

Jinbae Im, JeongYeon Nam, Nokyung Park et al.

CVPR 2024posterarXiv:2404.02072
#10677

Distributionally Generative Augmentation for Fair Facial Attribute Classification

Fengda Zhang, Qianpei He, Kun Kuang et al.

CVPR 2024posterarXiv:2403.06606
#10678

PikeLPN: Mitigating Overlooked Inefficiencies of Low-Precision Neural Networks

Marina Neseem, Conor McCullough, Randy Hsin et al.

CVPR 2024posterarXiv:2404.00103
#10679

THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models

Prannay Kaul, Zhizhong Li, Hao Yang et al.

CVPR 2024posterarXiv:2405.05256
#10680

UniVS: Unified and Universal Video Segmentation with Prompts as Queries

Minghan LI, Shuai Li, Xindong Zhang et al.

CVPR 2024posterarXiv:2402.18115
#10681

Inlier Confidence Calibration for Point Cloud Registration

Yongzhe Yuan, Yue Wu, Xiaolong Fan et al.

CVPR 2024poster
#10682

CLIP-BEVFormer: Enhancing Multi-View Image-Based BEV Detector with Ground Truth Flow

Chenbin Pan, Burhan Yaman, Senem Velipasalar et al.

CVPR 2024posterarXiv:2403.08919
#10683

ADFactory: An Effective Framework for Generalizing Optical Flow with NeRF

Han Ling, Quansen Sun, Yinghui Sun et al.

CVPR 2024posterarXiv:2311.04246
#10684

3D Building Reconstruction from Monocular Remote Sensing Images with Multi-level Supervisions

Weijia Li, Haote Yang, Zhenghao Hu et al.

CVPR 2024posterarXiv:2404.04823
#10685

In Search of a Data Transformation That Accelerates Neural Field Training

Junwon Seo, Sangyoon Lee, Kwang In Kim et al.

CVPR 2024posterarXiv:2311.17094
#10686

FastMAC: Stochastic Spectral Sampling of Correspondence Graph

Yifei Zhang, Hao Zhao, Hongyang Li et al.

CVPR 2024posterarXiv:2403.08770
#10687

PeLK: Parameter-efficient Large Kernel ConvNets with Peripheral Convolution

Honghao Chen, Xiangxiang Chu, Renyongjian et al.

CVPR 2024posterarXiv:2403.07589
#10688

Towards Generalizing to Unseen Domains with Few Labels

Chamuditha Jayanga Galappaththige, Sanoojan Baliah, Malitha Gunawardhana et al.

CVPR 2024posterarXiv:2403.11674
#10689

Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts

Jialin Wu, Xia Hu, Yaqing Wang et al.

CVPR 2024highlightarXiv:2312.00968
#10690

Improving Semantic Correspondence with Viewpoint-Guided Spherical Maps

Octave Mariotti, Oisin Mac Aodha, Hakan Bilen

CVPR 2024posterarXiv:2312.13216
#10691

Learning Degradation-Independent Representations for Camera ISP Pipelines

Yanhui Guo, Fangzhou Luo, Xiaolin Wu

CVPR 2024posterarXiv:2307.00761
#10692

A Subspace-Constrained Tyler's Estimator and its Applications to Structure from Motion

Feng Yu, Teng Zhang, Gilad Lerman

CVPR 2024posterarXiv:2404.11590
#10693

Low-Resource Vision Challenges for Foundation Models

Yunhua Zhang, Hazel Doughty, Cees G. M. Snoek

CVPR 2024posterarXiv:2401.04716
#10694

Low-Latency Neural Stereo Streaming

Qiqi Hou, Farzad Farhadzadeh, Amir Said et al.

CVPR 2024posterarXiv:2403.17879
#10695

Your Transferability Barrier is Fragile: Free-Lunch for Transferring the Non-Transferable Learning

Ziming Hong, Li Shen, Tongliang Liu

CVPR 2024highlight
#10696

ARTrackV2: Prompting Autoregressive Tracker Where to Look and How to Describe

Yifan Bai, Zeyang Zhao, Yihong Gong et al.

CVPR 2024posterarXiv:2312.17133
#10697

DPHMs: Diffusion Parametric Head Models for Depth-based Tracking

Jiapeng Tang, Angela Dai, Yinyu Nie et al.

CVPR 2024posterarXiv:2312.01068
#10698

MaxQ: Multi-Axis Query for N:M Sparsity Network

Jingyang Xiang, Siqi Li, Junhao Chen et al.

CVPR 2024poster
#10699

The Mirrored Influence Hypothesis: Efficient Data Influence Estimation by Harnessing Forward Passes

Myeongseob Ko, Feiyang Kang, Weiyan Shi et al.

CVPR 2024posterarXiv:2402.08922
#10700

Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models

Haoning Wu, Zicheng Zhang, Erli Zhang et al.

CVPR 2024posterarXiv:2311.06783
#10701

Efficient Scene Recovery Using Luminous Flux Prior

ZhongYu Li, Lei Zhang

CVPR 2024poster
#10702

Structured Gradient-based Interpretations via Norm-Regularized Adversarial Training

Shizhan Gong, Qi Dou, Farzan Farnia

CVPR 2024posterarXiv:2404.04647
#10703

Revisiting Global Translation Estimation with Feature Tracks

Peilin Tao, Hainan Cui, Mengqi Rong et al.

CVPR 2024poster
#10704

Forgery-aware Adaptive Transformer for Generalizable Synthetic Image Detection

Huan Liu, Zichang Tan, Chuangchuang Tan et al.

CVPR 2024posterarXiv:2312.16649
#10705

MeaCap: Memory-Augmented Zero-shot Image Captioning

Zequn Zeng, Yan Xie, Hao Zhang et al.

CVPR 2024posterarXiv:2403.03715
#10706

MuseChat: A Conversational Music Recommendation System for Videos

Zhikang Dong, Bin Chen, Xiulong Liu et al.

CVPR 2024highlightarXiv:2310.06282
#10707

Novel View Synthesis with View-Dependent Effects from a Single Image

Juan Luis Gonzalez Bello, Munchurl Kim

CVPR 2024posterarXiv:2312.08071
#10708

Orchestrate Latent Expertise: Advancing Online Continual Learning with Multi-Level Supervision and Reverse Self-Distillation

Hongwei Yan, Liyuan Wang, Kaisheng Ma et al.

CVPR 2024posterarXiv:2404.00417
#10709

DisCo: Disentangled Control for Realistic Human Dance Generation

Tan Wang, Linjie Li, Kevin Lin et al.

CVPR 2024posterarXiv:2307.00040
#10710

Suppress and Rebalance: Towards Generalized Multi-Modal Face Anti-Spoofing

Xun Lin, Shuai Wang, RIZHAO CAI et al.

CVPR 2024highlightarXiv:2402.19298
#10711

Constructing and Exploring Intermediate Domains in Mixed Domain Semi-supervised Medical Image Segmentation

Qinghe Ma, Jian Zhang, Lei Qi et al.

CVPR 2024posterarXiv:2404.08951
#10712

LAMP: Learn A Motion Pattern for Few-Shot Video Generation

Rui-Qi Wu, Liangyu Chen, Tong Yang et al.

CVPR 2024poster
#10713

PixelLM: Pixel Reasoning with Large Multimodal Model

Zhongwei Ren, Zhicheng Huang, Yunchao Wei et al.

CVPR 2024posterarXiv:2312.02228
#10714

Towards CLIP-driven Language-free 3D Visual Grounding via 2D-3D Relational Enhancement and Consistency

Yuqi Zhang, Han Luo, Yinjie Lei

CVPR 2024poster
#10715

iKUN: Speak to Trackers without Retraining

Yunhao Du, Cheng Lei, Zhicheng Zhao et al.

CVPR 2024posterarXiv:2312.16245
#10716

Neural Fields as Distributions: Signal Processing Beyond Euclidean Space

Daniel Rebain, Soroosh Yazdani, Kwang Moo Yi et al.

CVPR 2024poster
#10717

Weakly-Supervised Emotion Transition Learning for Diverse 3D Co-speech Gesture Generation

Xingqun Qi, Jiahao Pan, Peng Li et al.

CVPR 2024posterarXiv:2311.17532
#10718

LaRE^2: Latent Reconstruction Error Based Method for Diffusion-Generated Image Detection

Yunpeng Luo, Junlong Du, Ke Yan et al.

CVPR 2024posterarXiv:2403.17465
#10719

Stratified Avatar Generation from Sparse Observations

Han Feng, Wenchao Ma, Quankai Gao et al.

CVPR 2024posterarXiv:2405.20786
#10720

Few-shot Learner Parameterization by Diffusion Time-steps

Zhongqi Yue, Pan Zhou, Richang Hong et al.

CVPR 2024posterarXiv:2403.02649
#10721

Global and Hierarchical Geometry Consistency Priors for Few-shot NeRFs in Indoor Scenes

Xiaotian Sun, Qingshan Xu, Xinjie Yang et al.

CVPR 2024poster
#10722

Lookahead Exploration with Neural Radiance Representation for Continuous Vision-Language Navigation

Zihan Wang, Xiangyang Li, Jiahao Yang et al.

CVPR 2024highlightarXiv:2404.01943
#10723

Compressed 3D Gaussian Splatting for Accelerated Novel View Synthesis

Simon Niedermayr, Josef Stumpfegger, rüdiger westermann

CVPR 2024posterarXiv:2401.02436
#10724

The STVchrono Dataset: Towards Continuous Change Recognition in Time

Yanjun Sun, Yue Qiu, Mariia Khan et al.

CVPR 2024poster
#10725

Unleashing Channel Potential: Space-Frequency Selection Convolution for SAR Object Detection

Ke Li, Di Wang, Zhangyuan Hu et al.

CVPR 2024poster
#10726

Motion Blur Decomposition with Cross-shutter Guidance

Xiang Ji, Haiyang Jiang, Yinqiang Zheng

CVPR 2024posterarXiv:2404.01120
#10727

Mind Marginal Non-Crack Regions: Clustering-Inspired Representation Learning for Crack Segmentation

zhuangzhuang chen, Zhuonan Lai, Jie Chen et al.

CVPR 2024poster
#10728

LION: Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge

Gongwei Chen, Leyang Shen, Rui Shao et al.

CVPR 2024posterarXiv:2311.11860
#10729

Pixel-Aligned Language Model

Jiarui Xu, Xingyi Zhou, Shen Yan et al.

CVPR 2024poster
#10730

Eclipse: Disambiguating Illumination and Materials using Unintended Shadows

Dor Verbin, Ben Mildenhall, Peter Hedman et al.

CVPR 2024posterarXiv:2305.16321
#10731

ConvoFusion: Multi-Modal Conversational Diffusion for Co-Speech Gesture Synthesis

Muhammad Hamza Mughal, Rishabh Dabral, Ikhsanul Habibie et al.

CVPR 2024posterarXiv:2403.17936
#10732

2S-UDF: A Novel Two-stage UDF Learning Method for Robust Non-watertight Model Reconstruction from Multi-view Images

Junkai Deng, Fei Hou, Xuhui Chen et al.

CVPR 2024posterarXiv:2303.15368
#10733

ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models

Lukas Höllein, Aljaž Božič, Norman Müller et al.

CVPR 2024posterarXiv:2403.01807
#10734

Taming Stable Diffusion for Text to 360 Panorama Image Generation

Cheng Zhang, Qianyi Wu, Camilo Cruz Gambardella et al.

CVPR 2024highlightarXiv:2404.07949
#10735

CAMEL: CAusal Motion Enhancement Tailored for Lifting Text-driven Video Editing

Guiwei Zhang, Tianyu Zhang, Guanglin Niu et al.

CVPR 2024poster
#10736

DuPL: Dual Student with Trustworthy Progressive Learning for Robust Weakly Supervised Semantic Segmentation

Yuanchen Wu, Xichen Ye, KequanYang et al.

CVPR 2024posterarXiv:2403.11184
#10737

A Physics-informed Low-rank Deep Neural Network for Blind and Universal Lens Aberration Correction

Jin Gong, Runzhao Yang, Weihang Zhang et al.

CVPR 2024poster
#10738

NAPGuard: Towards Detecting Naturalistic Adversarial Patches

Siyang Wu, Jiakai Wang, Jiejie Zhao et al.

CVPR 2024poster
#10739

Descriptor and Word Soups: Overcoming the Parameter Efficiency Accuracy Tradeoff for Out-of-Distribution Few-shot Learning

Christopher Liao, Theodoros Tsiligkaridis, Brian Kulis

CVPR 2024posterarXiv:2311.13612
#10740

A Stealthy Wrongdoer: Feature-Oriented Reconstruction Attack against Split Learning

Xiaoyang Xu, Mengda Yang, Wenzhe Yi et al.

CVPR 2024posterarXiv:2405.04115
#10741

Bootstrapping SparseFormers from Vision Foundation Models

Ziteng Gao, Zhan Tong, Kevin Qinghong Lin et al.

CVPR 2024posterarXiv:2312.01987
#10742

Psychometry: An Omnifit Model for Image Reconstruction from Human Brain Activity

Ruijie Quan, Wenguan Wang, Zhibo Tian et al.

CVPR 2024posterarXiv:2403.20022
#10743

G-NeRF: Geometry-enhanced Novel View Synthesis from Single-View Images

Zixiong Huang, Qi Chen, Libo Sun et al.

CVPR 2024posterarXiv:2404.07474
#10744

Active Prompt Learning in Vision Language Models

Jihwan Bang, Sumyeong Ahn, Jae-Gil Lee

CVPR 2024posterarXiv:2311.11178
#10745

Generating Handwritten Mathematical Expressions From Symbol Graphs: An End-to-End Pipeline

Yu chen, Fei Gao, YanguangZhang et al.

CVPR 2024poster
#10746

On the Robustness of Language Guidance for Low-Level Vision Tasks: Findings from Depth Estimation

Agneet Chatterjee, Tejas Gokhale, Chitta Baral et al.

CVPR 2024posterarXiv:2404.08540
#10747

SingularTrajectory: Universal Trajectory Predictor Using Diffusion Model

Inhwan Bae, Young-Jae Park, Hae-Gon Jeon

CVPR 2024posterarXiv:2403.18452
#10748

Domain Separation Graph Neural Networks for Saliency Object Ranking

Zijian Wu, Jun Lu, Jing Han et al.

CVPR 2024poster
#10749

Solving the Catastrophic Forgetting Problem in Generalized Category Discovery

Xinzi Cao, Xiawu Zheng, Guanhong Wang et al.

CVPR 2024posterarXiv:2501.05272
#10750

Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving

Yuqi Wang, Jiawei He, Lue Fan et al.

CVPR 2024posterarXiv:2311.17918
#10751

WOUAF: Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models

Changhoon Kim, Kyle Min, Maitreya Patel et al.

CVPR 2024posterarXiv:2306.04744
#10752

MatchU: Matching Unseen Objects for 6D Pose Estimation from RGB-D Images

Junwen Huang, Hao Yu, Kuan-Ting Yu et al.

CVPR 2024posterarXiv:2403.01517
#10753

Resource-Efficient Transformer Pruning for Finetuning of Large Models

Fatih Ilhan, Gong Su, Selim Tekin et al.

CVPR 2024poster
#10754

Link-Context Learning for Multimodal LLMs

Yan Tai, Weichen Fan, Zhao Zhang et al.

CVPR 2024posterarXiv:2308.07891
#10755

The Manga Whisperer: Automatically Generating Transcriptions for Comics

Ragav Sachdeva, Andrew Zisserman

CVPR 2024posterarXiv:2401.10224
#10756

Deep-TROJ: An Inference Stage Trojan Insertion Algorithm through Efficient Weight Replacement Attack

Sabbir Ahmed, RANYANG ZHOU, Shaahin Angizi et al.

CVPR 2024poster
#10757

Dynamic LiDAR Re-simulation using Compositional Neural Fields

Hanfeng Wu, Xingxing Zuo, Stefan Leutenegger et al.

CVPR 2024highlightarXiv:2312.05247
#10758

Language-aware Visual Semantic Distillation for Video Question Answering

Bo Zou, Chao Yang, Yu Qiao et al.

CVPR 2024poster
#10759

3DInAction: Understanding Human Actions in 3D Point Clouds

Yizhak Ben-Shabat, Oren Shrout, Stephen Gould

CVPR 2024highlightarXiv:2303.06346
#10760

DiLiGenRT: A Photometric Stereo Dataset with Quantified Roughness and Translucency

Heng Guo, Jieji Ren, Feishi Wang et al.

CVPR 2024poster
#10761

StyLitGAN: Image-Based Relighting via Latent Control

Anand Bhattad, James Soole, David Forsyth

CVPR 2024poster
#10762

Label-Efficient Group Robustness via Out-of-Distribution Concept Curation

Yiwei Yang, Anthony Liu, Robert Wolfe et al.

CVPR 2024poster
#10763

Unsupervised Universal Image Segmentation

XuDong Wang, Dantong Niu, Xinyang Han et al.

CVPR 2024posterarXiv:2312.17243
#10764

Batch Normalization Alleviates the Spectral Bias in Coordinate Networks

Zhicheng Cai, Hao Zhu, Qiu Shen et al.

CVPR 2024poster
#10765

Not All Classes Stand on Same Embeddings: Calibrating a Semantic Distance with Metric Tensor

Jae Hyeon Park, Gyoomin Lee, Seunggi Park et al.

CVPR 2024poster
#10766

CodedEvents: Optimal Point-Spread-Function Engineering for 3D-Tracking with Event Cameras

Sachin Shah, Matthew Chan, Haoming Cai et al.

CVPR 2024posterarXiv:2406.09409
#10767

Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling

Zhe Li, Zerong Zheng, Lizhen Wang et al.

CVPR 2024poster
#10768

Retrieval-Augmented Open-Vocabulary Object Detection

Jooyeon Kim, Eulrang Cho, Sehyung Kim et al.

CVPR 2024posterarXiv:2404.05687
#10769

NB-GTR: Narrow-Band Guided Turbulence Removal

Yifei Xia, Chu Zhou, Chengxuan Zhu et al.

CVPR 2024poster
#10770

LangSplat: 3D Language Gaussian Splatting

Minghan Qin, Wanhua Li, Jiawei ZHOU et al.

CVPR 2024highlightarXiv:2312.16084
#10771

Positive-Unlabeled Learning by Latent Group-Aware Meta Disambiguation

Lin Long, Haobo Wang, Zhijie Jiang et al.

CVPR 2024poster
#10772

Text-conditional Attribute Alignment across Latent Spaces for 3D Controllable Face Image Synthesis

FeiFan Xu, Rui Li, Si Wu et al.

CVPR 2024poster
#10773

SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering

Antoine Guédon, Vincent Lepetit

CVPR 2024posterarXiv:2311.12775
#10774

DiffusionPoser: Real-time Human Motion Reconstruction From Arbitrary Sparse Sensors Using Autoregressive Diffusion

Tom Van Wouwe, Seunghwan Lee, Antoine Falisse et al.

CVPR 2024posterarXiv:2308.16682
#10775

HumanRef: Single Image to 3D Human Generation via Reference-Guided Diffusion

Jingbo Zhang, Xiaoyu Li, Qi Zhang et al.

CVPR 2024posterarXiv:2311.16961
#10776

CurveCloudNet: Processing Point Clouds with 1D Structure

Colton Stearns, Alex Fu, Jiateng Liu et al.

CVPR 2024posterarXiv:2303.12050
#10777

Harnessing Meta-Learning for Improving Full-Frame Video Stabilization

Muhammad Kashif Ali, Eun Woo Im, Dongjin Kim et al.

CVPR 2024posterarXiv:2403.03662
#10778

Physical 3D Adversarial Attacks against Monocular Depth Estimation in Autonomous Driving

Junhao Zheng, Chenhao Lin, Jiahao Sun et al.

CVPR 2024posterarXiv:2403.17301
#10779

SeaBird: Segmentation in Bird’s View with Dice Loss Improves Monocular 3D Detection of Large Objects

Abhinav Kumar, Yuliang Guo, Xinyu Huang et al.

CVPR 2024posterarXiv:2403.20318
#10780

MoML: Online Meta Adaptation for 3D Human Motion Prediction

Xiaoning Sun, Huaijiang Sun, Bin Li et al.

CVPR 2024poster
#10781

Learning with Structural Labels for Learning with Noisy Labels

Noo-ri Kim, Jin-Seop Lee, Jee-Hyong Lee

CVPR 2024poster
#10782

What If the TV Was Off? Examining Counterfactual Reasoning Abilities of Multi-modal Language Models

Letian Zhang, Xiaotong Zhai, Zhongkai Zhao et al.

CVPR 2024posterarXiv:2310.06627
#10783

Incremental Nuclei Segmentation from Histopathological Images via Future-class Awareness and Compatibility-inspired Distillation

Huyong Wang, Huisi Wu, Jing Qin

CVPR 2024poster
#10784

Model Inversion Robustness: Can Transfer Learning Help?

Sy-Tuyen Ho, Koh Jun Hao, Keshigeyan Chandrasegaran et al.

CVPR 2024posterarXiv:2405.05588
#10785

Scene-adaptive and Region-aware Multi-modal Prompt for Open Vocabulary Object Detection

Xiaowei Zhao, Xianglong Liu, Duorui Wang et al.

CVPR 2024poster
#10786

InteractDiffusion: Interaction Control in Text-to-Image Diffusion Models

Jiun Tian Hoe, Xudong Jiang, Chee Seng Chan et al.

CVPR 2024posterarXiv:2312.05849
#10787

MAP: MAsk-Pruning for Source-Free Model Intellectual Property Protection

Boyang Peng, Sanqing Qu, Yong Wu et al.

CVPR 2024posterarXiv:2403.04149
#10788

Dynamic Adapter Meets Prompt Tuning: Parameter-Efficient Transfer Learning for Point Cloud Analysis

Xin Zhou, Dingkang Liang, Wei Xu et al.

CVPR 2024posterarXiv:2403.01439
#10789

EMCAD: Efficient Multi-scale Convolutional Attention Decoding for Medical Image Segmentation

Md Mostafijur Rahman, Mustafa Munir, Radu Marculescu

CVPR 2024posterarXiv:2405.06880
#10790

On Exact Inversion of DPM-Solvers

Seongmin Hong, Kyeonghyun Lee, Suh Yoon Jeon et al.

CVPR 2024posterarXiv:2311.18387
#10791

Generate Like Experts: Multi-Stage Font Generation by Incorporating Font Transfer Process into Diffusion Models

Bin Fu, Fanghua Yu, Anran Liu et al.

CVPR 2024poster
#10792

A Unified Diffusion Framework for Scene-aware Human Motion Estimation from Sparse Signals

Jiangnan Tang, Jingya Wang, Kaiyang Ji et al.

CVPR 2024posterarXiv:2404.04890
#10793

MaskCLR: Attention-Guided Contrastive Learning for Robust Action Representation Learning

Mohamed Abdelfattah, Mariam Hassan, Alex Alahi

CVPR 2024poster
#10794

D3T: Distinctive Dual-Domain Teacher Zigzagging Across RGB-Thermal Gap for Domain-Adaptive Object Detection

Dinh Phat Do, Taehoon Kim, JAEMIN NA et al.

CVPR 2024posterarXiv:2403.09359
#10795

MAGICK: A Large-scale Captioned Dataset from Matting Generated Images using Chroma Keying

Ryan Burgert, Brian Price, Jason Kuen et al.

CVPR 2024poster
#10796

Intrinsic Image Diffusion for Indoor Single-view Material Estimation

Peter Kocsis, Vincent Sitzmann, Matthias Nießner

CVPR 2024posterarXiv:2312.12274
#10797

Prompt Highlighter: Interactive Control for Multi-Modal LLMs

Yuechen Zhang, Shengju Qian, Bohao Peng et al.

CVPR 2024posterarXiv:2312.04302
#10798

Can Protective Perturbation Safeguard Personal Data from Being Exploited by Stable Diffusion?

Zhengyue Zhao, Jinhao Duan, Kaidi Xu et al.

CVPR 2024posterarXiv:2312.00084
#10799

NetTrack: Tracking Highly Dynamic Objects with a Net

Guangze Zheng, Shijie Lin, Haobo Zuo et al.

CVPR 2024posterarXiv:2403.11186
#10800

Scaling Up Video Summarization Pretraining with Large Language Models

Dawit Argaw Argaw, Seunghyun Yoon, Fabian Caba Heilbron et al.

CVPR 2024posterarXiv:2404.03398