Most Cited CVPR "junction trees" Papers

5,589 papers found • Page 23 of 28

#4401

Few-shot Learner Parameterization by Diffusion Time-steps

Zhongqi Yue, Pan Zhou, Richang Hong et al.

CVPR 2024posterarXiv:2403.02649
#4402

LION: Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge

Gongwei Chen, Leyang Shen, Rui Shao et al.

CVPR 2024posterarXiv:2311.11860
#4403

Eclipse: Disambiguating Illumination and Materials using Unintended Shadows

Dor Verbin, Ben Mildenhall, Peter Hedman et al.

CVPR 2024posterarXiv:2305.16321
#4404

ConvoFusion: Multi-Modal Conversational Diffusion for Co-Speech Gesture Synthesis

Muhammad Hamza Mughal, Rishabh Dabral, Ikhsanul Habibie et al.

CVPR 2024posterarXiv:2403.17936
#4405

Taming Stable Diffusion for Text to 360 Panorama Image Generation

Cheng Zhang, Qianyi Wu, Camilo Cruz Gambardella et al.

CVPR 2024highlightarXiv:2404.07949
#4406

A Physics-informed Low-rank Deep Neural Network for Blind and Universal Lens Aberration Correction

Jin Gong, Runzhao Yang, Weihang Zhang et al.

CVPR 2024poster
#4407

Descriptor and Word Soups: Overcoming the Parameter Efficiency Accuracy Tradeoff for Out-of-Distribution Few-shot Learning

Christopher Liao, Theodoros Tsiligkaridis, Brian Kulis

CVPR 2024posterarXiv:2311.13612
#4408

A Stealthy Wrongdoer: Feature-Oriented Reconstruction Attack against Split Learning

Xiaoyang Xu, Mengda Yang, Wenzhe Yi et al.

CVPR 2024posterarXiv:2405.04115
#4409

Psychometry: An Omnifit Model for Image Reconstruction from Human Brain Activity

Ruijie Quan, Wenguan Wang, Zhibo Tian et al.

CVPR 2024posterarXiv:2403.20022
#4410

Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving

Yuqi Wang, Jiawei He, Lue Fan et al.

CVPR 2024posterarXiv:2311.17918
#4411

Resource-Efficient Transformer Pruning for Finetuning of Large Models

Fatih Ilhan, Gong Su, Selim Tekin et al.

CVPR 2024poster
#4412

Link-Context Learning for Multimodal LLMs

Yan Tai, Weichen Fan, Zhao Zhang et al.

CVPR 2024posterarXiv:2308.07891
#4413

Deep-TROJ: An Inference Stage Trojan Insertion Algorithm through Efficient Weight Replacement Attack

Sabbir Ahmed, RANYANG ZHOU, Shaahin Angizi et al.

CVPR 2024poster
#4414

Dynamic LiDAR Re-simulation using Compositional Neural Fields

Hanfeng Wu, Xingxing Zuo, Stefan Leutenegger et al.

CVPR 2024highlightarXiv:2312.05247
#4415

DiLiGenRT: A Photometric Stereo Dataset with Quantified Roughness and Translucency

Heng Guo, Jieji Ren, Feishi Wang et al.

CVPR 2024poster
#4416

Batch Normalization Alleviates the Spectral Bias in Coordinate Networks

Zhicheng Cai, Hao Zhu, Qiu Shen et al.

CVPR 2024poster
#4417

NB-GTR: Narrow-Band Guided Turbulence Removal

Yifei Xia, Chu Zhou, Chengxuan Zhu et al.

CVPR 2024poster
#4418

Positive-Unlabeled Learning by Latent Group-Aware Meta Disambiguation

Lin Long, Haobo Wang, Zhijie Jiang et al.

CVPR 2024poster
#4419

SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering

Antoine Guédon, Vincent Lepetit

CVPR 2024posterarXiv:2311.12775
#4420

HumanRef: Single Image to 3D Human Generation via Reference-Guided Diffusion

Jingbo Zhang, Xiaoyu Li, Qi Zhang et al.

CVPR 2024posterarXiv:2311.16961
#4421

Harnessing Meta-Learning for Improving Full-Frame Video Stabilization

Muhammad Kashif Ali, Eun Woo Im, Dongjin Kim et al.

CVPR 2024posterarXiv:2403.03662
#4422

MoML: Online Meta Adaptation for 3D Human Motion Prediction

Xiaoning Sun, Huaijiang Sun, Bin Li et al.

CVPR 2024poster
#4423

What If the TV Was Off? Examining Counterfactual Reasoning Abilities of Multi-modal Language Models

Letian Zhang, Xiaotong Zhai, Zhongkai Zhao et al.

CVPR 2024posterarXiv:2310.06627
#4424

Scene-adaptive and Region-aware Multi-modal Prompt for Open Vocabulary Object Detection

Xiaowei Zhao, Xianglong Liu, Duorui Wang et al.

CVPR 2024poster
#4425

InteractDiffusion: Interaction Control in Text-to-Image Diffusion Models

Jiun Tian Hoe, Xudong Jiang, Chee Seng Chan et al.

CVPR 2024posterarXiv:2312.05849
#4426

MAP: MAsk-Pruning for Source-Free Model Intellectual Property Protection

Boyang Peng, Sanqing Qu, Yong Wu et al.

CVPR 2024posterarXiv:2403.04149
#4427

Dynamic Adapter Meets Prompt Tuning: Parameter-Efficient Transfer Learning for Point Cloud Analysis

Xin Zhou, Dingkang Liang, Wei Xu et al.

CVPR 2024posterarXiv:2403.01439
#4428

EMCAD: Efficient Multi-scale Convolutional Attention Decoding for Medical Image Segmentation

Md Mostafijur Rahman, Mustafa Munir, Radu Marculescu

CVPR 2024posterarXiv:2405.06880
#4429

On Exact Inversion of DPM-Solvers

Seongmin Hong, Kyeonghyun Lee, Suh Yoon Jeon et al.

CVPR 2024posterarXiv:2311.18387
#4430

A Unified Diffusion Framework for Scene-aware Human Motion Estimation from Sparse Signals

Jiangnan Tang, Jingya Wang, Kaiyang Ji et al.

CVPR 2024posterarXiv:2404.04890
#4431

Prompt Highlighter: Interactive Control for Multi-Modal LLMs

Yuechen Zhang, Shengju Qian, Bohao Peng et al.

CVPR 2024posterarXiv:2312.04302
#4432

Can Protective Perturbation Safeguard Personal Data from Being Exploited by Stable Diffusion?

Zhengyue Zhao, Jinhao Duan, Kaidi Xu et al.

CVPR 2024posterarXiv:2312.00084
#4433

Versatile Navigation Under Partial Observability via Value-guided Diffusion Policy

Gengyu Zhang, Hao Tang, Yan Yan

CVPR 2024posterarXiv:2404.02176
#4434

Improving Depth Completion via Depth Feature Upsampling

Yufei Wang, Ge Zhang, Shaoqian Wang et al.

CVPR 2024poster
#4435

OrCo: Towards Better Generalization via Orthogonality and Contrast for Few-Shot Class-Incremental Learning

Noor Ahmed, Anna Kukleva, Bernt Schiele

CVPR 2024highlightarXiv:2403.18550
#4436

LASIL: Learner-Aware Supervised Imitation Learning For Long-term Microscopic Traffic Simulation

Ke Guo, Zhenwei Miao, Wei Jing et al.

CVPR 2024posterarXiv:2403.17601
#4437

BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models

Fengyuan Shi, Jiaxi Gu, Hang Xu et al.

CVPR 2024posterarXiv:2312.02813
#4438

SCoFT: Self-Contrastive Fine-Tuning for Equitable Image Generation

Zhixuan Liu, Peter Schaldenbrand, Beverley-Claire Okogwu et al.

CVPR 2024posterarXiv:2401.08053
#4439

Building a Strong Pre-Training Baseline for Universal 3D Large-Scale Perception

Haoming Chen, Zhizhong Zhang, Yanyun Qu et al.

CVPR 2024posterarXiv:2405.07201
#4440

Symphonize 3D Semantic Scene Completion with Contextual Instance Queries

Haoyi Jiang, Tianheng Cheng, Naiyu Gao et al.

CVPR 2024posterarXiv:2306.15670
#4441

AV-RIR: Audio-Visual Room Impulse Response Estimation

Anton Ratnarajah, Sreyan Ghosh, Sonal Kumar et al.

CVPR 2024posterarXiv:2312.00834
#4442

Entangled View-Epipolar Information Aggregation for Generalizable Neural Radiance Fields

Zhiyuan Min, Yawei Luo, Wei Yang et al.

CVPR 2024posterarXiv:2311.11845
#4443

HanDiffuser: Text-to-Image Generation With Realistic Hand Appearances

Supreeth Narasimhaswamy, Uttaran Bhattacharya, Xiang Chen et al.

CVPR 2024posterarXiv:2403.01693
#4444

Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks

Bin Xiao, Haiping Wu, Weijian Xu et al.

CVPR 2024posterarXiv:2311.06242
#4445

DMR: Decomposed Multi-Modality Representations for Frames and Events Fusion in Visual Reinforcement Learning

Haoran Xu, Peixi Peng, Guang Tan et al.

CVPR 2024poster
#4446

Frequency-aware Event-based Video Deblurring for Real-World Motion Blur

Taewoo Kim, Hoonhee Cho, Kuk-Jin Yoon

CVPR 2024poster
#4447

Classes Are Not Equal: An Empirical Study on Image Recognition Fairness

Jiequan Cui, Beier Zhu, Xin Wen et al.

CVPR 2024posterarXiv:2402.18133
#4448

Dynamic Inertial Poser (DynaIP): Part-Based Motion Dynamics Learning for Enhanced Human Pose Estimation with Sparse Inertial Sensors

Yu Zhang, Songpengcheng Xia, Lei Chu et al.

CVPR 2024posterarXiv:2312.02196
#4449

Tri-Perspective View Decomposition for Geometry-Aware Depth Completion

Zhiqiang Yan, Yuankai Lin, Kun Wang et al.

CVPR 2024posterarXiv:2403.15008
#4450

MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-wise Pruning Error Metric

Haokun Lin, Haoli Bai, Zhili Liu et al.

CVPR 2024posterarXiv:2403.07839
#4451

Tumor Micro-environment Interactions Guided Graph Learning for Survival Analysis of Human Cancers from Whole-slide Pathological Images

WEI SHAO, YangYang Shi, Daoqiang Zhang et al.

CVPR 2024poster
#4452

Fooling Polarization-Based Vision using Locally Controllable Polarizing Projection

Zhuoxiao Li, Zhihang Zhong, Shohei Nobuhara et al.

CVPR 2024posterarXiv:2303.17890
#4453

Diffusion-based Blind Text Image Super-Resolution

Yuzhe Zhang, jiawei zhang, Hao Li et al.

CVPR 2024posterarXiv:2312.08886
#4454

AdaRevD: Adaptive Patch Exiting Reversible Decoder Pushes the Limit of Image Deblurring

Xintian Mao, Xiwen Gao, Yan Wang

CVPR 2024posterarXiv:2406.09135
#4455

MS-DETR: Efficient DETR Training with Mixed Supervision

Chuyang Zhao, Yifan Sun, Wenhao Wang et al.

CVPR 2024posterarXiv:2401.03989
#4456

LP++: A Surprisingly Strong Linear Probe for Few-Shot CLIP

Yunshi HUANG, Fereshteh Shakeri, Jose Dolz et al.

CVPR 2024posterarXiv:2404.02285
#4457

PoNQ: a Neural QEM-based Mesh Representation

Nissim Maruani, Maks Ovsjanikov, Pierre Alliez et al.

CVPR 2024posterarXiv:2403.12870
#4458

Beyond Image Super-Resolution for Image Recognition with Task-Driven Perceptual Loss

Jaeha Kim, Junghun Oh, Kyoung Mu Lee

CVPR 2024posterarXiv:2404.01692
#4459

3D Face Tracking from 2D Video through Iterative Dense UV to Image Flow

Felix Taubner, Prashant Raina, Mathieu Tuli et al.

CVPR 2024posterarXiv:2404.09819
#4460

HIT: Estimating Internal Human Implicit Tissues from the Body Surface

Marilyn Keller, Vaibhav ARORA, Abdelmouttaleb Dakri et al.

CVPR 2024poster
#4461

LEDITS++: Limitless Image Editing using Text-to-Image Models

Manuel Brack, Felix Friedrich, Katharina Kornmeier et al.

CVPR 2024posterarXiv:2311.16711
#4462

Situational Awareness Matters in 3D Vision Language Reasoning

Yunze Man, Liang-Yan Gui, Yu-Xiong Wang

CVPR 2024posterarXiv:2406.07544
#4463

SIFU: Side-view Conditioned Implicit Function for Real-world Usable Clothed Human Reconstruction

Zechuan Zhang, Zongxin Yang, Yi Yang

CVPR 2024highlightarXiv:2312.06704
#4464

MetaCloak: Preventing Unauthorized Subject-driven Text-to-image Diffusion-based Synthesis via Meta-learning

Yixin Liu, Chenrui Fan, Yutong Dai et al.

CVPR 2024posterarXiv:2311.13127
#4465

ModaVerse: Efficiently Transforming Modalities with LLMs

Xinyu Wang, Bohan Zhuang, Qi Wu

CVPR 2024posterarXiv:2401.06395
#4466

Hierarchical Histogram Threshold Segmentation – Auto-terminating High-detail Oversegmentation

Thomas Chang, Simon Seibt, Bartosz von Rymon Lipinski

CVPR 2024poster
#4467

CogAgent: A Visual Language Model for GUI Agents

Wenyi Hong, Weihan Wang, Qingsong Lv et al.

CVPR 2024highlightarXiv:2312.08914
#4468

Multi-criteria Token Fusion with One-step-ahead Attention for Efficient Vision Transformers

Sanghyeok Lee, Joonmyung Choi, Hyunwoo J. Kim

CVPR 2024posterarXiv:2403.10030
#4469

PeVL: Pose-Enhanced Vision-Language Model for Fine-Grained Human Action Recognition

Haosong Zhang, Mei Leong, Liyuan Li et al.

CVPR 2024poster
#4470

Look-Up Table Compression for Efficient Image Restoration

Yinglong Li, Jiacheng Li, Zhiwei Xiong

CVPR 2024highlight
#4471

Hourglass Tokenizer for Efficient Transformer-Based 3D Human Pose Estimation

Wenhao Li, Mengyuan Liu, Hong Liu et al.

CVPR 2024highlightarXiv:2311.12028
#4472

Dense Vision Transformer Compression with Few Samples

Hanxiao Zhang, Yifan Zhou, Guo-Hua Wang

CVPR 2024posterarXiv:2403.18708
#4473

Bilateral Adaptation for Human-Object Interaction Detection with Occlusion-Robustness

Guangzhi Wang, Yangyang Guo, Ziwei Xu et al.

CVPR 2024poster
#4474

RMT: Retentive Networks Meet Vision Transformers

Qihang Fan, Huaibo Huang, Mingrui Chen et al.

CVPR 2024posterarXiv:2309.11523
#4475

Low-Rank Approximation for Sparse Attention in Multi-Modal LLMs

Lin Song, Yukang Chen, Shuai Yang et al.

CVPR 2024poster
#4476

CHAIN: Enhancing Generalization in Data-Efficient GANs via lipsCHitz continuity constrAIned Normalization

Yao Ni, Piotr Koniusz

CVPR 2024posterarXiv:2404.00521
#4477

OneFormer3D: One Transformer for Unified Point Cloud Segmentation

Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin et al.

CVPR 2024posterarXiv:2311.14405
#4478

One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion

Minghua Liu, Ruoxi Shi, Linghao Chen et al.

CVPR 2024posterarXiv:2311.07885
#4479

C2KD: Bridging the Modality Gap for Cross-Modal Knowledge Distillation

Fushuo Huo, Wenchao Xu, Jingcai Guo et al.

CVPR 2024highlight
#4480

CLOAF: CoLlisiOn-Aware Human Flow

Andrey Davydov, Martin Engilberge, Mathieu Salzmann et al.

CVPR 2024posterarXiv:2403.09050
#4481

Abductive Ego-View Accident Video Understanding for Safe Driving Perception

Jianwu Fang, Lei-lei Li, Junfei Zhou et al.

CVPR 2024highlightarXiv:2403.00436
#4482

SCINeRF: Neural Radiance Fields from a Snapshot Compressive Image

Yunhao Li, Xiaodong Wang, Ping Wang et al.

CVPR 2024highlightarXiv:2403.20018
#4483

Regressor-Segmenter Mutual Prompt Learning for Crowd Counting

Mingyue Guo, Li Yuan, Zhaoyi Yan et al.

CVPR 2024posterarXiv:2312.01711
#4484

Vector Graphics Generation via Mutually Impulsed Dual-domain Diffusion

Zhongyin Zhao, Ye Chen, Zhangli Hu et al.

CVPR 2024poster
#4485

Spectral Meets Spatial: Harmonising 3D Shape Matching and Interpolation

Dongliang Cao, Marvin Eisenberger, Nafie El Amrani et al.

CVPR 2024posterarXiv:2402.18920
#4486

Learning to Transform Dynamically for Better Adversarial Transferability

Rongyi Zhu, Zeliang Zhang, Susan Liang et al.

CVPR 2024posterarXiv:2405.14077
#4487

Learning to Select Views for Efficient Multi-View Understanding

Yunzhong Hou, Stephen Gould, Liang Zheng

CVPR 2024poster
#4488

UniGS: Unified Representation for Image Generation and Segmentation

Lu Qi, Lehan Yang, Weidong Guo et al.

CVPR 2024posterarXiv:2312.01985
#4489

UV-IDM: Identity-Conditioned Latent Diffusion Model for Face UV-Texture Generation

Hong Li, Yutang Feng, Song Xue et al.

CVPR 2024poster
#4490

Open-Vocabulary Segmentation with Semantic-Assisted Calibration

Yong Liu, Sule Bai, Guanbin Li et al.

CVPR 2024posterarXiv:2312.04089
#4491

RCooper: A Real-world Large-scale Dataset for Roadside Cooperative Perception

Ruiyang Hao, Siqi Fan, Yingru Dai et al.

CVPR 2024posterarXiv:2403.10145
#4492

EVCap: Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension

Jiaxuan Li, Duc Minh Vo, Akihiro Sugimoto et al.

CVPR 2024posterarXiv:2311.15879
#4493

Geometrically-driven Aggregation for Zero-shot 3D Point Cloud Understanding

Guofeng Mei, Luigi Riz, Yiming Wang et al.

CVPR 2024highlightarXiv:2312.02244
#4494

L0-Sampler: An L0 Model Guided Volume Sampling for NeRF

Liangchen Li, Juyong Zhang

CVPR 2024poster
#4495

Context-based and Diversity-driven Specificity in Compositional Zero-Shot Learning

Yun Li, Zhe Liu, Hang Chen et al.

CVPR 2024posterarXiv:2402.17251
#4496

VISTA-LLAMA: Reducing Hallucination in Video Language Models via Equal Distance to Visual Tokens

Fan Ma, Xiaojie Jin, Heng Wang et al.

CVPR 2024posterarXiv:2312.08870
#4497

FC-GNN: Recovering Reliable and Accurate Correspondences from Interferences

Haobo Xu, Jun Zhou, Hua Yang et al.

CVPR 2024poster
#4498

CapsFusion: Rethinking Image-Text Data at Scale

Qiying Yu, Quan Sun, Xiaosong Zhang et al.

CVPR 2024posterarXiv:2310.20550
#4499

A General and Efficient Training for Transformer via Token Expansion

Wenxuan Huang, Yunhang Shen, Jiao Xie et al.

CVPR 2024posterarXiv:2404.00672
#4500

Breathing Life Into Sketches Using Text-to-Video Priors

Rinon Gal, Yael Vinker, Yuval Alaluf et al.

CVPR 2024highlightarXiv:2311.13608
#4501

Byzantine-robust Decentralized Federated Learning via Dual-domain Clustering and Trust Bootstrapping

Peng Sun, Xinyang Liu, Zhibo Wang et al.

CVPR 2024poster
#4502

Towards Calibrated Multi-label Deep Neural Networks

Jiacheng Cheng, Nuno Vasconcelos

CVPR 2024poster
#4503

TIM: A Time Interval Machine for Audio-Visual Action Recognition

Jacob Chalk, Jaesung Huh, Evangelos Kazakos et al.

CVPR 2024posterarXiv:2404.05559
#4504

PhyScene: Physically Interactable 3D Scene Synthesis for Embodied AI

Yandan Yang, Baoxiong Jia, Peiyuan Zhi et al.

CVPR 2024highlightarXiv:2404.09465
#4505

Multi-modal In-Context Learning Makes an Ego-evolving Scene Text Recognizer

Zhen Zhao, Jingqun Tang, Chunhui Lin et al.

CVPR 2024posterarXiv:2311.13120
#4506

Selective Nonlinearities Removal from Digital Signals

Krzysztof Maliszewski, Magdalena Urbanska, Varvara Vetrova et al.

CVPR 2024posterarXiv:2403.09731
#4507

Towards a Perceptual Evaluation Framework for Lighting Estimation

Justine Giroux, Mohammad Reza Karimi Dastjerdi, Yannick Hold-Geoffroy et al.

CVPR 2024posterarXiv:2312.04334
#4508

From Correspondences to Pose: Non-minimal Certifiably Optimal Relative Pose without Disambiguation

Javier Tirado-Garín, Javier Civera

CVPR 2024highlightarXiv:2312.05995
#4509

Choose What You Need: Disentangled Representation Learning for Scene Text Recognition Removal and Editing

Boqiang Zhang, Hongtao Xie, Zuan Gao et al.

CVPR 2024posterarXiv:2405.04377
#4510

EASE-DETR: Easing the Competition among Object Queries

Yulu Gao, Yifan Sun, Xudong Ding et al.

CVPR 2024poster
#4511

Transcriptomics-guided Slide Representation Learning in Computational Pathology

Guillaume Jaume, Lukas Oldenburg, Anurag Vaidya et al.

CVPR 2024posterarXiv:2405.11618
#4512

Active Open-Vocabulary Recognition: Let Intelligent Moving Mitigate CLIP Limitations

Lei Fan, Jianxiong Zhou, Xiaoying Xing et al.

CVPR 2024posterarXiv:2311.17938
#4513

SAOR: Single-View Articulated Object Reconstruction

Mehmet Aygun, Oisin Mac Aodha

CVPR 2024poster
#4514

GenHowTo: Learning to Generate Actions and State Transformations from Instructional Videos

Tomas Soucek, Dima Damen, Michael Wray et al.

CVPR 2024poster
#4515

MoST: Multi-Modality Scene Tokenization for Motion Prediction

Norman Mu, Jingwei Ji, Zhenpei Yang et al.

CVPR 2024posterarXiv:2404.19531
#4516

Multi-Scale Video Anomaly Detection by Multi-Grained Spatio-Temporal Representation Learning

Menghao Zhang, Jingyu Wang, Qi Qi et al.

CVPR 2024highlight
#4517

PTT: Point-Trajectory Transformer for Efficient Temporal 3D Object Detection

Kuan-Chih Huang, Weijie Lyu, Ming-Hsuan Yang et al.

CVPR 2024posterarXiv:2312.08371
#4518

TextNeRF: A Novel Scene-Text Image Synthesis Method based on Neural Radiance Fields

Jialei Cui, Jianwei Du, Wenzhuo Liu et al.

CVPR 2024poster
#4519

An Asymmetric Augmented Self-Supervised Learning Method for Unsupervised Fine-Grained Image Hashing

Feiran Hu, Chenlin Zhang, Jiangliang GUO et al.

CVPR 2024poster
#4520

DiffusionTrack: Point Set Diffusion Model for Visual Object Tracking

Fei Xie, Zhongdao Wang, Chao Ma

CVPR 2024poster
#4521

Brush2Prompt: Contextual Prompt Generator for Object Inpainting

Mang Tik Chiu, Yuqian Zhou, Lingzhi Zhang et al.

CVPR 2024poster
#4522

G^3-LQ: Marrying Hyperbolic Alignment with Explicit Semantic-Geometric Modeling for 3D Visual Grounding

Yuan Wang, Yali Li, Shengjin Wang

CVPR 2024poster
#4523

Sparse Views Near Light: A Practical Paradigm for Uncalibrated Point-light Photometric Stereo

Mohammed Brahimi, Bjoern Haefner, Zhenzhang Ye et al.

CVPR 2024posterarXiv:2404.00098
#4524

Total Selfie: Generating Full-Body Selfies

Bowei Chen, Brian Curless, Ira Kemelmacher-Shlizerman et al.

CVPR 2024highlightarXiv:2308.14740
#4525

LayoutFormer: Hierarchical Text Detection Towards Scene Text Understanding

Min Liang, Jia-Wei Ma, Xiaobin Zhu et al.

CVPR 2024poster
#4526

On the Diversity and Realism of Distilled Dataset: An Efficient Dataset Distillation Paradigm

Peng Sun, Bei Shi, Daiwei Yu et al.

CVPR 2024posterarXiv:2312.03526
#4527

PredToken: Predicting Unknown Tokens and Beyond with Coarse-to-Fine Iterative Decoding

Xuesong Nie, Haoyuan Jin, Yunfeng Yan et al.

CVPR 2024poster
#4528

Seeing the Unseen: Visual Common Sense for Semantic Placement

Ram Ramrakhya, Aniruddha Kembhavi, Dhruv Batra et al.

CVPR 2024posterarXiv:2401.07770
#4529

Diffuse Attend and Segment: Unsupervised Zero-Shot Segmentation using Stable Diffusion

Junjiao Tian, Lavisha Aggarwal, Andrea Colaco et al.

CVPR 2024poster
#4530

WonderJourney: Going from Anywhere to Everywhere

Hong-Xing Yu, Haoyi Duan, Junhwa Hur et al.

CVPR 2024posterarXiv:2312.03884
#4531

A Dual-Augmentor Framework for Domain Generalization in 3D Human Pose Estimation

Qucheng Peng, Ce Zheng, Chen Chen

CVPR 2024posterarXiv:2403.11310
#4532

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

Kristen Grauman, Andrew Westbury, Lorenzo Torresani et al.

CVPR 2024posterarXiv:2311.18259
#4533

CAD-SIGNet: CAD Language Inference from Point Clouds using Layer-wise Sketch Instance Guided Attention

Mohammad Sadil Khan, Elona Dupont, Sk Aziz Ali et al.

CVPR 2024highlightarXiv:2402.17678
#4534

LAA-Net: Localized Artifact Attention Network for Quality-Agnostic and Generalizable Deepfake Detection

Dat NGUYEN, Nesryne Mejri, Inder Pal Singh et al.

CVPR 2024posterarXiv:2401.13856
#4535

Adaptive Random Feature Regularization on Fine-tuning Deep Neural Networks

Shin&#x27, ya Yamaguchi, Sekitoshi Kanai et al.

CVPR 2024poster
#4536

MRFP: Learning Generalizable Semantic Segmentation from Sim-2-Real with Multi-Resolution Feature Perturbation

Sumanth Udupa, Prajwal Gurunath, Aniruddh Sikdar et al.

CVPR 2024posterarXiv:2311.18331
#4537

A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions

Jack Urbanek, Florian Bordes, Pietro Astolfi et al.

CVPR 2024posterarXiv:2312.08578
#4538

An Interactive Navigation Method with Effect-oriented Affordance

Xiaohan Wang, Yuehu LIU, Xinhang Song et al.

CVPR 2024poster
#4539

MULDE: Multiscale Log-Density Estimation via Denoising Score Matching for Video Anomaly Detection

Jakub Micorek, Horst Possegger, Dominik Narnhofer et al.

CVPR 2024posterarXiv:2403.14497
#4540

PhysGaussian: Physics-Integrated 3D Gaussians for Generative Dynamics

Tianyi Xie, Zeshun Zong, Yuxing Qiu et al.

CVPR 2024highlightarXiv:2311.12198
#4541

Infrared Adversarial Car Stickers

Xiaopei Zhu, Yuqiu Liu, Zhanhao Hu et al.

CVPR 2024posterarXiv:2405.09924
#4542

Implicit Event-RGBD Neural SLAM

Delin Qu, Chi Yan, Dong Wang et al.

CVPR 2024highlightarXiv:2311.11013
#4543

Retraining-Free Model Quantization via One-Shot Weight-Coupling Learning

Chen Tang, Yuan Meng, Jiacheng Jiang et al.

CVPR 2024posterarXiv:2401.01543
#4544

Exploiting Inter-sample and Inter-feature Relations in Dataset Distillation

Wenxiao Deng, Wenbin Li, Tianyu Ding et al.

CVPR 2024posterarXiv:2404.00563
#4545

Privacy-Preserving Face Recognition Using Trainable Feature Subtraction

Yuxi Mi, Zhizhou Zhong, Yuge Huang et al.

CVPR 2024posterarXiv:2403.12457
#4546

Unified Entropy Optimization for Open-Set Test-Time Adaptation

Zhengqing Gao, Xu-Yao Zhang, Cheng-Lin Liu

CVPR 2024posterarXiv:2404.06065
#4547

Poly Kernel Inception Network for Remote Sensing Detection

Xinhao Cai, Qiuxia Lai, Yuwei Wang et al.

CVPR 2024posterarXiv:2403.06258
#4548

MTMMC: A Large-Scale Real-World Multi-Modal Camera Tracking Benchmark

Sanghyun Woo, Kwanyong Park, Inkyu Shin et al.

CVPR 2024posterarXiv:2403.20225
#4549

ParameterNet: Parameters Are All You Need for Large-scale Visual Pretraining of Mobile Networks

Kai Han, Yunhe Wang, Jianyuan Guo et al.

CVPR 2024poster
#4550

LASO: Language-guided Affordance Segmentation on 3D Object

Yicong Li, Na Zhao, Junbin Xiao et al.

CVPR 2024poster
#4551

Dispersed Structured Light for Hyperspectral 3D Imaging

Suhyun Shin, Seokjun Choi, Felix Heide et al.

CVPR 2024posterarXiv:2311.18287
#4552

Behind the Veil: Enhanced Indoor 3D Scene Reconstruction with Occluded Surfaces Completion

Su Sun, Cheng Zhao, Yuliang Guo et al.

CVPR 2024posterarXiv:2404.03070
#4553

ActiveDC: Distribution Calibration for Active Finetuning

Wenshuai Xu, Zhenghui Hu, Yu Lu et al.

CVPR 2024posterarXiv:2311.07634
#4554

AUEditNet: Dual-Branch Facial Action Unit Intensity Manipulation with Implicit Disentanglement

Shiwei Jin, Zhen Wang, Lei Wang et al.

CVPR 2024posterarXiv:2404.05063
#4555

VecFusion: Vector Font Generation with Diffusion

Vikas Thamizharasan, Difan Liu, Shantanu Agarwal et al.

CVPR 2024highlightarXiv:2312.10540
#4556

SeMoLi: What Moves Together Belongs Together

Jenny Seidenschwarz, Aljoša Ošep, Francesco Ferroni et al.

CVPR 2024posterarXiv:2402.19463
#4557

HINTED: Hard Instance Enhanced Detector with Mixed-Density Feature Fusion for Sparsely-Supervised 3D Object Detection

Qiming Xia, Wei Ye, Hai Wu et al.

CVPR 2024poster
#4558

Domain-Specific Block Selection and Paired-View Pseudo-Labeling for Online Test-Time Adaptation

Yeonguk Yu, Sungho Shin, Seunghyeok Back et al.

CVPR 2024posterarXiv:2404.10966
#4559

LLMs are Good Sign Language Translators

Jia Gong, Lin Geng Foo, Yixuan He et al.

CVPR 2024posterarXiv:2404.00925
#4560

FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition

Sicheng Mo, Fangzhou Mu, Kuan Heng Lin et al.

CVPR 2024posterarXiv:2312.07536
#4561

G-FARS: Gradient-Field-based Auto-Regressive Sampling for 3D Part Grouping

Junfeng Cheng, Tania Stathaki

CVPR 2024posterarXiv:2405.06828
#4562

Masked AutoDecoder is Effective Multi-Task Vision Generalist

Han Qiu, Jiaxing Huang, Peng Gao et al.

CVPR 2024posterarXiv:2403.07692
#4563

Deciphering ‘What’ and ‘Where’ Visual Pathways from Spectral Clustering of Layer-Distributed Neural Representations

Xiao Zhang, David Yunis, Michael Maire

CVPR 2024highlightarXiv:2312.06716
#4564

Probabilistic Speech-Driven 3D Facial Motion Synthesis: New Benchmarks Methods and Applications

Karren Yang, Anurag Ranjan, Jen-Hao Rick Chang et al.

CVPR 2024poster
#4565

From Feature to Gaze: A Generalizable Replacement of Linear Layer for Gaze Estimation

Yiwei Bao, Feng Lu

CVPR 2024highlight
#4566

Language Models as Black-Box Optimizers for Vision-Language Models

Shihong Liu, Samuel Yu, Zhiqiu Lin et al.

CVPR 2024posterarXiv:2309.05950
#4567

Transferable Structural Sparse Adversarial Attack Via Exact Group Sparsity Training

Di Ming, Peng Ren, Yunlong Wang et al.

CVPR 2024poster
#4568

ASH: Animatable Gaussian Splats for Efficient and Photoreal Human Rendering

Haokai Pang, Heming Zhu, Adam Kortylewski et al.

CVPR 2024posterarXiv:2312.05941
#4569

Equivariant Plug-and-Play Image Reconstruction

Matthieu Terris, Thomas Moreau, Nelly Pustelnik et al.

CVPR 2024posterarXiv:2312.01831
#4570

DiffusionAvatars: Deferred Diffusion for High-fidelity 3D Head Avatars

Tobias Kirschstein, Simon Giebenhain, Matthias Nießner

CVPR 2024posterarXiv:2311.18635
#4571

Learning Background Prompts to Discover Implicit Knowledge for Open Vocabulary Object Detection

Jiaming Li, Jiacheng Zhang, Jichang Li et al.

CVPR 2024posterarXiv:2406.00510
#4572

AVFF: Audio-Visual Feature Fusion for Video Deepfake Detection

Trevine Oorloff, Surya Koppisetti, Nicolo Bonettini et al.

CVPR 2024posterarXiv:2406.02951
#4573

Brain Decodes Deep Nets

Huzheng Yang, James Gee, Jianbo Shi

CVPR 2024highlightarXiv:2312.01280
#4574

A Unified Framework for Microscopy Defocus Deblur with Multi-Pyramid Transformer and Contrastive Learning

Yuelin Zhang, Pengyu Zheng, Wanquan Yan et al.

CVPR 2024posterarXiv:2403.02611
#4575

SchurVINS: Schur Complement-Based Lightweight Visual Inertial Navigation System

Yunfei Fan, Tianyu Zhao, Guidong Wang

CVPR 2024posterarXiv:2312.01616
#4576

360DVD: Controllable Panorama Video Generation with 360-Degree Video Diffusion Model

Qian Wang, Weiqi Li, Chong Mou et al.

CVPR 2024posterarXiv:2401.06578
#4577

Alpha Invariance: On Inverse Scaling Between Distance and Volume Density in Neural Radiance Fields

Joshua Ahn, Haochen Wang, Raymond A. Yeh et al.

CVPR 2024posterarXiv:2404.02155
#4578

ANIM: Accurate Neural Implicit Model for Human Reconstruction from a single RGB-D Image

Marco Pesavento, Yuanlu Xu, Nikolaos Sarafianos et al.

CVPR 2024posterarXiv:2403.10357
#4579

vid-TLDR: Training Free Token Merging for Light-weight Video Transformer

Joonmyung Choi, Sanghyeok Lee, Jaewon Chu et al.

CVPR 2024posterarXiv:2403.13347
#4580

Weakly-Supervised Audio-Visual Video Parsing with Prototype-based Pseudo-Labeling

Kranthi Kumar Rachavarapu, Kalyan Ramakrishnan, A. N. Rajagopalan

CVPR 2024poster
#4581

DeconfuseTrack: Dealing with Confusion for Multi-Object Tracking

Cheng Huang, Shoudong Han, Mengyu He et al.

CVPR 2024poster
#4582

ChatPose: Chatting about 3D Human Pose

Yao Feng, Jing Lin, Sai Kumar Dwivedi et al.

CVPR 2024posterarXiv:2311.18836
#4583

Perturbing Attention Gives You More Bang for the Buck: Subtle Imaging Perturbations That Efficiently Fool Customized Diffusion Models

Jingyao Xu, Yuetong Lu, Yandong Li et al.

CVPR 2024posterarXiv:2404.15081
#4584

ReGenNet: Towards Human Action-Reaction Synthesis

Liang Xu, Yizhou Zhou, Yichao Yan et al.

CVPR 2024posterarXiv:2403.11882
#4585

ZONE: Zero-Shot Instruction-Guided Local Editing

Shanglin Li, Bohan Zeng, Yutang Feng et al.

CVPR 2024posterarXiv:2312.16794
#4586

Part-aware Unified Representation of Language and Skeleton for Zero-shot Action Recognition

Anqi Zhu, Qiuhong Ke, Mingming Gong et al.

CVPR 2024posterarXiv:2406.13327
#4587

Diffusion-ES: Gradient-free Planning with Diffusion for Autonomous and Instruction-guided Driving

Brian Yang, Huangyuan Su, Nikolaos Gkanatsios et al.

CVPR 2024poster
#4588

Structured Model Probing: Empowering Efficient Transfer Learning by Structured Regularization

Zhi-Fan Wu, Chaojie Mao, Xue Wang et al.

CVPR 2024poster
#4589

TULIP: Transformer for Upsampling of LiDAR Point Clouds

Bin Yang, Patrick Pfreundschuh, Roland Siegwart et al.

CVPR 2024posterarXiv:2312.06733
#4590

Hierarchical Diffusion Policy for Kinematics-Aware Multi-Task Robotic Manipulation

Xiao Ma, Sumit Patidar, Iain Haughton et al.

CVPR 2024posterarXiv:2403.03890
#4591

ODCR: Orthogonal Decoupling Contrastive Regularization for Unpaired Image Dehazing

Zhongze Wang, Haitao Zhao, Jingchao Peng et al.

CVPR 2024posterarXiv:2404.17825
#4592

Unsupervised Learning of Category-Level 3D Pose from Object-Centric Videos

Leonhard Sommer, Artur Jesslen, Eddy Ilg et al.

CVPR 2024posterarXiv:2407.04384
#4593

PICTURE: PhotorealistIC virtual Try-on from UnconstRained dEsigns

Shuliang Ning, Duomin Wang, Yipeng Qin et al.

CVPR 2024posterarXiv:2312.04534
#4594

Learned Representation-Guided Diffusion Models for Large-Image Generation

Alexandros Graikos, Srikar Yellapragada, Minh-Quan Le et al.

CVPR 2024posterarXiv:2312.07330
#4595

REWIND: Real-Time Egocentric Whole-Body Motion Diffusion with Exemplar-Based Identity Conditioning

Jian Wang, Zhe Cao, Diogo Luvizon et al.

CVPR 2024poster
#4596

Spectral and Polarization Vision: Spectro-polarimetric Real-world Dataset

Yujin Jeon, Eunsue Choi, Youngchan Kim et al.

CVPR 2024highlightarXiv:2311.17396
#4597

Diffusion Models Without Attention

Jing Nathan Yan, Jiatao Gu, Alexander Rush

CVPR 2024posterarXiv:2311.18257
#4598

H-ViT: A Hierarchical Vision Transformer for Deformable Image Registration

Morteza Ghahremani, Mohammad Khateri, Bailiang Jian et al.

CVPR 2024highlight
#4599

Going Beyond Multi-Task Dense Prediction with Synergy Embedding Models

Huimin Huang, Yawen Huang, Lanfen Lin et al.

CVPR 2024poster
#4600

MR-VNet: Media Restoration using Volterra Networks

Siddharth Roheda, Amit Unde, Loay Rashid

CVPR 2024poster