Most Cited 2024 Poster Papers

12,324 papers found • Page 39 of 62

#7601

UniMix: Towards Domain Adaptive and Generalizable LiDAR Semantic Segmentation in Adverse Weather

Haimei Zhao, Jing Zhang, Zhuo Chen et al.

CVPR 2024posterarXiv:2404.05145
#7602

BilevelPruning: Unified Dynamic and Static Channel Pruning for Convolutional Neural Networks

Shangqian Gao, Yanfu Zhang, Feihu Huang et al.

CVPR 2024poster
#7603

Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models

Jiayi Guo, Xingqian Xu, Yifan Pu et al.

CVPR 2024posterarXiv:2312.04410
#7604

Editable Scene Simulation for Autonomous Driving via Collaborative LLM-Agents

Yuxi Wei, Zi Wang, Yifan Lu et al.

CVPR 2024highlightarXiv:2402.05746
#7605

Instruct-ReID: A Multi-purpose Person Re-identification Task with Instructions

Weizhen He, Yiheng Deng, SHIXIANG TANG et al.

CVPR 2024posterarXiv:2306.07520
#7606

SPOT: Self-Training with Patch-Order Permutation for Object-Centric Learning with Autoregressive Transformers

Ioannis Kakogeorgiou, Spyros Gidaris, Konstantinos Karantzalos et al.

CVPR 2024highlightarXiv:2312.00648
#7607

MRC-Net: 6-DoF Pose Estimation with MultiScale Residual Correlation

Yuelong Li, Yafei Mao, Raja Bala et al.

CVPR 2024posterarXiv:2403.08019
#7608

Faces that Speak: Jointly Synthesising Talking Face and Speech from Text

Youngjoon Jang, Jihoon Kim, Junseok Ahn et al.

CVPR 2024posterarXiv:2405.10272
#7609

Learning to Segment Referred Objects from Narrated Egocentric Videos

Yuhan Shen, Huiyu Wang, Xitong Yang et al.

CVPR 2024poster
#7610

CLIP-BEVFormer: Enhancing Multi-View Image-Based BEV Detector with Ground Truth Flow

Chenbin Pan, Burhan Yaman, Senem Velipasalar et al.

CVPR 2024posterarXiv:2403.08919
#7611

ADFactory: An Effective Framework for Generalizing Optical Flow with NeRF

Han Ling, Quansen Sun, Yinghui Sun et al.

CVPR 2024posterarXiv:2311.04246
#7612

FastMAC: Stochastic Spectral Sampling of Correspondence Graph

Yifei Zhang, Hao Zhao, Hongyang Li et al.

CVPR 2024posterarXiv:2403.08770
#7613

PeLK: Parameter-efficient Large Kernel ConvNets with Peripheral Convolution

Honghao Chen, Xiangxiang Chu, Renyongjian et al.

CVPR 2024posterarXiv:2403.07589
#7614

Improving Semantic Correspondence with Viewpoint-Guided Spherical Maps

Octave Mariotti, Oisin Mac Aodha, Hakan Bilen

CVPR 2024posterarXiv:2312.13216
#7615

Learning Degradation-Independent Representations for Camera ISP Pipelines

Yanhui Guo, Fangzhou Luo, Xiaolin Wu

CVPR 2024posterarXiv:2307.00761
#7616

Low-Resource Vision Challenges for Foundation Models

Yunhua Zhang, Hazel Doughty, Cees G. M. Snoek

CVPR 2024posterarXiv:2401.04716
#7617

MaxQ: Multi-Axis Query for N:M Sparsity Network

Jingyang Xiang, Siqi Li, Junhao Chen et al.

CVPR 2024poster
#7618

Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models

Haoning Wu, Zicheng Zhang, Erli Zhang et al.

CVPR 2024posterarXiv:2311.06783
#7619

Efficient Scene Recovery Using Luminous Flux Prior

ZhongYu Li, Lei Zhang

CVPR 2024poster
#7620

Towards CLIP-driven Language-free 3D Visual Grounding via 2D-3D Relational Enhancement and Consistency

Yuqi Zhang, Han Luo, Yinjie Lei

CVPR 2024poster
#7621

LaRE^2: Latent Reconstruction Error Based Method for Diffusion-Generated Image Detection

Yunpeng Luo, Junlong Du, Ke Yan et al.

CVPR 2024posterarXiv:2403.17465
#7622

Stratified Avatar Generation from Sparse Observations

Han Feng, Wenchao Ma, Quankai Gao et al.

CVPR 2024posterarXiv:2405.20786
#7623

Compressed 3D Gaussian Splatting for Accelerated Novel View Synthesis

Simon Niedermayr, Josef Stumpfegger, rüdiger westermann

CVPR 2024posterarXiv:2401.02436
#7624

Motion Blur Decomposition with Cross-shutter Guidance

Xiang Ji, Haiyang Jiang, Yinqiang Zheng

CVPR 2024posterarXiv:2404.01120
#7625

NAPGuard: Towards Detecting Naturalistic Adversarial Patches

Siyang Wu, Jiakai Wang, Jiejie Zhao et al.

CVPR 2024poster
#7626

Active Prompt Learning in Vision Language Models

Jihwan Bang, Sumyeong Ahn, Jae-Gil Lee

CVPR 2024posterarXiv:2311.11178
#7627

Generating Handwritten Mathematical Expressions From Symbol Graphs: An End-to-End Pipeline

Yu chen, Fei Gao, YanguangZhang et al.

CVPR 2024poster
#7628

On the Robustness of Language Guidance for Low-Level Vision Tasks: Findings from Depth Estimation

Agneet Chatterjee, Tejas Gokhale, Chitta Baral et al.

CVPR 2024posterarXiv:2404.08540
#7629

SingularTrajectory: Universal Trajectory Predictor Using Diffusion Model

Inhwan Bae, Young-Jae Park, Hae-Gon Jeon

CVPR 2024posterarXiv:2403.18452
#7630

Solving the Catastrophic Forgetting Problem in Generalized Category Discovery

Xinzi Cao, Xiawu Zheng, Guanhong Wang et al.

CVPR 2024posterarXiv:2501.05272
#7631

MatchU: Matching Unseen Objects for 6D Pose Estimation from RGB-D Images

Junwen Huang, Hao Yu, Kuan-Ting Yu et al.

CVPR 2024posterarXiv:2403.01517
#7632

The Manga Whisperer: Automatically Generating Transcriptions for Comics

Ragav Sachdeva, Andrew Zisserman

CVPR 2024posterarXiv:2401.10224
#7633

3DInAction: Understanding Human Actions in 3D Point Clouds

Yizhak Ben-Shabat, Oren Shrout, Stephen Gould

CVPR 2024highlightarXiv:2303.06346
#7634

StyLitGAN: Image-Based Relighting via Latent Control

Anand Bhattad, James Soole, David Forsyth

CVPR 2024poster
#7635

Unsupervised Universal Image Segmentation

XuDong Wang, Dantong Niu, Xinyang Han et al.

CVPR 2024posterarXiv:2312.17243
#7636

Not All Classes Stand on Same Embeddings: Calibrating a Semantic Distance with Metric Tensor

Jae Hyeon Park, Gyoomin Lee, Seunggi Park et al.

CVPR 2024poster
#7637

Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling

Zhe Li, Zerong Zheng, Lizhen Wang et al.

CVPR 2024poster
#7638

Retrieval-Augmented Open-Vocabulary Object Detection

Jooyeon Kim, Eulrang Cho, Sehyung Kim et al.

CVPR 2024posterarXiv:2404.05687
#7639

LangSplat: 3D Language Gaussian Splatting

Minghan Qin, Wanhua Li, Jiawei ZHOU et al.

CVPR 2024highlightarXiv:2312.16084
#7640

SeaBird: Segmentation in Bird’s View with Dice Loss Improves Monocular 3D Detection of Large Objects

Abhinav Kumar, Yuliang Guo, Xinyu Huang et al.

CVPR 2024posterarXiv:2403.20318
#7641

Learning with Structural Labels for Learning with Noisy Labels

Noo-ri Kim, Jin-Seop Lee, Jee-Hyong Lee

CVPR 2024poster
#7642

Incremental Nuclei Segmentation from Histopathological Images via Future-class Awareness and Compatibility-inspired Distillation

Huyong Wang, Huisi Wu, Jing Qin

CVPR 2024poster
#7643

Model Inversion Robustness: Can Transfer Learning Help?

Sy-Tuyen Ho, Koh Jun Hao, Keshigeyan Chandrasegaran et al.

CVPR 2024posterarXiv:2405.05588
#7644

MaskCLR: Attention-Guided Contrastive Learning for Robust Action Representation Learning

Mohamed Abdelfattah, Mariam Hassan, Alex Alahi

CVPR 2024poster
#7645

D3T: Distinctive Dual-Domain Teacher Zigzagging Across RGB-Thermal Gap for Domain-Adaptive Object Detection

Dinh Phat Do, Taehoon Kim, JAEMIN NA et al.

CVPR 2024posterarXiv:2403.09359
#7646

Intrinsic Image Diffusion for Indoor Single-view Material Estimation

Peter Kocsis, Vincent Sitzmann, Matthias Nießner

CVPR 2024posterarXiv:2312.12274
#7647

NetTrack: Tracking Highly Dynamic Objects with a Net

Guangze Zheng, Shijie Lin, Haobo Zuo et al.

CVPR 2024posterarXiv:2403.11186
#7648

FADES: Fair Disentanglement with Sensitive Relevance

Taeuk Jang, Xiaoqian Wang

CVPR 2024poster
#7649

Deep Single Image Camera Calibration by Heatmap Regression to Recover Fisheye Images Under Manhattan World Assumption

Nobuhiko Wakai, Satoshi Sato, Yasunori Ishii et al.

CVPR 2024posterarXiv:2303.17166
#7650

HybridNeRF: Efficient Neural Rendering via Adaptive Volumetric Surfaces

Haithem Turki, Vasu Agrawal, Samuel Rota Bulò et al.

CVPR 2024highlightarXiv:2312.03160
#7651

IIRP-Net: Iterative Inference Residual Pyramid Network for Enhanced Image Registration

Tai Ma, zhangsuwei, Jiafeng Li et al.

CVPR 2024poster
#7652

SEED-Bench: Benchmarking Multimodal Large Language Models

Bohao Li, Yuying Ge, Yixiao Ge et al.

CVPR 2024poster
#7653

Style Aligned Image Generation via Shared Attention

Amir Hertz, Andrey Voynov, Shlomi Fruchter et al.

CVPR 2024posterarXiv:2312.02133
#7654

NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows

Zhenggang Tang, Jason Ren, Xiaoming Zhao et al.

CVPR 2024posterarXiv:2406.10543
#7655

LTGC: Long-tail Recognition via Leveraging LLMs-driven Generated Content

Qihao Zhao, Yalun Dai, Hao Li et al.

CVPR 2024posterarXiv:2403.05854
#7656

How to Train Neural Field Representations: A Comprehensive Study and Benchmark

Samuele Papa, Riccardo Valperga, David Knigge et al.

CVPR 2024posterarXiv:2312.10531
#7657

Reg-PTQ: Regression-specialized Post-training Quantization for Fully Quantized Object Detector

Yifu Ding, Weilun Feng, Chuyan Chen et al.

CVPR 2024poster
#7658

FREE: Faster and Better Data-Free Meta-Learning

Yongxian Wei, Zixuan Hu, Zhenyi Wang et al.

CVPR 2024posterarXiv:2405.00984
#7659

You Only Need Less Attention at Each Stage in Vision Transformers

Shuoxi Zhang, Hanpeng Liu, Stephen Lin et al.

CVPR 2024posterarXiv:2406.00427
#7660

Hierarchical Patch Diffusion Models for High-Resolution Video Generation

Ivan Skorokhodov, Willi Menapace, Aliaksandr Siarohin et al.

CVPR 2024posterarXiv:2406.07792
#7661

CPLIP: Zero-Shot Learning for Histopathology with Comprehensive Vision-Language Alignment

Sajid Javed, Arif Mahmood, IYYAKUTTI IYAPPAN GANAPATHI et al.

CVPR 2024posterarXiv:2406.05205
#7662

Towards Memorization-Free Diffusion Models

Chen Chen, Daochang Liu, Chang Xu

CVPR 2024posterarXiv:2404.00922
#7663

FedHCA2: Towards Hetero-Client Federated Multi-Task Learning

Yuxiang Lu, Suizhi Huang, Yuwen Yang et al.

CVPR 2024poster
#7664

Improving Unsupervised Hierarchical Representation with Reinforcement Learning

Ruyi An, Yewen Li, Xu He et al.

CVPR 2024poster
#7665

Global Latent Neural Rendering

Thomas Tanay, Matteo Maggioni

CVPR 2024posterarXiv:2312.08338
#7666

Data Poisoning based Backdoor Attacks to Contrastive Learning

Jinghuai Zhang, Hongbin Liu, Jinyuan Jia et al.

CVPR 2024posterarXiv:2211.08229
#7667

ERMVP: Communication-Efficient and Collaboration-Robust Multi-Vehicle Perception in Challenging Environments

Jingyu Zhang, Kun Yang, Yilei Wang et al.

CVPR 2024poster
#7668

GRAM: Global Reasoning for Multi-Page VQA

Itshak Blau, Sharon Fogel, Roi Ronen et al.

CVPR 2024posterarXiv:2401.03411
#7669

HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data

Qifan Yu, Juncheng Li, Longhui Wei et al.

CVPR 2024posterarXiv:2311.13614
#7670

Perception-Oriented Video Frame Interpolation via Asymmetric Blending

Guangyang Wu, Xin Tao, Changlin Li et al.

CVPR 2024posterarXiv:2404.06692
#7671

Cooperation Does Matter: Exploring Multi-Order Bilateral Relations for Audio-Visual Segmentation

Qi Yang, Xing Nie, Tong Li et al.

CVPR 2024highlightarXiv:2312.06462
#7672

Improving Generalized Zero-Shot Learning by Exploring the Diverse Semantics from External Class Names

Yapeng Li, Yong Luo, Zengmao Wang et al.

CVPR 2024poster
#7673

Continual Learning for Motion Prediction Model via Meta-Representation Learning and Optimal Memory Buffer Retention Strategy

Dae Jun Kang, Dongsuk Kum, Sanmin Kim

CVPR 2024poster
#7674

3DGS-Avatar: Animatable Avatars via Deformable 3D Gaussian Splatting

Zhiyin Qian, Shaofei Wang, Marko Mihajlovic et al.

CVPR 2024posterarXiv:2312.09228
#7675

Drag Your Noise: Interactive Point-based Editing via Diffusion Semantic Propagation

Haofeng Liu, Chenshu Xu, Yifei Yang et al.

CVPR 2024posterarXiv:2404.01050
#7676

Building Vision-Language Models on Solid Foundations with Masked Distillation

Sepehr Sameni, Kushal Kafle, Hao Tan et al.

CVPR 2024poster
#7677

FaceChain-SuDe: Building Derived Class to Inherit Category Attributes for One-shot Subject-Driven Generation

Pengchong Qiao, Lei Shang, Chang Liu et al.

CVPR 2024posterarXiv:2403.06775
#7678

OmniSDF: Scene Reconstruction using Omnidirectional Signed Distance Functions and Adaptive Binoctrees

Hakyeong Kim, Andreas Meuleman, Hyeonjoong Jang et al.

CVPR 2024posterarXiv:2404.00678
#7679

VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models

Hyeonho Jeong, Geon Yeong Park, Jong Chul Ye

CVPR 2024posterarXiv:2312.00845
#7680

M3-UDA: A New Benchmark for Unsupervised Domain Adaptive Fetal Cardiac Structure Detection

Bin Pu, Liwen Wang, Jiewen Yang et al.

CVPR 2024poster
#7681

Expandable Subspace Ensemble for Pre-Trained Model-Based Class-Incremental Learning

Da-Wei Zhou, Hai-Long Sun, Han-Jia Ye et al.

CVPR 2024posterarXiv:2403.12030
#7682

Multiway Point Cloud Mosaicking with Diffusion and Global Optimization

Shengze Jin, Iro Armeni, Marc Pollefeys et al.

CVPR 2024posterarXiv:2404.00429
#7683

NeRSP: Neural 3D Reconstruction for Reflective Objects with Sparse Polarized Images

Yufei Han, Heng Guo, Koki Fukai et al.

CVPR 2024posterarXiv:2406.07111
#7684

Beyond Text: Frozen Large Language Models in Visual Signal Comprehension

Lei Zhu, Fangyun Wei, Yanye Lu

CVPR 2024posterarXiv:2403.07874
#7685

CAM Back Again: Large Kernel CNNs from a Weakly Supervised Object Localization Perspective

Shunsuke Yasuki, Masato Taki

CVPR 2024posterarXiv:2403.06676
#7686

Regularized Parameter Uncertainty for Improving Generalization in Reinforcement Learning

Pehuen Moure, Longbiao Cheng, Joachim Ott et al.

CVPR 2024poster
#7687

Robust Noisy Correspondence Learning with Equivariant Similarity Consistency

Yuchen Yang, Erkun Yang, Likai Wang et al.

CVPR 2024poster
#7688

OmniVec2 - A Novel Transformer based Network for Large Scale Multimodal and Multitask Learning

Siddharth Srivastava, Gaurav Sharma

CVPR 2024posterarXiv:2507.13364
#7689

Utility-Fairness Trade-Offs and How to Find Them

Sepehr Dehdashtian, Bashir Sadeghi, Vishnu Naresh Boddeti

CVPR 2024posterarXiv:2404.09454
#7690

Fitting Flats to Flats

Gabriel Dogadov, Ugo Finnendahl, Marc Alexa

CVPR 2024poster
#7691

HOIST-Former: Hand-held Objects Identification Segmentation and Tracking in the Wild

Supreeth Narasimhaswamy, Huy Anh Nguyen, Lihan Huang et al.

CVPR 2024poster
#7692

WaveFace: Authentic Face Restoration with Efficient Frequency Recovery

Yunqi Miao, Jiankang Deng, Jungong Han

CVPR 2024posterarXiv:2403.12760
#7693

Learning Adaptive Spatial Coherent Correlations for Speech-Preserving Facial Expression Manipulation

Tianshui Chen, Jianman Lin, Zhijing Yang et al.

CVPR 2024highlight
#7694

UFORecon: Generalizable Sparse-View Surface Reconstruction from Arbitrary and Unfavorable Sets

Youngju Na, Woo Jae Kim, Kyu Han et al.

CVPR 2024posterarXiv:2403.05086
#7695

Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Lihe Yang, Bingyi Kang, Zilong Huang et al.

CVPR 2024posterarXiv:2401.10891
#7696

EventDance: Unsupervised Source-free Cross-modal Adaptation for Event-based Object Recognition

Xu Zheng, Addison, Lin Wang

CVPR 2024posterarXiv:2403.14082
#7697

Real-World Efficient Blind Motion Deblurring via Blur Pixel Discretization

Insoo Kim, Jae Seok Choi, Geonseok Seo et al.

CVPR 2024posterarXiv:2404.12168
#7698

HUGS: Holistic Urban 3D Scene Understanding via Gaussian Splatting

Hongyu Zhou, Jiahao Shao, Lu Xu et al.

CVPR 2024posterarXiv:2403.12722
#7699

Human Motion Prediction Under Unexpected Perturbation

Jiangbei Yue, Baiyi Li, Julien Pettré et al.

CVPR 2024highlightarXiv:2403.15891
#7700

SynFog: A Photo-realistic Synthetic Fog Dataset based on End-to-end Imaging Simulation for Advancing Real-World Defogging in Autonomous Driving

Yiming Xie, Henglu Wei, Zhenyi Liu et al.

CVPR 2024posterarXiv:2403.17094
#7701

IntrinsicAvatar: Physically Based Inverse Rendering of Dynamic Humans from Monocular Videos via Explicit Ray Tracing

Shaofei Wang, Bozidar Antic, Andreas Geiger et al.

CVPR 2024posterarXiv:2312.05210
#7702

All in One Framework for Multimodal Re-identification in the Wild

He Li, Mang Ye, Ming Zhang et al.

CVPR 2024posterarXiv:2405.04741
#7703

FedSOL: Stabilized Orthogonal Learning with Proximal Restrictions in Federated Learning

Gihun Lee, Minchan Jeong, SangMook Kim et al.

CVPR 2024posterarXiv:2308.12532
#7704

Don't Look into the Dark: Latent Codes for Pluralistic Image Inpainting

Haiwei Chen, Yajie Zhao

CVPR 2024posterarXiv:2403.18186
#7705

WHAM: Reconstructing World-grounded Humans with Accurate 3D Motion

Soyong Shin, Juyong Kim, Eni Halilaj et al.

CVPR 2024posterarXiv:2312.07531
#7706

FedUV: Uniformity and Variance for Heterogeneous Federated Learning

Ha Min Son, Moon-Hyun Kim, Tai-Myoung Chung et al.

CVPR 2024posterarXiv:2402.18372
#7707

Language-driven Grasp Detection

An Dinh Vuong, Minh Nhat VU, Baoru Huang et al.

CVPR 2024posterarXiv:2406.09489
#7708

Each Test Image Deserves A Specific Prompt: Continual Test-Time Adaptation for 2D Medical Image Segmentation

Ziyang Chen, Yongsheng Pan, Yiwen Ye et al.

CVPR 2024posterarXiv:2311.18363
#7709

Prompting Vision Foundation Models for Pathology Image Analysis

CHONG YIN, Siqi Liu, Kaiyang Zhou et al.

CVPR 2024poster
#7710

Unmixing Before Fusion: A Generalized Paradigm for Multi-Source-based Hyperspectral Image Synthesis

Yang Yu, Erting Pan, Xinya Wang et al.

CVPR 2024poster
#7711

Navigating Beyond Dropout: An Intriguing Solution towards Generalizable Image Super Resolution

Hongjun Wang, Jiyuan Chen, Yinqiang Zheng et al.

CVPR 2024posterarXiv:2402.18929
#7712

Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior

Chen Cheng, Xiaofeng Yang, Fan Yang et al.

CVPR 2024posterarXiv:2403.09140
#7713

Task2Box: Box Embeddings for Modeling Asymmetric Task Relationships

Rangel Daroya, Aaron Sun, Subhransu Maji

CVPR 2024highlightarXiv:2403.17173
#7714

ExMap: Leveraging Explainability Heatmaps for Unsupervised Group Robustness to Spurious Correlations

Rwiddhi Chakraborty, Adrian de Sena Sletten, Michael C. Kampffmeyer

CVPR 2024posterarXiv:2403.13870
#7715

DUDF: Differentiable Unsigned Distance Fields with Hyperbolic Scaling

Miguel Fainstein, Viviana Siless, Emmanuel Iarussi

CVPR 2024posterarXiv:2402.08876
#7716

PBWR: Parametric-Building-Wireframe Reconstruction from Aerial LiDAR Point Clouds

Shangfeng Huang, Ruisheng Wang, Bo Guo et al.

CVPR 2024poster
#7717

GenH2R: Learning Generalizable Human-to-Robot Handover via Scalable Simulation Demonstration and Imitation

Zifan Wang, Junyu Chen, Ziqing Chen et al.

CVPR 2024poster
#7718

Content-Style Decoupling for Unsupervised Makeup Transfer without Generating Pseudo Ground Truth

Zhaoyang Sun, Shengwu Xiong, Yaxiong Chen et al.

CVPR 2024posterarXiv:2405.17240
#7719

UFOGen: You Forward Once Large Scale Text-to-Image Generation via Diffusion GANs

Yanwu Xu, Yang Zhao, Zhisheng Xiao et al.

CVPR 2024highlightarXiv:2311.09257
#7720

SelfPose3d: Self-Supervised Multi-Person Multi-View 3d Pose Estimation

Keqi Chen, vinkle srivastav, Nicolas Padoy

CVPR 2024posterarXiv:2404.02041
#7721

Context-Aware Integration of Language and Visual References for Natural Language Tracking

Yanyan Shao, Shuting He, Qi Ye et al.

CVPR 2024posterarXiv:2403.19975
#7722

PointBeV: A Sparse Approach for BeV Predictions

Loick Chambon, Éloi Zablocki, Mickaël Chen et al.

CVPR 2024posterarXiv:2312.00703
#7723

Contextual Augmented Global Contrast for Multimodal Intent Recognition

Kaili Sun, Zhiwen Xie, Mang Ye et al.

CVPR 2024poster
#7724

Intraoperative 2D/3D Image Registration via Differentiable X-ray Rendering

Vivek Gopalakrishnan, Neel Dey, Polina Golland

CVPR 2024posterarXiv:2312.06358
#7725

LLM4SGG: Large Language Models for Weakly Supervised Scene Graph Generation

Kibum Kim, Kanghoon Yoon, Jaehyeong Jeon et al.

CVPR 2024posterarXiv:2310.10404
#7726

FreeKD: Knowledge Distillation via Semantic Frequency Prompt

Yuan Zhang, Tao Huang, Jiaming Liu et al.

CVPR 2024posterarXiv:2311.12079
#7727

Enhanced Motion-Text Alignment for Image-to-Video Transfer Learning

Wei Zhang, Chaoqun Wan, Tongliang Liu et al.

CVPR 2024poster
#7728

SNIDA: Unlocking Few-Shot Object Detection with Non-linear Semantic Decoupling Augmentation

Yanjie Wang, Xu Zou, Luxin Yan et al.

CVPR 2024poster
#7729

Diffusion 3D Features (Diff3F): Decorating Untextured Shapes with Distilled Semantic Features

Niladri Shekhar Dutt, Sanjeev Muralikrishnan, Niloy J. Mitra

CVPR 2024posterarXiv:2311.17024
#7730

Unsupervised Occupancy Learning from Sparse Point Cloud

Amine Ouasfi, Adnane Boukhayma

CVPR 2024highlightarXiv:2404.02759
#7731

Hierarchical Intra-modal Correlation Learning for Label-free 3D Semantic Segmentation

Xin Kang, Lei Chu, Jiahao Li et al.

CVPR 2024poster
#7732

Adaptive Hyper-graph Aggregation for Modality-Agnostic Federated Learning

Fan Qi, Shuai Li

CVPR 2024poster
#7733

Efficient and Effective Weakly-Supervised Action Segmentation via Action-Transition-Aware Boundary Alignment

Angchi Xu, Wei-Shi Zheng

CVPR 2024posterarXiv:2403.19225
#7734

MoMask: Generative Masked Modeling of 3D Human Motions

chuan guo, Yuxuan Mu, Muhammad Gohar Javed et al.

CVPR 2024posterarXiv:2312.00063
#7735

Event-based Visible and Infrared Fusion via Multi-task Collaboration

Mengyue Geng, Lin Zhu, Lizhi Wang et al.

CVPR 2024poster
#7736

MotionEditor: Editing Video Motion via Content-Aware Diffusion

Shuyuan Tu, Qi Dai, Zhi-Qi Cheng et al.

CVPR 2024posterarXiv:2311.18830
#7737

State Space Models for Event Cameras

Nikola Zubic, Mathias Gehrig, Davide Scaramuzza

CVPR 2024poster
#7738

Test-Time Linear Out-of-Distribution Detection

Ke Fan, Tong Liu, Xingyu Qiu et al.

CVPR 2024poster
#7739

Exploiting Style Latent Flows for Generalizing Deepfake Video Detection

Jongwook Choi, Taehoon Kim, Yonghyun Jeong et al.

CVPR 2024posterarXiv:2403.06592
#7740

Unsupervised Semantic Segmentation Through Depth-Guided Feature Correlation and Sampling

Leon Sick, Dominik Engel, Pedro Hermosilla et al.

CVPR 2024posterarXiv:2309.12378
#7741

Generalizing 6-DoF Grasp Detection via Domain Prior Knowledge

Haoxiang Ma, Modi Shi, Boyang GAO et al.

CVPR 2024posterarXiv:2404.01727
#7742

Generalized Event Cameras

Varun Sundar, Matthew Dutson, Andrei Ardelean et al.

CVPR 2024posterarXiv:2407.02683
#7743

Dual Pose-invariant Embeddings: Learning Category and Object-specific Discriminative Representations for Recognition and Retrieval

Rohan Sarkar, Avinash Kak

CVPR 2024posterarXiv:2403.00272
#7744

NIVeL: Neural Implicit Vector Layers for Text-to-Vector Generation

Vikas Thamizharasan, Difan Liu, Matthew Fisher et al.

CVPR 2024posterarXiv:2405.15217
#7745

Hyperbolic Anomaly Detection

Huimin Li, Zhentao Chen, Yunhao Xu et al.

CVPR 2024poster
#7746

DanceCamera3D: 3D Camera Movement Synthesis with Music and Dance

Zixuan Wang, Jia Jia, Shikun Sun et al.

CVPR 2024posterarXiv:2403.13667
#7747

HoloVIC: Large-scale Dataset and Benchmark for Multi-Sensor Holographic Intersection and Vehicle-Infrastructure Cooperative

CONG MA, Qiao Lei, Chengkai Zhu et al.

CVPR 2024posterarXiv:2403.02640
#7748

What Sketch Explainability Really Means for Downstream Tasks?

Hmrishav Bandyopadhyay, Pinaki Nath Chowdhury, Ayan Kumar Bhunia et al.

CVPR 2024posterarXiv:2403.09480
#7749

CommonCanvas: Open Diffusion Models Trained on Creative-Commons Images

Aaron Gokaslan, A. Feder Cooper, Jasmine Collins et al.

CVPR 2024poster
#7750

Memory-based Adapters for Online 3D Scene Perception

Xiuwei Xu, Chong Xia, Ziwei Wang et al.

CVPR 2024posterarXiv:2403.06974
#7751

Cross-spectral Gated-RGB Stereo Depth Estimation

Samuel Brucker, Stefanie Walz, Mario Bijelic et al.

CVPR 2024highlightarXiv:2405.12759
#7752

Readout Guidance: Learning Control from Diffusion Features

Grace Luo, Trevor Darrell, Oliver Wang et al.

CVPR 2024highlightarXiv:2312.02150
#7753

SurroundSDF: Implicit 3D Scene Understanding Based on Signed Distance Field

Lizhe Liu, Bohua Wang, Hongwei Xie et al.

CVPR 2024highlightarXiv:2403.14366
#7754

MLIP: Enhancing Medical Visual Representation with Divergence Encoder and Knowledge-guided Contrastive Learning

Zhe Li, Laurence Yang, Bocheng Ren et al.

CVPR 2024posterarXiv:2402.02045
#7755

Density-Adaptive Model Based on Motif Matrix for Multi-Agent Trajectory Prediction

Di Wen, Haoran Xu, Zhaocheng He et al.

CVPR 2024poster
#7756

Towards Accurate Post-training Quantization for Diffusion Models

Changyuan Wang, Ziwei Wang, Xiuwei Xu et al.

CVPR 2024highlightarXiv:2305.18723
#7757

Gear-NeRF: Free-Viewpoint Rendering and Tracking with Motion-aware Spatio-Temporal Sampling

Xinhang Liu, Yu-Wing Tai, Chi-Keung Tang et al.

CVPR 2024highlightarXiv:2406.03723
#7758

MultiDiff: Consistent Novel View Synthesis from a Single Image

Norman Müller, Katja Schwarz, Barbara Roessle et al.

CVPR 2024posterarXiv:2406.18524
#7759

Uncertainty-aware Action Decoupling Transformer for Action Anticipation

Hongji Guo, Nakul Agarwal, Shao-Yuan Lo et al.

CVPR 2024highlight
#7760

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

Haoxin Chen, Yong Zhang, Xiaodong Cun et al.

CVPR 2024posterarXiv:2401.09047
#7761

EmoGen: Emotional Image Content Generation with Text-to-Image Diffusion Models

Jingyuan Yang, Jiawei Feng, Hui Huang

CVPR 2024posterarXiv:2401.04608
#7762

3D Facial Expressions through Analysis-by-Neural-Synthesis

George Retsinas, Panagiotis Filntisis, Radek Danecek et al.

CVPR 2024posterarXiv:2404.04104
#7763

Visual Layout Composer: Image-Vector Dual Diffusion Model for Design Layout Generation

Mohammad Amin Shabani, Zhaowen Wang, Difan Liu et al.

CVPR 2024poster
#7764

A Simple Recipe for Language-guided Domain Generalized Segmentation

Mohammad Fahes, TUAN-HUNG VU, Andrei Bursuc et al.

CVPR 2024posterarXiv:2311.17922
#7765

An Edit Friendly DDPM Noise Space: Inversion and Manipulations

Inbar Huberman-Spiegelglas, Vladimir Kulikov, Tomer Michaeli

CVPR 2024posterarXiv:2304.06140
#7766

Causal Mode Multiplexer: A Novel Framework for Unbiased Multispectral Pedestrian Detection

Taeheon Kim, Sebin Shin, Youngjoon Yu et al.

CVPR 2024posterarXiv:2403.01300
#7767

GOAT-Bench: A Benchmark for Multi-Modal Lifelong Navigation

Mukul Khanna, Ram Ramrakhya, Gunjan Chhablani et al.

CVPR 2024posterarXiv:2404.06609
#7768

HRVDA: High-Resolution Visual Document Assistant

Chaohu Liu, Kun Yin, Haoyu Cao et al.

CVPR 2024posterarXiv:2404.06918
#7769

Building Bridges across Spatial and Temporal Resolutions: Reference-Based Super-Resolution via Change Priors and Conditional Diffusion Model

Runmin Dong, Shuai Yuan, Bin Luo et al.

CVPR 2024posterarXiv:2403.17460
#7770

Resolution Limit of Single-Photon LiDAR

Stanley H. Chan, Hashan K Weerasooriya, Weijian Zhang et al.

CVPR 2024posterarXiv:2403.17719
#7771

Generating Enhanced Negatives for Training Language-Based Object Detectors

Shiyu Zhao, Long Zhao, Vijay Kumar BG et al.

CVPR 2024posterarXiv:2401.00094
#7772

Object Recognition as Next Token Prediction

Kaiyu Yue, Bor-Chun Chen, Jonas Geiping et al.

CVPR 2024highlightarXiv:2312.02142
#7773

MuGE: Multiple Granularity Edge Detection

Caixia Zhou, Yaping Huang, Mengyang Pu et al.

CVPR 2024poster
#7774

LOTUS: Evasive and Resilient Backdoor Attacks through Sub-Partitioning

Siyuan Cheng, Guanhong Tao, Yingqi Liu et al.

CVPR 2024posterarXiv:2403.17188
#7775

Unsupervised Salient Instance Detection

Xin Tian, Ke Xu, Rynson W.H. Lau

CVPR 2024poster
#7776

Multi-modal Instruction Tuned LLMs with Fine-grained Visual Perception

Junwen He, Yifan Wang, Lijun Wang et al.

CVPR 2024highlightarXiv:2403.02969
#7777

PanoOcc: Unified Occupancy Representation for Camera-based 3D Panoptic Segmentation

Yuqi Wang, Yuntao Chen, Xingyu Liao et al.

CVPR 2024posterarXiv:2306.10013
#7778

XFibrosis: Explicit Vessel-Fiber Modeling for Fibrosis Staging from Liver Pathology Images

CHONG YIN, Siqi Liu, Fei Lyu et al.

CVPR 2024poster
#7779

Discriminative Pattern Calibration Mechanism for Source-Free Domain Adaptation

Haifeng Xia, Siyu Xia, Zhengming Ding

CVPR 2024poster
#7780

RAM-Avatar: Real-time Photo-Realistic Avatar from Monocular Videos with Full-body Control

xiang deng, Zerong Zheng, Yuxiang Zhang et al.

CVPR 2024poster
#7781

PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding

Zhen Li, Mingdeng Cao, Xintao Wang et al.

CVPR 2024posterarXiv:2312.04461
#7782

3DSFLabelling: Boosting 3D Scene Flow Estimation by Pseudo Auto-labelling

Chaokang Jiang, Guangming Wang, Jiuming Liu et al.

CVPR 2024posterarXiv:2402.18146
#7783

CPR-Coach: Recognizing Composite Error Actions based on Single-class Training

Shunli Wang, Shuaibing Wang, Dingkang Yang et al.

CVPR 2024posterarXiv:2309.11718
#7784

Restoration by Generation with Constrained Priors

Zheng Ding, Xuaner Zhang, Zhuowen Tu et al.

CVPR 2024highlightarXiv:2312.17161
#7785

Parameter Efficient Fine-tuning via Cross Block Orchestration for Segment Anything Model

Zelin Peng, Zhengqin Xu, Zhilin Zeng et al.

CVPR 2024posterarXiv:2311.17112
#7786

Align and Aggregate: Compositional Reasoning with Video Alignment and Answer Aggregation for Video Question-Answering

Zhaohe Liao, Jiangtong Li, Li Niu et al.

CVPR 2024posterarXiv:2407.03008
#7787

Communication-Efficient Collaborative Perception via Information Filling with Codebook

Yue Hu, Juntong Peng, Sifei Liu et al.

CVPR 2024posterarXiv:2405.04966
#7788

QUADify: Extracting Meshes with Pixel-level Details and Materials from Images

Maximilian Frühauf, Hayko Riemenschneider, Markus Gross et al.

CVPR 2024highlight
#7789

Focus on Hiders: Exploring Hidden Threats for Enhancing Adversarial Training

Qian Li, Yuxiao Hu, Yinpeng Dong et al.

CVPR 2024posterarXiv:2312.07067
#7790

Any-Shift Prompting for Generalization over Distributions

Zehao Xiao, Jiayi Shen, Mohammad Mahdi Derakhshani et al.

CVPR 2024posterarXiv:2402.10099
#7791

Revisiting Counterfactual Problems in Referring Expression Comprehension

Zhihan Yu, Ruifan Li

CVPR 2024poster
#7792

VMINer: Versatile Multi-view Inverse Rendering with Near- and Far-field Light Sources

Fan Fei, Jiajun Tang, Ping Tan et al.

CVPR 2024highlight
#7793

Generating Non-Stationary Textures using Self-Rectification

Yang Zhou, Rongjun Xiao, Dani Lischinski et al.

CVPR 2024posterarXiv:2401.02847
#7794

OOSTraj: Out-of-Sight Trajectory Prediction With Vision-Positioning Denoising

Haichao Zhang, Yi Xu, Hongsheng Lu et al.

CVPR 2024posterarXiv:2404.02227
#7795

Rethinking Interactive Image Segmentation with Low Latency High Quality and Diverse Prompts

Qin Liu, Jaemin Cho, Mohit Bansal et al.

CVPR 2024posterarXiv:2404.00741
#7796

NAYER: Noisy Layer Data Generation for Efficient and Effective Data-free Knowledge Distillation

Minh-Tuan Tran, Trung Le, Xuan-May Le et al.

CVPR 2024posterarXiv:2310.00258
#7797

Revamping Federated Learning Security from a Defender's Perspective: A Unified Defense with Homomorphic Encrypted Data Space

Naveen Kumar Kummari, Reshmi Mitra, Krishna Mohan Chalavadi

CVPR 2024poster
#7798

PanoRecon: Real-Time Panoptic 3D Reconstruction from Monocular Video

Dong Wu, Zike Yan, Hongbin Zha

CVPR 2024poster
#7799

TACO: Benchmarking Generalizable Bimanual Tool-ACtion-Object Understanding

Yun Liu, Haolin Yang, Xu Si et al.

CVPR 2024posterarXiv:2401.08399
#7800

Learning by Correction: Efficient Tuning Task for Zero-Shot Generative Vision-Language Reasoning

Rongjie Li, Yu Wu, Xuming He

CVPR 2024posterarXiv:2404.00909