CVPR Papers

5,589 papers found • Page 83 of 112

Inversion-Free Image Editing with Language-Guided Diffusion Models

Sihan Xu, Yidong Huang, Jiayi Pan et al.

CVPR 2024poster
32
citations

Investigating and Mitigating the Side Effects of Noisy Views for Self-Supervised Clustering Algorithms in Practical Multi-View Scenarios

Jie Xu, Yazhou Ren, Xiaolong Wang et al.

CVPR 2024posterarXiv:2303.17245
41
citations

Investigating Compositional Challenges in Vision-Language Models for Visual Grounding

Yunan Zeng, Yan Huang, Jinjin Zhang et al.

CVPR 2024highlight

IPoD: Implicit Field Learning with Point Diffusion for Generalizable 3D Object Reconstruction from Single RGB-D Images

Yushuang Wu, Luyue Shi, Junhao Cai et al.

CVPR 2024highlightarXiv:2404.00269

IQ-VFI: Implicit Quadratic Motion Estimation for Video Frame Interpolation

Mengshun Hu, Kui Jiang, Zhihang Zhong et al.

CVPR 2024poster

IReNe: Instant Recoloring of Neural Radiance Fields

Alessio Mazzucchelli, Adrian Garcia-Garcia, Elena Garces et al.

CVPR 2024posterarXiv:2405.19876

Is Ego Status All You Need for Open-Loop End-to-End Autonomous Driving?

Zhiqi Li, Zhiding Yu, Shiyi Lan et al.

CVPR 2024posterarXiv:2312.03031
169
citations

IS-Fusion: Instance-Scene Collaborative Fusion for Multimodal 3D Object Detection

Junbo Yin, Wenguan Wang, Runnan Chen et al.

CVPR 2024highlightarXiv:2403.15241
81
citations

Is Vanilla MLP in Neural Radiance Field Enough for Few-shot View Synthesis?

Hanxin Zhu, Tianyu He, Xin Li et al.

CVPR 2024posterarXiv:2403.06092

Iterated Learning Improves Compositionality in Large Vision-Language Models

Chenhao Zheng, Jieyu Zhang, Aniruddha Kembhavi et al.

CVPR 2024posterarXiv:2404.02145
16
citations

iToF-flow-based High Frame Rate Depth Imaging

Yu Meng, Zhou Xue, Xu Chang et al.

CVPR 2024poster

It's All About Your Sketch: Democratising Sketch Control in Diffusion Models

Subhadeep Koley, Ayan Kumar Bhunia, Deeptanshu Sekhri et al.

CVPR 2024posterarXiv:2403.07234
28
citations

Jack of All Tasks Master of Many: Designing General-Purpose Coarse-to-Fine Vision-Language Model

Shraman Pramanick, Guangxing Han, Rui Hou et al.

CVPR 2024highlightarXiv:2312.12423
50
citations

JDEC: JPEG Decoding via Enhanced Continuous Cosine Coefficients

Woo Kyoung Han, Sunghoon Im, Jaedeok Kim et al.

CVPR 2024posterarXiv:2404.05558
3
citations

JeDi: Joint-Image Diffusion Models for Finetuning-Free Personalized Text-to-Image Generation

Yu Zeng, Vishal M. Patel, Haochen Wang et al.

CVPR 2024posterarXiv:2407.06187
47
citations

JoAPR: Cleaning the Lens of Prompt Learning for Vision-Language Models

YUNCHENG GUO, Xiaodong Gu

CVPR 2024poster

Joint2Human: High-Quality 3D Human Generation via Compact Spherical Embedding of 3D Joints

Muxin Zhang, Qiao Feng, Zhuo Su et al.

CVPR 2024posterarXiv:2312.08591

Jointly Training and Pruning CNNs via Learnable Agent Guidance and Alignment

Alireza Ganjdanesh, Shangqian Gao, Heng Huang

CVPR 2024posterarXiv:2403.19490
15
citations

Joint Reconstruction of 3D Human and Object via Contact-Based Refinement Transformer

Hyeongjin Nam, Daniel Jung, Gyeongsik Moon et al.

CVPR 2024posterarXiv:2404.04819
19
citations

JointSQ: Joint Sparsification-Quantization for Distributed Learning

Weiying Xie, Haowei Li, Ma Jitao et al.

CVPR 2024poster

Joint-Task Regularization for Partially Labeled Multi-Task Learning

Kento Nishi, Junsik Kim, Wanhua Li et al.

CVPR 2024posterarXiv:2404.01976

JRDB-PanoTrack: An Open-world Panoptic Segmentation and Tracking Robotic Dataset in Crowded Human Environments

Duy Tho Le, Chenhui Gou, Stavya Datta et al.

CVPR 2024posterarXiv:2404.01686

JRDB-Social: A Multifaceted Robotic Dataset for Understanding of Context and Dynamics of Human Interactions Within Social Groups

Simindokht Jahangard, Zhixi Cai, Shiki Wen et al.

CVPR 2024posterarXiv:2404.04458
13
citations

Just Add ?! Pose Induced Video Transformers for Understanding Activities of Daily Living

Dominick Reilly, Srijan Das

CVPR 2024poster

Kandinsky Conformal Prediction: Efficient Calibration of Image Segmentation Algorithms

Joren Brunekreef, Eric Marcus, Ray Sheombarsing et al.

CVPR 2024posterarXiv:2311.11837
15
citations

KD-DETR: Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling

Yu Wang, Xin Li, Shengzhao Wen et al.

CVPR 2024posterarXiv:2211.08071
16
citations

Kernel Adaptive Convolution for Scene Text Detection via Distance Map Prediction

Jinzhi Zheng, Heng Fan, Libo Zhang

CVPR 2024poster

KeyPoint Relative Position Encoding for Face Recognition

Minchul Kim, Feng Liu, Yiyang Su et al.

CVPR 2024posterarXiv:2403.14852

KITRO: Refining Human Mesh by 2D Clues and Kinematic-tree Rotation

Fengyuan Yang, Kerui Gu, Angela Yao

CVPR 2024posterarXiv:2405.19833
7
citations

Knowledge-Enhanced Dual-stream Zero-shot Composed Image Retrieval

Yucheng Suo, Fan Ma, Linchao Zhu et al.

CVPR 2024posterarXiv:2403.16005
45
citations

Know Your Neighbors: Improving Single-View Reconstruction via Spatial Vision-Language Reasoning

Rui Li, Tobias Fischer, Mattia Segu et al.

CVPR 2024posterarXiv:2404.03658

Koala: Key Frame-Conditioned Long Video-LLM

Reuben Tan, Ximeng Sun, Ping Hu et al.

CVPR 2024highlightarXiv:2404.04346
62
citations

KPConvX: Modernizing Kernel Point Convolution with Kernel Attention

Hugues Thomas, Yao-Hung Hubert Tsai, Timothy Barfoot et al.

CVPR 2024posterarXiv:2405.13194

KP-RED: Exploiting Semantic Keypoints for Joint 3D Shape Retrieval and Deformation

Ruida Zhang, Chenyangguang Zhang, Yan Di et al.

CVPR 2024posterarXiv:2403.10099
3
citations

KTPFormer: Kinematics and Trajectory Prior Knowledge-Enhanced Transformer for 3D Human Pose Estimation

Jihua Peng, Yanghong Zhou, Tracy P Y Mok

CVPR 2024posterarXiv:2404.00658

KVQ: Kwai Video Quality Assessment for Short-form Videos

Yiting Lu, Xin Li, Yajing Pei et al.

CVPR 2024posterarXiv:2402.07220

L0-Sampler: An L0 Model Guided Volume Sampling for NeRF

Liangchen Li, Juyong Zhang

CVPR 2024poster

L2B: Learning to Bootstrap Robust Models for Combating Label Noise

Yuyin Zhou, Xianhang li, Fengze Liu et al.

CVPR 2024posterarXiv:2202.04291

L4D-Track: Language-to-4D Modeling Towards 6-DoF Tracking and Shape Reconstruction in 3D Point Cloud Stream

Jingtao Sun, Yaonan Wang, Mingtao Feng et al.

CVPR 2024poster

LAA-Net: Localized Artifact Attention Network for Quality-Agnostic and Generalizable Deepfake Detection

Dat NGUYEN, Nesryne Mejri, Inder Pal Singh et al.

CVPR 2024posterarXiv:2401.13856

Label-Efficient Group Robustness via Out-of-Distribution Concept Curation

Yiwei Yang, Anthony Liu, Robert Wolfe et al.

CVPR 2024poster

Label Propagation for Zero-shot Classification with Vision-Language Models

Vladan Stojnić, Yannis Kalantidis, Giorgos Tolias

CVPR 2024posterarXiv:2404.04072

LAENeRF: Local Appearance Editing for Neural Radiance Fields

Lukas Radl, Michael Steiner, Andreas Kurz et al.

CVPR 2024posterarXiv:2312.09913

LAFS: Landmark-based Facial Self-supervised Learning for Face Recognition

Zhonglin Sun, Chen Feng, Ioannis Patras et al.

CVPR 2024posterarXiv:2403.08161

LAKE-RED: Camouflaged Images Generation by Latent Background Knowledge Retrieval-Augmented Diffusion

Pancheng Zhao, Peng Xu, Pengda Qin et al.

CVPR 2024posterarXiv:2404.00292
19
citations

LaMPilot: An Open Benchmark Dataset for Autonomous Driving with Language Model Programs

Yunsheng Ma, Can Cui, Xu Cao et al.

CVPR 2024posterarXiv:2312.04372

LAMP: Learn A Motion Pattern for Few-Shot Video Generation

Rui-Qi Wu, Liangyu Chen, Tong Yang et al.

CVPR 2024poster

Lane2Seq: Towards Unified Lane Detection via Sequence Generation

Kunyang Zhou

CVPR 2024posterarXiv:2402.17172

LaneCPP: Continuous 3D Lane Detection using Physical Priors

Maximilian Pittner, Joel Janai, Alexandru Paul Condurache

CVPR 2024posterarXiv:2406.08381
28
citations

LangSplat: 3D Language Gaussian Splatting

Minghan Qin, Wanhua Li, Jiawei ZHOU et al.

CVPR 2024highlightarXiv:2312.16084