Most Cited CVPR "neural network analysis" Papers

5,589 papers found • Page 21 of 28

#4001

Robotic Visual Instruction

Yanbang Li, ZiYang Gong, Haoyang Li et al.

CVPR 2025posterarXiv:2505.00693
#4002

Enhancing Vision-Language Compositional Understanding with Multimodal Synthetic Data

Haoxin Li, Boyang Li

CVPR 2025posterarXiv:2503.01167
#4003

AnyMap: Learning a General Camera Model for Structure-from-Motion with Unknown Distortion in Dynamic Scenes

Andrea Porfiri Dal Cin, Georgi Dikov, Jihong Ju et al.

CVPR 2025poster
#4004

Learning Physics From Video: Unsupervised Physical Parameter Estimation for Continuous Dynamical Systems

Alejandro Castañeda Garcia, Jan Warchocki, Jan van Gemert et al.

CVPR 2025posterarXiv:2410.01376
#4005

DiffusionSfM: Predicting Structure and Motion via Ray Origin and Endpoint Diffusion

Qitao Zhao, Amy Lin, Jeff Tan et al.

CVPR 2025posterarXiv:2505.05473
#4006

Navigating Image Restoration with VAR’s Distribution Alignment Prior

Siyang Wang, Naishan Zheng, Jie Huang et al.

CVPR 2025posterarXiv:2412.21063
#4007

Neuron: Learning Context-Aware Evolving Representations for Zero-Shot Skeleton Action Recognition

Yang Chen, Jingcai Guo, Song Guo et al.

CVPR 2025posterarXiv:2411.11288
#4008

Two by Two: Learning Multi-Task Pairwise Objects Assembly for Generalizable Robot Manipulation

Yu Qi, Yuanchen Ju, Tianming Wei et al.

CVPR 2025posterarXiv:2504.06961
#4009

ALIEN: Implicit Neural Representations for Human Motion Prediction under Arbitrary Latency

Dong Wei, Xiaoning Sun, Xizhan Gao et al.

CVPR 2025highlight
#4010

Decentralized Diffusion Models

David McAllister, Matthew Tancik, Jiaming Song et al.

CVPR 2025posterarXiv:2501.05450
#4011

TinyFusion: Diffusion Transformers Learned Shallow

Gongfan Fang, Kunjun Li, Xinyin Ma et al.

CVPR 2025highlightarXiv:2412.01199
#4012

Poly-Autoregressive Prediction for Modeling Interactions

Neerja Thakkar, Tara Sadjadpour, Jathushan Rajasegaran et al.

CVPR 2025posterarXiv:2502.08646
#4013

DiSciPLE: Learning Interpretable Programs for Scientific Visual Discovery

Utkarsh Mall, Cheng Perng Phoo, Mia Chiquier et al.

CVPR 2025posterarXiv:2502.10060
#4014

UVGS: Reimagining Unstructured 3D Gaussian Splatting using UV Mapping

Aashish Rai, Dilin Wang, Mihir Jain et al.

CVPR 2025posterarXiv:2502.01846
#4015

Z-Magic: Zero-shot Multiple Attributes Guided Image Creator

Yingying Deng, Xiangyu He, Fan Tang et al.

CVPR 2025posterarXiv:2503.12124
#4016

Science-T2I: Addressing Scientific Illusions in Image Synthesis

Jialuo Li, Wenhao Chai, XINGYU FU et al.

CVPR 2025posterarXiv:2504.13129
#4017

Minding Fuzzy Regions: A Data-driven Alternating Learning Paradigm for Stable Lesion Segmentation

Lexin Fang, Yunyang Xu, Xiang Ma et al.

CVPR 2025posterarXiv:2503.11140
#4018

LumiNet: Latent Intrinsics Meets Diffusion Models for Indoor Scene Relighting

Xiaoyan Xing, Konrad Groh, Sezer Karaoglu et al.

CVPR 2025posterarXiv:2412.00177
#4019

ProjAttacker: A Configurable Physical Adversarial Attack for Face Recognition via Projector

Yuanwei Liu, Hui Wei, Chengyu Jia et al.

CVPR 2025poster
#4020

Towards Improved Text-Aligned Codebook Learning: Multi-Hierarchical Codebook-Text Alignment with Long Text

Guotao liang, Baoquan Zhang, Zhiyuan Wen et al.

CVPR 2025highlightarXiv:2503.01261
#4021

StyleMaster: Stylize Your Video with Artistic Generation and Translation

Zixuan Ye, Huijuan Huang, Xintao Wang et al.

CVPR 2025posterarXiv:2412.07744
#4022

BlenderGym: Benchmarking Foundational Model Systems for Graphics Editing

Yunqi Gu, Ian Huang, Jihyeon Je et al.

CVPR 2025highlightarXiv:2504.01786
#4023

Accelerating Diffusion Transformer via Increment-Calibrated Caching with Channel-Aware Singular Value Decomposition

Zhiyuan Chen, Keyi Li, Yifan Jia et al.

CVPR 2025posterarXiv:2505.05829
#4024

RoomPainter: View-Integrated Diffusion for Consistent Indoor Scene Texturing

Zhipeng Huang, Wangbo Yu, Xinhua Cheng et al.

CVPR 2025posterarXiv:2412.16778
#4025

A Simple Data Augmentation for Feature Distribution Skewed Federated Learning

Yunlu Yan, Huazhu Fu, Yuexiang Li et al.

CVPR 2025posterarXiv:2306.09363
#4026

Graph-Embedded Structure-Aware Perceptual Hashing for Neural Network Protection and Piracy Detection

Ruiheng Liu, Haozhe Chen, Boyao Zhao et al.

CVPR 2025poster
#4027

Teller: Real-Time Streaming Audio-Driven Portrait Animation with Autoregressive Motion Generation

Dingcheng Zhen, Shunshun Yin, Shiyang Qin et al.

CVPR 2025posterarXiv:2503.18429
#4028

K-LoRA: Unlocking Training-Free Fusion of Any Subject and Style LoRAs

Ziheng Ouyang, Zhen Li, Qibin Hou

CVPR 2025posterarXiv:2502.18461
#4029

Less is More: Efficient Model Merging with Binary Task Switch

Biqing Qi, Fangyuan Li, Zhen Wang et al.

CVPR 2025highlightarXiv:2412.00054
#4030

Cross-Modal 3D Representation with Multi-View Images and Point Clouds

Ziyang Zhou, Pinghui Wang, Zi Liang et al.

CVPR 2025poster
#4031

HiPART: Hierarchical Pose AutoRegressive Transformer for Occluded 3D Human Pose Estimation

Hongwei Zheng, Han Li, Wenrui Dai et al.

CVPR 2025posterarXiv:2503.23331
#4032

Exploring Contextual Attribute Density in Referring Expression Counting

Zhicheng Wang, Zhiyu Pan, Zhan Peng et al.

CVPR 2025posterarXiv:2503.12460
#4033

FSHNet: Fully Sparse Hybrid Network for 3D Object Detection

Shuai Liu, Mingyue Cui, Boyang Li et al.

CVPR 2025posterarXiv:2506.03714
#4034

DiverseFlow: Sample-Efficient Diverse Mode Coverage in Flows

Mashrur M. Morshed, Vishnu Naresh Boddeti

CVPR 2025posterarXiv:2504.07894
#4035

Learning Compatible Multi-Prize Subnetworks for Asymmetric Retrieval

Yushuai Sun, Zikun Zhou, Dongmei Jiang et al.

CVPR 2025posterarXiv:2504.11879
#4036

Opportunistic Single-Photon Time of Flight

Sotiris Nousias, Mian Wei, Howard Xiao et al.

CVPR 2025poster
#4037

Volumetric Surfaces: Representing Fuzzy Geometries with Layered Meshes

Stefano Esposito, Anpei Chen, Christian Reiser et al.

CVPR 2025posterarXiv:2409.02482
#4038

MegaSynth: Scaling Up 3D Scene Reconstruction with Synthesized Data

Hanwen Jiang, Zexiang Xu, Desai Xie et al.

CVPR 2025posterarXiv:2412.14166
#4039

Pose Priors from Language Models

Sanjay Subramanian, Evonne Ng, Lea Müller et al.

CVPR 2025posterarXiv:2405.03689
#4040

Generative Densification: Learning to Densify Gaussians for High-Fidelity Generalizable 3D Reconstruction

Seungtae Nam, Xiangyu Sun, Gyeongjin Kang et al.

CVPR 2025highlightarXiv:2412.06234
#4041

Towards Optimizing Large-Scale Multi-Graph Matching in Bioimaging

Max Kahl, Sebastian Stricker, Lisa Hutschenreiter et al.

CVPR 2025poster
#4042

Image Quality Assessment: From Human to Machine Preference

Chunyi Li, Yuan Tian, Xiaoyue Ling et al.

CVPR 2025highlightarXiv:2503.10078
#4043

MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision

Ruicheng Wang, Sicheng Xu, Cassie Lee Dai et al.

CVPR 2025posterarXiv:2410.19115
#4044

DiffusionDrive: Truncated Diffusion Model for End-to-End Autonomous Driving

Bencheng Liao, Shaoyu Chen, haoran yin et al.

CVPR 2025highlightarXiv:2411.15139
#4045

Higher-Order Ratio Cycles for Fast and Globally Optimal Shape Matching

Paul Roetzer, Viktoria Ehm, Daniel Cremers et al.

CVPR 2025poster
#4046

Acc3D: Accelerating Single Image to 3D Diffusion Models via Edge Consistency Guided Score Distillation

Kendong Liu, Zhiyu Zhu, Hui LIU et al.

CVPR 2025posterarXiv:2503.15975
#4047

ReDiffDet: Rotation-equivariant Diffusion Model for Oriented Object Detection

Jiaqi Zhao, Zeyu Ding, Yong Zhou et al.

CVPR 2025poster
#4048

SKE-Layout: Spatial Knowledge Enhanced Layout Generation with LLMs

Junsheng Wang, Nieqing Cao, Yan Ding et al.

CVPR 2025poster
#4049

Continuous Adverse Weather Removal via Degradation-Aware Distillation

Xin Lu, Jie Xiao, Yurui Zhu et al.

CVPR 2025poster
#4050

Subspace Constraint and Contribution Estimation for Heterogeneous Federated Learning

Xiangtao Zhang, Sheng Li, Ao Li et al.

CVPR 2025poster
#4051

Emphasizing Discriminative Features for Dataset Distillation in Complex Scenarios

Kai Wang, Zekai Li, Zhi-Qi Cheng et al.

CVPR 2025posterarXiv:2410.17193
#4052

Unveil Inversion and Invariance in Flow Transformer for Versatile Image Editing

Pengcheng Xu, Boyuan Jiang, Xiaobin Hu et al.

CVPR 2025posterarXiv:2411.15843
#4053

SeqMvRL: A Sequential Fusion Framework for Multi-view Representation Learning

Ren Wang, Haoliang Sun, Yuxiu Lin et al.

CVPR 2025poster
#4054

Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language

Yicheng Chen, Xiangtai Li, Yining Li et al.

CVPR 2025posterarXiv:2406.20085
#4055

Structure-from-Motion with a Non-Parametric Camera Model

Yihan Wang, Linfei Pan, Marc Pollefeys et al.

CVPR 2025highlight
#4056

Sea-ing in Low-light

Nisha Varghese, A. N. Rajagopalan

CVPR 2025poster
#4057

Towards Autonomous Micromobility through Scalable Urban Simulation

Wayne Wu, Honglin He, Chaoyuan Zhang et al.

CVPR 2025highlightarXiv:2505.00690
#4058

Learning-enabled Polynomial Lyapunov Function Synthesis via High-Accuracy Counterexample-Guided Framework

Hanrui Zhao, Niuniu Qi, Mengxin Ren et al.

CVPR 2025poster
#4059

Advancing Adversarial Robustness in GNeRFs: The IL2-NeRF Attack

Nicole Meng, Caleb Manicke, Ronak Sahu et al.

CVPR 2025poster
#4060

Spiking Transformer: Introducing Accurate Addition-Only Spiking Self-Attention for Transformer

Yufei Guo, Xiaode Liu, Yuanpei Chen et al.

CVPR 2025poster
#4061

NoiseCtrl: A Sampling-Algorithm-Agnostic Conditional Generation Method for Diffusion Models

Longquan Dai, He Wang, Jinhui Tang

CVPR 2025poster
#4062

PCM : Picard Consistency Model for Fast Parallel Sampling of Diffusion Models

Junhyuk So, Jiwoong Shin, Chaeyeon Jang et al.

CVPR 2025posterarXiv:2503.19731
#4063

Towards Precise Scaling Laws for Video Diffusion Transformers

Yuanyang Yin, Yaqi Zhao, Mingwu Zheng et al.

CVPR 2025posterarXiv:2411.17470
#4064

Sonic: Shifting Focus to Global Audio Perception in Portrait Animation

Xiaozhong Ji, Xiaobin Hu, Zhihong Xu et al.

CVPR 2025posterarXiv:2411.16331
#4065

T2SG: Traffic Topology Scene Graph for Topology Reasoning in Autonomous Driving

Changsheng Lv, Mengshi Qi, Liang Liu et al.

CVPR 2025posterarXiv:2411.18894
#4066

MODfinity: Unsupervised Domain Adaptation with Multimodal Information Flow Intertwining

Shanglin Liu, Jianming Lv, Jingdan Kang et al.

CVPR 2025poster
#4067

Volume Tells: Dual Cycle-Consistent Diffusion for 3D Fluorescence Microscopy De-noising and Super-Resolution

ZELIN LI, Chenwei Wang, Zhaoke Huang et al.

CVPR 2025highlightarXiv:2503.02261
#4068

DreamTrack: Dreaming the Future for Multimodal Visual Object Tracking

Mingzhe Guo, Weiping Tan, Wenyu Ran et al.

CVPR 2025poster
#4069

Learned Image Compression with Dictionary-based Entropy Model

Jingbo Lu, Leheng Zhang, Xingyu Zhou et al.

CVPR 2025posterarXiv:2504.00496
#4070

Training Data Provenance Verification: Did Your Model Use Synthetic Data from My Generative Model for Training?

Yuechen Xie, Jie Song, Huiqiong Wang et al.

CVPR 2025posterarXiv:2503.09122
#4071

Shadow Generation Using Diffusion Model with Geometry Prior

Haonan Zhao, Qingyang Liu, Xinhao Tao et al.

CVPR 2025poster
#4072

How to Merge Your Multimodal Models Over Time?

Sebastian Dziadzio, Vishaal Udandarao, Karsten Roth et al.

CVPR 2025posterarXiv:2412.06712
#4073

Active Hyperspectral Imaging Using an Event Camera

Bohan Yu, Jinxiu Liang, Zhuofeng Wang et al.

CVPR 2025highlight
#4074

Bridging the Gap between Gaussian Diffusion Models and Universal Quantization for Image Compression

Lucas Relic, Roberto Azevedo, Yang Zhang et al.

CVPR 2025posterarXiv:2504.02579
#4075

InstanceGaussian: Appearance-Semantic Joint Gaussian Representation for 3D Instance-Level Perception

Haijie Li, Yanmin Wu, Jiarui Meng et al.

CVPR 2025posterarXiv:2411.19235
#4076

Online Task-Free Continual Learning via Dynamic Expansionable Memory Distribution

Fei Ye, Adrian Bors

CVPR 2025poster
#4077

Plug-and-Play Interpretable Responsible Text-to-Image Generation via Dual-Space Multi-facet Concept Control

Basim Azam, Naveed Akhtar

CVPR 2025posterarXiv:2503.18324
#4078

Generalized Gaussian Entropy Model for Point Cloud Attribute Compression with Dynamic Likelihood Intervals

Changhao Peng

CVPR 2025posterarXiv:2506.09510
#4079

HSI-GPT: A General-Purpose Large Scene-Motion-Language Model for Human Scene Interaction

Yuan Wang, Yali Li, Lixiang Li et al.

CVPR 2025highlight
#4080

SoftShadow: Leveraging Soft Masks for Penumbra-Aware Shadow Removal

Xinrui Wang, Lanqing Guo, Xiyu Wang et al.

CVPR 2025posterarXiv:2409.07041
#4081

Towards Satellite Image Road Graph Extraction: A Global-Scale Dataset and A Novel Method

Pan Yin, Kaiyu Li, Xiangyong Cao et al.

CVPR 2025posterarXiv:2411.16733
#4082

ManipTrans: Efficient Dexterous Bimanual Manipulation Transfer via Residual Learning

Kailin Li, Puhao Li, Tengyu Liu et al.

CVPR 2025posterarXiv:2503.21860
#4083

COB-GS: Clear Object Boundaries in 3DGS Segmentation Based on Boundary-Adaptive Gaussian Splitting

Jiaxin Zhang, Junjun Jiang, Youyu Chen et al.

CVPR 2025posterarXiv:2503.19443
#4084

GROVE: A Generalized Reward for Learning Open-Vocabulary Physical Skill

Jieming Cui, Tengyu Liu, Ziyu Meng et al.

CVPR 2025posterarXiv:2504.04191
#4085

OPTICAL: Leveraging Optimal Transport for Contribution Allocation in Dataset Distillation

Xiao Cui, Yulei Qin, Wengang Zhou et al.

CVPR 2025highlight
#4086

Incremental Object Keypoint Learning

Mingfu Liang, Jiahuan Zhou, Xu Zou et al.

CVPR 2025posterarXiv:2503.20248
#4087

SceneDiffuser++: City-Scale Traffic Simulation via a Generative World Model

Shuhan Tan, John Wheatley Lambert, Hong Jeon et al.

CVPR 2025posterarXiv:2506.21976
#4088

Learning Extremely High Density Crowds as Active Matters

Feixiang He, Jiangbei Yue, Jialin Zhu et al.

CVPR 2025posterarXiv:2503.12168
#4089

Mosaic3D: Foundation Dataset and Model for Open-Vocabulary 3D Segmentation

Junha Lee, Chunghyun Park, Jaesung Choe et al.

CVPR 2025posterarXiv:2502.02548
#4090

MobileH2R: Learning Generalizable Human to Mobile Robot Handover Exclusively from Scalable and Diverse Synthetic Data

Zifan Wang, Ziqing Chen, Junyu Chen et al.

CVPR 2025posterarXiv:2501.04595
#4091

Is this Generated Person Existed in Real-world? Fine-grained Detecting and Calibrating Abnormal Human-body

Zeqing Wang, Qingyang Ma, Wentao Wan et al.

CVPR 2025highlightarXiv:2411.14205
#4092

Shape and Texture: What Influences Reliable Optical Flow Estimation?

Libo Long, Xiao Hu, Jochen Lang

CVPR 2025poster
#4093

Rate-In: Information-Driven Adaptive Dropout Rates for Improved Inference-Time Uncertainty Estimation

Tal Zeevi, Ravid Shwartz-Ziv, Yann LeCun et al.

CVPR 2025posterarXiv:2412.07169
#4094

RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins

Yao Mu, Tianxing Chen, Zanxin Chen et al.

CVPR 2025highlightarXiv:2504.13059
#4095

Beyond Background Shift: Rethinking Instance Replay in Continual Semantic Segmentation

Hongmei Yin, Tingliang Feng, Fan Lyu et al.

CVPR 2025posterarXiv:2503.22136
#4096

DKC: Differentiated Knowledge Consolidation for Cloth-Hybrid Lifelong Person Re-identification

Zhenyu Cui, Jiahuan Zhou, Yuxin Peng

CVPR 2025poster
#4097

Rectification-specific Supervision and Constrained Estimator for Online Stereo Rectification

Rui Gong, Kim-Hui Yap, Weide Liu et al.

CVPR 2025poster
#4098

Dual Focus-Attention Transformer for Robust Point Cloud Registration

Kexue Fu, Ming'zhi Yuan, Changwei Wang et al.

CVPR 2025poster
#4099

Traversing Distortion-Perception Tradeoff using a Single Score-Based Generative Model

Yuhan Wang, Suzhi Bi, Ying-Jun Angela Zhang et al.

CVPR 2025posterarXiv:2503.20297
#4100

SLVR: Super-Light Visual Reconstruction via Blueprint Controllable Convolutions and Exploring Feature Diversity Representation

Ning Ni, Libao Zhang

CVPR 2025poster
#4101

MobileMamba: Lightweight Multi-Receptive Visual Mamba Network

Haoyang He, Jiangning Zhang, Yuxuan Cai et al.

CVPR 2025posterarXiv:2411.15941
#4102

Learning Endogenous Attention for Incremental Object Detection

Xiang Song, Yuhang He, Jingyuan Li et al.

CVPR 2025poster
#4103

Minimizing Labeled, Maximizing Unlabeled: An Image-Driven Approach for Video Instance Segmentation

Fangyun Wei, Jinjing Zhao, Kun Yan et al.

CVPR 2025poster
#4104

DeClotH: Decomposable 3D Cloth and Human Body Reconstruction from a Single Image

Hyeongjin Nam, Donghwan Kim, Jeongtaek Oh et al.

CVPR 2025posterarXiv:2503.19373
#4105

Learning to Sample Effective and Diverse Prompts for Text-to-Image Generation

Taeyoung Yun, Dinghuai Zhang, Jinkyoo Park et al.

CVPR 2025posterarXiv:2502.11477
#4106

Perceptual Inductive Bias Is What You Need Before Contrastive Learning

Junru Zhao, Tianqin Li, Dunhan Jiang et al.

CVPR 2025posterarXiv:2506.01201
#4107

Diffusion Self-Distillation for Zero-Shot Customized Image Generation

Shengqu Cai, Eric Ryan Chan, Yunzhi Zhang et al.

CVPR 2025posterarXiv:2411.18616
#4108

EnvPoser: Environment-aware Realistic Human Motion Estimation from Sparse Observations with Uncertainty Modeling

Songpengcheng Xia, Yu Zhang, Zhuo Su et al.

CVPR 2025posterarXiv:2412.10235
#4109

HOIAnimator: Generating Text-prompt Human-object Animations using Novel Perceptive Diffusion Models

Wenfeng Song, Xinyu Zhang, Shuai Li et al.

CVPR 2024poster
#4110

HDQMF: Holographic Feature Decomposition Using Quantum Algorithms

Prathyush Poduval, Zhuowen Zou, Mohsen Imani

CVPR 2024poster
#4111

DrivingGaussian: Composite Gaussian Splatting for Surrounding Dynamic Autonomous Driving Scenes

Xiaoyu Zhou, Zhiwei Lin, Xiaojun Shan et al.

CVPR 2024posterarXiv:2312.07920
#4112

PortraitBooth: A Versatile Portrait Model for Fast Identity-preserved Personalization

Xu Peng, Junwei Zhu, Boyuan Jiang et al.

CVPR 2024posterarXiv:2312.06354
#4113

Dr.Hair: Reconstructing Scalp-Connected Hair Strands without Pre-Training via Differentiable Rendering of Line Segments

Yusuke Takimoto, Hikari Takehara, Hiroyuki Sato et al.

CVPR 2024highlightarXiv:2403.17496
#4114

Cache Me if You Can: Accelerating Diffusion Models through Block Caching

Felix Wimbauer, Bichen Wu, Edgar Schoenfeld et al.

CVPR 2024posterarXiv:2312.03209
#4115

SparseOcc: Rethinking Sparse Latent Representation for Vision-Based Semantic Occupancy Prediction

Pin Tang, Zhongdao Wang, Guoqing Wang et al.

CVPR 2024posterarXiv:2404.09502
#4116

Beyond First-Order Tweedie: Solving Inverse Problems using Latent Diffusion

Litu Rout, Yujia Chen, Abhishek Kumar et al.

CVPR 2024posterarXiv:2312.00852
#4117

MAPLM: A Real-World Large-Scale Vision-Language Benchmark for Map and Traffic Scene Understanding

Xu Cao, Tong Zhou, Yunsheng Ma et al.

CVPR 2024poster
#4118

Customize your NeRF: Adaptive Source Driven 3D Scene Editing via Local-Global Iterative Training

Runze He, Shaofei Huang, Xuecheng Nie et al.

CVPR 2024posterarXiv:2312.01663
#4119

Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft

Hao Li, Xue Yang, Zhaokai Wang et al.

CVPR 2024posterarXiv:2312.09238
#4120

Direct2.5: Diverse Text-to-3D Generation via Multi-view 2.5D Diffusion

Yuanxun Lu, Jingyang Zhang, Shiwei Li et al.

CVPR 2024posterarXiv:2311.15980
#4121

Gradient-based Parameter Selection for Efficient Fine-Tuning

Zhi Zhang, Qizhe Zhang, Zijun Gao et al.

CVPR 2024posterarXiv:2312.10136
#4122

HPL-ESS: Hybrid Pseudo-Labeling for Unsupervised Event-based Semantic Segmentation

Linglin Jing, Yiming Ding, Yunpeng Gao et al.

CVPR 2024posterarXiv:2403.16788
#4123

Promptable Behaviors: Personalizing Multi-Objective Rewards from Human Preferences

Minyoung Hwang, Luca Weihs, Chanwoo Park et al.

CVPR 2024posterarXiv:2312.09337
#4124

Fourier Priors-Guided Diffusion for Zero-Shot Joint Low-Light Enhancement and Deblurring

Xiaoqian Lv, Shengping Zhang, Chenyang Wang et al.

CVPR 2024poster
#4125

From Isolated Islands to Pangea: Unifying Semantic Space for Human Action Understanding

Yonglu Li, Xiaoqian Wu, Xinpeng Liu et al.

CVPR 2024highlightarXiv:2304.00553
#4126

LowRankOcc: Tensor Decomposition and Low-Rank Recovery for Vision-based 3D Semantic Occupancy Prediction

Linqing Zhao, Xiuwei Xu, Ziwei Wang et al.

CVPR 2024poster
#4127

UniDepth: Universal Monocular Metric Depth Estimation

Luigi Piccinelli, Yung-Hsu Yang, Christos Sakaridis et al.

CVPR 2024highlightarXiv:2403.18913
#4128

Small Steps and Level Sets: Fitting Neural Surface Models with Point Guidance

Chamin Hewa Koneputugodage, Yizhak Ben-Shabat, Dylan Campbell et al.

CVPR 2024poster
#4129

Adapt or Perish: Adaptive Sparse Transformer with Attentive Feature Refinement for Image Restoration

Shihao Zhou, Duosheng Chen, Jinshan Pan et al.

CVPR 2024poster
#4130

3D Human Pose Perception from Egocentric Stereo Videos

Hiroyasu Akada, Jian Wang, Vladislav Golyanik et al.

CVPR 2024highlightarXiv:2401.00889
#4131

Check Locate Rectify: A Training-Free Layout Calibration System for Text-to-Image Generation

Biao Gong, Siteng Huang, Yutong Feng et al.

CVPR 2024poster
#4132

Volumetric Environment Representation for Vision-Language Navigation

Liu, Wenguan Wang, Yi Yang

CVPR 2024highlightarXiv:2403.14158
#4133

Continual-MAE: Adaptive Distribution Masked Autoencoders for Continual Test-Time Adaptation

Jiaming Liu, Ran Xu, Senqiao Yang et al.

CVPR 2024posterarXiv:2312.12480
#4134

Is Vanilla MLP in Neural Radiance Field Enough for Few-shot View Synthesis?

Hanxin Zhu, Tianyu He, Xin Li et al.

CVPR 2024posterarXiv:2403.06092
#4135

DIEM: Decomposition-Integration Enhancing Multimodal Insights

Xinyi Jiang, Guoming Wang, Junhao Guo et al.

CVPR 2024poster
#4136

DeMatch: Deep Decomposition of Motion Field for Two-View Correspondence Learning

Shihua Zhang, Zizhuo Li, Yuan Gao et al.

CVPR 2024poster
#4137

Domain-Agnostic Mutual Prompting for Unsupervised Domain Adaptation

Zhekai Du, Xinyao Li, Fengling Li et al.

CVPR 2024posterarXiv:2403.02899
#4138

Absolute Pose from One or Two Scaled and Oriented Features

Jonathan Ventura, Zuzana Kukelova, Torsten Sattler et al.

CVPR 2024highlight
#4139

Training Vision Transformers for Semi-Supervised Semantic Segmentation

Xinting Hu, Li Jiang, Bernt Schiele

CVPR 2024poster
#4140

APSeg: Auto-Prompt Network for Cross-Domain Few-Shot Semantic Segmentation

Weizhao He, Yang Zhang, Wei Zhuo et al.

CVPR 2024posterarXiv:2406.08372
#4141

SFOD: Spiking Fusion Object Detector

Yimeng Fan, Wei Zhang, Changsong Liu et al.

CVPR 2024posterarXiv:2403.15192
#4142

InstanceDiffusion: Instance-level Control for Image Generation

XuDong Wang, Trevor Darrell, Sai Saketh Rambhatla et al.

CVPR 2024posterarXiv:2402.03290
#4143

Robust Emotion Recognition in Context Debiasing

Dingkang Yang, Kun Yang, Mingcheng Li et al.

CVPR 2024posterarXiv:2403.05963
#4144

Balancing Act: Distribution-Guided Debiasing in Diffusion Models

Rishubh Parihar, Abhijnya Bhat, Abhipsa Basu et al.

CVPR 2024posterarXiv:2402.18206
#4145

Sieve: Multimodal Dataset Pruning using Image Captioning Models

Anas Mahmoud, Mostafa Elhoushi, Amro Abbas et al.

CVPR 2024posterarXiv:2310.02110
#4146

Not All Voxels Are Equal: Hardness-Aware Semantic Scene Completion with Self-Distillation

Song Wang, Jiawei Yu, Wentong Li et al.

CVPR 2024posterarXiv:2404.11958
#4147

Towards Fairness-Aware Adversarial Learning

Yanghao Zhang, Tianle Zhang, Ronghui Mu et al.

CVPR 2024posterarXiv:2402.17729
#4148

SOK-Bench: A Situated Video Reasoning Benchmark with Aligned Open-World Knowledge

Andong Wang, Bo Wu, Sunli Chen et al.

CVPR 2024posterarXiv:2405.09713
#4149

MuRF: Multi-Baseline Radiance Fields

Haofei Xu, Anpei Chen, Yuedong Chen et al.

CVPR 2024posterarXiv:2312.04565
#4150

Hide in Thicket: Generating Imperceptible and Rational Adversarial Perturbations on 3D Point Clouds

Tianrui Lou, Xiaojun Jia, Jindong Gu et al.

CVPR 2024posterarXiv:2403.05247
#4151

Retrieval-Augmented Egocentric Video Captioning

Jilan Xu, Yifei Huang, Junlin Hou et al.

CVPR 2024posterarXiv:2401.00789
#4152

Low-Rank Knowledge Decomposition for Medical Foundation Models

Yuhang Zhou, Haolin li, Siyuan Du et al.

CVPR 2024posterarXiv:2404.17184
#4153

Pixel-level Semantic Correspondence through Layout-aware Representation Learning and Multi-scale Matching Integration

Yixuan Sun, Zhangyue Yin, Haibo Wang et al.

CVPR 2024poster
#4154

Event-assisted Low-Light Video Object Segmentation

Li Hebei, Jin Wang, Jiahui Yuan et al.

CVPR 2024posterarXiv:2404.01945
#4155

3DToonify: Creating Your High-Fidelity 3D Stylized Avatar Easily from 2D Portrait Images

Yifang Men, Hanxi Liu, Yuan Yao et al.

CVPR 2024poster
#4156

CAT-Seg: Cost Aggregation for Open-Vocabulary Semantic Segmentation

Seokju Cho, Heeseong Shin, Sunghwan Hong et al.

CVPR 2024highlightarXiv:2303.11797
#4157

PIE-NeRF: Physics-based Interactive Elastodynamics with NeRF

Yutao Feng, Yintong Shang, Xuan Li et al.

CVPR 2024posterarXiv:2311.13099
#4158

MAFA: Managing False Negatives for Vision-Language Pre-training

Jaeseok Byun, Dohoon Kim, Taesup Moon

CVPR 2024posterarXiv:2312.06112
#4159

ChatScene: Knowledge-Enabled Safety-Critical Scenario Generation for Autonomous Vehicles

Jiawei Zhang, Chejian Xu, Bo Li

CVPR 2024posterarXiv:2405.14062
#4160

MMSum: A Dataset for Multimodal Summarization and Thumbnail Generation of Videos

Jielin Qiu, Jiacheng Zhu, William Han et al.

CVPR 2024highlightarXiv:2306.04216
#4161

Learning Structure-from-Motion with Graph Attention Networks

Lucas Brynte, José Pedro Iglesias, Carl Olsson et al.

CVPR 2024posterarXiv:2308.15984
#4162

SNIFFER: Multimodal Large Language Model for Explainable Out-of-Context Misinformation Detection

Peng Qi, Zehong Yan, Wynne Hsu et al.

CVPR 2024posterarXiv:2403.03170
#4163

Spatial-Aware Regression for Keypoint Localization

Dongkai Wang, Shiliang Zhang

CVPR 2024highlight
#4164

EMOPortraits: Emotion-enhanced Multimodal One-shot Head Avatars

Nikita Drobyshev, Antoni Bigata Casademunt, Konstantinos Vougioukas et al.

CVPR 2024posterarXiv:2404.19110
#4165

Latent Modulated Function for Computational Optimal Continuous Image Representation

Zongyao He, Zhi Jin

CVPR 2024highlightarXiv:2404.16451
#4166

Domain-Rectifying Adapter for Cross-Domain Few-Shot Segmentation

Jiapeng Su, Qi Fan, Wenjie Pei et al.

CVPR 2024posterarXiv:2404.10322
#4167

L2B: Learning to Bootstrap Robust Models for Combating Label Noise

Yuyin Zhou, Xianhang li, Fengze Liu et al.

CVPR 2024posterarXiv:2202.04291
#4168

OED: Towards One-stage End-to-End Dynamic Scene Graph Generation

Guan Wang, Zhimin Li, Qingchao Chen et al.

CVPR 2024posterarXiv:2405.16925
#4169

ViewFusion: Towards Multi-View Consistency via Interpolated Denoising

Xianghui Yang, Gil Avraham, Yan Zuo et al.

CVPR 2024posterarXiv:2402.18842
#4170

Delving into the Trajectory Long-tail Distribution for Muti-object Tracking

Sijia Chen, En Yu, Jinyang Li et al.

CVPR 2024posterarXiv:2403.04700
#4171

Streaming Dense Video Captioning

Xingyi Zhou, Anurag Arnab, Shyamal Buch et al.

CVPR 2024posterarXiv:2404.01297
#4172

On the Scalability of Diffusion-based Text-to-Image Generation

Hao Li, Yang Zou, Ying Wang et al.

CVPR 2024posterarXiv:2404.02883
#4173

Bootstrapping Autonomous Driving Radars with Self-Supervised Learning

Yiduo Hao, Sohrab Madani, Junfeng Guan et al.

CVPR 2024posterarXiv:2312.04519
#4174

OneLLM: One Framework to Align All Modalities with Language

Jiaming Han, Kaixiong Gong, Yiyuan Zhang et al.

CVPR 2024posterarXiv:2312.03700
#4175

LAFS: Landmark-based Facial Self-supervised Learning for Face Recognition

Zhonglin Sun, Chen Feng, Ioannis Patras et al.

CVPR 2024posterarXiv:2403.08161
#4176

PRDP: Proximal Reward Difference Prediction for Large-Scale Reward Finetuning of Diffusion Models

Fei Deng, Qifei Wang, Wei Wei et al.

CVPR 2024posterarXiv:2402.08714
#4177

MMVP: A Multimodal MoCap Dataset with Vision and Pressure Sensors

He Zhang, Shenghao Ren, Haolei Yuan et al.

CVPR 2024posterarXiv:2403.17610
#4178

Tuning Stable Rank Shrinkage: Aiming at the Overlooked Structural Risk in Fine-tuning

Sicong Shen, Yang Zhou, Bingzheng Wei et al.

CVPR 2024poster
#4179

Cam4DOcc: Benchmark for Camera-Only 4D Occupancy Forecasting in Autonomous Driving Applications

Junyi Ma, Xieyuanli Chen, Jiawei Huang et al.

CVPR 2024posterarXiv:2311.17663
#4180

Relightable and Animatable Neural Avatar from Sparse-View Video

Zhen Xu, Sida Peng, Chen Geng et al.

CVPR 2024highlightarXiv:2308.07903
#4181

Objects as Volumes: A Stochastic Geometry View of Opaque Solids

Bailey Miller, Hanyu Chen, Alice Lai et al.

CVPR 2024posterarXiv:2312.15406
#4182

Do You Remember? Dense Video Captioning with Cross-Modal Memory Retrieval

Minkuk Kim, Hyeon Bae Kim, Jinyoung Moon et al.

CVPR 2024posterarXiv:2404.07610
#4183

SwitchLight: Co-design of Physics-driven Architecture and Pre-training Framework for Human Portrait Relighting

Hoon Kim, Minje Jang, Wonjun Yoon et al.

CVPR 2024highlightarXiv:2402.18848
#4184

CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor

Shuyang Sun, Runjia Li, Philip H.S. Torr et al.

CVPR 2024posterarXiv:2312.07661
#4185

Image Neural Field Diffusion Models

Yinbo Chen, Oliver Wang, Richard Zhang et al.

CVPR 2024highlightarXiv:2406.07480
#4186

Dual-View Visual Contextualization for Web Navigation

Jihyung Kil, Chan Hee Song, Boyuan Zheng et al.

CVPR 2024posterarXiv:2402.04476
#4187

Improving the Generalization of Segmentation Foundation Model under Distribution Shift via Weakly Supervised Adaptation

Haojie Zhang, Yongyi Su, Xun Xu et al.

CVPR 2024posterarXiv:2312.03502
#4188

Language-guided Image Reflection Separation

Haofeng Zhong, Yuchen Hong, Shuchen Weng et al.

CVPR 2024posterarXiv:2402.11874
#4189

CrowdDiff: Multi-hypothesis Crowd Density Estimation using Diffusion Models

Yasiru Ranasinghe, Nithin Gopalakrishnan Nair, Wele Gedara Chaminda Bandara et al.

CVPR 2024posterarXiv:2303.12790
#4190

Semantically-Shifted Incremental Adapter-Tuning is A Continual ViTransformer

Yuwen Tan, Qinhao Zhou, Xiang Xiang et al.

CVPR 2024poster
#4191

Physics-guided Shape-from-Template: Monocular Video Perception through Neural Surrogate Models

David Stotko, Nils Wandel, Reinhard Klein

CVPR 2024posterarXiv:2311.12796
#4192

GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation

Tong Wu, Guandao Yang, Zhibing Li et al.

CVPR 2024posterarXiv:2401.04092
#4193

Orthogonal Adaptation for Modular Customization of Diffusion Models

Ryan Po, Guandao Yang, Kfir Aberman et al.

CVPR 2024highlightarXiv:2312.02432
#4194

End-to-End Spatio-Temporal Action Localisation with Video Transformers

Alexey Gritsenko, Xuehan Xiong, Josip Djolonga et al.

CVPR 2024posterarXiv:2304.12160
#4195

TRINS: Towards Multimodal Language Models that Can Read

Ruiyi Zhang, Yanzhe Zhang, Jian Chen et al.

CVPR 2024posterarXiv:2406.06730
#4196

Unlocking Pre-trained Image Backbones for Semantic Image Synthesis

Tariq Berrada, Jakob Verbeek, camille couprie et al.

CVPR 2024posterarXiv:2312.13314
#4197

Infer from What You Have Seen Before: Temporally-dependent Classifier for Semi-supervised Video Segmentation

Jiafan Zhuang, Zilei Wang, Yixin Zhang et al.

CVPR 2024poster
#4198

RegionGPT: Towards Region Understanding Vision Language Model

Qiushan Guo, Shalini De Mello, Danny Yin et al.

CVPR 2024posterarXiv:2403.02330
#4199

MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI

Xiang Yue, Yuansheng Ni, Kai Zhang et al.

CVPR 2024posterarXiv:2311.16502
#4200

Navigate Beyond Shortcuts: Debiased Learning Through the Lens of Neural Collapse

Yining Wang, Junjie Sun, Chenyue Wang et al.

CVPR 2024highlightarXiv:2405.05587