Peng Wang

101
Papers
1,823
Total Citations

Papers (101)

MVDream: Multi-view Diffusion for 3D Generation

ICLR 2024
871
citations

DMV3D: Denoising Multi-view Diffusion Using 3D Large Reconstruction Model

ICLR 2024
227
citations

VadCLIP: Adapting Vision-Language Models for Weakly Supervised Video Anomaly Detection

AAAI 2024arXiv
156
citations

PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction

ICLR 2024
154
citations

SURGE: Surface Regularized Geometry Estimation from a Single Image

NeurIPS 2016
98
citations

Semi-Supervised Crowd Counting via Self-Training on Surrogate Tasks

ECCV 2020
85
citations

Open-Vocabulary Video Anomaly Detection

CVPR 2024
64
citations

CC-OCR: A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy

ICCV 2025arXiv
42
citations

Towards Continual Knowledge Graph Embedding via Incremental Distillation

AAAI 2024arXiv
39
citations

MoDGS: Dynamic Gaussian Splatting from Casually-captured Monocular Videos with Depth Priors

ICLR 2025
26
citations

COCONut: Modernizing COCO Segmentation

CVPR 2024
22
citations

Low-Rank Rescaled Vision Transformer Fine-Tuning: A Residual Design Approach

CVPR 2024
19
citations

Understanding Representation Dynamics of Diffusion Models via Low-Dimensional Modeling

NeurIPS 2025
6
citations

PoseLLaVA: Pose Centric Multimodal LLM for Fine-Grained 3D Pose Manipulation

AAAI 2025
5
citations

Unify Named Entity Recognition Scenarios via Contrastive Real-Time Updating Prototype

AAAI 2024
4
citations

Attention-Only Transformers via Unrolled Subspace Denoising

ICML 2025
3
citations

Platypus: A Generalized Specialist Model for Reading Text in Various Forms

ECCV 2024
2
citations

Enhancing 3D Fidelity of Text-to-3D using Cross-View Correspondences

CVPR 2024
0
citations

Generalized Neural Collapse for a Large Number of Classes

ICML 2024
0
citations

Symmetric Matrix Completion with ReLU Sampling

ICML 2024
0
citations

Image Fusion via Vision-Language Model

ICML 2024
0
citations

The Emergence of Reproducibility and Consistency in Diffusion Models

ICML 2024
0
citations

Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation

ICML 2024
0
citations

A Global Geometric Analysis of Maximal Coding Rate Reduction

ICML 2024
0
citations

Towards Unified Depth and Semantic Prediction From a Single Image

CVPR 2015
0
citations

Efficient SDP Inference for Fully-Connected CRFs Based on Low-Rank Decomposition

CVPR 2015
0
citations

What's Wrong With That Object? Identifying Images of Unusual Objects by Modelling the Detection Score Distribution

CVPR 2016
0
citations

Ask Me Anything: Free-Form Visual Question Answering Based on Knowledge From External Sources

CVPR 2016
0
citations

The VQA-Machine: Learning How to Use Existing Vision Algorithms to Answer New Questions

CVPR 2017
0
citations

Multi-Attention Network for One Shot Learning

CVPR 2017
0
citations

Joint Multi-Person Pose Estimation and Semantic Part Segmentation

CVPR 2017arXiv
0
citations

LEGO: Learning Edge With Geometry All at Once by Watching Videos

CVPR 2018arXiv
0
citations

MaskLab: Instance Segmentation by Refining Object Detection With Semantic and Direction Features

CVPR 2018arXiv
0
citations

View Extrapolation of Human Body From a Single Image

CVPR 2018arXiv
0
citations

Occlusion Aware Unsupervised Learning of Optical Flow

CVPR 2018arXiv
0
citations

DeLS-3D: Deep Localization and Segmentation With a 3D Semantic Map

CVPR 2018arXiv
0
citations

Are You Talking to Me? Reasoned Visual Dialog Generation Through Adversarial Learning

CVPR 2018arXiv
0
citations

Visual Question Answering With Memory-Augmented Networks

CVPR 2018arXiv
0
citations

Neighbourhood Watch: Referring Expression Comprehension via Language-Guided Graph Attention Networks

CVPR 2019
0
citations

Multi-Label Image Recognition With Graph Convolutional Networks

CVPR 2019
0
citations

ApolloCar3D: A Large 3D Car Instance Understanding Benchmark for Autonomous Driving

CVPR 2019
0
citations

Visual Question Answering as Reading Comprehension

CVPR 2019
0
citations

UnOS: Unified Unsupervised Optical-Flow and Stereo-Depth Estimation by Watching Videos

CVPR 2019
0
citations

Anisotropic Convolutional Networks for 3D Semantic Scene Completion

CVPR 2020arXiv
0
citations

Say As You Wish: Fine-Grained Control of Image Caption Generation With Abstract Scene Graphs

CVPR 2020arXiv
0
citations

Cops-Ref: A New Dataset and Task on Compositional Referring Expression Comprehension

CVPR 2020
0
citations

NAS-FCOS: Fast Neural Architecture Search for Object Detection

CVPR 2020
0
citations

3D Part Guided Image Editing for Fine-Grained Object Understanding

CVPR 2020
0
citations

Contrastive Learning Based Hybrid Networks for Long-Tailed Image Classification

CVPR 2021arXiv
0
citations

HR-NAS: Searching Efficient High-Resolution Neural Architectures With Lightweight Transformers

CVPR 2021
0
citations

Neural Rays for Occlusion-Aware Image-Based Rendering

CVPR 2022arXiv
0
citations

Node Representation Learning in Graph via Node-to-Neighbourhood Mutual Information Maximization

CVPR 2022arXiv
0
citations

Label Relation Graphs Enhanced Hierarchical Residual Network for Hierarchical Multi-Granularity Classification

CVPR 2022arXiv
0
citations

Semi-Supervised Wide-Angle Portraits Correction by Multi-Scale Transformer

CVPR 2022arXiv
0
citations

NightLab: A Dual-Level Architecture With Hardness Detection for Segmentation at Night

CVPR 2022arXiv
0
citations

HOP: History-and-Order Aware Pre-Training for Vision-and-Language Navigation

CVPR 2022arXiv
0
citations

Pushing the Performance Limit of Scene Text Recognizer Without Human Annotation

CVPR 2022arXiv
0
citations

BAD-NeRF: Bundle Adjusted Deblur Neural Radiance Fields

CVPR 2023
0
citations

Revisiting Prototypical Network for Cross Domain Few-Shot Learning

CVPR 2023
0
citations

Learning Conditional Attributes for Compositional Zero-Shot Learning

CVPR 2023arXiv
0
citations

Glocal Energy-Based Learning for Few-Shot Open-Set Recognition

CVPR 2023arXiv
0
citations

Unlocking Generalization Power in LiDAR Point Cloud Registration

CVPR 2025
0
citations

NeuralUDF: Learning Unsigned Distance Fields for Multi-View Reconstruction of Surfaces With Arbitrary Topologies

CVPR 2023arXiv
0
citations

A New Comprehensive Benchmark for Semi-Supervised Video Anomaly Detection and Anticipation

CVPR 2023
0
citations

S3C: Semi-Supervised VQA Natural Language Explanation via Self-Critical Learning

CVPR 2023
0
citations

Joint Object and Part Segmentation Using Deep Learned Potentials

ICCV 2015
0
citations

Towards End-To-End Text Spotting With Convolutional Recurrent Neural Networks

ICCV 2017arXiv
0
citations

Vehicle Re-Identification in Aerial Imagery: Dataset and Approach

ICCV 2019
0
citations

Continual Neural Mapping: Learning an Implicit Scene Representation From Sequential Observations

ICCV 2021arXiv
0
citations

AerialVLN: Vision-and-Language Navigation for UAVs

ICCV 2023arXiv
0
citations

Batch-based Model Registration for Fast 3D Sherd Reconstruction

ICCV 2023arXiv
0
citations

LISTER: Neighbor Decoding for Length-Insensitive Scene Text Recognition

ICCV 2023arXiv
0
citations

Dynamically Transformed Instance Normalization Network for Generalizable Person Re-identification

ECCV 2022
0
citations

Levenshtein OCR

ECCV 2022
0
citations

Multi-Granularity Prediction for Scene Text Recognition

ECCV 2022
0
citations

NeuRIS: Neural Reconstruction of Indoor Scenes Using Normal Priors

ECCV 2022
0
citations

SparseNeuS: Fast Generalizable Neural Surface Reconstruction from Sparse Views

ECCV 2022
0
citations

DistPro: Searching a Fast Knowledge Distillation Process via Meta Optimization

ECCV 2022
0
citations

A Simple and Robust Correlation Filtering Method for Text-Based Person Search

ECCV 2022
0
citations

F2-NeRF: Fast Neural Radiance Field Training With Free Camera Trajectories

CVPR 2023
0
citations

SeCap: Self-Calibrating and Adaptive Prompts for Cross-view Person Re-Identification in Aerial-Ground Networks

CVPR 2025
0
citations

CamFreeDiff: Camera-free Image to Panorama Generation with Diffusion Model

CVPR 2025
0
citations

Dual Diffusion for Unified Image Generation and Understanding

CVPR 2025
0
citations

Octopus: Alleviating Hallucination via Dynamic Contrastive Decoding

CVPR 2025
0
citations

LA-MOTR: End-to-End Multi-Object Tracking by Learnable Association

ICCV 2025
0
citations

RayZer: A Self-supervised Large View Synthesis Model

ICCV 2025
0
citations

A Unified Framework for Industrial Cel-Animation Colorization with Temporal-Structural Awareness

ICCV 2025
0
citations

Implicit Counterfactual Learning for Audio-Visual Segmentation

ICCV 2025
0
citations

Towards Effective Foundation Model Adaptation for Extreme Cross-Domain Few-Shot Learning

ICCV 2025
0
citations

Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy

ICCV 2025
0
citations

DoGA: Enhancing Grounded Object Detection via Grouped Pre-Training with Attributes

AAAI 2025
0
citations

VarCMP: Adapting Cross-Modal Pre-Training Models for Video Anomaly Retrieval

AAAI 2025
0
citations

A Lightweight Sparse Interaction Network for Time Series Forecasting

AAAI 2025
0
citations

OntoFact: Unveiling Fantastic Fact-Skeleton of LLMs via Ontology-Driven Reinforcement Learning

AAAI 2024
0
citations

ConsistNER: Towards Instructive NER Demonstrations for LLMs with the Consistency of Ontology and Context

AAAI 2024
0
citations

MV-Adapter: Multimodal Video Transfer Learning for Video Text Retrieval

CVPR 2024
0
citations

NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction

NeurIPS 2021
0
citations

HumanLiker: A Human-like Object Detector to Model the Manual Labeling Process

NeurIPS 2022
0
citations

Neural Collapse with Normalized Features: A Geometric Analysis over the Riemannian Manifold

NeurIPS 2022
0
citations

MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion

NeurIPS 2023
0
citations

Efficient Adaptation of Large Vision Transformer via Adapter Re-Composing

NeurIPS 2023
0
citations