Peng Wang
101
Papers
1,823
Total Citations
Papers (101)
MVDream: Multi-view Diffusion for 3D Generation
ICLR 2024
871
citations
DMV3D: Denoising Multi-view Diffusion Using 3D Large Reconstruction Model
ICLR 2024
227
citations
VadCLIP: Adapting Vision-Language Models for Weakly Supervised Video Anomaly Detection
AAAI 2024arXiv
156
citations
PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction
ICLR 2024
154
citations
SURGE: Surface Regularized Geometry Estimation from a Single Image
NeurIPS 2016
98
citations
Semi-Supervised Crowd Counting via Self-Training on Surrogate Tasks
ECCV 2020
85
citations
Open-Vocabulary Video Anomaly Detection
CVPR 2024
64
citations
CC-OCR: A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy
ICCV 2025arXiv
42
citations
Towards Continual Knowledge Graph Embedding via Incremental Distillation
AAAI 2024arXiv
39
citations
MoDGS: Dynamic Gaussian Splatting from Casually-captured Monocular Videos with Depth Priors
ICLR 2025
26
citations
COCONut: Modernizing COCO Segmentation
CVPR 2024
22
citations
Low-Rank Rescaled Vision Transformer Fine-Tuning: A Residual Design Approach
CVPR 2024
19
citations
Understanding Representation Dynamics of Diffusion Models via Low-Dimensional Modeling
NeurIPS 2025
6
citations
PoseLLaVA: Pose Centric Multimodal LLM for Fine-Grained 3D Pose Manipulation
AAAI 2025
5
citations
Unify Named Entity Recognition Scenarios via Contrastive Real-Time Updating Prototype
AAAI 2024
4
citations
Attention-Only Transformers via Unrolled Subspace Denoising
ICML 2025
3
citations
Platypus: A Generalized Specialist Model for Reading Text in Various Forms
ECCV 2024
2
citations
Enhancing 3D Fidelity of Text-to-3D using Cross-View Correspondences
CVPR 2024
0
citations
Generalized Neural Collapse for a Large Number of Classes
ICML 2024
0
citations
Symmetric Matrix Completion with ReLU Sampling
ICML 2024
0
citations
Image Fusion via Vision-Language Model
ICML 2024
0
citations
The Emergence of Reproducibility and Consistency in Diffusion Models
ICML 2024
0
citations
Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation
ICML 2024
0
citations
A Global Geometric Analysis of Maximal Coding Rate Reduction
ICML 2024
0
citations
Towards Unified Depth and Semantic Prediction From a Single Image
CVPR 2015
0
citations
Efficient SDP Inference for Fully-Connected CRFs Based on Low-Rank Decomposition
CVPR 2015
0
citations
What's Wrong With That Object? Identifying Images of Unusual Objects by Modelling the Detection Score Distribution
CVPR 2016
0
citations
Ask Me Anything: Free-Form Visual Question Answering Based on Knowledge From External Sources
CVPR 2016
0
citations
The VQA-Machine: Learning How to Use Existing Vision Algorithms to Answer New Questions
CVPR 2017
0
citations
Multi-Attention Network for One Shot Learning
CVPR 2017
0
citations
Joint Multi-Person Pose Estimation and Semantic Part Segmentation
CVPR 2017arXiv
0
citations
LEGO: Learning Edge With Geometry All at Once by Watching Videos
CVPR 2018arXiv
0
citations
MaskLab: Instance Segmentation by Refining Object Detection With Semantic and Direction Features
CVPR 2018arXiv
0
citations
View Extrapolation of Human Body From a Single Image
CVPR 2018arXiv
0
citations
Occlusion Aware Unsupervised Learning of Optical Flow
CVPR 2018arXiv
0
citations
DeLS-3D: Deep Localization and Segmentation With a 3D Semantic Map
CVPR 2018arXiv
0
citations
Are You Talking to Me? Reasoned Visual Dialog Generation Through Adversarial Learning
CVPR 2018arXiv
0
citations
Visual Question Answering With Memory-Augmented Networks
CVPR 2018arXiv
0
citations
Neighbourhood Watch: Referring Expression Comprehension via Language-Guided Graph Attention Networks
CVPR 2019
0
citations
Multi-Label Image Recognition With Graph Convolutional Networks
CVPR 2019
0
citations
ApolloCar3D: A Large 3D Car Instance Understanding Benchmark for Autonomous Driving
CVPR 2019
0
citations
Visual Question Answering as Reading Comprehension
CVPR 2019
0
citations
UnOS: Unified Unsupervised Optical-Flow and Stereo-Depth Estimation by Watching Videos
CVPR 2019
0
citations
Anisotropic Convolutional Networks for 3D Semantic Scene Completion
CVPR 2020arXiv
0
citations
Say As You Wish: Fine-Grained Control of Image Caption Generation With Abstract Scene Graphs
CVPR 2020arXiv
0
citations
Cops-Ref: A New Dataset and Task on Compositional Referring Expression Comprehension
CVPR 2020
0
citations
NAS-FCOS: Fast Neural Architecture Search for Object Detection
CVPR 2020
0
citations
3D Part Guided Image Editing for Fine-Grained Object Understanding
CVPR 2020
0
citations
Contrastive Learning Based Hybrid Networks for Long-Tailed Image Classification
CVPR 2021arXiv
0
citations
HR-NAS: Searching Efficient High-Resolution Neural Architectures With Lightweight Transformers
CVPR 2021
0
citations
Neural Rays for Occlusion-Aware Image-Based Rendering
CVPR 2022arXiv
0
citations
Node Representation Learning in Graph via Node-to-Neighbourhood Mutual Information Maximization
CVPR 2022arXiv
0
citations
Label Relation Graphs Enhanced Hierarchical Residual Network for Hierarchical Multi-Granularity Classification
CVPR 2022arXiv
0
citations
Semi-Supervised Wide-Angle Portraits Correction by Multi-Scale Transformer
CVPR 2022arXiv
0
citations
NightLab: A Dual-Level Architecture With Hardness Detection for Segmentation at Night
CVPR 2022arXiv
0
citations
HOP: History-and-Order Aware Pre-Training for Vision-and-Language Navigation
CVPR 2022arXiv
0
citations
Pushing the Performance Limit of Scene Text Recognizer Without Human Annotation
CVPR 2022arXiv
0
citations
BAD-NeRF: Bundle Adjusted Deblur Neural Radiance Fields
CVPR 2023
0
citations
Revisiting Prototypical Network for Cross Domain Few-Shot Learning
CVPR 2023
0
citations
Learning Conditional Attributes for Compositional Zero-Shot Learning
CVPR 2023arXiv
0
citations
Glocal Energy-Based Learning for Few-Shot Open-Set Recognition
CVPR 2023arXiv
0
citations
Unlocking Generalization Power in LiDAR Point Cloud Registration
CVPR 2025
0
citations
NeuralUDF: Learning Unsigned Distance Fields for Multi-View Reconstruction of Surfaces With Arbitrary Topologies
CVPR 2023arXiv
0
citations
A New Comprehensive Benchmark for Semi-Supervised Video Anomaly Detection and Anticipation
CVPR 2023
0
citations
S3C: Semi-Supervised VQA Natural Language Explanation via Self-Critical Learning
CVPR 2023
0
citations
Joint Object and Part Segmentation Using Deep Learned Potentials
ICCV 2015
0
citations
Towards End-To-End Text Spotting With Convolutional Recurrent Neural Networks
ICCV 2017arXiv
0
citations
Vehicle Re-Identification in Aerial Imagery: Dataset and Approach
ICCV 2019
0
citations
Continual Neural Mapping: Learning an Implicit Scene Representation From Sequential Observations
ICCV 2021arXiv
0
citations
AerialVLN: Vision-and-Language Navigation for UAVs
ICCV 2023arXiv
0
citations
Batch-based Model Registration for Fast 3D Sherd Reconstruction
ICCV 2023arXiv
0
citations
LISTER: Neighbor Decoding for Length-Insensitive Scene Text Recognition
ICCV 2023arXiv
0
citations
Dynamically Transformed Instance Normalization Network for Generalizable Person Re-identification
ECCV 2022
0
citations
Levenshtein OCR
ECCV 2022
0
citations
Multi-Granularity Prediction for Scene Text Recognition
ECCV 2022
0
citations
NeuRIS: Neural Reconstruction of Indoor Scenes Using Normal Priors
ECCV 2022
0
citations
SparseNeuS: Fast Generalizable Neural Surface Reconstruction from Sparse Views
ECCV 2022
0
citations
DistPro: Searching a Fast Knowledge Distillation Process via Meta Optimization
ECCV 2022
0
citations
A Simple and Robust Correlation Filtering Method for Text-Based Person Search
ECCV 2022
0
citations
F2-NeRF: Fast Neural Radiance Field Training With Free Camera Trajectories
CVPR 2023
0
citations
SeCap: Self-Calibrating and Adaptive Prompts for Cross-view Person Re-Identification in Aerial-Ground Networks
CVPR 2025
0
citations
CamFreeDiff: Camera-free Image to Panorama Generation with Diffusion Model
CVPR 2025
0
citations
Dual Diffusion for Unified Image Generation and Understanding
CVPR 2025
0
citations
Octopus: Alleviating Hallucination via Dynamic Contrastive Decoding
CVPR 2025
0
citations
LA-MOTR: End-to-End Multi-Object Tracking by Learnable Association
ICCV 2025
0
citations
RayZer: A Self-supervised Large View Synthesis Model
ICCV 2025
0
citations
A Unified Framework for Industrial Cel-Animation Colorization with Temporal-Structural Awareness
ICCV 2025
0
citations
Implicit Counterfactual Learning for Audio-Visual Segmentation
ICCV 2025
0
citations
Towards Effective Foundation Model Adaptation for Extreme Cross-Domain Few-Shot Learning
ICCV 2025
0
citations
Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy
ICCV 2025
0
citations
DoGA: Enhancing Grounded Object Detection via Grouped Pre-Training with Attributes
AAAI 2025
0
citations
VarCMP: Adapting Cross-Modal Pre-Training Models for Video Anomaly Retrieval
AAAI 2025
0
citations
A Lightweight Sparse Interaction Network for Time Series Forecasting
AAAI 2025
0
citations
OntoFact: Unveiling Fantastic Fact-Skeleton of LLMs via Ontology-Driven Reinforcement Learning
AAAI 2024
0
citations
ConsistNER: Towards Instructive NER Demonstrations for LLMs with the Consistency of Ontology and Context
AAAI 2024
0
citations
MV-Adapter: Multimodal Video Transfer Learning for Video Text Retrieval
CVPR 2024
0
citations
NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction
NeurIPS 2021
0
citations
HumanLiker: A Human-like Object Detector to Model the Manual Labeling Process
NeurIPS 2022
0
citations
Neural Collapse with Normalized Features: A Geometric Analysis over the Riemannian Manifold
NeurIPS 2022
0
citations
MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion
NeurIPS 2023
0
citations
Efficient Adaptation of Large Vision Transformer via Adapter Re-Composing
NeurIPS 2023
0
citations