Peng Wang

39
Papers
1,640
Total Citations

Papers (39)

MVDream: Multi-view Diffusion for 3D Generation

ICLR 2024
871
citations

DMV3D: Denoising Multi-view Diffusion Using 3D Large Reconstruction Model

ICLR 2024
227
citations

VadCLIP: Adapting Vision-Language Models for Weakly Supervised Video Anomaly Detection

AAAI 2024arXiv
156
citations

PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction

ICLR 2024
154
citations

Open-Vocabulary Video Anomaly Detection

CVPR 2024
64
citations

CC-OCR: A Comprehensive and Challenging OCR Benchmark for Evaluating Large Multimodal Models in Literacy

ICCV 2025arXiv
42
citations

Towards Continual Knowledge Graph Embedding via Incremental Distillation

AAAI 2024arXiv
39
citations

MoDGS: Dynamic Gaussian Splatting from Casually-captured Monocular Videos with Depth Priors

ICLR 2025
26
citations

COCONut: Modernizing COCO Segmentation

CVPR 2024
22
citations

Low-Rank Rescaled Vision Transformer Fine-Tuning: A Residual Design Approach

CVPR 2024
19
citations

Understanding Representation Dynamics of Diffusion Models via Low-Dimensional Modeling

NeurIPS 2025arXiv
6
citations

PoseLLaVA: Pose Centric Multimodal LLM for Fine-Grained 3D Pose Manipulation

AAAI 2025
5
citations

Unify Named Entity Recognition Scenarios via Contrastive Real-Time Updating Prototype

AAAI 2024
4
citations

Attention-Only Transformers via Unrolled Subspace Denoising

ICML 2025
3
citations

Platypus: A Generalized Specialist Model for Reading Text in Various Forms

ECCV 2024arXiv
2
citations

ConsistNER: Towards Instructive NER Demonstrations for LLMs with the Consistency of Ontology and Context

AAAI 2024
0
citations

MV-Adapter: Multimodal Video Transfer Learning for Video Text Retrieval

CVPR 2024
0
citations

Enhancing 3D Fidelity of Text-to-3D using Cross-View Correspondences

CVPR 2024
0
citations

Generalized Neural Collapse for a Large Number of Classes

ICML 2024
0
citations

Symmetric Matrix Completion with ReLU Sampling

ICML 2024
0
citations

Image Fusion via Vision-Language Model

ICML 2024
0
citations

The Emergence of Reproducibility and Consistency in Diffusion Models

ICML 2024
0
citations

Compressible Dynamics in Deep Overparameterized Low-Rank Learning & Adaptation

ICML 2024
0
citations

Unlocking Generalization Power in LiDAR Point Cloud Registration

CVPR 2025
0
citations

A Global Geometric Analysis of Maximal Coding Rate Reduction

ICML 2024
0
citations

SeCap: Self-Calibrating and Adaptive Prompts for Cross-view Person Re-Identification in Aerial-Ground Networks

CVPR 2025
0
citations

CamFreeDiff: Camera-free Image to Panorama Generation with Diffusion Model

CVPR 2025
0
citations

Dual Diffusion for Unified Image Generation and Understanding

CVPR 2025
0
citations

Octopus: Alleviating Hallucination via Dynamic Contrastive Decoding

CVPR 2025
0
citations

LA-MOTR: End-to-End Multi-Object Tracking by Learnable Association

ICCV 2025
0
citations

RayZer: A Self-supervised Large View Synthesis Model

ICCV 2025
0
citations

A Unified Framework for Industrial Cel-Animation Colorization with Temporal-Structural Awareness

ICCV 2025
0
citations

Implicit Counterfactual Learning for Audio-Visual Segmentation

ICCV 2025
0
citations

Towards Effective Foundation Model Adaptation for Extreme Cross-Domain Few-Shot Learning

ICCV 2025
0
citations

Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy

ICCV 2025
0
citations

DoGA: Enhancing Grounded Object Detection via Grouped Pre-Training with Attributes

AAAI 2025
0
citations

VarCMP: Adapting Cross-Modal Pre-Training Models for Video Anomaly Retrieval

AAAI 2025
0
citations

A Lightweight Sparse Interaction Network for Time Series Forecasting

AAAI 2025
0
citations

OntoFact: Unveiling Fantastic Fact-Skeleton of LLMs via Ontology-Driven Reinforcement Learning

AAAI 2024
0
citations