Wenwei Zhang

25
Papers
522
Total Citations

Papers (25)

LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D Capabilities

ICCV 2025
127
citations

OMG-Seg: Is One Model Good Enough For All Segmentation?

CVPR 2024
106
citations

CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction

ICLR 2024
104
citations

Unified Human-Scene Interaction via Prompted Chain-of-Contacts

ICLR 2024
100
citations

Harmonizing Visual Representations for Unified Multimodal Understanding and Generation

ICCV 2025
33
citations

CLIM: Contrastive Language-Image Mosaic for Region Representation

AAAI 2024arXiv
24
citations

F-LMM: Grounding Frozen Large Multimodal Models

CVPR 2025
21
citations

Rethinking Verification for LLM Code Generation: From Generation to Testing

NeurIPS 2025
7
citations

Dense Distinct Query for End-to-End Object Detection

CVPR 2023arXiv
0
citations

Robust Multi-Modality Multi-Object Tracking

ICCV 2019
0
citations

Robo3D: Towards Robust and Reliable 3D Perception against Corruptions

ICCV 2023arXiv
0
citations

Tube-Link: A Flexible Cross Tube Framework for Universal Video Segmentation

ICCV 2023
0
citations

Side-Aware Boundary Localization for More Precise Object Detection

ECCV 2020
0
citations

Dense Siamese Network for Dense Unsupervised Learning

ECCV 2022
0
citations

Seesaw Loss for Long-Tailed Instance Segmentation

CVPR 2021arXiv
0
citations

Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data and Metric Perspectives

ICCV 2025
0
citations

EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI

CVPR 2024
0
citations

Can AI Assistants Know What They Don't Know?

ICML 2024
0
citations

EcoNAS: Finding Proxies for Economical Neural Architecture Search

CVPR 2020arXiv
0
citations

Video K-Net: A Simple, Strong, and Unified Baseline for Video Segmentation

CVPR 2022
0
citations

Aligning Bag of Regions for Open-Vocabulary Object Detection

CVPR 2023arXiv
0
citations

MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based Self-Supervised Pre-Training

CVPR 2023
0
citations

K-Net: Towards Unified Image Segmentation

NeurIPS 2021
0
citations

Segment Any Point Cloud Sequences by Distilling Vision Foundation Models

NeurIPS 2023
0
citations

OV-PARTS: Towards Open-Vocabulary Part Segmentation

NeurIPS 2023
0
citations