Lewei Lu

22
Papers
2,418
Total Citations

Papers (22)

InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks

CVPR 2024
2,210
citations

The All-Seeing Project V2: Towards General Relation Comprehension of the Open World

ECCV 2024
86
citations

ControlLLM: Augment Language Models with Tools by Searching on Graphs

ECCV 2024arXiv
57
citations

SynerGen-VL: Towards Synergistic Image Understanding and Generation with Vision Experts and Token Folding

CVPR 2025
34
citations

Docopilot: Improving Multimodal Models for Document-Level Understanding

CVPR 2025
14
citations

Weakly Supervised Monocular 3D Detection with a Single-View Image

CVPR 2024
12
citations

HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding

CVPR 2025
5
citations

Masked AutoDecoder is Effective Multi-Task Vision Generalist

CVPR 2024
0
citations

Towards All-in-One Pre-Training via Maximizing Multi-Modal Mutual Information

CVPR 2023arXiv
0
citations

BEVFormer v2: Adapting Modern Image Backbones to Bird's-Eye-View Recognition via Perspective Supervision

CVPR 2023
0
citations

Distilling Focal Knowledge From Imperfect Expert for 3D Object Detection

CVPR 2023
0
citations

InternImage: Exploring Large-Scale Vision Foundation Models With Deformable Convolutions

CVPR 2023arXiv
0
citations

Planning-Oriented Autonomous Driving

CVPR 2023arXiv
0
citations

FuseFormer: Fusing Fine-Grained Information in Transformers for Video Inpainting

ICCV 2021arXiv
0
citations

PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models

CVPR 2025
0
citations

Scene as Occupancy

ICCV 2023arXiv
0
citations

MaskGWM: A Generalizable Driving World Model with Video Mask Reconstruction

CVPR 2025
0
citations

Spatial Preference Rewarding for MLLMs Spatial Understanding

ICCV 2025
0
citations

Enhancing the Outcome Reward-based RL Training of MLLMs with Self-Consistency Sampling

NeurIPS 2025
0
citations

Modeling Continuous Motion for 3D Point Cloud Object Tracking

AAAI 2024arXiv
0
citations

Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft

CVPR 2024
0
citations

Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications

CVPR 2024
0
citations