Wei Wu

54
Papers
137
Total Citations

Papers (54)

Language-Image Pre-training with Long Captions

ECCV 2024
63
citations

Theoretical Benefit and Limitation of Diffusion Language Model

NeurIPS 2025arXiv
27
citations

Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioning

CVPR 2025
18
citations

FreeDiff: Progressive Frequency Truncation for Image Editing with Diffusion Models

ECCV 2024
11
citations

OmniCache: A Trajectory-Oriented Global Perspective on Training-Free Cache Reuse for Diffusion Transformer Models

ICCV 2025
6
citations

ProtCLIP: Function-Informed Protein Multi-Modal Learning

AAAI 2025
5
citations

Learning Visual Generative Priors without Text

CVPR 2025
4
citations

DriveScape: High-Resolution Driving Video Generation by Multi-View Feature Fusion

CVPR 2025
3
citations

SwiftPillars: High-Efficiency Pillar Encoder for Lidar-Based 3D Detection

AAAI 2024
0
citations

HoloVIC: Large-scale Dataset and Benchmark for Multi-Sensor Holographic Intersection and Vehicle-Infrastructure Cooperative

CVPR 2024
0
citations

End-to-End Flow Correlation Tracking With Spatial-Temporal Attention

CVPR 2018arXiv
0
citations

Practical Block-Wise Neural Network Architecture Generation

CVPR 2018arXiv
0
citations

High Performance Visual Tracking With Siamese Region Proposal Network

CVPR 2018
0
citations

Feedback Network for Image Super-Resolution

CVPR 2019
0
citations

SiamRPN++: Evolution of Siamese Visual Tracking With Very Deep Networks

CVPR 2019
0
citations

IRLAS: Inverse Reinforcement Learning for Architecture Search

CVPR 2019
0
citations

Selective Sensor Fusion for Neural Visual-Inertial Odometry

CVPR 2019
0
citations

Adaptive Dilated Network With Self-Correction Supervision for Counting

CVPR 2020
0
citations

Improving One-Shot NAS by Suppressing the Posterior Fading

CVPR 2020arXiv
0
citations

Hierarchical Feature Embedding for Attribute Recognition

CVPR 2020arXiv
0
citations

Temporal Context Aggregation Network for Temporal Action Proposal Refinement

CVPR 2021arXiv
0
citations

Learning Statistical Texture for Semantic Segmentation

CVPR 2021arXiv
0
citations

Spatial-Temporal Correlation and Topology Learning for Person Re-Identification in Videos

CVPR 2021arXiv
0
citations

Target-Relevant Knowledge Preservation for Multi-Source Domain Adaptive Object Detection

CVPR 2022arXiv
0
citations

Unsupervised Learning of Accurate Siamese Tracking

CVPR 2022arXiv
0
citations

Learning Video Representations of Human Motion From Synthetic Data

CVPR 2022
0
citations

Cross Domain Object Detection by Target-Perceived Dual Branch Distillation

CVPR 2022arXiv
0
citations

Temporal Complementarity-Guided Reinforcement Learning for Image-to-Video Person Re-Identification

CVPR 2022
0
citations

MD-VQA: Multi-Dimensional Quality Assessment for UGC Live Videos

CVPR 2023
0
citations

LidarGait: Benchmarking 3D Gait Recognition With Point Clouds

CVPR 2023arXiv
0
citations

STM: SpatioTemporal and Motion Encoding for Action Recognition

ICCV 2019
0
citations

Dynamic Curriculum Learning for Imbalanced Data Classification

ICCV 2019
0
citations

Online Hyper-Parameter Learning for Auto-Augmentation Strategy

ICCV 2019
0
citations

RoboSense: Large-scale Dataset and Benchmark for Egocentric Robot Perception and Navigation in Crowded and Unstructured Environments

CVPR 2025
0
citations

Incorporating Convolution Designs Into Visual Transformers

ICCV 2021arXiv
0
citations

Regularized Mask Tuning: Uncovering Hidden Knowledge in Pre-Trained Vision-Language Models

ICCV 2023arXiv
0
citations

ActFormer: A GAN-based Transformer towards General Action-Conditioned 3D Human Motion Generation

ICCV 2023arXiv
0
citations

Scalable Video Object Segmentation with Simplified Framework

ICCV 2023arXiv
0
citations

Class-wise Dynamic Graph Convolution for Semantic Segmentation

ECCV 2020
0
citations

L-Tracing: Fast Light Visibility Estimation on Neural Surfaces by Sphere Tracing

ECCV 2022
0
citations

Backbone Is All Your Need: A Simplified Architecture for Visual Object Tracking

ECCV 2022
0
citations

AM-LFS: AutoML for Loss Function Search

ICCV 2019
0
citations

UniMamba: Unified Spatial-Channel Representation Learning with Group-Efficient Mamba for LiDAR-based 3D Object Detection

CVPR 2025
0
citations

InstaDrive: Instance-Aware Driving World Models for Realistic and Consistent Video Generation

ICCV 2025
0
citations

GeoFormer: Geometry Point Encoder for 3D Object Detection with Graph-based Transformer

ICCV 2025
0
citations

FreqPDE: Rethinking Positional Depth Embedding for Multi-View 3D Object Detection Transformers

ICCV 2025
0
citations

DynaAct: Large Language Model Reasoning with Dynamic Action Spaces

NeurIPS 2025
0
citations

Causal Inference over Visual-Semantic-Aligned Graph for Image Classification

AAAI 2025
0
citations

Synergy of GFlowNet and Protein Language Model Makes a Diverse Antibody Designer

AAAI 2025
0
citations

PointCNN: Convolution On X-Transformed Points

NeurIPS 2018
0
citations

Glyce: Glyph-vectors for Chinese Character Representations

NeurIPS 2019
0
citations

Zero-Resource Knowledge-Grounded Dialogue Generation

NeurIPS 2020
0
citations

Moderate-fitting as a Natural Backdoor Defender for Pre-trained Language Models

NeurIPS 2022
0
citations

Orthogonality-Promoting Distance Metric Learning: Convex Relaxation and Theoretical Analysis

ICML 2018
0
citations