Wei Wu
54
Papers
137
Total Citations
Papers (54)
Language-Image Pre-training with Long Captions
ECCV 2024
63
citations
Theoretical Benefit and Limitation of Diffusion Language Model
NeurIPS 2025arXiv
27
citations
Benchmarking Large Vision-Language Models via Directed Scene Graph for Comprehensive Image Captioning
CVPR 2025
18
citations
FreeDiff: Progressive Frequency Truncation for Image Editing with Diffusion Models
ECCV 2024
11
citations
OmniCache: A Trajectory-Oriented Global Perspective on Training-Free Cache Reuse for Diffusion Transformer Models
ICCV 2025
6
citations
ProtCLIP: Function-Informed Protein Multi-Modal Learning
AAAI 2025
5
citations
Learning Visual Generative Priors without Text
CVPR 2025
4
citations
DriveScape: High-Resolution Driving Video Generation by Multi-View Feature Fusion
CVPR 2025
3
citations
SwiftPillars: High-Efficiency Pillar Encoder for Lidar-Based 3D Detection
AAAI 2024
0
citations
HoloVIC: Large-scale Dataset and Benchmark for Multi-Sensor Holographic Intersection and Vehicle-Infrastructure Cooperative
CVPR 2024
0
citations
End-to-End Flow Correlation Tracking With Spatial-Temporal Attention
CVPR 2018arXiv
0
citations
Practical Block-Wise Neural Network Architecture Generation
CVPR 2018arXiv
0
citations
High Performance Visual Tracking With Siamese Region Proposal Network
CVPR 2018
0
citations
Feedback Network for Image Super-Resolution
CVPR 2019
0
citations
SiamRPN++: Evolution of Siamese Visual Tracking With Very Deep Networks
CVPR 2019
0
citations
IRLAS: Inverse Reinforcement Learning for Architecture Search
CVPR 2019
0
citations
Selective Sensor Fusion for Neural Visual-Inertial Odometry
CVPR 2019
0
citations
Adaptive Dilated Network With Self-Correction Supervision for Counting
CVPR 2020
0
citations
Improving One-Shot NAS by Suppressing the Posterior Fading
CVPR 2020arXiv
0
citations
Hierarchical Feature Embedding for Attribute Recognition
CVPR 2020arXiv
0
citations
Temporal Context Aggregation Network for Temporal Action Proposal Refinement
CVPR 2021arXiv
0
citations
Learning Statistical Texture for Semantic Segmentation
CVPR 2021arXiv
0
citations
Spatial-Temporal Correlation and Topology Learning for Person Re-Identification in Videos
CVPR 2021arXiv
0
citations
Target-Relevant Knowledge Preservation for Multi-Source Domain Adaptive Object Detection
CVPR 2022arXiv
0
citations
Unsupervised Learning of Accurate Siamese Tracking
CVPR 2022arXiv
0
citations
Learning Video Representations of Human Motion From Synthetic Data
CVPR 2022
0
citations
Cross Domain Object Detection by Target-Perceived Dual Branch Distillation
CVPR 2022arXiv
0
citations
Temporal Complementarity-Guided Reinforcement Learning for Image-to-Video Person Re-Identification
CVPR 2022
0
citations
MD-VQA: Multi-Dimensional Quality Assessment for UGC Live Videos
CVPR 2023
0
citations
LidarGait: Benchmarking 3D Gait Recognition With Point Clouds
CVPR 2023arXiv
0
citations
STM: SpatioTemporal and Motion Encoding for Action Recognition
ICCV 2019
0
citations
Dynamic Curriculum Learning for Imbalanced Data Classification
ICCV 2019
0
citations
Online Hyper-Parameter Learning for Auto-Augmentation Strategy
ICCV 2019
0
citations
RoboSense: Large-scale Dataset and Benchmark for Egocentric Robot Perception and Navigation in Crowded and Unstructured Environments
CVPR 2025
0
citations
Incorporating Convolution Designs Into Visual Transformers
ICCV 2021arXiv
0
citations
Regularized Mask Tuning: Uncovering Hidden Knowledge in Pre-Trained Vision-Language Models
ICCV 2023arXiv
0
citations
ActFormer: A GAN-based Transformer towards General Action-Conditioned 3D Human Motion Generation
ICCV 2023arXiv
0
citations
Scalable Video Object Segmentation with Simplified Framework
ICCV 2023arXiv
0
citations
Class-wise Dynamic Graph Convolution for Semantic Segmentation
ECCV 2020
0
citations
L-Tracing: Fast Light Visibility Estimation on Neural Surfaces by Sphere Tracing
ECCV 2022
0
citations
Backbone Is All Your Need: A Simplified Architecture for Visual Object Tracking
ECCV 2022
0
citations
AM-LFS: AutoML for Loss Function Search
ICCV 2019
0
citations
UniMamba: Unified Spatial-Channel Representation Learning with Group-Efficient Mamba for LiDAR-based 3D Object Detection
CVPR 2025
0
citations
InstaDrive: Instance-Aware Driving World Models for Realistic and Consistent Video Generation
ICCV 2025
0
citations
GeoFormer: Geometry Point Encoder for 3D Object Detection with Graph-based Transformer
ICCV 2025
0
citations
FreqPDE: Rethinking Positional Depth Embedding for Multi-View 3D Object Detection Transformers
ICCV 2025
0
citations
DynaAct: Large Language Model Reasoning with Dynamic Action Spaces
NeurIPS 2025
0
citations
Causal Inference over Visual-Semantic-Aligned Graph for Image Classification
AAAI 2025
0
citations
Synergy of GFlowNet and Protein Language Model Makes a Diverse Antibody Designer
AAAI 2025
0
citations
PointCNN: Convolution On X-Transformed Points
NeurIPS 2018
0
citations
Glyce: Glyph-vectors for Chinese Character Representations
NeurIPS 2019
0
citations
Zero-Resource Knowledge-Grounded Dialogue Generation
NeurIPS 2020
0
citations
Moderate-fitting as a Natural Backdoor Defender for Pre-trained Language Models
NeurIPS 2022
0
citations
Orthogonality-Promoting Distance Metric Learning: Convex Relaxation and Theoretical Analysis
ICML 2018
0
citations