Lei Zhang
73
Papers
1,256
Total Citations
Papers (73)
SeeSR: Towards Semantics-Aware Real-World Image Super-Resolution
CVPR 2024
256
citations
Pixel-Aware Stable Diffusion for Realistic Image Super-Resolution and Personalized Stylization
ECCV 2024
234
citations
Osprey: Pixel Understanding with Visual Instruction Tuning
CVPR 2024
147
citations
DreamTime: An Improved Optimization Strategy for Diffusion-Guided 3D Generation
ICLR 2024
78
citations
ChatTime: A Unified Multimodal Time Series Foundation Model Bridging Numerical and Textual Data
AAAI 2025
72
citations
Progressive3D: Progressively Local Editing for Text-to-3D Content Creation with Complex Semantic Prompts
ICLR 2024
54
citations
Visual In-Context Prompting
CVPR 2024
52
citations
Implicit Discriminative Knowledge Learning for Visible-Infrared Person Re-Identification
CVPR 2024
51
citations
Scaling Speech-Text Pre-training with Synthetic Interleaved Data
ICLR 2025
39
citations
CoCoCo: Improving Text-Guided Video Inpainting for Better Consistency, Controllability and Compatibility
AAAI 2025
38
citations
Open-World Human-Object Interaction Detection via Multi-modal Prompts
CVPR 2024
31
citations
ScaleDreamer: Scalable Text-to-3D Synthesis with Asynchronous Score Distillation
ECCV 2024arXiv
26
citations
Adversarial Diffusion Compression for Real-World Image Super-Resolution
CVPR 2025
25
citations
Toward Generalized Image Quality Assessment: Relaxing the Perfect Reference Quality Assumption
CVPR 2025
16
citations
Hierarchical Context Pruning: Optimizing Real-World Code Completion with Repository-Level Pretrained Code LLMs
AAAI 2025
15
citations
Self-Supervised Video Desmoking for Laparoscopic Surgery
ECCV 2024
15
citations
Referring to Any Person
ICCV 2025arXiv
13
citations
ScatterFormer: Efficient Voxel Transformer with Scattered Linear Attention
ECCV 2024
13
citations
Robust Overfitting Does Matter: Test-Time Adversarial Purification With FGSM
CVPR 2024
12
citations
SkillMimic: Learning Basketball Interaction Skills from Demonstrations
CVPR 2025
12
citations
Neural Super-Resolution for Real-time Rendering with Radiance Demodulation
CVPR 2024
9
citations
Generalized and Efficient 2D Gaussian Splatting for Arbitrary-scale Super-Resolution
ICCV 2025
9
citations
Symbol as Points: Panoptic Symbol Spotting via Point-based Representation
ICLR 2024
9
citations
Generalizable Sensor-Based Activity Recognition via Categorical Concept Invariant Learning
AAAI 2025
7
citations
D^2iT: Dynamic Diffusion Transformer for Accurate Image Generation
CVPR 2025
6
citations
HandOS: 3D Hand Reconstruction in One Stage
CVPR 2025arXiv
5
citations
HumanMM: Global Human Motion Recovery from Multi-shot Videos
CVPR 2025
3
citations
Integrating Visual Interpretation and Linguistic Reasoning for Geometric Problem Solving
ICCV 2025
3
citations
SyncNoise: Geometrically Consistent Noise Prediction for Instruction-based 3D Editing
AAAI 2025
2
citations
PASS: Path-selective State Space Model for Event-based Recognition
NeurIPS 2025
1
citations
Reverse Convolution and Its Applications to Image Restoration
ICCV 2025arXiv
1
citations
Multi-Edge Reinforced Collaborative Data Acquisition for Continuous Video Analytics by Prioritizing Quality over Quantity
AAAI 2025
1
citations
The Underappreciated Power of Vision Models for Graph Structural Understanding
NeurIPS 2025
1
citations
Dual Memory Networks: A Versatile Adaptation Approach for Vision-Language Models
CVPR 2024
0
citations
Linguistic-Aware Patch Slimming Framework for Fine-grained Cross-Modal Alignment
CVPR 2024
0
citations
UniVS: Unified and Universal Video Segmentation with Prompts as Queries
CVPR 2024
0
citations
Efficient Scene Recovery Using Luminous Flux Prior
CVPR 2024
0
citations
Uncertainty-Aware Source-Free Adaptive Image Super-Resolution with Wavelet Augmentation Transformer
CVPR 2024
0
citations
State-Constrained Zero-Sum Differential Games with One-Sided Information
ICML 2024
0
citations
DNA-SE: Towards Deep Neural-Nets Assisted Semiparametric Estimation
ICML 2024
0
citations
HumanTOMATO: Text-aligned Whole-body Motion Generation
ICML 2024
0
citations
Low-Biased General Annotated Dataset Generation
CVPR 2025
0
citations
RORem: Training a Robust Object Remover with Human-in-the-Loop
CVPR 2025
0
citations
Pixel-level and Semantic-level Adjustable Super-resolution: A Dual-LoRA Approach
CVPR 2025
0
citations
MaSS13K: A Matting-level Semantic Segmentation Benchmark
CVPR 2025
0
citations
Progressive Rendering Distillation: Adapting Stable Diffusion for Instant Text-to-Mesh Generation without 3D Data
CVPR 2025
0
citations
LeanGaussian: Breaking Pixel or Point Cloud Correspondence in Modeling 3D Gaussians
CVPR 2025
0
citations
OSMamba: Omnidirectional Spectral Mamba with Dual-Domain Prior Generator for Exposure Correction
CVPR 2025
0
citations
FeedEdit: Text-Based Image Editing with Dynamic Feedback Regulation
CVPR 2025
0
citations
Prior-aware Dynamic Temporal Modeling Framework for Sequential 3D Hand Pose Estimation
ICCV 2025
0
citations
FiVE-Bench: A Fine-grained Video Editing Benchmark for Evaluating Emerging Diffusion and Rectified Flow Models
ICCV 2025
0
citations
Co-Painter: Fine-Grained Controllable Image Stylization via Implicit Decoupling and Adaptive Injection
ICCV 2025
0
citations
UniGS: Modeling Unitary 3D Gaussians for Novel View Synthesis from Sparse-view Images
ICCV 2025
0
citations
ForgeLens: Data-Efficient Forgery Focus for Generalizable Forgery Image Detection
ICCV 2025
0
citations
Fine-structure Preserved Real-world Image Super-resolution via Transfer VAE Training
ICCV 2025
0
citations
Towards Effective Foundation Model Adaptation for Extreme Cross-Domain Few-Shot Learning
ICCV 2025
0
citations
Hierarchy-Aware Pseudo Word Learning with Text Adaptation for Zero-Shot Composed Image Retrieval
ICCV 2025
0
citations
Dual-Temporal Exemplar Representation Network for Video Semantic Segmentation
ICCV 2025
0
citations
InsViE-1M: Effective Instruction-based Video Editing with Elaborate Dataset Construction
ICCV 2025
0
citations
Perceive, Understand and Restore: Real-World Image Super-Resolution with Autoregressive Multimodal Generative Models
ICCV 2025
0
citations
Polyline Path Masked Attention for Vision Transformer
NeurIPS 2025
0
citations
SLRL: Semi-Supervised Local Community Detection Based on Reinforcement Learning
AAAI 2025
0
citations
CustomContrast: A Multilevel Contrastive Perspective for Subject-Driven Text-to-Image Customization
AAAI 2025
0
citations
GaussianSR: High Fidelity 2D Gaussian Splatting for Arbitrary-Scale Image Super-Resolution
AAAI 2025
0
citations
Manta: Enhancing Mamba for Few-Shot Action Recognition of Long Sub-Sequence
AAAI 2025
0
citations
MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis
AAAI 2025
0
citations
GapMatch: Bridging Instance and Model Perturbations for Enhanced Semi-Supervised Medical Image Segmentation
AAAI 2025
0
citations
Adversarial Contrastive Graph Augmentation with Counterfactual Regularization
AAAI 2025
0
citations
Imitate Before Detect: Aligning Machine Stylistic Preference for Machine-Revised Text Detection
AAAI 2025
0
citations
Fine-Tuning Language Models with Collaborative and Semantic Experts
AAAI 2025
0
citations
Dynamic Weighted Combiner for Mixed-Modal Image Retrieval
AAAI 2024
0
citations
Identification of Necessary Semantic Undertakers in the Causal View for Image-Text Matching
AAAI 2024
0
citations
Gradual Residuals Alignment: A Dual-Stream Framework for GAN Inversion and Image Attribute Editing
AAAI 2024
0
citations