Lei Zhang

73
Papers
1,256
Total Citations

Papers (73)

SeeSR: Towards Semantics-Aware Real-World Image Super-Resolution

CVPR 2024
256
citations

Pixel-Aware Stable Diffusion for Realistic Image Super-Resolution and Personalized Stylization

ECCV 2024
234
citations

Osprey: Pixel Understanding with Visual Instruction Tuning

CVPR 2024
147
citations

DreamTime: An Improved Optimization Strategy for Diffusion-Guided 3D Generation

ICLR 2024
78
citations

ChatTime: A Unified Multimodal Time Series Foundation Model Bridging Numerical and Textual Data

AAAI 2025
72
citations

Progressive3D: Progressively Local Editing for Text-to-3D Content Creation with Complex Semantic Prompts

ICLR 2024
54
citations

Visual In-Context Prompting

CVPR 2024
52
citations

Implicit Discriminative Knowledge Learning for Visible-Infrared Person Re-Identification

CVPR 2024
51
citations

Scaling Speech-Text Pre-training with Synthetic Interleaved Data

ICLR 2025
39
citations

CoCoCo: Improving Text-Guided Video Inpainting for Better Consistency, Controllability and Compatibility

AAAI 2025
38
citations

Open-World Human-Object Interaction Detection via Multi-modal Prompts

CVPR 2024
31
citations

ScaleDreamer: Scalable Text-to-3D Synthesis with Asynchronous Score Distillation

ECCV 2024arXiv
26
citations

Adversarial Diffusion Compression for Real-World Image Super-Resolution

CVPR 2025
25
citations

Toward Generalized Image Quality Assessment: Relaxing the Perfect Reference Quality Assumption

CVPR 2025
16
citations

Hierarchical Context Pruning: Optimizing Real-World Code Completion with Repository-Level Pretrained Code LLMs

AAAI 2025
15
citations

Self-Supervised Video Desmoking for Laparoscopic Surgery

ECCV 2024
15
citations

Referring to Any Person

ICCV 2025arXiv
13
citations

ScatterFormer: Efficient Voxel Transformer with Scattered Linear Attention

ECCV 2024
13
citations

Robust Overfitting Does Matter: Test-Time Adversarial Purification With FGSM

CVPR 2024
12
citations

SkillMimic: Learning Basketball Interaction Skills from Demonstrations

CVPR 2025
12
citations

Neural Super-Resolution for Real-time Rendering with Radiance Demodulation

CVPR 2024
9
citations

Generalized and Efficient 2D Gaussian Splatting for Arbitrary-scale Super-Resolution

ICCV 2025
9
citations

Symbol as Points: Panoptic Symbol Spotting via Point-based Representation

ICLR 2024
9
citations

Generalizable Sensor-Based Activity Recognition via Categorical Concept Invariant Learning

AAAI 2025
7
citations

D^2iT: Dynamic Diffusion Transformer for Accurate Image Generation

CVPR 2025
6
citations

HandOS: 3D Hand Reconstruction in One Stage

CVPR 2025arXiv
5
citations

HumanMM: Global Human Motion Recovery from Multi-shot Videos

CVPR 2025
3
citations

Integrating Visual Interpretation and Linguistic Reasoning for Geometric Problem Solving

ICCV 2025
3
citations

SyncNoise: Geometrically Consistent Noise Prediction for Instruction-based 3D Editing

AAAI 2025
2
citations

PASS: Path-selective State Space Model for Event-based Recognition

NeurIPS 2025
1
citations

Reverse Convolution and Its Applications to Image Restoration

ICCV 2025arXiv
1
citations

Multi-Edge Reinforced Collaborative Data Acquisition for Continuous Video Analytics by Prioritizing Quality over Quantity

AAAI 2025
1
citations

The Underappreciated Power of Vision Models for Graph Structural Understanding

NeurIPS 2025
1
citations

Dual Memory Networks: A Versatile Adaptation Approach for Vision-Language Models

CVPR 2024
0
citations

Linguistic-Aware Patch Slimming Framework for Fine-grained Cross-Modal Alignment

CVPR 2024
0
citations

UniVS: Unified and Universal Video Segmentation with Prompts as Queries

CVPR 2024
0
citations

Efficient Scene Recovery Using Luminous Flux Prior

CVPR 2024
0
citations

Uncertainty-Aware Source-Free Adaptive Image Super-Resolution with Wavelet Augmentation Transformer

CVPR 2024
0
citations

State-Constrained Zero-Sum Differential Games with One-Sided Information

ICML 2024
0
citations

DNA-SE: Towards Deep Neural-Nets Assisted Semiparametric Estimation

ICML 2024
0
citations

HumanTOMATO: Text-aligned Whole-body Motion Generation

ICML 2024
0
citations

Low-Biased General Annotated Dataset Generation

CVPR 2025
0
citations

RORem: Training a Robust Object Remover with Human-in-the-Loop

CVPR 2025
0
citations

Pixel-level and Semantic-level Adjustable Super-resolution: A Dual-LoRA Approach

CVPR 2025
0
citations

MaSS13K: A Matting-level Semantic Segmentation Benchmark

CVPR 2025
0
citations

Progressive Rendering Distillation: Adapting Stable Diffusion for Instant Text-to-Mesh Generation without 3D Data

CVPR 2025
0
citations

LeanGaussian: Breaking Pixel or Point Cloud Correspondence in Modeling 3D Gaussians

CVPR 2025
0
citations

OSMamba: Omnidirectional Spectral Mamba with Dual-Domain Prior Generator for Exposure Correction

CVPR 2025
0
citations

FeedEdit: Text-Based Image Editing with Dynamic Feedback Regulation

CVPR 2025
0
citations

Prior-aware Dynamic Temporal Modeling Framework for Sequential 3D Hand Pose Estimation

ICCV 2025
0
citations

FiVE-Bench: A Fine-grained Video Editing Benchmark for Evaluating Emerging Diffusion and Rectified Flow Models

ICCV 2025
0
citations

Co-Painter: Fine-Grained Controllable Image Stylization via Implicit Decoupling and Adaptive Injection

ICCV 2025
0
citations

UniGS: Modeling Unitary 3D Gaussians for Novel View Synthesis from Sparse-view Images

ICCV 2025
0
citations

ForgeLens: Data-Efficient Forgery Focus for Generalizable Forgery Image Detection

ICCV 2025
0
citations

Fine-structure Preserved Real-world Image Super-resolution via Transfer VAE Training

ICCV 2025
0
citations

Towards Effective Foundation Model Adaptation for Extreme Cross-Domain Few-Shot Learning

ICCV 2025
0
citations

Hierarchy-Aware Pseudo Word Learning with Text Adaptation for Zero-Shot Composed Image Retrieval

ICCV 2025
0
citations

Dual-Temporal Exemplar Representation Network for Video Semantic Segmentation

ICCV 2025
0
citations

InsViE-1M: Effective Instruction-based Video Editing with Elaborate Dataset Construction

ICCV 2025
0
citations

Perceive, Understand and Restore: Real-World Image Super-Resolution with Autoregressive Multimodal Generative Models

ICCV 2025
0
citations

Polyline Path Masked Attention for Vision Transformer

NeurIPS 2025
0
citations

SLRL: Semi-Supervised Local Community Detection Based on Reinforcement Learning

AAAI 2025
0
citations

CustomContrast: A Multilevel Contrastive Perspective for Subject-Driven Text-to-Image Customization

AAAI 2025
0
citations

GaussianSR: High Fidelity 2D Gaussian Splatting for Arbitrary-Scale Image Super-Resolution

AAAI 2025
0
citations

Manta: Enhancing Mamba for Few-Shot Action Recognition of Long Sub-Sequence

AAAI 2025
0
citations

MARS: Mixture of Auto-Regressive Models for Fine-grained Text-to-image Synthesis

AAAI 2025
0
citations

GapMatch: Bridging Instance and Model Perturbations for Enhanced Semi-Supervised Medical Image Segmentation

AAAI 2025
0
citations

Adversarial Contrastive Graph Augmentation with Counterfactual Regularization

AAAI 2025
0
citations

Imitate Before Detect: Aligning Machine Stylistic Preference for Machine-Revised Text Detection

AAAI 2025
0
citations

Fine-Tuning Language Models with Collaborative and Semantic Experts

AAAI 2025
0
citations

Dynamic Weighted Combiner for Mixed-Modal Image Retrieval

AAAI 2024
0
citations

Identification of Necessary Semantic Undertakers in the Causal View for Image-Text Matching

AAAI 2024
0
citations

Gradual Residuals Alignment: A Dual-Stream Framework for GAN Inversion and Image Attribute Editing

AAAI 2024
0
citations