Zhenyu Zhang

34
Papers
82
Total Citations

Papers (34)

Exploiting Multimodal Spatial-temporal Patterns for Video Object Tracking

AAAI 2025
38
citations

STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution

ICCV 2025
22
citations

Anywhere: A Multi-Agent Framework for User-Guided, Reliable, and Diverse Foreground-Conditioned Image Generation

AAAI 2025
7
citations

Debiasing Multimodal Large Language Models via Noise-Aware Preference Optimization

CVPR 2025
7
citations

AltNeRF: Learning Robust Neural Radiance Field via Alternating Depth-Pose Optimization

AAAI 2024arXiv
4
citations

Describe, Don’t Dictate: Semantic Image Editing with Natural Language Intent

ICCV 2025
2
citations

ReCAP: Recursive Context-Aware Reasoning and Planning for Large Language Model Agents

NeurIPS 2025
1
citations

StrandHead: Text to Hair-Disentangled 3D Head Avatars Using Human-Centric Priors

ICCV 2025
1
citations

Pattern-Structure Diffusion for Multi-Task Learning

CVPR 2020
0
citations

Online Depth Learning Against Forgetting in Monocular Videos

CVPR 2020
0
citations

Learning To Restore 3D Face From In-the-Wild Degraded Images

CVPR 2022
0
citations

Quarantine: Sparsity Can Uncover the Trojan Attack Trigger for Free

CVPR 2022arXiv
0
citations

The Principle of Diversity: Training Stronger Vision Transformers Calls for Reducing All Levels of Redundancy

CVPR 2022arXiv
0
citations

Physically-Guided Disentangled Implicit Rendering for 3D Face Modeling

CVPR 2022
0
citations

Graph Transformer GANs for Graph-Constrained House Generation

CVPR 2023arXiv
0
citations

ERNIE-ViLG 2.0: Improving Text-to-Image Diffusion Model With Knowledge-Enhanced Mixture-of-Denoising-Experts

CVPR 2023
0
citations

Learning To Measure the Point Cloud Reconstruction Loss in a Representation Space

CVPR 2023
0
citations

Learning Neural Proto-Face Field for Disentangled 3D Face Modeling in the Wild

CVPR 2023
0
citations

Regularizing Nighttime Weirdness: Efficient Self-Supervised Monocular Depth Estimation in the Dark

ICCV 2021arXiv
0
citations

Learning Versatile 3D Shape Generation with Improved Auto-regressive Models

ICCV 2023
0
citations

Multi-modal Masked Pre-training for Monocular Panoramic Depth Completion

ECCV 2022
0
citations

RigNet: Repetitive Image Guided Network for Depth Completion

ECCV 2022
0
citations

Learning To Aggregate and Personalize 3D Face From In-the-Wild Photo Collection

CVPR 2021arXiv
0
citations

Tri-Perspective View Decomposition for Geometry-Aware Depth Completion

CVPR 2024
0
citations

Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference

ICML 2024
0
citations

Sparse Cocktail: Every Sparse Pattern Every Sparse Ratio All At Once

ICML 2024
0
citations

CaM: Cache Merging for Memory-efficient LLMs Inference

ICML 2024
0
citations

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection

ICML 2024
0
citations

Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity

ICML 2024
0
citations

Pattern-Affinitive Propagation Across Depth, Surface Normal and Semantic Segmentation

CVPR 2019
0
citations

You are caught stealing my winning lottery ticket! Making a lottery ticket claim its ownership

NeurIPS 2021
0
citations

Sparse Winning Tickets are Data-Efficient Image Recognizers

NeurIPS 2022
0
citations

Randomized Channel Shuffling: Minimal-Overhead Backdoor Attack Detection without Clean Datasets

NeurIPS 2022
0
citations

H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models

NeurIPS 2023
0
citations