Fan Zhang

35

Papers

1,482

Total Citations

Papers (35)

VBench: Comprehensive Benchmark Suite for Video Generative Models

Generative Multimodal Models are In-Context Learners

Atlantis: Enabling Underwater Depth Estimation with Stable Diffusion

HIIF: Hierarchical Encoding based Implicit Image Function for Continuous Super-resolution

UltraFusion: Ultra High Dynamic Imaging using Exposure Fusion

ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models

Fine-grained Prototypical Voting with Heterogeneous Mixup for Semi-supervised 2D-3D Cross-modal Retrieval

HumanSAM: Classifying Human-centric Forgery Videos in Human Spatial, Appearance, and Motion Anomaly

CULTURE3D: A Large-Scale and Diverse Dataset of Cultural Landmarks and Terrains for Gaussian-Based Scene Rendering

DREAM: Decoupled Discriminative Learning with Bigraph-aware Alignment for Semi-supervised 2D-3D Cross-modal Retrieval

LDMVFI: Video Frame Interpolation with Latent Diffusion Models

LTGC: Long-tail Recognition via Leveraging LLMs-driven Generated Content

CapsFusion: Rethinking Image-Text Data at Scale

Casual Stereoscopic Panorama Stitching

Fusing Subcategory Probabilities for Texture Classification

High-Speed Tracking With Multi-Kernel Correlation Filters

Noise-Tolerant Paradigm for Training Face Recognition CNNs

Unsupervised Instance Segmentation in Microscopy Images via Panoptic Domain Adaptation and Task Re-Weighting

Learning Temporal Consistency for Low Light Video Enhancement From Single Images

ST-MFNet: A Spatio-Temporal Multi-Flow Network for Frame Interpolation

Locally-Transferred Fisher Vectors for Texture Classification

ACFNet: Attentional Class Feature Network for Semantic Segmentation

MDCS: More Diverse Experts with Consistency Self-distillation for Long-tailed Recognition

Learning Rain Location Prior for Nighttime Deraining

HOT3D: Hand and Object Tracking in 3D from Egocentric Multi-View Videos

Subspace Constraint and Contribution Estimation for Heterogeneous Federated Learning

GauUpdate: New Object Insertion in 3D Gaussian Fields with Consistent Global Illumination

GIViC: Generative Implicit Video Compression

AdaptiveAE: An Adaptive Exposure Strategy for HDR Capturing in Dynamic Scenes

OneGT: One-Shot Geometry-Texture Neural Rendering for Head Avatars

Blind Video Super-Resolution based on Implicit Kernels

PNVC: Towards Practical INR-based Video Compression

SGTC: Semantic-Guided Triplet Co-training for Sparsely Annotated Semi-Supervised Medical Image Segmentation

Distributionally Robust Local Non-parametric Conditional Estimation

HiNeRV: Video Compression with Hierarchical Encoding-based Neural Representation