Kai Zhang
45
Papers
1,404
Total Citations
1
Affiliations
Affiliations
Department of Computer Science and Engineering, The Ohio State University
Papers (45)
Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts
ICLR 2024
252
citations
GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting
ECCV 2024
245
citations
DMV3D: Denoising Multi-view Diffusion Using 3D Large Reconstruction Model
ICLR 2024
227
citations
PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction
ICLR 2024
154
citations
LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias
ICLR 2025
86
citations
RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval
NeurIPS 2025
81
citations
RandAR: Decoder-only Autoregressive Visual Generation in Random Orders
CVPR 2025
61
citations
Long-LRM: Long-sequence Large Reconstruction Model for Wide-coverage Gaussian Splats
ICCV 2025
56
citations
DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception
CVPR 2024
39
citations
UMIE: Unified Multimodal Information Extraction with Instruction Tuning
AAAI 2024arXiv
29
citations
Deep Equilibrium Diffusion Restoration with Parallel Sampling
CVPR 2024
23
citations
STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution
ICCV 2025
22
citations
CPath-Omni: A Unified Multimodal Foundation Model for Patch and Whole Slide Image Analysis in Computational Pathology
CVPR 2025arXiv
22
citations
Unmixing Diffusion for Self-Supervised Hyperspectral Image Denoising
CVPR 2024
18
citations
Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation and Reconstruction
ICCV 2025arXiv
16
citations
DiffSCI: Zero-Shot Snapshot Compressive Imaging via Iterative Spectral Diffusion Model
CVPR 2024
14
citations
ECVC: Exploiting Non-Local Correlations in Multiple Frames for Contextual Video Compression
CVPR 2025arXiv
13
citations
RelitLRM: Generative Relightable Radiance for Large Reconstruction Models
ICLR 2025arXiv
11
citations
Gaussian Mixture Flow Matching Models
ICML 2025
8
citations
Turbo3D: Ultra-fast Text-to-3D Generation
CVPR 2025
6
citations
DATENeRF: Depth-Aware Text-based Editing of NeRFs
ECCV 2024
5
citations
Buffer Anytime: Zero-Shot Video Depth and Normal from Image Priors
CVPR 2025
4
citations
Generating 3D-Consistent Videos from Unposed Internet Photos
CVPR 2025
4
citations
A Unified Solution to Video Fusion: From Multi-Frame Learning to Benchmarking
NeurIPS 2025
4
citations
Enhancing Low-Light Images: A Synthetic Data Perspective on Practical and Generalizable Solutions
AAAI 2025
2
citations
Reverse Convolution and Its Applications to Image Restoration
ICCV 2025arXiv
1
citations
Intent Oriented Contrastive Learning for Sequential Recommendation
AAAI 2025
1
citations
PathAsst: A Generative Foundation AI Assistant towards Artificial General Intelligence of Pathology
AAAI 2024arXiv
0
citations
Harnessing Multimodal Large Language Models for Multimodal Sequential Recommendation
AAAI 2025
0
citations
Equivariant Multi-Modality Image Fusion
CVPR 2024
0
citations
GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation
CVPR 2024
0
citations
MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
CVPR 2024
0
citations
Spin: Diffusion-based Semantic Image Painting Through Independent Information Injection
AAAI 2025
0
citations
Neural Directional Encoding for Efficient and Accurate View-Dependent Appearance Modeling
CVPR 2024
0
citations
RayZer: A Self-supervised Large View Synthesis Model
ICCV 2025
0
citations
DORNet: A Degradation Oriented and Regularized Network for Blind Depth Super-Resolution
CVPR 2025
0
citations
MegaSynth: Scaling Up 3D Scene Reconstruction with Synthesized Data
CVPR 2025
0
citations
High-Order Contrastive Learning with Fine-grained Comparative Levels for Sparse Ordinal Tensor Completion
ICML 2024
0
citations
Federated Self-Explaining GNNs with Anti-shortcut Augmentations
ICML 2024
0
citations
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions
ICML 2024
0
citations
Lightweight Image Super-Resolution via Flexible Meta Pruning
ICML 2024
0
citations
LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers
AAAI 2025
0
citations
TravelPlanner: A Benchmark for Real-World Planning with Language Agents
ICML 2024
0
citations
Adaptive Multimodal Fusion: Dynamic Attention Allocation for Intent Recognition
AAAI 2025
0
citations
DiffRAW: Leveraging Diffusion Model to Generate DSLR-Comparable Perceptual Quality sRGB from Smartphone RAW Images
AAAI 2024
0
citations