Kai Zhang

45
Papers
1,404
Total Citations
1
Affiliations

Affiliations

Department of Computer Science and Engineering, The Ohio State University

Papers (45)

Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts

ICLR 2024
252
citations

GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting

ECCV 2024
245
citations

DMV3D: Denoising Multi-view Diffusion Using 3D Large Reconstruction Model

ICLR 2024
227
citations

PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction

ICLR 2024
154
citations

LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias

ICLR 2025
86
citations

RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval

NeurIPS 2025
81
citations

RandAR: Decoder-only Autoregressive Visual Generation in Random Orders

CVPR 2025
61
citations

Long-LRM: Long-sequence Large Reconstruction Model for Wide-coverage Gaussian Splats

ICCV 2025
56
citations

DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception

CVPR 2024
39
citations

UMIE: Unified Multimodal Information Extraction with Instruction Tuning

AAAI 2024arXiv
29
citations

Deep Equilibrium Diffusion Restoration with Parallel Sampling

CVPR 2024
23
citations

STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution

ICCV 2025
22
citations

CPath-Omni: A Unified Multimodal Foundation Model for Patch and Whole Slide Image Analysis in Computational Pathology

CVPR 2025arXiv
22
citations

Unmixing Diffusion for Self-Supervised Hyperspectral Image Denoising

CVPR 2024
18
citations

Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation and Reconstruction

ICCV 2025arXiv
16
citations

DiffSCI: Zero-Shot Snapshot Compressive Imaging via Iterative Spectral Diffusion Model

CVPR 2024
14
citations

ECVC: Exploiting Non-Local Correlations in Multiple Frames for Contextual Video Compression

CVPR 2025arXiv
13
citations

RelitLRM: Generative Relightable Radiance for Large Reconstruction Models

ICLR 2025arXiv
11
citations

Gaussian Mixture Flow Matching Models

ICML 2025
8
citations

Turbo3D: Ultra-fast Text-to-3D Generation

CVPR 2025
6
citations

DATENeRF: Depth-Aware Text-based Editing of NeRFs

ECCV 2024
5
citations

Buffer Anytime: Zero-Shot Video Depth and Normal from Image Priors

CVPR 2025
4
citations

Generating 3D-Consistent Videos from Unposed Internet Photos

CVPR 2025
4
citations

A Unified Solution to Video Fusion: From Multi-Frame Learning to Benchmarking

NeurIPS 2025
4
citations

Enhancing Low-Light Images: A Synthetic Data Perspective on Practical and Generalizable Solutions

AAAI 2025
2
citations

Reverse Convolution and Its Applications to Image Restoration

ICCV 2025arXiv
1
citations

Intent Oriented Contrastive Learning for Sequential Recommendation

AAAI 2025
1
citations

PathAsst: A Generative Foundation AI Assistant towards Artificial General Intelligence of Pathology

AAAI 2024arXiv
0
citations

Harnessing Multimodal Large Language Models for Multimodal Sequential Recommendation

AAAI 2025
0
citations

Equivariant Multi-Modality Image Fusion

CVPR 2024
0
citations

GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation

CVPR 2024
0
citations

MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI

CVPR 2024
0
citations

Spin: Diffusion-based Semantic Image Painting Through Independent Information Injection

AAAI 2025
0
citations

Neural Directional Encoding for Efficient and Accurate View-Dependent Appearance Modeling

CVPR 2024
0
citations

RayZer: A Self-supervised Large View Synthesis Model

ICCV 2025
0
citations

DORNet: A Degradation Oriented and Regularized Network for Blind Depth Super-Resolution

CVPR 2025
0
citations

MegaSynth: Scaling Up 3D Scene Reconstruction with Synthesized Data

CVPR 2025
0
citations

High-Order Contrastive Learning with Fine-grained Comparative Levels for Sparse Ordinal Tensor Completion

ICML 2024
0
citations

Federated Self-Explaining GNNs with Anti-shortcut Augmentations

ICML 2024
0
citations

MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions

ICML 2024
0
citations

Lightweight Image Super-Resolution via Flexible Meta Pruning

ICML 2024
0
citations

LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers

AAAI 2025
0
citations

TravelPlanner: A Benchmark for Real-World Planning with Language Agents

ICML 2024
0
citations

Adaptive Multimodal Fusion: Dynamic Attention Allocation for Intent Recognition

AAAI 2025
0
citations

DiffRAW: Leveraging Diffusion Model to Generate DSLR-Comparable Perceptual Quality sRGB from Smartphone RAW Images

AAAI 2024
0
citations