Kai Zhang

79
Papers
1,458
Total Citations
1
Affiliations

Affiliations

Department of Computer Science and Engineering, The Ohio State University

Papers (79)

Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts

ICLR 2024
252
citations

GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting

ECCV 2024
245
citations

DMV3D: Denoising Multi-view Diffusion Using 3D Large Reconstruction Model

ICLR 2024
227
citations

PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction

ICLR 2024
154
citations

LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias

ICLR 2025
86
citations

RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval

NeurIPS 2025
81
citations

Distributed Flexible Nonlinear Tensor Factorization

NeurIPS 2016arXiv
65
citations

RandAR: Decoder-only Autoregressive Visual Generation in Random Orders

CVPR 2025
61
citations

Long-LRM: Long-sequence Large Reconstruction Model for Wide-coverage Gaussian Splats

ICCV 2025
56
citations

DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception

CVPR 2024
39
citations

UMIE: Unified Multimodal Information Extraction with Instruction Tuning

AAAI 2024arXiv
29
citations

Deep Equilibrium Diffusion Restoration with Parallel Sampling

CVPR 2024
23
citations

STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution

ICCV 2025
22
citations

CPath-Omni: A Unified Multimodal Foundation Model for Patch and Whole Slide Image Analysis in Computational Pathology

CVPR 2025arXiv
22
citations

Unmixing Diffusion for Self-Supervised Hyperspectral Image Denoising

CVPR 2024
18
citations

Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation and Reconstruction

ICCV 2025arXiv
16
citations

DiffSCI: Zero-Shot Snapshot Compressive Imaging via Iterative Spectral Diffusion Model

CVPR 2024
14
citations

ECVC: Exploiting Non-Local Correlations in Multiple Frames for Contextual Video Compression

CVPR 2025arXiv
13
citations

Gaussian Mixture Flow Matching Models

ICML 2025
8
citations

Turbo3D: Ultra-fast Text-to-3D Generation

CVPR 2025
6
citations

DATENeRF: Depth-Aware Text-based Editing of NeRFs

ECCV 2024
5
citations

A Unified Solution to Video Fusion: From Multi-Frame Learning to Benchmarking

NeurIPS 2025
4
citations

Generating 3D-Consistent Videos from Unposed Internet Photos

CVPR 2025
4
citations

Buffer Anytime: Zero-Shot Video Depth and Normal from Image Priors

CVPR 2025
4
citations

Enhancing Low-Light Images: A Synthetic Data Perspective on Practical and Generalizable Solutions

AAAI 2025
2
citations

Intent Oriented Contrastive Learning for Sequential Recommendation

AAAI 2025
1
citations

Reverse Convolution and Its Applications to Image Restoration

ICCV 2025arXiv
1
citations

PhySG: Inverse Rendering With Spherical Gaussians for Physics-Based Material Editing and Relighting

CVPR 2021arXiv
0
citations

GMOT-40: A Benchmark for Generic Multiple Object Tracking

CVPR 2021
0
citations

PQA: Perceptual Question Answering

CVPR 2021arXiv
0
citations

The Heterogeneity Hypothesis: Finding Layer-Wise Differentiated Network Architectures

CVPR 2021arXiv
0
citations

IRON: Inverse Rendering by Optimizing Neural SDFs and Materials From Photometric Images

CVPR 2022arXiv
0
citations

ClusterGNN: Cluster-Based Coarse-To-Fine Graph Neural Network for Efficient Feature Matching

CVPR 2022arXiv
0
citations

Event-Based Frame Interpolation With Ad-Hoc Deblurring

CVPR 2023arXiv
0
citations

CiaoSR: Continuous Implicit Attention-in-Attention Network for Arbitrary-Scale Image Super-Resolution

CVPR 2023arXiv
0
citations

Designing a Practical Degradation Model for Deep Blind Image Super-Resolution

ICCV 2021arXiv
0
citations

Towards Flexible Blind JPEG Artifacts Removal

ICCV 2021arXiv
0
citations

Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution

ICCV 2021arXiv
0
citations

Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling

ICCV 2021arXiv
0
citations

Ray Conditioning: Trading Photo-consistency for Photo-realism in Multi-view Image Generation

ICCV 2023arXiv
0
citations

DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion

ICCV 2023arXiv
0
citations

DHP: Differentiable Meta Pruning via HyperNetworks

ECCV 2020
0
citations

Reference-Based Image Super-Resolution with Deformable Attention Transformer

ECCV 2022
0
citations

Towards Interpretable Video Super-Resolution via Alternating Optimization

ECCV 2022
0
citations

ARF: Artistic Radiance Fields

ECCV 2022
0
citations

Exploring Overall Contextual Information for Image Captioning in Human-Like Cognitive Style

ICCV 2019
0
citations

MegaSynth: Scaling Up 3D Scene Reconstruction with Synthesized Data

CVPR 2025
0
citations

DORNet: A Degradation Oriented and Regularized Network for Blind Depth Super-Resolution

CVPR 2025
0
citations

RayZer: A Self-supervised Large View Synthesis Model

ICCV 2025
0
citations

Spin: Diffusion-based Semantic Image Painting Through Independent Information Injection

AAAI 2025
0
citations

Harnessing Multimodal Large Language Models for Multimodal Sequential Recommendation

AAAI 2025
0
citations

Adaptive Multimodal Fusion: Dynamic Attention Allocation for Intent Recognition

AAAI 2025
0
citations

LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers

AAAI 2025
0
citations

DiffRAW: Leveraging Diffusion Model to Generate DSLR-Comparable Perceptual Quality sRGB from Smartphone RAW Images

AAAI 2024
0
citations

PathAsst: A Generative Foundation AI Assistant towards Artificial General Intelligence of Pathology

AAAI 2024arXiv
0
citations

Equivariant Multi-Modality Image Fusion

CVPR 2024
0
citations

GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation

CVPR 2024
0
citations

MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI

CVPR 2024
0
citations

Neural Directional Encoding for Efficient and Accurate View-Dependent Appearance Modeling

CVPR 2024
0
citations

High-Order Contrastive Learning with Fine-grained Comparative Levels for Sparse Ordinal Tensor Completion

ICML 2024
0
citations

Federated Self-Explaining GNNs with Anti-shortcut Augmentations

ICML 2024
0
citations

MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions

ICML 2024
0
citations

Lightweight Image Super-Resolution via Flexible Meta Pruning

ICML 2024
0
citations

TravelPlanner: A Benchmark for Real-World Planning with Language Agents

ICML 2024
0
citations

Learning Deep CNN Denoiser Prior for Image Restoration

CVPR 2017arXiv
0
citations

Learning a Single Convolutional Super-Resolution Network for Multiple Degradations

CVPR 2018arXiv
0
citations

Deep Plug-And-Play Super-Resolution for Arbitrary Blur Kernels

CVPR 2019
0
citations

Toward Convolutional Blind Denoising of Real Photographs

CVPR 2019
0
citations

Deep Unfolding Network for Image Super-Resolution

CVPR 2020arXiv
0
citations

Depth Sensing Beyond LiDAR Range

CVPR 2020arXiv
0
citations

Neural Blind Deconvolution Using Deep Priors

CVPR 2020arXiv
0
citations

Flow-Based Kernel Prior With Application to Blind Super-Resolution

CVPR 2021arXiv
0
citations

Recurrent Video Restoration Transformer with Guided Deformable Attention

NeurIPS 2022
0
citations

WT-MVSNet: Window-based Transformers for Multi-view Stereo

NeurIPS 2022
0
citations

SAViT: Structure-Aware Vision Transformer Pruning via Collaborative Optimization

NeurIPS 2022
0
citations

APG: Adaptive Parameter Generation Network for Click-Through Rate Prediction

NeurIPS 2022
0
citations

AdaptSSR: Pre-training User Model with Augmentation-Adaptive Self-Supervised Ranking

NeurIPS 2023
0
citations

MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing

NeurIPS 2023
0
citations

Greedy Orthogonal Pivoting Algorithm for Non-Negative Matrix Factorization

ICML 2019
0
citations