Kai Zhang

79

Papers

1,458

Total Citations

1

Affiliations

Affiliations

Department of Computer Science and Engineering, The Ohio State University

Papers (79)

Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts

GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting

DMV3D: Denoising Multi-view Diffusion Using 3D Large Reconstruction Model

PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction

LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias

RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval

Distributed Flexible Nonlinear Tensor Factorization

NeurIPS 2016arXiv

RandAR: Decoder-only Autoregressive Visual Generation in Random Orders

Long-LRM: Long-sequence Large Reconstruction Model for Wide-coverage Gaussian Splats

DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception

UMIE: Unified Multimodal Information Extraction with Instruction Tuning

Deep Equilibrium Diffusion Restoration with Parallel Sampling

STAR: Spatial-Temporal Augmentation with Text-to-Video Models for Real-World Video Super-Resolution

CPath-Omni: A Unified Multimodal Foundation Model for Patch and Whole Slide Image Analysis in Computational Pathology

Unmixing Diffusion for Self-Supervised Hyperspectral Image Denoising

Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation and Reconstruction

DiffSCI: Zero-Shot Snapshot Compressive Imaging via Iterative Spectral Diffusion Model

ECVC: Exploiting Non-Local Correlations in Multiple Frames for Contextual Video Compression

Gaussian Mixture Flow Matching Models

Turbo3D: Ultra-fast Text-to-3D Generation

DATENeRF: Depth-Aware Text-based Editing of NeRFs

A Unified Solution to Video Fusion: From Multi-Frame Learning to Benchmarking

Generating 3D-Consistent Videos from Unposed Internet Photos

Buffer Anytime: Zero-Shot Video Depth and Normal from Image Priors

Enhancing Low-Light Images: A Synthetic Data Perspective on Practical and Generalizable Solutions

Intent Oriented Contrastive Learning for Sequential Recommendation

Reverse Convolution and Its Applications to Image Restoration

PhySG: Inverse Rendering With Spherical Gaussians for Physics-Based Material Editing and Relighting

GMOT-40: A Benchmark for Generic Multiple Object Tracking

PQA: Perceptual Question Answering

The Heterogeneity Hypothesis: Finding Layer-Wise Differentiated Network Architectures

IRON: Inverse Rendering by Optimizing Neural SDFs and Materials From Photometric Images

ClusterGNN: Cluster-Based Coarse-To-Fine Graph Neural Network for Efficient Feature Matching

Event-Based Frame Interpolation With Ad-Hoc Deblurring

CiaoSR: Continuous Implicit Attention-in-Attention Network for Arbitrary-Scale Image Super-Resolution

Designing a Practical Degradation Model for Deep Blind Image Super-Resolution

Towards Flexible Blind JPEG Artifacts Removal

Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution

Hierarchical Conditional Flow: A Unified Framework for Image Super-Resolution and Image Rescaling

Ray Conditioning: Trading Photo-consistency for Photo-realism in Multi-view Image Generation

DDFM: Denoising Diffusion Model for Multi-Modality Image Fusion

DHP: Differentiable Meta Pruning via HyperNetworks

Reference-Based Image Super-Resolution with Deformable Attention Transformer

Towards Interpretable Video Super-Resolution via Alternating Optimization

ARF: Artistic Radiance Fields

Exploring Overall Contextual Information for Image Captioning in Human-Like Cognitive Style

MegaSynth: Scaling Up 3D Scene Reconstruction with Synthesized Data

DORNet: A Degradation Oriented and Regularized Network for Blind Depth Super-Resolution

RayZer: A Self-supervised Large View Synthesis Model

Spin: Diffusion-based Semantic Image Painting Through Independent Information Injection

Harnessing Multimodal Large Language Models for Multimodal Sequential Recommendation

Adaptive Multimodal Fusion: Dynamic Attention Allocation for Intent Recognition

LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers

DiffRAW: Leveraging Diffusion Model to Generate DSLR-Comparable Perceptual Quality sRGB from Smartphone RAW Images

PathAsst: A Generative Foundation AI Assistant towards Artificial General Intelligence of Pathology

Equivariant Multi-Modality Image Fusion

GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation

MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI

Neural Directional Encoding for Efficient and Accurate View-Dependent Appearance Modeling

High-Order Contrastive Learning with Fine-grained Comparative Levels for Sparse Ordinal Tensor Completion

Federated Self-Explaining GNNs with Anti-shortcut Augmentations

MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions

Lightweight Image Super-Resolution via Flexible Meta Pruning

TravelPlanner: A Benchmark for Real-World Planning with Language Agents

Learning Deep CNN Denoiser Prior for Image Restoration

Learning a Single Convolutional Super-Resolution Network for Multiple Degradations

Deep Plug-And-Play Super-Resolution for Arbitrary Blur Kernels

Toward Convolutional Blind Denoising of Real Photographs

Deep Unfolding Network for Image Super-Resolution

Depth Sensing Beyond LiDAR Range

Neural Blind Deconvolution Using Deep Priors

Flow-Based Kernel Prior With Application to Blind Super-Resolution

Recurrent Video Restoration Transformer with Guided Deformable Attention

WT-MVSNet: Window-based Transformers for Multi-view Stereo

SAViT: Structure-Aware Vision Transformer Pruning via Collaborative Optimization

APG: Adaptive Parameter Generation Network for Click-Through Rate Prediction

AdaptSSR: Pre-training User Model with Augmentation-Adaptive Self-Supervised Ranking

MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing

Greedy Orthogonal Pivoting Algorithm for Non-Negative Matrix Factorization