Zhuo Chen

18

Papers

225

Total Citations

Papers (18)

ELLA-V: Stable Neural Codec Language Modeling with Alignment-Guided Sequence Reordering

Structure-CLIP: Towards Scene Graph Knowledge to Enhance Multi-Modal Structured Representations

Language Model Can Listen While Speaking

DiTAR: Diffusion Transformer Autoregressive Modeling for Speech Generation

Multiple Heads are Better than One: Mixture of Modality Knowledge Experts for Entity Representation Learning

Multimodal Latent Diffusion Model for Complex Sewing Pattern Generation

3D-Aware Face Editing via Warping-Guided Latent Direction Learning

AniSDF: Fused-Granularity Neural Surfaces with Anisotropic Encoding for High-Fidelity 3D Reconstruction

Infer the Whole from a Glimpse of a Part: Keypoint-Based Knowledge Graph for Vehicle Re-Identification

TENG: Time-Evolving Natural Gradient for Solving PDEs With Deep Neural Nets Toward Machine Precision

One-for-More: Continual Diffusion Model for Anomaly Detection

Theoretical Insights in Model Inversion Robustness and Conditional Entropy Maximization for Collaborative Inference Systems

Dataset Distillation as Data Compression: A Rate-Utility Perspective

AMDANet: Attention-Driven Multi-Perspective Discrepancy Alignment for RGB-Infrared Image Fusion and Segmentation

K-ON: Stacking Knowledge on the Head Layer of Large Language Model

Tokenization, Fusion, and Augmentation: Towards Fine-grained Multi-modal Entity Representation

UniMix: Towards Domain Adaptive and Generalizable LiDAR Semantic Segmentation in Adverse Weather

Scaling Mesh Generation via Compressive Tokenization