Yujiu Yang

39

Papers

319

Total Citations

Papers (39)

Improving Video Generation with Human Feedback

CoSeR: Bridging Image and Language for Cognitive Super-Resolution

IDOL: Instant Photorealistic 3D Human Creation from a Single Image

Spurious Feature Diversification Improves Out-of-distribution Generalization

Universal Segmentation at Arbitrary Granularity with Language Instruction

Scalable Image Tokenization with Index Backpropagation Quantization

InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models

IDA-VLM: Towards Movie Understanding via ID-Aware Large Vision-Language Model

CoHD: A Counting-Aware Hierarchical Decoding Framework for Generalized Referring Expression Segmentation

HyperSeg: Hybrid Segmentation Assistant with Fine-grained Visual Perceiver

Accelerating Neural Network Optimization Through an Automated Control Theory Lens

Seeing What You Miss: Vision-Language Pre-Training With Semantic Completion Learning

3D GAN Inversion With Facial Symmetry Prior

GLeaD: Improving GANs With a Generator-Leading Task

RIFormer: Keep Your Vision Backbone Effective but Removing Token Mixer

MAP: Multimodal Uncertainty-Aware Vision-Language Pre-Training Model

Global Knowledge Calibration for Fast Open-Vocabulary Segmentation

Masked Autoencoders Are Stronger Knowledge Distillers

ToonTalker: Cross-Domain Face Reenactment

UniKD: Universal Knowledge Distillation for Mimicking Homogeneous or Heterogeneous Object Detectors

Sparse Adversarial Attack via Perturbation Factorization

High-Fidelity GAN Inversion with Padding Space

StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via Pre-trained StyleGAN

Learning Quality-Aware Dynamic Memory for Video Object Segmentation

Global Spectral Filter Memory Network for Video Object Segmentation

Learning Adaptive Warping for Real-World Rolling Shutter Correction

DnLUT: Ultra-Efficient Color Image Denoising via Channel-Aware Lookup Tables

ProReflow: Progressive Reflow with Decomposed Velocity

Advancing Visual Large Language Model for Multi-granular Versatile Perception

Rolling Shutter Correction with Intermediate Distortion Flow Estimation

Bridging the Gap: A Unified Video Comprehension Framework for Moment Retrieval and Highlight Detection

Incremental Residual Concept Bottleneck Models

Compressing Convolutional Neural Networks via Factorized Convolutional Filters

TediGAN: Text-Guided Diverse Face Image Generation and Manipulation

Probabilistic Modeling of Semantic Ambiguity for Scene Graph Generation

Adder Attention for Vision Transformer

Rethinking Alignment in Video Super-Resolution Transformers

SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation

Assessor360: Multi-sequence Network for Blind Omnidirectional Image Quality Assessment