Rongjie Huang
10
Papers
159
Total Citations
Papers (10)
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
ICLR 2025arXiv
125
citations
TechSinger: Technique Controllable Multilingual Singing Voice Synthesis via Flow Matching
AAAI 2025
16
citations
OmniSep: Unified Omni-Modality Sound Separation with Query-Mixup
ICLR 2025
10
citations
Lumina-T2X: Scalable Flow-based Large Diffusion Transformer for Flexible Resolution Generation
ICLR 2025
8
citations
UniAudio: Towards Universal Audio Generation with Large Language Models
ICML 2024
0
citations
InstructSpeech: Following Speech Editing Instructions via Large Language Models
ICML 2024
0
citations
FreeBind: Free Lunch in Unified Multimodal Space via Knowledge Fusion
ICML 2024
0
citations
MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition
ICCV 2023arXiv
0
citations
M4Singer: A Multi-Style, Multi-Singer and Musical Score Provided Mandarin Singing Corpus
NeurIPS 2022
0
citations
GenerSpeech: Towards Style Transfer for Generalizable Out-Of-Domain Text-to-Speech
NeurIPS 2022
0
citations