Poster "code generation tasks" Papers
2 papers found
BigDocs: An Open Dataset for Training Multimodal Models on Document and Code Tasks
Juan A. Rodriguez, Xiangru Jian, Siba Smarak Panigrahi et al.
ICLR 2025posterarXiv:2412.04626
5
citations
Getting the most out of your tokenizer for pre-training and domain adaptation
Gautier Dagan, Gabriel Synnaeve, Baptiste Roziere
ICML 2024poster