Submitted by akhaliq 38 NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models · 19 authors 237 3
Submitted by akhaliq 18 Finetuned Multimodal Language Models Are High-Quality Image-Text Data Filters · 7 authors 69 1
Submitted by akhaliq 17 MathScale: Scaling Instruction Tuning for Mathematical Reasoning · 4 authors 2
Submitted by akhaliq 15 MAGID: An Automated Pipeline for Generating Synthetic Multi-modal Datasets · 10 authors 2 1
Submitted by akhaliq 13 EasyQuant: An Efficient Data-free Quantization Algorithm for LLMs · 6 authors 3
Submitted by akhaliq 11 Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models · 6 authors 246 1
Submitted by akhaliq 11 Modeling Collaborator: Enabling Subjective Vision Classification With Minimal Human Effort via LLM Tool-Use · 13 authors 1
Submitted by akhaliq 9 RT-Sketch: Goal-Conditioned Imitation Learning from Hand-Drawn Sketches · 13 authors 1