amphion/Emilia-Dataset
Viewer • Updated • 54.8M • 42.3k • 457
This is the Anime-Llasa-3B, a Text-to-Speech (TTS) model fine-tuned for Japanese. This model is based on HKUSTAudio/Llasa-3B.
You can try a demo on Hugging Face Spaces: Anime-Llasa-3B-Demo
The primary improvement in this version is a significant increase in the training data. The amount of training data has been increased from approximately 14,000 hours (3 epochs) to approximately 33,000 hours (1 epoch).
This enhancement aims to further improve the model's expressiveness and overall stability.
This model is licensed under the CC-BY-NC-4.0.
Base model
meta-llama/Llama-3.2-3B-Instruct