view article Article Sparse Mixture of Experts Language Model from Scratch: Extending makeMoE with Expert Capacity AviSoori1x • Mar 18, 2024 • 14
view article Article makeMoE: Implement a Sparse Mixture of Experts Language Model from Scratch AviSoori1x • May 7, 2024 • 121
view article Article Train 400x faster Static Embedding Models with Sentence Transformers tomaarsen • Jan 15, 2025 • 230
view article Article Training and Finetuning Embedding Models with Sentence Transformers tomaarsen • May 28, 2024 • 274
view article Article Training and Finetuning Reranker Models with Sentence Transformers tomaarsen • Mar 26, 2025 • 194
view article Article Faster Text Generation with Self-Speculative Decoding +2 ariG23498, melhoushi, pcuenq, reach-vb • Nov 20, 2024 • 65
view article Article Controlling Language Model Generation with NVIDIA's LogitsProcessorZoo ariG23498, aerdem4 • Dec 23, 2024 • 51
view article Article From PyTorch DDP to Accelerate to Trainer, mastery of distributed training with ease muellerzr • Oct 21, 2022 • 44