mrinaalarora 's Collections

124M-Base-Experiments

Checkpoints from my first 124M LLM pre-training project, covering scratch training, continued pre-training, and SFT experiments.