Add model card with training config and eval metrics
Browse files
README.md
CHANGED
|
@@ -18,6 +18,8 @@ Part of a course project evaluating per-step weighted loss functions for trainin
|
|
| 18 |
EAGLE3 draft models. Full pipeline and source:
|
| 19 |
**https://github.com/XLOverflow/anlp_course_project**
|
| 20 |
|
|
|
|
|
|
|
| 21 |
## Training
|
| 22 |
|
| 23 |
- **Framework:** [SpecForge](https://github.com/sgl-project/SpecForge) (our fork: https://github.com/XLOverflow/SpecForge)
|
|
@@ -29,15 +31,23 @@ EAGLE3 draft models. Full pipeline and source:
|
|
| 29 |
- Additional epochs: 1
|
| 30 |
- β_s profiled offline via `scripts/train/profile_beta.py`
|
| 31 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 32 |
|
| 33 |
## Files
|
| 34 |
|
| 35 |
- `model.safetensors` — draft model weights (~763 MB)
|
| 36 |
- `config.json` — model config
|
| 37 |
-
-
|
| 38 |
|
| 39 |
-
Optimizer state (
|
| 40 |
-
repo's training scripts to resume from scratch if needed.
|
| 41 |
|
| 42 |
## Usage
|
| 43 |
|
|
|
|
| 18 |
EAGLE3 draft models. Full pipeline and source:
|
| 19 |
**https://github.com/XLOverflow/anlp_course_project**
|
| 20 |
|
| 21 |
+
Collection: [Qwen3 EAGLE3 — Weighted Loss Variants](https://huggingface.co/collections/XLOverflow/qwen3-eagle3-weighted-loss-variants)
|
| 22 |
+
|
| 23 |
## Training
|
| 24 |
|
| 25 |
- **Framework:** [SpecForge](https://github.com/sgl-project/SpecForge) (our fork: https://github.com/XLOverflow/SpecForge)
|
|
|
|
| 31 |
- Additional epochs: 1
|
| 32 |
- β_s profiled offline via `scripts/train/profile_beta.py`
|
| 33 |
|
| 34 |
+
## Evaluation (Qwen3-8B target)
|
| 35 |
+
|
| 36 |
+
| Dataset | τ (accept. length) | Speedup | Accuracy |
|
| 37 |
+
|---|---|---|---|
|
| 38 |
+
| GSM8K | 7.359 | 4.588× | 95.15% |
|
| 39 |
+
| MATH500 | 7.326 | 4.606× | 95.20% |
|
| 40 |
+
|
| 41 |
+
Baselines for reference: Vanilla ≈ 1× speedup, EAGLE-orig ≈ 2× speedup.
|
| 42 |
+
|
| 43 |
|
| 44 |
## Files
|
| 45 |
|
| 46 |
- `model.safetensors` — draft model weights (~763 MB)
|
| 47 |
- `config.json` — model config
|
| 48 |
+
- Corresponds to: `outputs/eagle3-accrate/epoch_0_step_17026` in the original training output
|
| 49 |
|
| 50 |
+
Optimizer state (~3 GB) is not uploaded — use the project repo's training scripts to resume from scratch if needed.
|
|
|
|
| 51 |
|
| 52 |
## Usage
|
| 53 |
|