XLOverflow commited on
Commit
1a28e76
·
verified ·
1 Parent(s): ea7d27f

Add model card with training config and eval metrics

Browse files
Files changed (1) hide show
  1. README.md +13 -3
README.md CHANGED
@@ -18,6 +18,8 @@ Part of a course project evaluating per-step weighted loss functions for trainin
18
  EAGLE3 draft models. Full pipeline and source:
19
  **https://github.com/XLOverflow/anlp_course_project**
20
 
 
 
21
  ## Training
22
 
23
  - **Framework:** [SpecForge](https://github.com/sgl-project/SpecForge) (our fork: https://github.com/XLOverflow/SpecForge)
@@ -29,15 +31,23 @@ EAGLE3 draft models. Full pipeline and source:
29
  - Additional epochs: 1
30
  - β_s profiled offline via `scripts/train/profile_beta.py`
31
 
 
 
 
 
 
 
 
 
 
32
 
33
  ## Files
34
 
35
  - `model.safetensors` — draft model weights (~763 MB)
36
  - `config.json` — model config
37
- - Checkpoint corresponds to: `outputs/eagle3-accrate/epoch_0_step_17026` in the original training output
38
 
39
- Optimizer state (`training_state.pt`, ~3 GB) is not uploaded — use the project
40
- repo's training scripts to resume from scratch if needed.
41
 
42
  ## Usage
43
 
 
18
  EAGLE3 draft models. Full pipeline and source:
19
  **https://github.com/XLOverflow/anlp_course_project**
20
 
21
+ Collection: [Qwen3 EAGLE3 — Weighted Loss Variants](https://huggingface.co/collections/XLOverflow/qwen3-eagle3-weighted-loss-variants)
22
+
23
  ## Training
24
 
25
  - **Framework:** [SpecForge](https://github.com/sgl-project/SpecForge) (our fork: https://github.com/XLOverflow/SpecForge)
 
31
  - Additional epochs: 1
32
  - β_s profiled offline via `scripts/train/profile_beta.py`
33
 
34
+ ## Evaluation (Qwen3-8B target)
35
+
36
+ | Dataset | τ (accept. length) | Speedup | Accuracy |
37
+ |---|---|---|---|
38
+ | GSM8K | 7.359 | 4.588× | 95.15% |
39
+ | MATH500 | 7.326 | 4.606× | 95.20% |
40
+
41
+ Baselines for reference: Vanilla ≈ 1× speedup, EAGLE-orig ≈ 2× speedup.
42
+
43
 
44
  ## Files
45
 
46
  - `model.safetensors` — draft model weights (~763 MB)
47
  - `config.json` — model config
48
+ - Corresponds to: `outputs/eagle3-accrate/epoch_0_step_17026` in the original training output
49
 
50
+ Optimizer state (~3 GB) is not uploaded — use the project repo's training scripts to resume from scratch if needed.
 
51
 
52
  ## Usage
53