EAGLE-ft (Uniform) β€” EAGLE3 Draft Model for Qwen3-8B

EAGLE3 draft fine-tuned on our training data with uniform per-step loss weights. Baseline.

Part of a course project evaluating per-step weighted loss functions for training EAGLE3 draft models. Full pipeline and source: https://github.com/XLOverflow/anlp_course_project

Collection: Qwen3 EAGLE3 β€” Weighted Loss Variants

Training

  • Framework: SpecForge (our fork: https://github.com/XLOverflow/SpecForge)
  • Target model: Qwen/Qwen3-8B
  • Draft init: AngelSlim/Qwen3-8B_eagle3
  • Data: ShareGPT-style reasoning traces (see scripts/data/ in project repo)
  • Loss weight per step: uniform (w_s = 1)
  • Epochs: 5
  • LR: 3e-5
  • Batch size: 1
  • Max length: 4096

Evaluation (Qwen3-8B target)

Dataset Ο„ (accept. length) Speedup Accuracy
GSM8K 6.909 4.275Γ— 95.60%
MATH500 6.758 4.245Γ— 94.60%

Baselines for reference: Vanilla β‰ˆ 1Γ— speedup, EAGLE-orig β‰ˆ 2Γ— speedup.

Files

  • model.safetensors β€” draft model weights (~763 MB)
  • config.json β€” model config
  • Corresponds to: outputs/eagle3-baseline-uniform/epoch_4_step_82000 in the original training output

Optimizer state (~3 GB) is not uploaded β€” use the project repo's training scripts to resume from scratch if needed.

Usage

from huggingface_hub import snapshot_download
draft_path = snapshot_download(repo_id="XLOverflow/qwen3-eagle3-baseline-uniform")
# Then load with EAGLE's EaModel β€” see scripts/eval/eval_combined.py in the project repo.
Downloads last month
2
Safetensors
Model size
0.4B params
Tensor type
I64
Β·
BF16
Β·
BOOL
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for XLOverflow/qwen3-eagle3-baseline-uniform

Finetuned
(8)
this model

Collection including XLOverflow/qwen3-eagle3-baseline-uniform