Update README.md
Browse files
README.md
CHANGED
|
@@ -22,7 +22,7 @@ metrics:
|
|
| 22 |
|
| 23 |
**Llama-CRAFTS** (**C**ontext **R**ich **A**nd **F**ine-**T**uned On **S**ynthetic Data) is a Llama-3-8B model fine-tuned for the **Builder Action Prediction (BAP)** task in Minecraft. The model predicts a sequence of block placements or removals based on the current game context.
|
| 24 |
|
| 25 |
-
This model establishes a new **state-of-the-art** on the task, achieving an F1 score of **53.0**—a 6-point improvement over the previous SOTA (Nebula). Its development is part of a holistic re-examination of the BAP task itself, introducing an improved evaluation framework, new synthetic datasets, and enhanced modeling techniques, thereby forming **BAP v2**, an enchanced task framework.
|
| 26 |
|
| 27 |
### Key Features:
|
| 28 |
* **State-of-the-Art Performance**: Achieves the highest score on the BAP v2 benchmark.
|
|
|
|
| 22 |
|
| 23 |
**Llama-CRAFTS** (**C**ontext **R**ich **A**nd **F**ine-**T**uned On **S**ynthetic Data) is a Llama-3-8B model fine-tuned for the **Builder Action Prediction (BAP)** task in Minecraft. The model predicts a sequence of block placements or removals based on the current game context.
|
| 24 |
|
| 25 |
+
This model establishes a new **state-of-the-art** on the task, achieving an F1 score of **53.0**—a 6-point improvement over the previous SOTA ([Nebula](https://arxiv.org/abs/2406.18164)). Its development is part of a holistic re-examination of the BAP task itself, introducing an improved evaluation framework, new synthetic datasets, and enhanced modeling techniques, thereby forming **BAP v2**, an enchanced task framework.
|
| 26 |
|
| 27 |
### Key Features:
|
| 28 |
* **State-of-the-Art Performance**: Achieves the highest score on the BAP v2 benchmark.
|