Gemma-3-1B-it-Int8

This version of Gemma-3-1B-it has been converted to run on the Axera NPU using w8a16 quantization.

Compatible with Pulsar2 version: 4.0

Convert tools links:

For those who are interested in model conversion, you can try to export axmodel through the original repo:

Support Platform

AX650
- M4N-Dock(爱芯派Pro)

How to use

Download all files from this repository to the device.

Using AX650 Board

ai@ai-bj ~/yongqiang/Gemma-3-1B-it $ tree -L 1
.
├── config.json
├── gemma3_axmodel
├── gemma3_tokenizer
├── infer_axmodel.py
├── README.md
└── utils

3 directories, 3 files

Inference with AX650 Host, such as M4N-Dock(爱芯派Pro) or AX650N DEMO Board

Text Generation

input text:

$ python3 infer_axmodel.py -q "请用中文介绍一下你自己."

log information:

[INFO] Compiler version: 5.0-patch1-dirty 93949955-dirty
Init InferenceSession: 100%|██████████████████████████████████████████████████████████| 26/26 [00:21<00:00,  1.18it/s]
[INFO] Using provider: AxEngineExecutionProvider
[INFO] Model type: 2 (triple core)
[INFO] Compiler version: 5.0-patch1-dirty 93949955-dirty
Model loaded successfully!
slice_indices: [0]
Slice prefill done: 0
answer >> 您好！我是一个大型语言模型，由 Google 训练。

简单来说，我可以帮你做很多事情，比如：

*   **回答你的问题：** 无论你问什么，我都会尽力用清晰、准确的语言来回答。
*   **生成文本：** 比如写诗歌、故事、邮件、代码等等。
*   **翻译语言：** 我可以将一种语言翻译成另一种语言。
*   **总结文本：** 我可以帮你快速阅读一段文字，提取关键信息。
*   **进行创意写作：** 我们可以一起头脑风暴，一起创作故事或文章。

我还在不断学习和进步，所以我的能力也在不断提升。

我是一个工具，可以帮助你，但不能代替人类的思考和判断。

希望我能帮到你^@！ 你有什么想问的或者想让我做什么吗？ 😊

Downloads last month: 7