Gemma-3-1B-it-Int8

This version of Gemma-3-1B-it has been converted to run on the Axera NPU using w8a16 quantization.

Compatible with Pulsar2 version: 4.0

Convert tools links:

For those who are interested in model conversion, you can try to export axmodel through the original repo:

Support Platform

How to use

Download all files from this repository to the device.

Using AX650 Board

ai@ai-bj ~/yongqiang/Gemma-3-1B-it $ tree -L 1
.
├── config.json
├── gemma3_axmodel
├── gemma3_tokenizer
├── infer_axmodel.py
├── README.md
└── utils

3 directories, 3 files

Inference with AX650 Host, such as M4N-Dock(爱芯派Pro) or AX650N DEMO Board

Text Generation

input text:

$ python3 infer_axmodel.py -q "请用中文介绍一下你自己."

log information:

[INFO] Compiler version: 5.0-patch1-dirty 93949955-dirty
Init InferenceSession: 100%|██████████████████████████████████████████████████████████| 26/26 [00:21<00:00,  1.18it/s]
[INFO] Using provider: AxEngineExecutionProvider
[INFO] Model type: 2 (triple core)
[INFO] Compiler version: 5.0-patch1-dirty 93949955-dirty
Model loaded successfully!
slice_indices: [0]
Slice prefill done: 0
answer >> 您好!我是一个大型语言模型,由 Google 训练。

简单来说,我可以帮你做很多事情,比如:

*   **回答你的问题:** 无论你问什么,我都会尽力用清晰、准确的语言来回答。
*   **生成文本:** 比如写诗歌、故事、邮件、代码等等。
*   **翻译语言:** 我可以将一种语言翻译成另一种语言。
*   **总结文本:** 我可以帮你快速阅读一段文字,提取关键信息。
*   **进行创意写作:** 我们可以一起头脑风暴,一起创作故事或文章。

我还在不断学习和进步,所以我的能力也在不断提升。

我是一个工具,可以帮助你,但不能代替人类的思考和判断。

希望我能帮到你^@! 你有什么想问的或者想让我做什么吗? 😊
Downloads last month
7
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support