| | --- |
| | library_name: mlc-llm |
| | base_model: Qwen/Qwen2.5-Math-72B-Instruct |
| | tags: |
| | - mlc-llm |
| | - web-llm |
| | --- |
| | |
| | # Qwen2.5-Math-72B-Instruct-q0f16-MLC |
| |
|
| | This is the [Qwen2.5-Math-72B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Math-72B-Instruct) model in MLC format `q0f16`. |
| | The model can be used for projects [MLC-LLM](https://github.com/mlc-ai/mlc-llm) and [WebLLM](https://github.com/mlc-ai/web-llm). |
| |
|
| | ## Example Usage |
| |
|
| | Here are some examples of using this model in MLC LLM. |
| | Before running the examples, please install MLC LLM by following the [installation documentation](https://llm.mlc.ai/docs/install/mlc_llm.html#install-mlc-packages). |
| |
|
| | ### Chat |
| |
|
| | In command line, run |
| | ```bash |
| | mlc_llm chat HF://mlc-ai/Qwen2.5-Math-72B-Instruct-q0f16-MLC |
| | ``` |
| |
|
| | ### REST Server |
| |
|
| | In command line, run |
| | ```bash |
| | mlc_llm serve HF://mlc-ai/Qwen2.5-Math-72B-Instruct-q0f16-MLC |
| | ``` |
| |
|
| | ### Python API |
| |
|
| | ```python |
| | from mlc_llm import MLCEngine |
| | |
| | # Create engine |
| | model = "HF://mlc-ai/Qwen2.5-Math-72B-Instruct-q0f16-MLC" |
| | engine = MLCEngine(model) |
| | |
| | # Run chat completion in OpenAI API. |
| | for response in engine.chat.completions.create( |
| | messages=[{"role": "user", "content": "What is the meaning of life?"}], |
| | model=model, |
| | stream=True, |
| | ): |
| | for choice in response.choices: |
| | print(choice.delta.content, end="", flush=True) |
| | print("\n") |
| | |
| | engine.terminate() |
| | ``` |
| |
|
| | ## Documentation |
| |
|
| | For more information on MLC LLM project, please visit our [documentation](https://llm.mlc.ai/docs/) and [GitHub repo](http://github.com/mlc-ai/mlc-llm). |
| |
|