Instructions to use ginipick/GLM-4.6 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use ginipick/GLM-4.6 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="ginipick/GLM-4.6")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("ginipick/GLM-4.6")
model = AutoModelForCausalLM.from_pretrained("ginipick/GLM-4.6")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use ginipick/GLM-4.6 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "ginipick/GLM-4.6"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ginipick/GLM-4.6",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/ginipick/GLM-4.6

SGLang

How to use ginipick/GLM-4.6 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "ginipick/GLM-4.6" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ginipick/GLM-4.6",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "ginipick/GLM-4.6" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "ginipick/GLM-4.6",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use ginipick/GLM-4.6 with Docker Model Runner:
```
docker model run hf.co/ginipick/GLM-4.6
```

GLM-4.6 / chat_template.jinja

ginipick

Duplicate from zai-org/GLM-4.6

6312839 verified 7 months ago

raw

history blame contribute delete

3.24 kB

	[gMASK]<sop>
	{%- if tools -%}
	<\|system\|>
	# Tools

	You may call one or more functions to assist with the user query.

	You are provided with function signatures within <tools></tools> XML tags:
	<tools>
	{% for tool in tools %}
	{{ tool \| tojson(ensure_ascii=False) }}
	{% endfor %}
	</tools>

	For each function call, output the function name and arguments within the following XML format:
	<tool_call>{function-name}
	<arg_key>{arg-key-1}</arg_key>
	<arg_value>{arg-value-1}</arg_value>
	<arg_key>{arg-key-2}</arg_key>
	<arg_value>{arg-value-2}</arg_value>
	...
	</tool_call>{%- endif -%}
	{%- macro visible_text(content) -%}
	{%- if content is string -%}
	{{- content }}
	{%- elif content is iterable and content is not mapping -%}
	{%- for item in content -%}
	{%- if item is mapping and item.type == 'text' -%}
	{{- item.text }}
	{%- elif item is string -%}
	{{- item }}
	{%- endif -%}
	{%- endfor -%}
	{%- else -%}
	{{- content }}
	{%- endif -%}
	{%- endmacro -%}
	{%- set ns = namespace(last_user_index=-1) %}
	{%- for m in messages %}
	{%- if m.role == 'user' %}
	{% set ns.last_user_index = loop.index0 -%}
	{%- endif %}
	{%- endfor %}
	{% for m in messages %}
	{%- if m.role == 'user' -%}<\|user\|>
	{{ visible_text(m.content) }}
	{{- '/nothink' if (enable_thinking is defined and not enable_thinking and not visible_text(m.content).endswith("/nothink")) else '' -}}
	{%- elif m.role == 'assistant' -%}
	<\|assistant\|>
	{%- set reasoning_content = '' %}
	{%- set content = visible_text(m.content) %}
	{%- if m.reasoning_content is string %}
	{%- set reasoning_content = m.reasoning_content %}
	{%- else %}
	{%- if '</think>' in content %}
	{%- set reasoning_content = content.split('</think>')[0].rstrip('\n').split('<think>')[-1].lstrip('\n') %}
	{%- set content = content.split('</think>')[-1].lstrip('\n') %}
	{%- endif %}
	{%- endif %}
	{%- if loop.index0 > ns.last_user_index and reasoning_content -%}
	{{ '\n<think>' + reasoning_content.strip() + '</think>'}}
	{%- else -%}
	{{ '\n<think></think>' }}
	{%- endif -%}
	{%- if content.strip() -%}
	{{ '\n' + content.strip() }}
	{%- endif -%}
	{% if m.tool_calls %}
	{% for tc in m.tool_calls %}
	{%- if tc.function %}
	{%- set tc = tc.function %}
	{%- endif %}
	{{ '\n<tool_call>' + tc.name }}
	{% set _args = tc.arguments %}
	{% for k, v in _args.items() %}
	<arg_key>{{ k }}</arg_key>
	<arg_value>{{ v \| tojson(ensure_ascii=False) if v is not string else v }}</arg_value>
	{% endfor %}
	</tool_call>{% endfor %}
	{% endif %}
	{%- elif m.role == 'tool' -%}
	{%- if m.content is string -%}
	{%- if loop.first or (messages[loop.index0 - 1].role != "tool") %}
	{{- '<\|observation\|>' }}
	{%- endif %}
	{{- '\n<tool_response>\n' }}
	{{- m.content }}
	{{- '\n</tool_response>' }}
	{%- else -%}
	<\|observation\|>{% for tr in m.content %}

	<tool_response>
	{{ tr.output if tr.output is defined else tr }}
	</tool_response>{% endfor -%}
	{% endif -%}
	{%- elif m.role == 'system' -%}
	<\|system\|>
	{{ visible_text(m.content) }}
	{%- endif -%}
	{%- endfor -%}
	{%- if add_generation_prompt -%}
	<\|assistant\|>{{- '\n<think></think>' if (enable_thinking is defined and not enable_thinking) else '' -}}
	{%- endif -%}