Text Generation
Transformers
Safetensors
English
Korean
qwen2
conversational
Eval Results (legacy)
text-generation-inference
Instructions to use skt/A.X-4.0 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use skt/A.X-4.0 with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="skt/A.X-4.0") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("skt/A.X-4.0") model = AutoModelForCausalLM.from_pretrained("skt/A.X-4.0") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use skt/A.X-4.0 with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "skt/A.X-4.0" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "skt/A.X-4.0", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/skt/A.X-4.0
- SGLang
How to use skt/A.X-4.0 with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "skt/A.X-4.0" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "skt/A.X-4.0", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "skt/A.X-4.0" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "skt/A.X-4.0", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use skt/A.X-4.0 with Docker Model Runner:
docker model run hf.co/skt/A.X-4.0
fix chat template
#3
by minpeter - opened
- Combined conditional checks for
add_generation_prompt,message.reasoning_content, and</think>to reduce nesting. - Ensured
contentis explicitly set when the condition is not met. - Updated tool output block to avoid string concatenation, outputting
message.contentas a separate template expression for better readability.
minpeter changed pull request status to open
In the current template, when both a tool call and a system prompt are used, "<|im_start|><|system|>" seems to be rendered twice. Is this the intended behavior?
e.g.,
<|im_start|><|system|>당신은 도구 호출 기능을 갖춘 유용한 도우미입니다. 사용자의 요청을 처리하기 위해서 필요한 도구가 주어진 목록에 있는 경우 도구 호출로 응답하세요.
필요한 도구가 목록에 없는 경우에는 도구 호출 없이 사용자가 요구한 정보를 제공하세요.
필요한 도구가 목록에 있지만 해당 도구를 호출하는데 필요한 argument 정보가 부족한 경우 해당 정보를 사용자에게 요청하세요.
사용자의 요청을 처리하기 위해 여러번 도구를 호출할 수 있어야 합니다.
도구 호출 이후 도구 실행 결과를 입력으로 받으면 해당 결과를 활용하여 답변을 생성하세요.
다음은 접근할 수 있는 도구들의 목록 입니다:
<tools>
{"name": "get_weather", "description": "Get current weather information for a location", "parameters": {"type": "object", "properties": {"location": {"type": "string", "description": "The city and state, e.g. San Francisco, CA"}, "unit": {"type": "string", "enum": ["celsius", "fahrenheit"], "description": "The unit of temperature to use"}}, "required": ["location"]}}
</tools>
도구를 호출하려면 아래의 JSON으로 응답하세요.
도구 호출 형식: <tool_call>{"name": 도구 이름, "arguments": dictionary 형태의 도구 인자값}</tool_call><|im_end|><|im_start|><|system|>SYSTEM PROMPT - SYSTEM PROMPT<|im_end|>
