Instructions to use pushkarsharma/LegalSahayk_q4_k_m with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use pushkarsharma/LegalSahayk_q4_k_m with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="pushkarsharma/LegalSahayk_q4_k_m", filename="LegalSahyak_q4_k_m.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - llama-cpp-python
How to use pushkarsharma/LegalSahayk_q4_k_m with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="pushkarsharma/LegalSahayk_q4_k_m", filename="LegalSahyak_q4_k_m.gguf", )
llm.create_chat_completion( messages = [ { "role": "user", "content": "What is the capital of France?" } ] ) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- llama.cpp
How to use pushkarsharma/LegalSahayk_q4_k_m with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf pushkarsharma/LegalSahayk_q4_k_m:Q4_K_M # Run inference directly in the terminal: llama-cli -hf pushkarsharma/LegalSahayk_q4_k_m:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf pushkarsharma/LegalSahayk_q4_k_m:Q4_K_M # Run inference directly in the terminal: llama-cli -hf pushkarsharma/LegalSahayk_q4_k_m:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf pushkarsharma/LegalSahayk_q4_k_m:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf pushkarsharma/LegalSahayk_q4_k_m:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf pushkarsharma/LegalSahayk_q4_k_m:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf pushkarsharma/LegalSahayk_q4_k_m:Q4_K_M
Use Docker
docker model run hf.co/pushkarsharma/LegalSahayk_q4_k_m:Q4_K_M
- LM Studio
- Jan
- vLLM
How to use pushkarsharma/LegalSahayk_q4_k_m with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "pushkarsharma/LegalSahayk_q4_k_m" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "pushkarsharma/LegalSahayk_q4_k_m", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/pushkarsharma/LegalSahayk_q4_k_m:Q4_K_M
- Ollama
How to use pushkarsharma/LegalSahayk_q4_k_m with Ollama:
ollama run hf.co/pushkarsharma/LegalSahayk_q4_k_m:Q4_K_M
- Unsloth Studio new
How to use pushkarsharma/LegalSahayk_q4_k_m with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for pushkarsharma/LegalSahayk_q4_k_m to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for pushkarsharma/LegalSahayk_q4_k_m to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for pushkarsharma/LegalSahayk_q4_k_m to start chatting
- Pi new
How to use pushkarsharma/LegalSahayk_q4_k_m with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf pushkarsharma/LegalSahayk_q4_k_m:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "pushkarsharma/LegalSahayk_q4_k_m:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use pushkarsharma/LegalSahayk_q4_k_m with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf pushkarsharma/LegalSahayk_q4_k_m:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default pushkarsharma/LegalSahayk_q4_k_m:Q4_K_M
Run Hermes
hermes
- Docker Model Runner
How to use pushkarsharma/LegalSahayk_q4_k_m with Docker Model Runner:
docker model run hf.co/pushkarsharma/LegalSahayk_q4_k_m:Q4_K_M
- Lemonade
How to use pushkarsharma/LegalSahayk_q4_k_m with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull pushkarsharma/LegalSahayk_q4_k_m:Q4_K_M
Run and chat with the model
lemonade run user.LegalSahayk_q4_k_m-Q4_K_M
List all available models
lemonade list
LegalSahyak (Q4_K_M GGUF)
Model Description
LegalSahyak_q4_k_m.gguf is a quantized GGUF model intended for local legal question answering workflows, especially when paired with retrieval over contracts and Indian statutes.
- Base model:
unsloth/Meta-Llama-3.1-8B-Instruct - Adaptation: LoRA fine-tuning (rank
r=128) and merge - Quantization:
q4_k_mGGUF - Primary runtime target:
llama.cpp/llama-cpp-python
Intended Use
- Contract clause explanation and extraction
- Statute-grounded legal QA in a retrieval-augmented (RAG) pipeline
- Local/offline inference where low memory usage is needed
This model should be used with retrieval and human review for any high-stakes legal scenario.
Out-of-Scope Use
- Autonomous legal advice without human oversight
- Any use requiring guaranteed legal correctness or jurisdictional completeness
- Sensitive decisions where model hallucinations can cause harm
Training Data
The training pipeline in models/train.py uses two public datasets:
Prarabdha/indian-legal-supervised-fine-tuning-dataopennyaiorg/aalap_instruction_dataset
Training was performed in two stages:
- Knowledge injection on legal supervised examples
- Behavioral alignment on instruction-following data
Training Procedure (Summary)
- Context length: up to
8192 - Precision during training:
bfloat16 - LoRA target modules:
q_proj,k_proj,v_proj,o_proj,gate_proj,up_proj,down_proj - Optimizer:
adamw_8bit - Scheduler: cosine
- Export: merged weights -> GGUF quantized as
q4_k_m
Inference
Example with llama-cpp-python:
from llama_cpp import Llama
llm = Llama(
model_path="LegalSahyak_q4_k_m.gguf",
n_ctx=4096,
n_gpu_layers=20,
verbose=False,
)
resp = llm.create_chat_completion(
messages=[
{"role": "system", "content": "You are a legal assistant. Use provided context only."},
{"role": "user", "content": "Explain the notice period clause in simple words."},
],
max_tokens=512,
temperature=0.0,
)
print(resp["choices"][0]["message"]["content"])
Model File Details
- Filename:
LegalSahyak_q4_k_m.gguf - Size (bytes):
4920738464 - Approx size:
4.58 GiB - SHA256:
F32460DD8E7DC927B3CF33065D1E753FC1F85ED102A678512C8A5F520F544405
Limitations
- Can produce plausible but incorrect legal text
- Performance depends heavily on retrieval quality and prompt constraints
- May not reflect the latest statutory amendments
- Not a substitute for licensed legal counsel
Bias, Risk, and Safety
- Dataset and model biases may propagate into outputs
- Should not be used as the sole basis for legal, compliance, or policy decisions
- Recommended controls:
- Ground responses in retrieved sources
- Log model outputs and review manually
- Add refusal/uncertainty handling when context is missing
Citation
If you use this model in research or products, cite:
- The base model (
Meta-Llama-3.1) - The datasets listed above
- This repository (
Legalsahyak)
- Downloads last month
- 3
4-bit
Model tree for pushkarsharma/LegalSahayk_q4_k_m
Base model
meta-llama/Llama-3.1-8B