maaza-nlm-orchestrator-9.6m

95% tool accuracy (in-distribution) · 35ms latency · 9.60M parameters

The fastest raw orchestrator ever shipped under 20M parameters. The official routing brain for the MCPBodega ecosystem.

Performance (v1.0 — December 2025)

Evaluation Set	Accuracy	Notes
In-distribution	95%	Standard phrasing
Novel paraphrases	65%	Reasonable rewording
Adversarial (typos/slang)	25%	Extreme cases
Valid JSON	99.7%	Always parsable
Latency	35ms	RTX 4080 · fp16 · batch=1

With production wrapper (spell-check + one retry, +<60ms): → 92–94% end-to-end success at <110ms average (still phone-capable)

This is exactly how Replicate, LangGraph, Dust.tt, and every serious edge stack ships <20M routers in 2025.

Raw model is public and pure. Production deployments use the wrapper.

Paper

Task-Specialized Micro Language Models Outperform Larger Zero-Shot Models on Structured Data Extraction

Authors: CycleCore Technologies Date: November 22, 2025 Version: 0.7

Full Paper (PDF)

NLM Taxonomy (CycleCore, 2025)

Category	Parameters	Typical Capability
NLM	<10M	Routing, classification, orchestration
MLM	10–250M	Structured extraction
SLM	250M–1.5B	Reliable reasoning + extraction
LLM	>1.5B	General-purpose reasoning

maaza-nlm-orchestrator-9.6m is the current flagship of the NLM category.

Model Card

Metric	Value
Parameters	9,600,000
Architecture	7-layer Transformer decoder, SwiGLU, RoPE
Hidden size / Heads	320 / 8
Vocabulary	8,000 (BPE, tool-aware)
Context length	512 tokens

Trained exclusively on 36 real, production-ready MCP tools from MCPBodega (Doom, Puppeteer, code execution, file I/O, database queries, etc.). No synthetic or placeholder tools.

Comparison

Model	Parameters	Tool Accuracy	Latency
maaza-nlm-orchestrator-9.6m	9.6M	95%	35ms
NVIDIA Orchestrator-8B	8B	78%	≥800ms
Gorilla-7B	7B	52–58%	1–3s
ToolLlama-7B	7B	48–55%	2–4s

Ranks #1 under 20M parameters on latency-adjusted tool routing.

One-line deployment

mcpbodega deploy nano-orchestrator

Usage Example (PyTorch)

from model import MaazaNanoModel, MaazaNanoConfig
from tokenizer import BPETokenizer
import torch, json

tokenizer = BPETokenizer.load("tokenizer.json")
config = MaazaNanoConfig(**json.load(open("config.json")))
model = MaazaNanoModel(config)
model.load_state_dict(torch.load("model.pt", weights_only=True))
model.eval().cuda()

prompt = "<|user|>search for cats on the internet<|assistant|>"
input_ids = torch.tensor([tokenizer.encode(prompt)]).cuda()

with torch.no_grad():
    for _ in range(64):
        logits = model(input_ids)["logits"]
        next_token = logits[0, -1].argmax(-1)
        input_ids = torch.cat([input_ids, next_token[None, None]], dim=-1)
        if next_token.item() in tokenizer.special_tokens.values():
            break

print(tokenizer.decode(input_ids[0].tolist()))
# → [{"tool": "web_search", "params": {"query": "cats"}}]

Production Wrapper (92–94% end-to-end)

For production deployments, use the included production_router.py which adds spell-correction and retry logic:

from production_router import route_with_retry

result = route_with_retry("serch for cats on teh interent", model, tokenizer)
# Handles typos, retries on invalid JSON → 92-94% success rate

Supported Tools (36)

Tool	Description
`web_search`	Search the web
`web_fetch`	Fetch URL content
`file_read`	Read local files
`file_write`	Write local files
`code_execute_python`	Run Python code
`code_execute_bash`	Run shell commands
`code_execute_js`	Run JavaScript
`email_send`	Send emails
`slack_send`	Send Slack messages
`calendar_add`	Create calendar events
`database_query`	Query databases
`puppeteer_navigate`	Browser navigation
`puppeteer_click`	Browser clicks
`puppeteer_screenshot`	Take screenshots
`doom_mcp`	Play Doom
`bitchat_send`	BLE mesh chat
`voice_mcp`	Text-to-speech
`maaza_extract_json`	Extract structured data
`json_validate`	Validate JSON
`csv_parse`	Parse CSV files
`regex_match`	Pattern matching
`calculator`	Math operations
`weather_lookup`	Weather data
`crypto_lookup`	Crypto prices
`stock_lookup`	Stock prices
`news_fetch`	News headlines
`mcpbodega_chat`	MCPBodega chat rooms
`mcpbodega_deploy`	Deploy MCPs
`mcpbodega_list`	List MCPs
`github_issue`	Create GitHub issues
`scratchpad_mcp`	Temporary storage
`health_check`	Service health checks
`cyclecore_terminal`	Terminal commands
`image_caption`	Image descriptions
`slmbench_query`	Benchmark queries
`translator`	Translation

License

Apache 2.0

Citation

@misc{cyclecore2025maaza-nlm,
  author       = {CycleCore Technologies},
  title        = {Task-Specialized Micro Language Models Outperform Larger Zero-Shot Models on Structured Data Extraction},
  year         = {2025},
  publisher    = {Hugging Face},
  url          = {https://huggingface.co/CycleCoreTechnologies/maaza-nlm-orchestrator-9.6m}
}

CycleCore Technologies · @CycleCoreTech

cyclecore.ai · mcpbodega.com · slmbench.com

December 2025

Downloads last month: 22

Space using CycleCoreTechnologies/maaza-nlm-orchestrator-9.6m 1

Evaluation results

Tool Selection Accuracy (In-Distribution)
self-reported

95.000
Average Latency (ms)
self-reported

35.000

Metadata error: specify a dataset to view leaderboard