maaza-nlm-orchestrator-9.6m

70% tool-routing accuracy · 70ms average latency · 9.60M parameters

The fastest production-ready orchestrator under 20M parameters. The official routing brain for the MCPBodega ecosystem.

NLM Taxonomy (CycleCore, 2025)

Category	Parameters	Typical Capability
NLM	<10M	Routing, classification, orchestration
MLM	10–250M	Structured extraction
SLM	250M–1.5B	Reliable reasoning + extraction
LLM	>1.5B	General-purpose reasoning

maaza-nlm-orchestrator-9.6m is the current flagship of the NLM category.

For the full taxonomy and research, see the Maaza Paper v0.7

Model Card

Metric	Value
Parameters	9,600,000
Architecture	7-layer Transformer decoder, SwiGLU, RoPE
Hidden size / Heads	320 / 8
Vocabulary	8,000 (BPE, tool-aware)
Context length	512 tokens

Trained exclusively on 36 real, production-ready MCP tools from MCPBodega (Doom, Puppeteer, code execution, file I/O, database queries, etc.). No synthetic or placeholder tools.

Benchmarks (verified December 2025)

Metric	Value	Notes
Tool selection accuracy	70%	Held-out evaluation (100 samples)
Valid structured output	88%	Parsable JSON output
Average inference latency	70ms	RTX 4080, fp16, batch=1
Parameters	9,600,000

Initial reports of 90% were aspirational targets. Verified held-out performance is 70% tool selection with 88% valid output.

Comparison

Model	Parameters	Tool Accuracy	Latency
maaza-nlm-orchestrator-9.6m	9.6M	70%	70ms
NVIDIA Orchestrator-8B	8B	78%	≥800ms
Gorilla-7B	7B	52–58%	1–3s
ToolLlama-7B	7B	48–55%	2–4s

Ranks #1 under 20M parameters on latency-adjusted tool routing.

One-line deployment

mcpbodega deploy nano-orchestrator

Usage Example (PyTorch)

from model import MaazaNanoModel, MaazaNanoConfig
from tokenizer import BPETokenizer
import torch, json

tokenizer = BPETokenizer.load("tokenizer.json")
config = MaazaNanoConfig(**json.load(open("config.json")))
model = MaazaNanoModel(config)
model.load_state_dict(torch.load("model.pt", weights_only=True))
model.eval().cuda()

prompt = "<|user|>search for cats on the internet<|assistant|>"
input_ids = torch.tensor([tokenizer.encode(prompt)]).cuda()

with torch.no_grad():
    for _ in range(64):
        logits = model(input_ids)["logits"]
        next_token = logits[0, -1].argmax(-1)
        input_ids = torch.cat([input_ids, next_token[None, None]], dim=-1)
        if next_token.item() in tokenizer.special_tokens.values():
            break

print(tokenizer.decode(input_ids[0].tolist()))
# → [{"tool": "web_search", "params": {"query": "cats"}}]

Supported Tools (36)

Tool	Description
`web_search`	Search the web
`web_fetch`	Fetch URL content
`file_read`	Read local files
`file_write`	Write local files
`code_execute_python`	Run Python code
`code_execute_bash`	Run shell commands
`code_execute_js`	Run JavaScript
`email_send`	Send emails
`slack_send`	Send Slack messages
`calendar_add`	Create calendar events
`database_query`	Query databases
`puppeteer_navigate`	Browser navigation
`puppeteer_click`	Browser clicks
`puppeteer_screenshot`	Take screenshots
`doom_mcp`	Play Doom
`bitchat_send`	BLE mesh chat
`voice_mcp`	Text-to-speech
`maaza_extract_json`	Extract structured data
`json_validate`	Validate JSON
`csv_parse`	Parse CSV files
`regex_match`	Pattern matching
`calculator`	Math operations
`weather_lookup`	Weather data
`crypto_lookup`	Crypto prices
`stock_lookup`	Stock prices
`news_fetch`	News headlines
`mcpbodega_chat`	MCPBodega chat rooms
`mcpbodega_deploy`	Deploy MCPs
`mcpbodega_list`	List MCPs
`github_issue`	Create GitHub issues
`scratchpad_mcp`	Temporary storage
`health_check`	Service health checks
`cyclecore_terminal`	Terminal commands
`image_caption`	Image descriptions
`slmbench_query`	Benchmark queries
`translator`	Translation

License

Apache 2.0

Citation

@misc{cyclecore2025maaza-nlm,
  author       = {CycleCore},
  title        = {Maaza NLM Orchestrator 9.6M: 70% Tool-Routing at 70ms},
  year         = {2025},
  publisher    = {Hugging Face},
  url          = {https://huggingface.co/cyclecore/maaza-nlm-orchestrator-9.6m}
}

Built with MCPBodega · December 2025

Downloads last month: 16

Evaluation results

Tool Selection Accuracy
self-reported

70.000
Average Latency (ms)
self-reported

70.000

Metadata error: specify a dataset to view leaderboard