🌊 Overflow-1T

Overflow-1T is a next-generation, 1.03 Trillion parameter Large Language Model built on a custom 1.5-bit Ternary ({-1, 0, 1}) architecture.

By utilizing the Overflow architecture, this model achieves massive scale reasoning while remaining computationally efficient, designed specifically to run on consumer-grade hardware through advanced weight packing and specialized C++ inference kernels.

πŸš€ Key Specifications

  • Parameters: 1,000,000,000,000 (1T)
  • Precision: 1.5-bit Ternary (packed 5-weights-per-byte)
  • Architecture: OverflowForCausalLM
  • Layers: 128
  • Hidden Size: 16,384
  • Attention: Grouped Query Attention (GQA) with 16 KV heads
  • Format: .safetensors

πŸ›  Project Status: Initial Sharding

We are currently in the process of sharding the 1.5-bit weights to the Hugging Face Hub.

  • Progress: Shard 2 of 10 currently uploading.
  • Estimated Completion: March 2026.

🧠 Why 1.5-bit?

Unlike standard 1-bit models, Overflow-1T utilizes a 0-state (Neutral weight). This allows the model to effectively "silence" noise across its 1T parameter space, leading to significantly higher stability in Chain-of-Thought (CoT) reasoning and logic tasks compared to binary 1-bit models.


Created by CooLLaMACEO

Downloads last month
100
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support