🌊 Overflow-1T

Overflow-1T is a next-generation, 1.03 Trillion parameter Large Language Model built on a custom 1.5-bit Ternary ({-1, 0, 1}) architecture.

By utilizing the Overflow architecture, this model achieves massive scale reasoning while remaining computationally efficient, designed specifically to run on consumer-grade hardware through advanced weight packing and specialized C++ inference kernels.

🚀 Key Specifications

Parameters: 1,000,000,000,000 (1T)
Precision: 1.5-bit Ternary (packed 5-weights-per-byte)
Architecture: OverflowForCausalLM
Layers: 128
Hidden Size: 16,384
Attention: Grouped Query Attention (GQA) with 16 KV heads
Format: .safetensors

🛠 Project Status: Initial Sharding

We are currently in the process of sharding the 1.5-bit weights to the Hugging Face Hub.

Progress: Shard 2 of 10 currently uploading.
Estimated Completion: March 2026.

🧠 Why 1.5-bit?

Unlike standard 1-bit models, Overflow-1T utilizes a 0-state (Neutral weight). This allows the model to effectively "silence" noise across its 1T parameter space, leading to significantly higher stability in Chain-of-Thought (CoT) reasoning and logic tasks compared to binary 1-bit models.

Created by CooLLaMACEO

Downloads last month: 100

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support