rockylynnstein commited on
Commit
b9b611b
·
verified ·
1 Parent(s): 5488720

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -6
README.md CHANGED
@@ -139,7 +139,6 @@ pip install torch>=2.1.0 transformers>=4.40.0 accelerate compressed-tensors
139
  | **Base Model** | [microsoft/NextCoder-14B](https://huggingface.co/microsoft/NextCoder-14B) |
140
  | **Quantization Method** | FP8 E4M3 weight-only |
141
  | **Framework** | llm-compressor + compressed_tensors |
142
- | **Calibration Samples** | 2048 (8x industry standard) |
143
  | **Storage Size** | ~14GB (sharded safetensors) |
144
  | **VRAM (vLLM)** | ~14GB |
145
  | **VRAM (Transformers)** | ~28GB+ (decompressed to BF16) |
@@ -188,12 +187,7 @@ The 14B model offers significant improvements over 7B:
188
 
189
  **With vLLM**, the 14B model fits comfortably on a single RTX 4090 (24GB) or RTX 5000 Ada (32GB).
190
 
191
- ## 🔬 Quality Assurance
192
 
193
- - **High-quality calibration:** 2048 diverse code samples (8x industry standard of 256)
194
- - **Validation:** Tested on code generation benchmarks
195
- - **Format:** Standard compressed_tensors for broad compatibility
196
- - **Optimization:** Fine-tuned calibration for code-specific patterns
197
 
198
  ## 📚 Original Model
199
 
 
139
  | **Base Model** | [microsoft/NextCoder-14B](https://huggingface.co/microsoft/NextCoder-14B) |
140
  | **Quantization Method** | FP8 E4M3 weight-only |
141
  | **Framework** | llm-compressor + compressed_tensors |
 
142
  | **Storage Size** | ~14GB (sharded safetensors) |
143
  | **VRAM (vLLM)** | ~14GB |
144
  | **VRAM (Transformers)** | ~28GB+ (decompressed to BF16) |
 
187
 
188
  **With vLLM**, the 14B model fits comfortably on a single RTX 4090 (24GB) or RTX 5000 Ada (32GB).
189
 
 
190
 
 
 
 
 
191
 
192
  ## 📚 Original Model
193