File	Notes
PART 1 PART 2 PART 3	Q6_K with token embedding, output, and some other tensors quantized to Q8_0 6.65 bpw ~1.3% increase in size relative to Q6_K Quantized from F32

GGUF

Model size

123B params

Architecture

llama

Hardware compatibility

6-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Valeciela/Mistral-Large-Instruct-2411-Q6_K_L-GGUF

Base model

Quantized

(32)

this model