"Not all quantized model perform good", serving framework ollama uses NVIDIA gpu, llama.cpp uses CPU with AVX & AMX
v1k
xbruce22
AI & ML interests
None yet
Recent Activity
liked
a model
4 days ago
Tongyi-MAI/Z-Image-Turbo
upvoted
an
article
4 days ago
New in llama.cpp: Model Management
commented on
an
article
4 days ago
New in llama.cpp: Model Management