πŸ“‹ Model Description


base_model:
  • zai-org/GLM-4.5-Air
basemodelrelation: quantized quantized_by: ddh0

GLM-4.5-Air-GGUF

This repository contains several custom GGUF quantizations of GLM-4.5-Air, to be used with llama.cpp:

FilenameSize (GiB)Average BPWDirect link
GLM-4.5-Air-Q80-FFN-IQ3S-IQ3S-Q50.gguf57.434.47Download
GLM-4.5-Air-Q80-FFN-IQ4XS-IQ4XS-Q50.gguf63.864.97Download
GLM-4.5-Air-Q80-FFN-Q4K-Q4K-Q51.gguf67.825.27Download
GLM-4.5-Air-Q80-FFN-Q4K-Q4K-Q80.gguf77.716.04Download
GLM-4.5-Air-Q80-FFN-Q5K-Q5K-Q80.gguf85.636.66Download
GLM-4.5-Air-Q80-FFN-Q6K-Q6K-Q80.gguf94.047.31Download
GLM-4.5-Air-Q80.gguf109.398.50Download
GLM-4.5-Air-bf16.gguf205.8116.00Download
These quantizations use Q80 for all tensors by default - only the dense FFN block and conditional experts are downgraded. The shared expert is always kept in Q80.

πŸ“‚ GGUF File List

πŸ“ Filename πŸ“¦ Size ⚑ Download
GLM-4.5-Air-Q8_0-FFN-IQ3_S-IQ3_S-Q5_0.gguf
Recommended LFS Q3
57.44 GB Download
GLM-4.5-Air-Q8_0-FFN-IQ4_XS-IQ4_XS-Q5_0.gguf
LFS Q4
63.87 GB Download
GLM-4.5-Air-Q8_0-FFN-Q4_K-Q4_K-Q5_1.gguf
LFS Q4
67.83 GB Download
GLM-4.5-Air-Q8_0-FFN-Q4_K-Q4_K-Q8_0.gguf
LFS Q4
77.72 GB Download
GLM-4.5-Air-Q8_0-FFN-Q5_K-Q5_K-Q8_0.gguf
LFS Q5
85.64 GB Download
GLM-4.5-Air-Q8_0-FFN-Q6_K-Q6_K-Q8_0.gguf
LFS Q6
94.05 GB Download
GLM-4.5-Air-Q8_0.gguf
LFS Q8
109.39 GB Download
GLM-4.5-Air-bf16.gguf
LFS FP16
205.82 GB Download