π Model Description
base_model:
- zai-org/GLM-4.5-Air
GLM-4.5-Air-GGUF
This repository contains several custom GGUF quantizations of GLM-4.5-Air, to be used with llama.cpp.
The naming scheme for these custom quantizations is as follows:
ModelName-DefaultType-FFN-UpType-GateType-DownType.gguf
Where DefaultType refers to the default tensor type, and UpType, GateType, and DownType refer to the tensor types used for the ffnupexps, ffngateexps, and ffndownexps tensors respectively.
Original quantizations
These quantizations use Q80 for all tensors by default - only the dense FFN block and conditional experts are downgraded. The shared expert is always kept in Q80. They were quantized using bartowski's imatrix.
| Filename | Size (GB) | Size (GiB) | Average BPW | Direct link |
|---|---|---|---|---|
| GLM-4.5-Air-Q80-FFN-IQ3S-IQ3S-Q50.gguf | 61.66 | 57.43 | 4.47 | Download |
| GLM-4.5-Air-Q80-FFN-IQ4XS-IQ4XS-Q50.gguf | 68.56 | 63.86 | 4.97 | Download |
| GLM-4.5-Air-Q80-FFN-Q4K-Q4K-Q51.gguf | 72.82 | 67.82 | 5.27 | Download |
| GLM-4.5-Air-Q80-FFN-Q4K-Q4K-Q80.gguf | 83.44 | 77.71 | 6.04 | Download |
| GLM-4.5-Air-Q80-FFN-Q5K-Q5K-Q80.gguf | 91.94 | 85.63 | 6.66 | Download |
| GLM-4.5-Air-Q80-FFN-Q6K-Q6K-Q80.gguf | 100.97 | 94.04 | 7.31 | Download |
| GLM-4.5-Air-Q80.gguf | 117.45 | 109.39 | 8.50 | Download |
| GLM-4.5-Air-bf16.gguf | 220.98 | 205.81 | 16.00 | Download |
v2 quantizations
These quantizations use Q80 for all tensors by default, including the dense FFN block. Only the conditional experts are downgraded. The shared expert is always kept in Q80. They were quantized using my own imatrix (the calibration text corpus can be found here).
| Filename | Size (GB) | Size (GiB) | Average BPW | Direct link |
|---|---|---|---|---|
| GLM-4.5-Air-Q80-FFN-IQ4XS-IQ3S-IQ4NL-v2.gguf | 60.94 | 56.76 | 4.41 | Download |
| GLM-4.5-Air-Q80-FFN-IQ4XS-IQ4XS-IQ4NL-v2.gguf | 64.39 | 59.97 | 4.66 | Download |
| GLM-4.5-Air-Q80-FFN-IQ4XS-IQ4XS-Q50-v2.gguf | 68.63 | 63.92 | 4.97 | Download |
| GLM-4.5-Air-Q80-FFN-IQ4XS-IQ4XS-Q80-v2.gguf | 81.36 | 75.78 | 5.89 | Download |
| GLM-4.5-Air-Q80-FFN-Q5K-Q5K-Q80-v2.gguf | 91.97 | 85.66 | 6.66 | Download |
| GLM-4.5-Air-Q80-FFN-Q6K-Q6K-Q80-v2.gguf | 100.99 | 94.06 | 7.31 | Download |
π GGUF File List
| π Filename | π¦ Size | β‘ Download |
|---|---|---|
|
GLM-4.5-Air-Q8_0-FFN-IQ3_S-IQ3_S-Q5_0.gguf
Recommended
LFS
Q3
|
57.44 GB | Download |
|
GLM-4.5-Air-Q8_0-FFN-IQ4_XS-IQ3_S-IQ4_NL-v2.gguf
LFS
Q3
|
56.76 GB | Download |
|
GLM-4.5-Air-Q8_0-FFN-IQ4_XS-IQ4_XS-IQ4_NL-v2.gguf
LFS
Q4
|
59.98 GB | Download |
|
GLM-4.5-Air-Q8_0-FFN-IQ4_XS-IQ4_XS-Q5_0-v2.gguf
LFS
Q4
|
63.93 GB | Download |
|
GLM-4.5-Air-Q8_0-FFN-IQ4_XS-IQ4_XS-Q5_0.gguf
LFS
Q4
|
63.87 GB | Download |
|
GLM-4.5-Air-Q8_0-FFN-IQ4_XS-IQ4_XS-Q8_0-v2.gguf
LFS
Q4
|
75.79 GB | Download |
|
GLM-4.5-Air-Q8_0-FFN-Q4_K-Q4_K-Q5_1.gguf
LFS
Q4
|
67.83 GB | Download |
|
GLM-4.5-Air-Q8_0-FFN-Q4_K-Q4_K-Q8_0.gguf
LFS
Q4
|
77.72 GB | Download |
|
GLM-4.5-Air-Q8_0-FFN-Q5_K-Q5_K-Q8_0-v2.gguf
LFS
Q5
|
85.67 GB | Download |
|
GLM-4.5-Air-Q8_0-FFN-Q5_K-Q5_K-Q8_0.gguf
LFS
Q5
|
85.64 GB | Download |
|
GLM-4.5-Air-Q8_0-FFN-Q6_K-Q6_K-Q8_0-v2.gguf
LFS
Q6
|
94.07 GB | Download |
|
GLM-4.5-Air-Q8_0-FFN-Q6_K-Q6_K-Q8_0.gguf
LFS
Q6
|
94.05 GB | Download |
|
GLM-4.5-Air-Q8_0.gguf
LFS
Q8
|
109.39 GB | Download |
|
GLM-4.5-Air-bf16.gguf
LFS
FP16
|
205.82 GB | Download |
|
GLM-4.5-Air-ddh0_v2-imatrix.gguf
LFS
|
217.81 MB | Download |