πŸ“‹ Model Description


base_model:
  • zai-org/GLM-4.5-Air
basemodelrelation: quantized quantized_by: ddh0 license: mit

GLM-4.5-Air-GGUF

This repository contains several custom GGUF quantizations of GLM-4.5-Air, to be used with llama.cpp.

The naming scheme for these custom quantizations is as follows:

ModelName-DefaultType-FFN-UpType-GateType-DownType.gguf

Where DefaultType refers to the default tensor type, and UpType, GateType, and DownType refer to the tensor types used for the ffnupexps, ffngateexps, and ffndownexps tensors respectively.

Original quantizations

These quantizations use Q80 for all tensors by default - only the dense FFN block and conditional experts are downgraded. The shared expert is always kept in Q80. They were quantized using bartowski's imatrix.

FilenameSize (GB)Size (GiB)Average BPWDirect link
GLM-4.5-Air-Q80-FFN-IQ3S-IQ3S-Q50.gguf61.6657.434.47Download
GLM-4.5-Air-Q80-FFN-IQ4XS-IQ4XS-Q50.gguf68.5663.864.97Download
GLM-4.5-Air-Q80-FFN-Q4K-Q4K-Q51.gguf72.8267.825.27Download
GLM-4.5-Air-Q80-FFN-Q4K-Q4K-Q80.gguf83.4477.716.04Download
GLM-4.5-Air-Q80-FFN-Q5K-Q5K-Q80.gguf91.9485.636.66Download
GLM-4.5-Air-Q80-FFN-Q6K-Q6K-Q80.gguf100.9794.047.31Download
GLM-4.5-Air-Q80.gguf117.45109.398.50Download
GLM-4.5-Air-bf16.gguf220.98205.8116.00Download

v2 quantizations

These quantizations use Q80 for all tensors by default, including the dense FFN block. Only the conditional experts are downgraded. The shared expert is always kept in Q80. They were quantized using my own imatrix (the calibration text corpus can be found here).

FilenameSize (GB)Size (GiB)Average BPWDirect link
GLM-4.5-Air-Q80-FFN-IQ4XS-IQ3S-IQ4NL-v2.gguf60.9456.764.41Download
GLM-4.5-Air-Q80-FFN-IQ4XS-IQ4XS-IQ4NL-v2.gguf64.3959.974.66Download
GLM-4.5-Air-Q80-FFN-IQ4XS-IQ4XS-Q50-v2.gguf68.6363.924.97Download
GLM-4.5-Air-Q80-FFN-IQ4XS-IQ4XS-Q80-v2.gguf81.3675.785.89Download
GLM-4.5-Air-Q80-FFN-Q5K-Q5K-Q80-v2.gguf91.9785.666.66Download
GLM-4.5-Air-Q80-FFN-Q6K-Q6K-Q80-v2.gguf100.9994.067.31Download

πŸ“‚ GGUF File List

πŸ“ Filename πŸ“¦ Size ⚑ Download
GLM-4.5-Air-Q8_0-FFN-IQ3_S-IQ3_S-Q5_0.gguf
Recommended LFS Q3
57.44 GB Download
GLM-4.5-Air-Q8_0-FFN-IQ4_XS-IQ3_S-IQ4_NL-v2.gguf
LFS Q3
56.76 GB Download
GLM-4.5-Air-Q8_0-FFN-IQ4_XS-IQ4_XS-IQ4_NL-v2.gguf
LFS Q4
59.98 GB Download
GLM-4.5-Air-Q8_0-FFN-IQ4_XS-IQ4_XS-Q5_0-v2.gguf
LFS Q4
63.93 GB Download
GLM-4.5-Air-Q8_0-FFN-IQ4_XS-IQ4_XS-Q5_0.gguf
LFS Q4
63.87 GB Download
GLM-4.5-Air-Q8_0-FFN-IQ4_XS-IQ4_XS-Q8_0-v2.gguf
LFS Q4
75.79 GB Download
GLM-4.5-Air-Q8_0-FFN-Q4_K-Q4_K-Q5_1.gguf
LFS Q4
67.83 GB Download
GLM-4.5-Air-Q8_0-FFN-Q4_K-Q4_K-Q8_0.gguf
LFS Q4
77.72 GB Download
GLM-4.5-Air-Q8_0-FFN-Q5_K-Q5_K-Q8_0-v2.gguf
LFS Q5
85.67 GB Download
GLM-4.5-Air-Q8_0-FFN-Q5_K-Q5_K-Q8_0.gguf
LFS Q5
85.64 GB Download
GLM-4.5-Air-Q8_0-FFN-Q6_K-Q6_K-Q8_0-v2.gguf
LFS Q6
94.07 GB Download
GLM-4.5-Air-Q8_0-FFN-Q6_K-Q6_K-Q8_0.gguf
LFS Q6
94.05 GB Download
GLM-4.5-Air-Q8_0.gguf
LFS Q8
109.39 GB Download
GLM-4.5-Air-bf16.gguf
LFS FP16
205.82 GB Download
GLM-4.5-Air-ddh0_v2-imatrix.gguf
LFS
217.81 MB Download