πŸ“‹ Model Description


base_model:
  • ArliAI/GLM-4.5-Air-Derestricted
basemodelrelation: quantized quantized_by: ddh0 license: mit

GLM-4.5-Air-Derestricted-GGUF

This repository contains several custom GGUF quantizations of ArliAI/GLM-4.5-Air-Derestricted, to be used with llama.cpp.

The naming scheme for these custom quantizations is as follows:

ModelName-DefaultType-FFN-UpType-GateType-DownType.gguf

Where DefaultType refers to the default tensor type, and UpType, GateType, and DownType refer to the tensor types used for the ffnupexps, ffngateexps, and ffndownexps tensors respectively.

Quantizations

These quantizations use Q80 for all tensors by default, including the dense FFN block. Only the conditional experts are downgraded. The shared expert is always kept in Q80. They were quantized using my own imatrix (the calibration text corpus can be found here).

FilenameSize (GB)Size (GiB)Average BPWDirect link
GLM-4.5-Air-Derestricted-Q80-FFN-IQ4XS-IQ4XS-Q50.gguf68.6363.924.97Download
GLM-4.5-Air-Derestricted-Q80-FFN-Q5K-Q5K-Q80.gguf91.9785.666.66Download
GLM-4.5-Air-Derestricted-Q80-FFN-Q6K-Q6K-Q80.gguf100.9994.067.31Download
GLM-4.5-Air-Derestricted-Q80.gguf117.45109.388.51Download
GLM-4.5-Air-Derestricted-bf16.gguf220.98205.8116.00Download 1/2 Download 2/2

πŸ“‚ GGUF File List

πŸ“ Filename πŸ“¦ Size ⚑ Download
GLM-4.5-Air-Derestricted-Q8_0-FFN-IQ4_XS-IQ4_XS-Q5_0.gguf
Recommended LFS Q4
63.93 GB Download
GLM-4.5-Air-Derestricted-Q8_0-FFN-Q5_K-Q5_K-Q8_0.gguf
LFS Q5
85.67 GB Download
GLM-4.5-Air-Derestricted-Q8_0-FFN-Q6_K-Q6_K-Q8_0.gguf
LFS Q6
94.07 GB Download
GLM-4.5-Air-Derestricted-Q8_0.gguf
LFS Q8
109.39 GB Download
GLM-4.5-Air-Derestricted-bf16-00001-of-00002.gguf
LFS FP16
102.5 GB Download
GLM-4.5-Air-Derestricted-bf16-00002-of-00002.gguf
LFS FP16
102.5 GB Download