π Model Description
base_model:
- Qwen/Qwen3.5-397B-A17B
This repo contains specialized MoE-quants for Qwen3.5-397B-A17B. The idea being that given the huge size of the FFN tensors compared to the rest of the tensors in the model,
it should be possible to achieve a better quality while keeping the overall size of the entire model smaller compared to a similar naive quantization.
To that end, the quantization type default is kept in high quality and the FFN UP + FFN GATE tensors are quanted down along with the FFN DOWN tensors.
| Quant | Size | Mixture | PPL | 1-(Mean PPL(Q)/PPL(base)) | KLD |
|---|---|---|---|---|---|
| Q5KM | 273.49 GiB (5.93 BPW) | Q80 / Q5K / Q5K / Q6K | 4.617400 Β± 0.057235 | +0.0156% | 0.002553 Β± 0.000078 |
| Q5KS | 257.55 GiB (5.58 BPW) | Q80 / Q5K / Q5K / Q5K | 4.620864 Β± 0.057279 | +0.0907% | 0.002903 Β± 0.000085 |
| Q4KM | 227.55 GiB (4.93 BPW) | Q80 / Q4K / Q4K / Q5K | 4.624688 Β± 0.057341 | +0.1735% | 0.004496 Β± 0.000117 |
| IQ4XS | 176.92 GiB (3.83 BPW) | Q80 / IQ3S / IQ3S / IQ4_XS | 4.653226 Β± 0.057738 | +0.7916% | 0.011963 Β± 0.000309 |
| IQ3S | 136.31 GiB (2.95 BPW) | Q6K / IQ2S / IQ2S / IQ3_S | 4.745153 Β± 0.059208 | +2.7828% | 0.033163 Β± 0.000791 |