πŸ“‹ Model Description


base_model:
  • Qwen/Qwen3.5-397B-A17B

This repo contains specialized MoE-quants for Qwen3.5-397B-A17B. The idea being that given the huge size of the FFN tensors compared to the rest of the tensors in the model,
it should be possible to achieve a better quality while keeping the overall size of the entire model smaller compared to a similar naive quantization.
To that end, the quantization type default is kept in high quality and the FFN UP + FFN GATE tensors are quanted down along with the FFN DOWN tensors.

QuantSizeMixturePPL1-(Mean PPL(Q)/PPL(base))KLD
Q5KM273.49 GiB (5.93 BPW)Q80 / Q5K / Q5K / Q6K4.617400 Β± 0.057235+0.0156%0.002553 Β± 0.000078
Q5KS257.55 GiB (5.58 BPW)Q80 / Q5K / Q5K / Q5K4.620864 Β± 0.057279+0.0907%0.002903 Β± 0.000085
Q4KM227.55 GiB (4.93 BPW)Q80 / Q4K / Q4K / Q5K4.624688 Β± 0.057341+0.1735%0.004496 Β± 0.000117
IQ4XS176.92 GiB (3.83 BPW)Q80 / IQ3S / IQ3S / IQ4_XS4.653226 Β± 0.057738+0.7916%0.011963 Β± 0.000309
IQ3S136.31 GiB (2.95 BPW)Q6K / IQ2S / IQ2S / IQ3_S4.745153 Β± 0.059208+2.7828%0.033163 Β± 0.000791
!kldgraph !pplgraph

πŸ“‚ GGUF File List

πŸ“ Filename πŸ“¦ Size ⚑ Download
mmproj-Qwen3.5-397B-A17B-BF16.gguf
Recommended LFS FP16
879.01 MB Download
mmproj-Qwen3.5-397B-A17B-F16.gguf
LFS FP16
875.63 MB Download
mmproj-Qwen3.5-397B-A17B-F32.gguf
LFS
1.7 GB Download
mmproj-Qwen3.5-397B-A17B-Q8_0.gguf
LFS Q8
595.31 MB Download