Model Description


base_model:
  • Qwen/Qwen3.5-122B-A10B

This repo contains specialized MoE-quants for Qwen3.5-122B-A10B. The idea being that given the huge size of the FFN tensors compared to the rest of the tensors in the model, it should be possible to achieve a better quality while keeping the overall size of the entire model smaller compared to a similar naive quantization. To that end, the quantization type default is kept in high quality and the FFN UP + FFN GATE tensors are quanted down along with the FFN DOWN tensors.

QuantSizeMixturePPL1-(Mean PPL(Q)/PPL(base))KLD
Q80120.94 GiB (8.51 BPW)Q805.733978 ± 0.075548-0.0146%0.002545 ± 0.000078
Q5KM85.22 GiB (6.00 BPW)Q80 / Q5K / Q5K / Q6K5.740017 ± 0.075671+0.0907%0.003674 ± 0.000078
Q4KM71.44 GiB (5.03 BPW)Q80 / Q4K / Q4K / Q5K5.742536 ± 0.075656+0.1347%0.006429 ± 0.000197
IQ4XS56.25 GiB (3.96 BPW)Q80 / IQ3S / IQ3S / IQ4_XS5.799691 ± 0.076499+1.1313%0.016301 ± 0.000344
IQ3S43.35 GiB (3.05 BPW)Q6K / IQ2S / IQ2S / IQ3_S5.928605 ± 0.078470+3.3792%0.040833 ± 0.000741
!kldgraph !pplgraph

GGUF File List

📁 Filename 📦 Size ⚡ Download
mmproj-Qwen3.5-122B-A10B-BF16.gguf
Recommended LFS FP16
870 MB Download
mmproj-Qwen3.5-122B-A10B-F16.gguf
LFS FP16
866.63 MB Download
mmproj-Qwen3.5-122B-A10B-F32.gguf
LFS
1.68 GB Download
mmproj-Qwen3.5-122B-A10B-Q8_0.gguf
LFS Q8
590.53 MB Download