Model Description


pipeline_tag: text-generation base_model:
  • moonshotai/Kimi-Linear-48B-A3B-Instruct

This is a MXFP4MOE quantization of the model Kimi-Linear-48B-A3B-Instruct

The mainline standard is to use MXFP4 for the MoE tensors, and Q8 for the rest.
So I created 2 new variants, where the other tensors are either BF16 or FP16 instead of Q8.
The order of preference is BF16, then F16.
On some architectures BF16 will be slower, but its the highest quality, essentialy its the original tensors from the model copied over unquantized.

GGUF File List

📁 Filename 📦 Size ⚡ Download
Kimi-Linear-48B-A3B-Instruct-MXFP4_MOE_BF16.gguf
Recommended LFS FP16
27.05 GB Download
Kimi-Linear-48B-A3B-Instruct-MXFP4_MOE_F16.gguf
LFS FP16
27.05 GB Download