πŸ“‹ Model Description


license: apache-2.0 tags: - gguf - qwen - qwen3 - qwen3-coder - qwen3-coder-30B - qwen3-coder-30B-gguf - llama.cpp - quantized - text-generation - reasoning - agent - multilingual base_model: Qwen/Qwen3-Coder-30B-A3B-Instruct author: geoffmunn pipeline_tag: text-generation language: - en - zh - es - fr - de - ru - ar - ja - ko - hi

Qwen3-Coder-30B-A3B-Instruct-f16-GGUF

This is a GGUF-quantized version of the Qwen/Qwen3-Coder-30B-A3B-Instruct language model.

Converted for use with llama.cpp, LM Studio, OpenWebUI, GPT4All, and more.

πŸ’‘ Key Features of Qwen3-Coder-30B-A3B-Instruct:

Available Quantizations (from f16)

LevelQualitySpeedSizeRecommendation
Q2_KMinimal⚑ Fast11.30 GBOnly on severely memory-constrained systems.
| Q3KS | Low-Medium | ⚑ Fast | 13.30 GB | Minimal viability; avoid unless space-limited. | | Q3KM | Low-Medium | ⚑ Fast | 14.70 GB | Acceptable for basic interaction. | | Q4KS | Practical | ⚑ Fast | 17.50 GB | Good balance for mobile/embedded platforms. | | Q4KM | Practical | ⚑ Fast | 18.60 GB | Best overall choice for most users. | | Q5KS | Max Reasoning | 🐒 Medium | 21.10 GB | Slight quality gain; good for testing. | | Q5KM | Max Reasoning | 🐒 Medium | 21.70 GB | Best quality available. Recommended. | | Q6_K | Near-FP16 | 🐌 Slow | 25.10 GB | Diminishing returns. Only if RAM allows. | | Q8_0 | Lossless* | 🐌 Slow | 32.50 GB | Maximum fidelity. Ideal for archival. |

πŸ’‘ Recommendations by Use Case

>

- πŸ’» Standard Laptop (i5/M1 Mac): Q5KM (optimal quality)

- 🧠 Reasoning, Coding, Math: Q5KM or Q6K

- πŸ” RAG, Retrieval, Precision Tasks: Q6K or Q80

- πŸ€– Agent & Tool Integration: Q5KM

- πŸ› οΈ Development & Testing: Test from Q4KM up to Q80

Usage

Load this model using:

  • OpenWebUI – self-hosted AI interface with RAG & tools
  • LM Studio – desktop app with GPU support
  • GPT4All – private, offline AI chatbot
  • Or directly via llama.cpp

Each quantized model includes its own README.md and shares a common MODELFILE.

Author

πŸ‘€ Geoff Munn (@geoffmunn)
πŸ”— Hugging Face Profile

Disclaimer

This is a community conversion for local inference. Not affiliated with Alibaba Cloud or the Qwen team.

πŸ“‚ GGUF File List

πŸ“ Filename πŸ“¦ Size ⚑ Download
Qwen3-Coder-30B-A3B-Instruct-f16-imatrix-4697-coder.gguf
LFS FP16
116.38 MB Download
Qwen3-Coder-30B-A3B-Instruct-f16-imatrix-4697-generic.gguf
LFS FP16
116.38 MB Download
Qwen3-Coder-30B-A3B-Instruct-f16-imatrix:Q3_K_HIFI.gguf
LFS Q3
19.05 GB Download
Qwen3-Coder-30B-A3B-Instruct-f16-imatrix:Q3_K_M.gguf
LFS Q3
17.28 GB Download
Qwen3-Coder-30B-A3B-Instruct-f16-imatrix:Q3_K_S.gguf
LFS Q3
16.26 GB Download
Qwen3-Coder-30B-A3B-Instruct-f16:Q2_K.gguf
LFS Q2
10.49 GB Download
Qwen3-Coder-30B-A3B-Instruct-f16:Q3_K_HIFI.gguf
LFS Q3
15.69 GB Download
Qwen3-Coder-30B-A3B-Instruct-f16:Q3_K_M.gguf
LFS Q3
13.7 GB Download
Qwen3-Coder-30B-A3B-Instruct-f16:Q3_K_S.gguf
LFS Q3
12.38 GB Download
Qwen3-Coder-30B-A3B-Instruct-f16:Q4_K_HIFI.gguf
LFS Q4
19.05 GB Download
Qwen3-Coder-30B-A3B-Instruct-f16:Q4_K_M.gguf
Recommended LFS Q4
17.28 GB Download
Qwen3-Coder-30B-A3B-Instruct-f16:Q4_K_S.gguf
LFS Q4
16.26 GB Download
Qwen3-Coder-30B-A3B-Instruct-f16:Q5_K_M.gguf
LFS Q5
20.23 GB Download
Qwen3-Coder-30B-A3B-Instruct-f16:Q5_K_S.gguf
LFS Q5
19.63 GB Download
Qwen3-Coder-30B-A3B-Instruct-f16:Q6_K.gguf
LFS Q6
23.37 GB Download
Qwen3-Coder-30B-A3B-Instruct-f16:Q8_0.gguf
LFS Q8
30.25 GB Download