πŸ“‹ Model Description


license: apache-2.0 language: - en - fr - es - de - it - pt - ru - ja - ko - zh - ar tags: - mistral - mistral3 - vision - llama.cpp - quantized - heretic - reasoning - conversational base_model: - mistralai/Ministral-3-14B-Reasoning-2512 - coder3101/Ministral-3-14B-Reasoning-2512-heretic

Ministral-3-14B-Reasoning-2512-heretic-GGUF

GGUF quantizations of coder3101/Ministral-3-14B-Reasoning-2512-heretic for use with llama.cpp and compatible tools.

Model Description

This is a fine-tuned version of Mistral's Ministral-3-14B-Reasoning-2512 vision-language model. It supports:

  • Text generation with reasoning capabilities (uses [THINK] tokens)
  • Vision/Image understanding (requires the mmproj file)
  • Tool/Function calling

Available Quantizations

QuantizationSizeDescription
BF1626 GBFull precision (bfloat16)
Q8_014 GB8-bit quantization
Q5KM9.0 GB5-bit K-quant (medium)
Q4KM7.7 GB4-bit K-quant (medium) - Recommended

Vision Support

For vision/image understanding, you need to download the mmproj (multimodal projector) file:

  • Ministral-3-14B-Reasoning-2512-heretic-mmproj-bf16.gguf (847 MB)

Chat Template

The model includes a custom chat template with reasoning support. The format uses:

  • [SYSTEMPROMPT]...[/SYSTEMPROMPT] - System message
  • [INST]...[/INST] - User messages
  • [THINK]...[/THINK] - Model's reasoning/thinking process
  • [IMG] - Image placeholder for vision inputs
  • [TOOLCALLS] and [TOOLRESULTS] - For function calling

Example conversation:

[SYSTEMPROMPT]You are a helpful assistant.[/SYSTEMPROMPT][INST]What is 2+2?[/INST][THINK]The user is asking for a simple arithmetic calculation. 2+2=4.[/THINK]The answer is 4.

Usage

Text-only (CLI)

llama-cli -m Ministral-3-14B-Reasoning-2512-heretic-Q4KM.gguf \
  -p "[INST]What is the capital of France?[/INST]" \
  -n 256

With Vision Support

llama-mtmd-cli \
  -m Ministral-3-14B-Reasoning-2512-heretic-Q4KM.gguf \
  --mmproj Ministral-3-14B-Reasoning-2512-heretic-mmproj-bf16.gguf \
  -p "Describe this image in detail." \
  --image /path/to/image.jpg

With llama-server (OpenAI-compatible API)

llama-server \
  -m Ministral-3-14B-Reasoning-2512-heretic-Q4KM.gguf \
  --mmproj Ministral-3-14B-Reasoning-2512-heretic-mmproj-bf16.gguf \
  --port 8080

Then query the API:

curl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "ministral", "messages": [{"role": "user", "content": "What is 2+2?"}]}'

Original Model

This GGUF is based on:

License

Apache 2.0

πŸ“‚ GGUF File List

πŸ“ Filename πŸ“¦ Size ⚑ Download
Ministral-3-14B-Reasoning-2512-heretic-BF16.gguf
LFS FP16
25.17 GB Download
Ministral-3-14B-Reasoning-2512-heretic-Q4_K_M.gguf
Recommended LFS Q4
7.67 GB Download
Ministral-3-14B-Reasoning-2512-heretic-Q5_K_M.gguf
LFS Q5
8.96 GB Download
Ministral-3-14B-Reasoning-2512-heretic-Q8_0.gguf
LFS Q8
13.37 GB Download
Ministral-3-14B-Reasoning-2512-heretic-mmproj-bf16.gguf
LFS FP16
846.53 MB Download