ox-ox/MiniMax-M2.5-GGUF

Name: ox-ox/MiniMax-M2.5-GGUF
Author: ox-ox

High-quality GGUF model

3.2K 📥 Downloads

17 ❤️ Likes

2 📁 GGUF Files

239.06 GB 💾 Total Size

2 weeks ago 🔄 Last Updated

📋 Model Description

license: mit base_model: MiniMaxAI/MiniMax-M2.5 tags:

gguf
moe
minimax
llama.cpp
applesilicon
reasoning
conversational

model_creator: MiniMaxAI pipeline_tag: text-generation

MiniMax-M2.5-GGUF (230B MoE)

High-precision GGUF quants of the MiniMax-M2.5 (230B parameters) Mixture of Experts model. These versions are specifically optimized for local inference on high-RAM setups, particularly Apple Silicon (M3 Max/Ultra).

🔬 Perplexity Validation (WikiText-2):

Final PPL: 8.2213 +/- 0.09

Context: 4096 / 32 chunks

Outcome: The Q3KL quantization maintains high logical coherence while boosting speed to 28.7 t/s. Minimal degradation for a ~20GB size reduction vs Q4.

🚀 Available Quants

File Name	Method	Size	Use Case
`minimax-m2.5-Q4KM.gguf`	Q4KM	138 GB	Highest logic preservation. Requires >128GB RAM or SSD swap.
`minimax-m2.5-Q3KL.gguf`	Q3KL	~110 GB	Sweet spot for 128GB Macs. Runs natively in RAM with high t/s ( 28 ON MAC M3 MAX ).

🛠 Model Details

Architecture: MiniMax-M2 (Mixture of Experts) with 256 experts (8 active per token).
Parameters: ~230B total.
Quantization Process: Unlike automated scripts, these quants were generated from a full F16 GGUF Master (457GB) to minimize accumulation of errors during the K-Quant process.
Context Window: Up to 196k tokens (Native support).
Chat Template: Includes the official Jinja template for proper handling of interleaved tags, separating reasoning from the final response.

💻 Usage

Requires llama.cpp build 8022 or higher.

Command Line Example:

```bash ./llama-cli -m minimax-m2.5-Q3KL.gguf -n -1 \\ -c 262000 \\ -ngl 99 -fa on -ctk q40 -ctv q40 -b 2048 -ub 1024 --port 8080 --jinja --verbose -sm none --draft 16 -ncmoe 0 --cache-reuse 1024 --draft-p-min 0.5

📂 GGUF File List

📁 Filename	📦 Size	⚡ Download
minimax-m2.5-Q3_K_L.gguf LFS Q3	110.22 GB	Download
minimax-m2.5-Q4_K_M.gguf Recommended LFS Q4	128.84 GB	Download

📊 Model Information

🆔 Model ID: ox-ox/MiniMax-M2.5-GGUF

📅 Created: 2 weeks ago

🔄 Last Updated: 2 weeks ago

📥 Downloads: 3.2K

❤️ Likes: 17

🎯 Difficulty: Advanced

⚙️ Quantization: Q3, Q4

🏷️ Tags

ggufmoeminimaxllama.cppapplesiliconreasoningconversationaltext-generationbase_model:MiniMaxAI/MiniMax-M2.5base_model:quantized:MiniMaxAI/MiniMax-M2.5license:mitendpoints_compatibleregion:us

🔗 Related Links

🤗 Visit HuggingFace ⚡ Quick Download