andrijdavid/Voxtral-Mini-4B-Realtime-2602-GGUF

Name: andrijdavid/Voxtral-Mini-4B-Realtime-2602-GGUF
Author: andrijdavid

High-quality GGUF model

2.4K 📥 Downloads

1 ❤️ Likes

11 📁 GGUF Files

29.7 GB 💾 Total Size

3 weeks ago 🔄 Last Updated

📋 Model Description

base_model:

mistralai/Voxtral-Mini-4B-Realtime-2602

Voxtral-Mini-4B-Realtime-2602 (GGUF)

This repository contains GGUF weights for Voxtral Realtime 4B, a high-performance speech recognition (STT) model optimized for low-latency, real-time inference.

These weights are converted from the original mistralai/Voxtral-Mini-4B-Realtime-2602 model.

Voxtral is designed to process streaming audio with minimal delay, making it ideal for live transcription, voice assistants, and interactive applications.

Model Details

- Model Type: Speech Recognition / Transcription
- Parameters: ~4 Billion
- Architecture: Hybrid Encoder-Decoder with Log-Mel preprocessing
- Format: GGUF (optimized for ggml)
- Sample Rate: 16,000 Hz (Mono)

Inference with voxtral.cpp

For the fastest inference performance on CPU and GPU, use the voxtral.cpp (https://github.com/andrijdavid/voxtral.cpp) repository. It provides a lightweight C++ implementation based on ggml.

Getting Started

1. Clone the repository and build

git clone https://github.com/andrijdavid/voxtral.cpp
   cd voxtral.cpp
   cmake -B build -DCMAKEBUILDTYPE=Release
   cmake --build build -j

2. Download a quantized model
Use the provided script to download your preferred quantization (e.g., Q4_0):

./tools/downloadmodel.sh Q40

4. Run Transcription

Prepare a 16kHz mono WAV file and run inference:

./build/voxtral \
        --model models/voxtral/Q4_0.gguf \
        --audio input.wav \
        --threads 8

For more advanced usage, including streaming examples and conversion scripts, please visit the voxtral.cpp GitHub repository (https://github.com/andrijdavid/voxtral.cpp).

📂 GGUF File List

📁 Filename	📦 Size	⚡ Download
Q2_K.gguf LFS Q2	1.37 GB	Download
Q3_K.gguf LFS Q3	1.79 GB	Download
Q4_0.gguf Recommended LFS Q4	2.33 GB	Download
Q4_1.gguf LFS Q4	2.59 GB	Download
Q4_K.gguf LFS Q4	2.34 GB	Download
Q4_K_M.gguf LFS Q4	2.7 GB	Download
Q5_0.gguf LFS Q5	2.85 GB	Download
Q5_1.gguf LFS Q5	3.1 GB	Download
Q5_K.gguf LFS Q5	2.85 GB	Download
Q6_K.gguf LFS Q6	3.4 GB	Download
Q8_0.gguf LFS Q8	4.39 GB	Download

📊 Model Information

🆔 Model ID: andrijdavid/Voxtral-Mini-4B-Realtime-2602-GGUF

📅 Created: 3 weeks ago

🔄 Last Updated: 3 weeks ago

📥 Downloads: 2.4K

❤️ Likes: 1

🎯 Difficulty: Intermediate

⚙️ Quantization: Q2, Q3, Q4, Q5, Q6, Q8

🏷️ Tags

ggufbase_model:mistralai/Voxtral-Mini-4B-Realtime-2602base_model:quantized:mistralai/Voxtral-Mini-4B-Realtime-2602region:us

🔗 Related Links

🤗 Visit HuggingFace ⚡ Quick Download