π Model Description
base_model:
- mistralai/Voxtral-Mini-4B-Realtime-2602
Voxtral-Mini-4B-Realtime-2602 (GGUF)
This repository contains GGUF weights for Voxtral Realtime 4B, a high-performance speech recognition (STT) model optimized for low-latency, real-time inference.
These weights are converted from the original mistralai/Voxtral-Mini-4B-Realtime-2602 model.
Voxtral is designed to process streaming audio with minimal delay, making it ideal for live transcription, voice assistants, and interactive applications.
Model Details
- Model Type: Speech Recognition / Transcription
- Parameters: ~4 Billion
- Architecture: Hybrid Encoder-Decoder with Log-Mel preprocessing
- Format: GGUF (optimized for ggml)
- Sample Rate: 16,000 Hz (Mono)
Inference with voxtral.cpp
For the fastest inference performance on CPU and GPU, use the voxtral.cpp (https://github.com/andrijdavid/voxtral.cpp) repository. It provides a lightweight C++ implementation based on ggml.
Getting Started
1. Clone the repository and build
git clone https://github.com/andrijdavid/voxtral.cpp
cd voxtral.cpp
cmake -B build -DCMAKEBUILDTYPE=Release
cmake --build build -j
2. Download a quantized model
Use the provided script to download your preferred quantization (e.g., Q4_0):
./tools/downloadmodel.sh Q40
4. Run Transcription
Prepare a 16kHz mono WAV file and run inference:
./build/voxtral \
--model models/voxtral/Q4_0.gguf \
--audio input.wav \
--threads 8
For more advanced usage, including streaming examples and conversion scripts, please visit the voxtral.cpp GitHub repository (https://github.com/andrijdavid/voxtral.cpp).
π GGUF File List
| π Filename | π¦ Size | β‘ Download |
|---|---|---|
|
Q2_K.gguf
LFS
Q2
|
1.37 GB | Download |
|
Q3_K.gguf
LFS
Q3
|
1.79 GB | Download |
|
Q4_0.gguf
Recommended
LFS
Q4
|
2.33 GB | Download |
|
Q4_1.gguf
LFS
Q4
|
2.59 GB | Download |
|
Q4_K.gguf
LFS
Q4
|
2.34 GB | Download |
|
Q4_K_M.gguf
LFS
Q4
|
2.7 GB | Download |
|
Q5_0.gguf
LFS
Q5
|
2.85 GB | Download |
|
Q5_1.gguf
LFS
Q5
|
3.1 GB | Download |
|
Q5_K.gguf
LFS
Q5
|
2.85 GB | Download |
|
Q6_K.gguf
LFS
Q6
|
3.4 GB | Download |
|
Q8_0.gguf
LFS
Q8
|
4.39 GB | Download |