π Model Description
license: apache-2.0 language:
- en
- audio-text-to-text
- chat
- audio
- GGUF
Qwen2-Audio

We're bringing Qwen2-Audio to run locally on edge devices with Nexa-SDK, offering various GGUF quantization options.
Qwen2-Audio is a SOTA small-scale multimodal model (AudioLM) that handles audio and text inputs, allowing you to have voice interactions without ASR modules. Qwen2-Audio supports English, Chinese, and major European languages,and provides voice chat and audio analysis capabilities for local use cases like:
- Speaker identification and response
- Speech translation and transcription
- Mixed audio and noise detection
- Music and sound analysis
Demo
See more demos in our blogs
How to Run Locally On Device
In the following, we demonstrate how to run Qwen2-Audio locally on your device.
Step 1: Install Nexa-SDK (local on-device inference framework)
Nexa-SDK is a open-sourced, local on-device inference framework, supporting text generation, image generation, vision-language models (VLM), audio-language models, speech-to-text (ASR), and text-to-speech (TTS) capabilities. Installable via Python Package or Executable Installer.
Step 2: Then run the following code in your terminal
nexa run qwen2audio
This will run default q4KM quantization.
For terminal:
- Drag and drop your audio file into the terminal (or enter file path on Linux)
- Add text prompt to guide analysis or leave empty for direct voice input
or to use with local UI (streamlit):
nexa run qwen2audio -st
Choose Quantizations for your device
Run different quantization versions here and check RAM requirements in our list.The default q4KM version requires 4.2GB of RAM.
Use Cases
Voice Chat
- Answer daily questions
- Offer suggestions
- Speaker identification and response
- Speech translation
- Detecting background noise and responding accordingly
Audio Analysis
- Information Extraction
- Audio summary
- Speech Transcription and Expansion
- Mixed audio and noise detection
- Music and sound analysis
Performance Benchmark

Results demonstrate that Qwen2-Audio significantly outperforms either previous SOTAs or Qwen-Audio across all tasks.

Blog
Learn more in our blogsJoin Community
Discord | X(Twitter)π GGUF File List
| π Filename | π¦ Size | β‘ Download |
|---|---|---|
|
Qwen2-7B-LLM-F16.gguf
LFS
FP16
|
14.45 GB | Download |
|
Qwen2-7B-LLM-Q2_K.gguf
LFS
Q2
|
2.91 GB | Download |
|
Qwen2-7B-LLM-Q3_K_L.gguf
LFS
Q3
|
3.95 GB | Download |
|
Qwen2-7B-LLM-Q3_K_M.gguf
LFS
Q3
|
3.67 GB | Download |
|
Qwen2-7B-LLM-Q3_K_S.gguf
LFS
Q3
|
3.34 GB | Download |
|
Qwen2-7B-LLM-Q4_0.gguf
Recommended
LFS
Q4
|
4.22 GB | Download |
|
Qwen2-7B-LLM-Q4_1.gguf
LFS
Q4
|
4.64 GB | Download |
|
Qwen2-7B-LLM-Q4_K_M.gguf
LFS
Q4
|
4.46 GB | Download |
|
Qwen2-7B-LLM-Q4_K_S.gguf
LFS
Q4
|
4.25 GB | Download |
|
Qwen2-7B-LLM-Q5_0.gguf
LFS
Q5
|
5.05 GB | Download |
|
Qwen2-7B-LLM-Q5_1.gguf
LFS
Q5
|
5.47 GB | Download |
|
Qwen2-7B-LLM-Q5_K_M.gguf
LFS
Q5
|
5.17 GB | Download |
|
Qwen2-7B-LLM-Q5_K_S.gguf
LFS
Q5
|
5.05 GB | Download |
|
Qwen2-7B-LLM-Q6_K.gguf
LFS
Q6
|
5.93 GB | Download |
|
Qwen2-7B-LLM-Q8_0.gguf
LFS
Q8
|
7.68 GB | Download |
|
qwen2-audio-projector.gguf
LFS
|
1.21 GB | Download |