Model Description
license: other license_name: tencent-hunyuan-community-license license_link: https://huggingface.co/tencent/Hunyuan-0.5B-Instruct/blob/main/LICENSE language:
- en
- zh
- tencent/Hunyuan-0.5B-Instruct
- GGUF
- quantization
- hunyuan
- instruct
- text-generation-inference
- text-generation
Hunyuan-0.5B-Instruct-GGUF
This repository contains GGUF quants for tencent/Hunyuan-0.5B-Instruct.
Hunyuan-0.5B is part of Tencent's efficient LLM series, featuring Hybrid Reasoning (fast and slow thinking modes) and a native 256K context window. Even at 0.5B parameters, it inherits robust performance from larger Hunyuan models, making it ideal for edge devices and resource-constrained environments.
Usage
llama.cpp
You can run these quants using the llama.cpp CLI:./llama-cli -m Hunyuan-0.5B-Instruct*.gguf -p "Your prompt here" -n 128
Special Features
- Thinking Mode: This model supports "slow-thinking" reasoning. To disable CoT (Chain of Thought), add
/nothinkbefore your prompt or setenablethinking=Falsein your chat template. - Long Context: Natively supports 256K tokens
GGUF File List
| 📁 Filename | 📦 Size | ⚡ Download |
|---|---|---|
|
Hunyuan-0.5B-Instruct.fp16.gguf
LFS
FP16
|
1.01 GB | Download |
|
Hunyuan-0.5B-Instruct_Q2_K.gguf
LFS
Q2
|
247.64 MB | Download |
|
Hunyuan-0.5B-Instruct_Q3_K_L.gguf
LFS
Q3
|
312.23 MB | Download |
|
Hunyuan-0.5B-Instruct_Q3_K_M.gguf
LFS
Q3
|
293.42 MB | Download |
|
Hunyuan-0.5B-Instruct_Q3_K_S.gguf
LFS
Q3
|
272.01 MB | Download |
|
Hunyuan-0.5B-Instruct_Q4_0.gguf
Recommended
LFS
Q4
|
324.6 MB | Download |
|
Hunyuan-0.5B-Instruct_Q4_K_M.gguf
LFS
Q4
|
338.53 MB | Download |
|
Hunyuan-0.5B-Instruct_Q4_K_S.gguf
LFS
Q4
|
326.42 MB | Download |
|
Hunyuan-0.5B-Instruct_Q5_0.gguf
LFS
Q5
|
374.1 MB | Download |
|
Hunyuan-0.5B-Instruct_Q5_K_M.gguf
LFS
Q5
|
381.28 MB | Download |
|
Hunyuan-0.5B-Instruct_Q5_K_S.gguf
LFS
Q5
|
374.1 MB | Download |
|
Hunyuan-0.5B-Instruct_Q6_K.gguf
LFS
Q6
|
426.7 MB | Download |
|
Hunyuan-0.5B-Instruct_Q8_0.gguf
LFS
Q8
|
551.18 MB | Download |