Model Description


license: other license_name: tencent-hunyuan-community-license license_link: https://huggingface.co/tencent/Hunyuan-0.5B-Instruct/blob/main/LICENSE language:
  • en
  • zh
base_model:
  • tencent/Hunyuan-0.5B-Instruct
pipeline_tag: text-generation tags:
  • GGUF
  • quantization
  • hunyuan
  • instruct
  • text-generation-inference
  • text-generation
library_name: gguf

Hunyuan-0.5B-Instruct-GGUF

This repository contains GGUF quants for tencent/Hunyuan-0.5B-Instruct.

Hunyuan-0.5B is part of Tencent's efficient LLM series, featuring Hybrid Reasoning (fast and slow thinking modes) and a native 256K context window. Even at 0.5B parameters, it inherits robust performance from larger Hunyuan models, making it ideal for edge devices and resource-constrained environments.

Usage

llama.cpp

You can run these quants using the llama.cpp CLI:
./llama-cli -m Hunyuan-0.5B-Instruct*.gguf -p "Your prompt here" -n 128

Special Features

  • Thinking Mode: This model supports "slow-thinking" reasoning. To disable CoT (Chain of Thought), add /nothink before your prompt or set enablethinking=False in your chat template.
  • Long Context: Natively supports 256K tokens

GGUF File List

📁 Filename 📦 Size ⚡ Download
Hunyuan-0.5B-Instruct.fp16.gguf
LFS FP16
1.01 GB Download
Hunyuan-0.5B-Instruct_Q2_K.gguf
LFS Q2
247.64 MB Download
Hunyuan-0.5B-Instruct_Q3_K_L.gguf
LFS Q3
312.23 MB Download
Hunyuan-0.5B-Instruct_Q3_K_M.gguf
LFS Q3
293.42 MB Download
Hunyuan-0.5B-Instruct_Q3_K_S.gguf
LFS Q3
272.01 MB Download
Hunyuan-0.5B-Instruct_Q4_0.gguf
Recommended LFS Q4
324.6 MB Download
Hunyuan-0.5B-Instruct_Q4_K_M.gguf
LFS Q4
338.53 MB Download
Hunyuan-0.5B-Instruct_Q4_K_S.gguf
LFS Q4
326.42 MB Download
Hunyuan-0.5B-Instruct_Q5_0.gguf
LFS Q5
374.1 MB Download
Hunyuan-0.5B-Instruct_Q5_K_M.gguf
LFS Q5
381.28 MB Download
Hunyuan-0.5B-Instruct_Q5_K_S.gguf
LFS Q5
374.1 MB Download
Hunyuan-0.5B-Instruct_Q6_K.gguf
LFS Q6
426.7 MB Download
Hunyuan-0.5B-Instruct_Q8_0.gguf
LFS Q8
551.18 MB Download