marcelone/Hunyuan-7B-Instruct-GGUF

Name: marcelone/Hunyuan-7B-Instruct-GGUF
Author: marcelone

High-quality GGUF model

11.9K 📥 Downloads

0 ❤️ Likes

22 📁 GGUF Files

114.15 GB 💾 Total Size

1 weeks ago 🔄 Last Updated

📋 Model Description

license: other license_name: tencent-hunyuan license_link: https://github.com/Tencent-Hunyuan/Hunyuan-7B/blob/main/LICENSE.txt base_model: tencent/Hunyuan-7B-Instruct basemodelrelation: quantized tags:

imatrix
hunyuanv1dense

language:

pipeline_tag: text-generation

Output + Embedding
2-bit	3-bit	4-bit	5-bit	6-bit	8-bit	16-bit	32-bit
AXL	BXL	CXL	DXL	EXL	FXL	GXL	HXL

Master table

Bits	Variant	Size (GB)	BPW	PPL	PPL error
1	IQ1\M\FXL	2.19	2.32	1.8967	0.01174
1	IQ1\M\GXL	2.68	2.85	1.8969	0.01175
1	IQ1\M\HXL	3.73	3.97	1.8965	0.01174
2	Q2\K\FXL	3.13	3.33	1.6234	0.00922
2	Q2\K\GXL	3.63	3.86	1.6234	0.00922
2	Q2\K\HXL	4.68	4.98	1.6234	0.00922
3	Q3\K\M\_FXL	3.92	4.17	1.5674	0.00864
3	Q3\K\M\_GXL	4.41	4.70	1.5674	0.00864
3	Q3\K\M\_HXL	5.46	5.81	1.5672	0.00864
4	Q4\K\M\_FXL	4.75	5.06	1.5567	0.00852
4	Q4\K\M\_GXL	5.24	5.58	1.5570	0.00853
4	Q4\K\M\_HXL	6.29	6.70	1.5566	0.00852
5	Q5\K\M\_FXL	5.50	5.85	1.5572	0.00855
5	Q5\K\M\_GXL	5.99	6.38	1.5574	0.00856
5	Q5\K\M\_HXL	7.04	7.50	1.5570	0.00855
6	Q6\K\FXL	6.29	6.70	1.5525	0.00848
6	Q6\K\GXL	6.78	7.22	1.5524	0.00848
6	Q6\K\HXL	7.83	8.34	1.5523	0.00848
8	Q8\0\GXL	8.47	9.03	1.5515	0.00847
8	Q8\0\HXL	9.52	10.14	1.5514	0.00847
16	BF16	15.00	16.00	1.5523	0.00848

Variant chooser, prefer FXL first

Variant (preferred)	Size (GB)	Quality vs BF16	Inference speed	Long context headroom
IQ1\M\FXL	2.19	Low	Fastest	Excellent
Q2\K\FXL	3.13	Fair	Very fast	Excellent
Q3\K\M\_FXL	3.92	Good	Fast	Very good
Q4\K\M\_FXL	4.75	Excellent	Fast	Good
Q5\K\M\_FXL	5.50	Excellent	Medium	Good
Q6\K\FXL	6.29	Excellent	Medium	OK
Q8\0\GXL	8.47	Excellent	Slower	Tight
BF16	15.00	Reference	Slowest	Very tight

Quick picks by GPU VRAM

GPU VRAM	Pick	Why
16 GB	Q6\K\FXL or Q4\K\M\_FXL	Same quality as BF16 in your PPL, plenty of room for long context or batching.
12 GB	Q4\K\M\_FXL	Best balance on 12 GB, strong quality with good headroom.
8 GB	Q4\K\M\FXL (default) or Q3\K\M\FXL for longer ctx	Q4 runs well; drop to Q3 if you need more KV cache.
6 GB	Q3\K\M\FXL	Usually fits with comfortable headroom. Use Q4\K\M\FXL only for short ctx.
4 GB	Q2\K\FXL first, IQ1\M\FXL last resort	Fits strict limits. Accept the quality hit as needed.

Notes

Preference order for size at equal quality: FXL first, then GXL, then HXL.
If you need more context headroom, drop one quant level rather than pushing to heavier weights.

📂 GGUF File List

📁 Filename	📦 Size	⚡ Download
Hunyuan-7B-Instruct-IQ1_M_FXL.gguf LFS	2.04 GB	Download
Hunyuan-7B-Instruct-IQ1_M_GXL.gguf LFS	2.49 GB	Download
Hunyuan-7B-Instruct-IQ1_M_HXL.gguf LFS	3.47 GB	Download
Hunyuan-7B-Instruct-Q2_K_FXL.gguf LFS Q2	2.92 GB	Download
Hunyuan-7B-Instruct-Q2_K_GXL.gguf LFS Q2	3.38 GB	Download
Hunyuan-7B-Instruct-Q2_K_HXL.gguf LFS Q2	4.35 GB	Download
Hunyuan-7B-Instruct-Q3_K_M_FXL.gguf LFS Q3	3.65 GB	Download
Hunyuan-7B-Instruct-Q3_K_M_GXL.gguf LFS Q3	4.11 GB	Download
Hunyuan-7B-Instruct-Q3_K_M_HXL.gguf LFS Q3	5.09 GB	Download
Hunyuan-7B-Instruct-Q4_K_M_FXL.gguf Recommended LFS Q4	4.43 GB	Download
Hunyuan-7B-Instruct-Q4_K_M_GXL.gguf LFS Q4	4.88 GB	Download
Hunyuan-7B-Instruct-Q4_K_M_HXL.gguf LFS Q4	5.86 GB	Download
Hunyuan-7B-Instruct-Q5_K_M_FXL.gguf LFS Q5	5.12 GB	Download
Hunyuan-7B-Instruct-Q5_K_M_GXL.gguf LFS Q5	5.58 GB	Download
Hunyuan-7B-Instruct-Q5_K_M_HXL.gguf LFS Q5	6.56 GB	Download
Hunyuan-7B-Instruct-Q6_K_FXL.gguf LFS Q6	5.86 GB	Download
Hunyuan-7B-Instruct-Q6_K_GXL.gguf LFS Q6	6.32 GB	Download
Hunyuan-7B-Instruct-Q6_K_HXL.gguf LFS Q6	7.3 GB	Download
Hunyuan-7B-Instruct-Q8_0_GXL.gguf LFS Q8	7.89 GB	Download
Hunyuan-7B-Instruct-Q8_0_HXL.gguf LFS Q8	8.87 GB	Download
Hunyuan-7B-Instruct-bf16.gguf LFS FP16	13.99 GB	Download
imatrix.gguf LFS	4.78 MB	Download

📊 Model Information

🆔 Model ID: marcelone/Hunyuan-7B-Instruct-GGUF

📅 Created: 2 weeks ago

🔄 Last Updated: 1 weeks ago

📥 Downloads: 11.9K

❤️ Likes: 0

🎯 Difficulty: Advanced

⚙️ Quantization: Q2, Q3, Q4, Q5, Q6, Q8, FP16

🏷️ Tags

ggufimatrixhunyuan_v1_densetext-generationzhenjaesptfrkobase_model:tencent/Hunyuan-7B-Instructbase_model:quantized:tencent/Hunyuan-7B-Instructlicense:otherendpoints_compatibleregion:usconversational

🔗 Related Links

🤗 Visit HuggingFace ⚡ Quick Download