h2oai/h2o-danube3-4b-chat-GGUF

Name: h2oai/h2o-danube3-4b-chat-GGUF
Author: h2oai

High-quality GGUF model

3.6K 📥 Downloads

22 ❤️ Likes

8 📁 GGUF Files

29.78 GB 💾 Total Size

2 years ago 🔄 Last Updated

📋 Model Description

language:

library_name: transformers license: apache-2.0 tags:

gpt
llm
large language model
h2o-llmstudio

thumbnail: >- https://h2o.ai/etc.clientlibs/h2o/clientlibs/clientlib-site/resources/images/favicon.ico pipeline_tag: text-generation quantized_by: h2oai

h2o-danube3-4b-chat-GGUF

Model creator: H2O.ai
Original model: h2oai/h2o-danube3-4b-chat

Description

This repo contains GGUF format model files for h2o-danube3-4b-chat quantized using llama.cpp framework.

Table below summarizes different quantized versions of h2o-danube3-4b-chat. It shows the trade-off between size, speed and quality of the models.

Name	Quant method	Model size	MT-Bench AVG	Perplexity	Tokens per second
h2o-danube3-4b-chat-F16.gguf	F16	7.92 GB	6.43	6.17	479
h2o-danube3-4b-chat-Q80.gguf	Q80	4.21 GB	6.49	6.17	725
h2o-danube3-4b-chat-Q6K.gguf	Q6K	3.25 GB	6.37	6.20	791
h2o-danube3-4b-chat-Q5KM.gguf	Q5K_M	2.81 GB	6.25	6.24	927
h2o-danube3-4b-chat-Q4KM.gguf	Q4K_M	2.39 GB	6.31	6.37	967
h2o-danube3-4b-chat-Q3KM.gguf	Q3K_M	1.94 GB	5.87	6.99	1099
h2o-danube3-4b-chat-Q2K.gguf	Q2K	1.51 GB	3.71	9.42	1299

Columns in the table are:

Name -- model name and link
Quant method -- quantization method
Model size -- size of the model in gigabytes
MT-Bench AVG -- MT-Bench benchmark score. The score is from 1 to 10, the higher, the better
Perplexity -- perplexity metric on WikiText-2 dataset. It's reported in a perplexity test from llama.cpp. The lower, the better
Tokens per second -- generation speed in tokens per second, as reported in a perplexity test from llama.cpp. The higher, the better. Speed tests are done on a single H100 GPU

Prompt template

<|prompt|>Why is drinking water so healthy?</s><|answer|>

📂 GGUF File List

📁 Filename	📦 Size	⚡ Download
h2o-danube3-4b-chat-BF16.gguf LFS FP16	7.38 GB	Download
h2o-danube3-4b-chat-F16.gguf LFS FP16	7.38 GB	Download
h2o-danube3-4b-chat-Q2_K.gguf LFS Q2	1.41 GB	Download
h2o-danube3-4b-chat-Q3_K_M.gguf LFS Q3	1.81 GB	Download
h2o-danube3-4b-chat-Q4_K_M.gguf Recommended LFS Q4	2.23 GB	Download
h2o-danube3-4b-chat-Q5_K_M.gguf LFS Q5	2.62 GB	Download
h2o-danube3-4b-chat-Q6_K.gguf LFS Q6	3.03 GB	Download
h2o-danube3-4b-chat-Q8_0.gguf LFS Q8	3.92 GB	Download

📊 Model Information

🆔 Model ID: h2oai/h2o-danube3-4b-chat-GGUF

📅 Created: 2 years ago

🔄 Last Updated: 2 years ago

📥 Downloads: 3.6K

❤️ Likes: 22

🎯 Difficulty: Intermediate

⚙️ Quantization: FP16, Q2, Q3, Q4, Q5, Q6, Q8

🏷️ Tags

transformersggufgptllmlarge language modelh2o-llmstudiotext-generationenarxiv:2306.05685license:apache-2.0endpoints_compatibleregion:usconversational

🔗 Related Links

🤗 Visit HuggingFace ⚡ Quick Download