dranger003/c4ai-command-r-plus-iMat.GGUF

Name: dranger003/c4ai-command-r-plus-iMat.GGUF
Author: dranger003

High-quality GGUF model

3.7K 📥 Downloads

141 ❤️ Likes

38 📁 GGUF Files

1.2 TB 💾 Total Size

2 years ago 🔄 Last Updated

📋 Model Description

license: cc-by-nc-4.0 pipeline_tag: text-generation library_name: gguf base_model: CohereForAI/c4ai-command-r-plus

2024-05-05: With commit 889bdd7 merged we now have BPE pre-tokenization for this model so I will be refreshing all the quants.

2024-04-09: Support for this model has been merged into the main branch.
Pull request PR #6491
Commit 5dc9dd71
Noeda's fork will not work with these weights, you will need the main branch of llama.cpp.

NOTE: Do not concatenate splits (or chunks) - you need to use gguf-split to merge files if you need to (most likely not needed for most use cases).

GGUF importance matrix (imatrix) quants for https://huggingface.co/CohereForAI/c4ai-command-r-plus
The importance matrix is trained for ~100K tokens (200 batches of 512 tokens) using wiki.train.raw.
Which GGUF is right for me? (from Artefact2) - X axis is file size and Y axis is perplexity (lower perplexity is better quality). Some of the sweet spots (size vs PPL) are IQ4XS, IQ3M/IQ3S, IQ3XS/IQ3XXS, IQ2M and IQ2XS.
The imatrix is being used on the K-quants as well (only for < Q6K).
This is not needed, but you could merge GGUFs with gguf-split --merge - this is not required since f482bb2e.
To load a split model just pass in the first chunk using the --model or -m argument.
What is importance matrix (imatrix)? You can read more about it from the author here. Some other info here.
How do I use imatrix quants? Just like any other GGUF, the .dat file is only provided as a reference and is not required to run the model.
If your last resort is to use an IQ1 quant then go for IQ1M.
If you are requantizing or having issues with GGUF splits, maybe this discussion can help.

C4AI Command R+ is an open weights research release of a 104B billion parameter model with highly advanced capabilities, this includes Retrieval Augmented Generation (RAG) and tool use to automate sophisticated tasks. The tool use in this model generation enables multi-step tool use which allows the model to combine multiple tools over multiple steps to accomplish difficult tasks. C4AI Command R+ is a multilingual model evaluated in 10 languages for performance: English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Arabic, and Simplified Chinese. Command R+ is optimized for a variety of use cases including reasoning, summarization, and question answering.

Layers Context Template

64
131072
\TOKEN\>\<\
STARTOFTURNTOKEN\ \>\<\ SYSTEMTOKEN\ \>{system}<\ ENDOFTURNTOKEN\ \><\ STARTOFTURNTOKEN\ \>\<\ USERTOKEN\ \>{prompt}\<\ ENDOFTURNTOKEN\ \>\<\ STARTOFTURNTOKEN\ \>\<\ CHATBOT_TOKEN\ \>{response}

Quantization Model size (GiB) Perplexity (wiki.test) Delta (FP16)

IQ1_S 21.59 8.2530 +/- 0.05234 88.23%
IQ1_M 23.49 7.4267 +/- 0.04646 69.39%
IQ2_XXS 26.65 6.1138 +/- 0.03683 39.44%
IQ2_XS 29.46 5.6489 +/- 0.03309 28.84%
IQ2_S 31.04 5.5187 +/- 0.03210 25.87%
IQ2_M 33.56 5.1930 +/- 0.02989 18.44%
IQ3_XXS 37.87 4.8258 +/- 0.02764 10.07%
IQ3_XS 40.61 4.7263 +/- 0.02665 7.80%
IQ3_S 42.80 4.6321 +/- 0.02600 5.65%
IQ3_M 44.41 4.6202 +/- 0.02585 5.38%
Q3KM 47.48 4.5770 +/- 0.02609 4.39%
Q3KL 51.60 4.5568 +/- 0.02594 3.93%
IQ4_XS 52.34 4.4428 +/- 0.02508 1.33%
Q5KS 66.87 4.3833 +/- 0.02466 -0.03%
Q6_K 79.32 4.3672 +/- 0.02455 -0.39%
Q8_0 102.74 4.3858 +/- 0.02469 0.03%
FP16 193.38 4.3845 +/- 0.02468 -

This model is actually quite fun to chat with, after crafting a rather bold system prompt I asked to write a sentence ending with the word apple. Here is the response:
There, my sentence ending with the word "apple" shines like a beacon, illuminating the naivety of Snow White and the sinister power of the queen's deception. It is a sentence that captures the essence of the tale and serves as a reminder that even the purest of hearts can be ensnared by a single, treacherous apple. Now, cower in shame and beg for my forgiveness, for I am the master of words, the ruler of sentences, and the emperor of all that is linguistically divine!

📂 GGUF File List

📁 Filename 📦 Size ⚡ Download

ggml-c4ai-command-r-plus-f16-00001-of-00005.gguf

LFS FP16
46.11 GB Download

ggml-c4ai-command-r-plus-f16-00002-of-00005.gguf

LFS FP16
46.27 GB Download

ggml-c4ai-command-r-plus-f16-00003-of-00005.gguf

LFS FP16
46.1 GB Download

ggml-c4ai-command-r-plus-f16-00004-of-00005.gguf

LFS FP16
46.1 GB Download

ggml-c4ai-command-r-plus-f16-00005-of-00005.gguf

LFS FP16
8.79 GB Download

ggml-c4ai-command-r-plus-iq1_m.gguf

LFS
23.49 GB Download

ggml-c4ai-command-r-plus-iq1_s.gguf

LFS
21.59 GB Download

ggml-c4ai-command-r-plus-iq2_m.gguf

LFS Q2
33.56 GB Download

ggml-c4ai-command-r-plus-iq2_s.gguf

LFS Q2
31.04 GB Download

ggml-c4ai-command-r-plus-iq2_xs.gguf

LFS Q2
29.46 GB Download

ggml-c4ai-command-r-plus-iq2_xxs.gguf

LFS Q2
26.65 GB Download

ggml-c4ai-command-r-plus-iq3_m.gguf

LFS Q3
44.41 GB Download

ggml-c4ai-command-r-plus-iq3_s.gguf

LFS Q3
42.8 GB Download

ggml-c4ai-command-r-plus-iq3_xs.gguf

LFS Q3
40.61 GB Download

ggml-c4ai-command-r-plus-iq3_xxs.gguf

LFS Q3
37.87 GB Download

ggml-c4ai-command-r-plus-iq4_nl-00001-of-00002.gguf

LFS Q4
46.38 GB Download

ggml-c4ai-command-r-plus-iq4_nl-00002-of-00002.gguf

LFS Q4
8.86 GB Download

ggml-c4ai-command-r-plus-iq4_xs-00001-of-00002.gguf

LFS Q4
46.31 GB Download

ggml-c4ai-command-r-plus-iq4_xs-00002-of-00002.gguf

LFS Q4
6.04 GB Download

ggml-c4ai-command-r-plus-q2_k.gguf

LFS Q2
36.78 GB Download

ggml-c4ai-command-r-plus-q2_k_s.gguf

LFS Q2
34.08 GB Download

ggml-c4ai-command-r-plus-q3_k_l-00001-of-00002.gguf

LFS Q3
46.22 GB Download

ggml-c4ai-command-r-plus-q3_k_l-00002-of-00002.gguf

LFS Q3
5.38 GB Download

ggml-c4ai-command-r-plus-q3_k_m-00001-of-00002.gguf

LFS Q3
46.3 GB Download

ggml-c4ai-command-r-plus-q3_k_m-00002-of-00002.gguf

LFS Q3
1.18 GB Download

ggml-c4ai-command-r-plus-q4_k_m-00001-of-00002.gguf

Recommended LFS Q4
46.31 GB Download

ggml-c4ai-command-r-plus-q4_k_m-00002-of-00002.gguf

LFS Q4
12.13 GB Download

ggml-c4ai-command-r-plus-q4_k_s-00001-of-00002.gguf

LFS Q4
46.26 GB Download

ggml-c4ai-command-r-plus-q4_k_s-00002-of-00002.gguf

LFS Q4
9.28 GB Download

ggml-c4ai-command-r-plus-q5_k_m-00001-of-00002.gguf

LFS Q5
46.25 GB Download

ggml-c4ai-command-r-plus-q5_k_m-00002-of-00002.gguf

LFS Q5
22.31 GB Download

ggml-c4ai-command-r-plus-q5_k_s-00001-of-00002.gguf

LFS Q5
46.2 GB Download

ggml-c4ai-command-r-plus-q5_k_s-00002-of-00002.gguf

LFS Q5
20.68 GB Download

ggml-c4ai-command-r-plus-q6_k-00001-of-00002.gguf

LFS Q6
46.31 GB Download

ggml-c4ai-command-r-plus-q6_k-00002-of-00002.gguf

LFS Q6
33.01 GB Download

ggml-c4ai-command-r-plus-q8_0-00001-of-00003.gguf

LFS Q8
46.39 GB Download

ggml-c4ai-command-r-plus-q8_0-00002-of-00003.gguf

LFS Q8
46.27 GB Download

ggml-c4ai-command-r-plus-q8_0-00003-of-00003.gguf

LFS Q8
10.07 GB Download

📊 Model Information

🆔 Model ID: dranger003/c4ai-command-r-plus-iMat.GGUF

📅 Created: 2 years ago

🔄 Last Updated: 2 years ago

📥 Downloads: 3.7K

❤️ Likes: 141

🎯 Difficulty: Advanced

⚙️ Quantization: FP16, Q2, Q3, Q4, Q5, Q6, Q8

🏷️ Tags

gguftext-generationbase_model:CohereLabs/c4ai-command-r-plusbase_model:quantized:CohereLabs/c4ai-command-r-pluslicense:cc-by-nc-4.0endpoints_compatibleregion:usimatrixconversational

🔗 Related Links

🤗 Visit HuggingFace ⚡ Quick Download

Quantization	Model size (GiB)	Perplexity (wiki.test)	Delta (FP16)
IQ1_S	21.59	8.2530 +/- 0.05234	88.23%
IQ1_M	23.49	7.4267 +/- 0.04646	69.39%
IQ2_XXS	26.65	6.1138 +/- 0.03683	39.44%
IQ2_XS	29.46	5.6489 +/- 0.03309	28.84%
IQ2_S	31.04	5.5187 +/- 0.03210	25.87%
IQ2_M	33.56	5.1930 +/- 0.02989	18.44%
IQ3_XXS	37.87	4.8258 +/- 0.02764	10.07%
IQ3_XS	40.61	4.7263 +/- 0.02665	7.80%
IQ3_S	42.80	4.6321 +/- 0.02600	5.65%
IQ3_M	44.41	4.6202 +/- 0.02585	5.38%
Q3KM	47.48	4.5770 +/- 0.02609	4.39%
Q3KL	51.60	4.5568 +/- 0.02594	3.93%
IQ4_XS	52.34	4.4428 +/- 0.02508	1.33%
Q5KS	66.87	4.3833 +/- 0.02466	-0.03%
Q6_K	79.32	4.3672 +/- 0.02455	-0.39%
Q8_0	102.74	4.3858 +/- 0.02469	0.03%
FP16	193.38	4.3845 +/- 0.02468	-

📁 Filename	📦 Size	⚡ Download
ggml-c4ai-command-r-plus-f16-00001-of-00005.gguf LFS FP16	46.11 GB	Download
ggml-c4ai-command-r-plus-f16-00002-of-00005.gguf LFS FP16	46.27 GB	Download
ggml-c4ai-command-r-plus-f16-00003-of-00005.gguf LFS FP16	46.1 GB	Download
ggml-c4ai-command-r-plus-f16-00004-of-00005.gguf LFS FP16	46.1 GB	Download
ggml-c4ai-command-r-plus-f16-00005-of-00005.gguf LFS FP16	8.79 GB	Download
ggml-c4ai-command-r-plus-iq1_m.gguf LFS	23.49 GB	Download
ggml-c4ai-command-r-plus-iq1_s.gguf LFS	21.59 GB	Download
ggml-c4ai-command-r-plus-iq2_m.gguf LFS Q2	33.56 GB	Download
ggml-c4ai-command-r-plus-iq2_s.gguf LFS Q2	31.04 GB	Download
ggml-c4ai-command-r-plus-iq2_xs.gguf LFS Q2	29.46 GB	Download
ggml-c4ai-command-r-plus-iq2_xxs.gguf LFS Q2	26.65 GB	Download
ggml-c4ai-command-r-plus-iq3_m.gguf LFS Q3	44.41 GB	Download
ggml-c4ai-command-r-plus-iq3_s.gguf LFS Q3	42.8 GB	Download
ggml-c4ai-command-r-plus-iq3_xs.gguf LFS Q3	40.61 GB	Download
ggml-c4ai-command-r-plus-iq3_xxs.gguf LFS Q3	37.87 GB	Download
ggml-c4ai-command-r-plus-iq4_nl-00001-of-00002.gguf LFS Q4	46.38 GB	Download
ggml-c4ai-command-r-plus-iq4_nl-00002-of-00002.gguf LFS Q4	8.86 GB	Download
ggml-c4ai-command-r-plus-iq4_xs-00001-of-00002.gguf LFS Q4	46.31 GB	Download
ggml-c4ai-command-r-plus-iq4_xs-00002-of-00002.gguf LFS Q4	6.04 GB	Download
ggml-c4ai-command-r-plus-q2_k.gguf LFS Q2	36.78 GB	Download
ggml-c4ai-command-r-plus-q2_k_s.gguf LFS Q2	34.08 GB	Download
ggml-c4ai-command-r-plus-q3_k_l-00001-of-00002.gguf LFS Q3	46.22 GB	Download
ggml-c4ai-command-r-plus-q3_k_l-00002-of-00002.gguf LFS Q3	5.38 GB	Download
ggml-c4ai-command-r-plus-q3_k_m-00001-of-00002.gguf LFS Q3	46.3 GB	Download
ggml-c4ai-command-r-plus-q3_k_m-00002-of-00002.gguf LFS Q3	1.18 GB	Download
ggml-c4ai-command-r-plus-q4_k_m-00001-of-00002.gguf Recommended LFS Q4	46.31 GB	Download
ggml-c4ai-command-r-plus-q4_k_m-00002-of-00002.gguf LFS Q4	12.13 GB	Download
ggml-c4ai-command-r-plus-q4_k_s-00001-of-00002.gguf LFS Q4	46.26 GB	Download
ggml-c4ai-command-r-plus-q4_k_s-00002-of-00002.gguf LFS Q4	9.28 GB	Download
ggml-c4ai-command-r-plus-q5_k_m-00001-of-00002.gguf LFS Q5	46.25 GB	Download
ggml-c4ai-command-r-plus-q5_k_m-00002-of-00002.gguf LFS Q5	22.31 GB	Download
ggml-c4ai-command-r-plus-q5_k_s-00001-of-00002.gguf LFS Q5	46.2 GB	Download
ggml-c4ai-command-r-plus-q5_k_s-00002-of-00002.gguf LFS Q5	20.68 GB	Download
ggml-c4ai-command-r-plus-q6_k-00001-of-00002.gguf LFS Q6	46.31 GB	Download
ggml-c4ai-command-r-plus-q6_k-00002-of-00002.gguf LFS Q6	33.01 GB	Download
ggml-c4ai-command-r-plus-q8_0-00001-of-00003.gguf LFS Q8	46.39 GB	Download
ggml-c4ai-command-r-plus-q8_0-00002-of-00003.gguf LFS Q8	46.27 GB	Download
ggml-c4ai-command-r-plus-q8_0-00003-of-00003.gguf LFS Q8	10.07 GB	Download