πŸ“‹ Model Description


license: cc-by-nc-4.0 language: - ar - en

C4AI Command R7B Arabic - Quantized Versions in GGUF Format

This repository contains quantized versions of the C4AI Command R7B Arabic model, provided in GGUF format. These quantized versions are designed to reduce model size and improve inference speed while maintaining reasonable performance.

Available Quantized Versions

The following GGUF quantized versions are available:

  • Q2K
  • Q3KM
  • Q40
  • Q4KM
  • Q5KS
  • Q5KM
  • Q6K
  • Q80
  • F16 Quantization

Original Repository

The original model was developed by Cohere and Cohere For AI. You can find it here:

https://huggingface.co/CohereForAI/c4ai-command-r7b-arabic-02-2025

License

These quantized versions follow the same licensing terms as the original model: CC-BY-NC-4.0, with an additional requirement to comply with C4AI’s Acceptable Use Policy. By using these models, you agree to abide by these terms.

Available Models

The GGUF files available in this repository are listed below:

QuantizationFile Name
Q2Kc4ai-command-r7b-arabic-02-2025-Q2K.gguf
Q3KMc4ai-command-r7b-arabic-02-2025-Q3KM.gguf
Q40c4ai-command-r7b-arabic-02-2025-Q40.gguf
Q4KMc4ai-command-r7b-arabic-02-2025-Q4KM.gguf
Q5KSc4ai-command-r7b-arabic-02-2025-Q5KS.gguf
Q5KMc4ai-command-r7b-arabic-02-2025-Q5KM.gguf
Q6Kc4ai-command-r7b-arabic-02-2025-Q6K.gguf
Q80c4ai-command-r7b-arabic-02-2025-Q80.gguf
F16c4ai-command-r7b-arabic-02-2025-F16.gguf

Installation

To use these GGUF models, you can use:

1. llama-cpp-python (Python Library)

Install with:

pip install llama-cpp-python

2. llama.cpp (C++ Library)

If you prefer a non-Python workflow, you can use the llama.cpp C++ implementation.

3. LM Studio (GUI Interface)

LM Studio provides an easy-to-use graphical interface for running GGUF models locally. You can download it from:

https://lmstudio.ai

4. GPT4All (Cross-Platform GUI & CLI)

GPT4All supports running GGUF models across various operating systems. You can install it from:

https://gpt4all.io

5. Ollama (Local Model Runner)

Ollama is a lightweight tool for running LLMs locally. Download it from:

https://ollama.com

Downloading the Models

You can download the GGUF files from this repository using the huggingface_hub library:

from huggingfacehub import hfhub_download

hfhubdownload(
repo_id="eltay89/c4ai-command-r7b-arabic-02-2025-gguf",
filename="c4ai-command-r7b-arabic-02-2025-Q4KM.gguf",
local_dir="."
)

Alternatively, download the files directly from the repository’s page on Hugging Face.

Usage

Using llama-cpp-python in Python

from llama_cpp import Llama

Load the model (replace with the path to your downloaded GGUF file)

llm = Llama(modelpath="path/to/c4ai-command-r7b-arabic-02-2025-Q4K_M.gguf")

Generate text

output = llm("Ω…Ψ±Ψ­Ψ¨Ψ§ΨŒ ΩƒΩŠΩ Ψ­Ψ§Ω„ΩƒΨŸ", max_tokens=100, temperature=0.3) print(output['choices'][0]['text'])

Replace "path/to/c4ai-command-r7b-arabic-02-2025-Q4KM.gguf" with the actual path to your downloaded GGUF file.

The prompt "Ω…Ψ±Ψ­Ψ¨Ψ§ΨŒ ΩƒΩŠΩ Ψ­Ψ§Ω„ΩƒΨŸ" translates to "Hello, how are you?" in Arabic.

Chat Templates

LM Studio Chat Template

To get clean, conversational outputs in LM Studio, use this chat template:

Before System: <|STARTOFTURNTOKEN|><|SYSTEMTOKEN|>  
After System: <|ENDOFTURN_TOKEN|>  
Before User: <|STARTOFTURNTOKEN|><|USERTOKEN|>  
After User: <|ENDOFTURN_TOKEN|>  
Before Assistant: <|STARTOFTURNTOKEN|><|CHATBOTTOKEN|><|START_RESPONSE|>  
After Assistant: <|ENDOFTURN_TOKEN|>  
Additional Stop Strings: <|ENDRESPONSE|>, <|ENDOFTURNTOKEN|>, <|STARTTHINKING|>, <|ENDTHINKING|>, <|STARTACTION|>, <|ENDACTION|>, <|STARTTOOLRESULT|>, <|ENDTOOLRESULT|>

Ollama Chat Template

For Ollama, create a Modelfile with this content:

FROM ./c4ai-command-r7b-arabic-02-2025-Q4KM.gguf

TEMPLATE """
<|STARTOFTURNTOKEN|><|SYSTEMTOKEN|> {{ system }} <|ENDOFTURN_TOKEN|>
<|STARTOFTURNTOKEN|><|USERTOKEN|> {{ prompt }} <|ENDOFTURN_TOKEN|>
<|STARTOFTURNTOKEN|><|CHATBOTTOKEN|><|STARTRESPONSE|> {{ response }} <|ENDOFTURNTOKEN|>
"""

PARAMETER stop "<|END_RESPONSE|>"
PARAMETER stop "<|ENDOFTURN_TOKEN|>"
PARAMETER stop "<|START_THINKING|>"
PARAMETER stop "<|END_THINKING|>"
PARAMETER stop "<|START_ACTION|>"
PARAMETER stop "<|END_ACTION|>"
PARAMETER stop "<|STARTTOOLRESULT|>"
PARAMETER stop "<|ENDTOOLRESULT|>"

Run:

ollama create c4ai-command-r7b-arabic -f Modelfile
ollama run c4ai-command-r7b-arabic

Contact

For questions or issues, please refer to the original repository or contact Cohere For AI at [email protected].

πŸ“‚ GGUF File List

πŸ“ Filename πŸ“¦ Size ⚑ Download
c4ai-command-r7b-arabic-02-2025-Q2_K.gguf
LFS Q2
3.2 GB Download
c4ai-command-r7b-arabic-02-2025-Q3_K_M.gguf
LFS Q3
3.93 GB Download
c4ai-command-r7b-arabic-02-2025-Q4_0.gguf
Recommended LFS Q4
4.47 GB Download
c4ai-command-r7b-arabic-02-2025-Q4_K_M.gguf
LFS Q4
4.71 GB Download
c4ai-command-r7b-arabic-02-2025-Q5_K_M.gguf
LFS Q5
5.41 GB Download
c4ai-command-r7b-arabic-02-2025-Q5_K_S.gguf
LFS Q5
5.28 GB Download
c4ai-command-r7b-arabic-02-2025-Q6_K.gguf
LFS Q6
6.14 GB Download
c4ai-command-r7b-arabic-02-2025-f16.gguf
LFS FP16
14.96 GB Download
c4ai-command-r7b-arabic-02-2025-q8_0.gguf
LFS Q8
7.95 GB Download