πŸ“‹ Model Description


datasets:
  • IlyaGusev/saigascored
  • IlyaGusev/saigapreferences
language:
  • ru
inference: false license: apache-2.0

Llama.cpp compatible versions of an original 12B model.

Download one of the versions, for example saiganemo12b.Q4KM.gguf.

wget https://huggingface.co/IlyaGusev/saiganemo12bgguf/resolve/main/saiganemo12b.Q4K_M.gguf

Download interactllama3llamacpp.py

wget https://raw.githubusercontent.com/IlyaGusev/rulm/master/selfinstruct/src/interactllama3_llamacpp.py

How to run:

pip install llama-cpp-python fire

python3 interactllama3llamacpp.py saiganemo12b.Q4KM.gguf

System requirements:

  • 15GB RAM for q8_0 and less for smaller quantizations

πŸ“‚ GGUF File List

πŸ“ Filename πŸ“¦ Size ⚑ Download
saiga_nemo_12b.BF16.gguf
LFS FP16
22.82 GB Download
saiga_nemo_12b.Q2_K.gguf
LFS Q2
4.46 GB Download
saiga_nemo_12b.Q3_K_M.gguf
LFS Q3
5.67 GB Download
saiga_nemo_12b.Q3_K_S.gguf
LFS Q3
5.15 GB Download
saiga_nemo_12b.Q4_0.gguf
Recommended LFS Q4
6.59 GB Download
saiga_nemo_12b.Q4_K_M.gguf
LFS Q4
6.96 GB Download
saiga_nemo_12b.Q4_K_S.gguf
LFS Q4
6.63 GB Download
saiga_nemo_12b.Q5_K_M.gguf
LFS Q5
8.13 GB Download
saiga_nemo_12b.Q5_K_S.gguf
LFS Q5
7.93 GB Download
saiga_nemo_12b.Q6_K.gguf
LFS Q6
9.37 GB Download
saiga_nemo_12b.Q8_0.gguf
LFS Q8
12.13 GB Download