speakleash/Bielik-11B-v3.0-Instruct-GGUF

Name: speakleash/Bielik-11B-v3.0-Instruct-GGUF
Author: speakleash

High-quality GGUF model

4.7K 📥 Downloads

16 ❤️ Likes

6 📁 GGUF Files

74.82 GB 💾 Total Size

4 weeks ago 🔄 Last Updated

📋 Model Description

language:

multilingual
pl
en
sq
bel
bs
bg
hr
cs
da
et
fi
fr
el
es
is
lt
nl
de
no
pt
ru
ro
sr
hbs
sv
sk
sl
tr
uk
hu
it
lv

license: apache-2.0 library_name: transformers tags:

finetuned
gguf

inference: false pipeline_tag: text-generation base_model: speakleash/Bielik-11B-v3.0-Instruct

Bielik-11B-v3.0-Instruct-GGUF

This repo contains GGUF format model files for SpeakLeash's Bielik-11B-v3.0-Instruct.

DISCLAIMER: Be aware that quantised models show reduced response quality and possible hallucinations!

Available quantization formats:

q4km: Uses Q6K for half of the attention.wv and feedforward.w2 tensors, else Q4K
q5km: Uses Q6K for half of the attention.wv and feedforward.w2 tensors, else Q5K
q6k: Uses Q8K for all tensors
q8_0: Almost indistinguishable from float16. High resource use and slow. Not recommended for most users.
16bit: Converted to FP16 and BF16 GGUF format.

Bielik 11B v3.0 is on Ollama!
https://ollama.com/SpeakLeash/bielik-11b-v3.0-instruct

Ollama Modfile

The GGUF file can be used with Ollama. To do this, you need to import the model using the configuration defined in the Modfile. For model eg. Bielik-11B-v3.0-Instruct.Q4K_M.gguf (full path to model location) Modfile looks like:

FROM ./Bielik-11B-v3.0-Instruct.Q4KM.gguf

TEMPLATE """<s>{{ if .System }}<|startheaderid|>system<|endheaderid|>

{{ .System }}<|eotid|>{{ end }}{{ if .Prompt }}<|startheaderid|>user<|endheader_id|>

{{ .Prompt }}<|eotid|>{{ end }}<|startheaderid|>assistant<|endheader_id|>

{{ .Response }}<|eot_id|>"""

PARAMETER stop "<|startheaderid|>"
PARAMETER stop "<|endheaderid|>"
PARAMETER stop "<|eot_id|>"

Remeber to set low temperature for experimental models (1-3bits)
PARAMETER temperature 0.1

Ollama Modfile with tools (already on Ollama):

FROM ./Bielik-11B-v3.0-Instruct.Q8_0.gguf

TEMPLATE """{{- / SYSTEM + TOOLS INJECTION / -}}
{{- if or .System .Tools -}}
<|im_start|>system
{{- if .System }}
{{ .System }}
{{- end }}

{{- if .Tools }}
You are provided with tool signatures that you can use to assist with the user's query.
You do not have to use a tool if you can respond adequately without it.
Do not make assumptions about tool arguments. If required parameters are missing, ask a clarification question.

If you decide to invoke a tool, you MUST respond with ONLY valid JSON in the following format:
{"name":"<tool-name>","arguments":{...}}

Below is a list of tools you can invoke (JSON):
{{ .Tools }}
{{- end }}
<|im_end|>
{{- end }}

{{- / MESSAGES / -}}
{{- range $i, $_ := .Messages }}
<|im_start|>{{ .Role }}
{{ .Content }}<|im_end|>
{{- end }}

{{- / GENERATION PROMPT / -}}
<|im_start|>assistant"""

PARAMETER stop "<|startheaderid|>"
PARAMETER stop "<|endheaderid|>"
PARAMETER stop "<|eot_id|>"

PARAMETER temperature 0.1

Model description:

Developed by: SpeakLeash & ACK Cyfronet AGH
Language: Multilingual (32 European languages, optimized for Polish)
Model type: causal decoder-only
Quant from: Bielik-11B-v3.0-Instruct
Finetuned from: speakleash/Bielik-11B-v3-Base-20250730
License: Apache 2.0

About GGUF

GGUF is a new format introduced by the llama.cpp team on August 21st 2023.

Here is an incomplete list of clients and libraries that are known to support GGUF:

llama.cpp. The source project for GGUF. Offers a CLI and a server option.
text-generation-webui, the most widely used web UI, with many features and powerful extensions. Supports GPU acceleration.
KoboldCpp, a fully featured web UI, with GPU accel across all platforms and GPU architectures. Especially good for story telling.
GPT4All, a free and open source local running GUI, supporting Windows, Linux and macOS with full GPU accel.
LM Studio, an easy-to-use and powerful local GUI for Windows, macOS (Silicon) and Linux, with GPU acceleration
LoLLMS Web UI, a great web UI with many interesting and unique features, including a full model library for easy model selection.
Faraday.dev, an attractive and easy to use character-based chat GUI for Windows and macOS (both Silicon and Intel), with GPU acceleration.
llama-cpp-python, a Python library with GPU accel, LangChain support, and OpenAI-compatible API server.
candle, a Rust ML framework with a focus on performance, including GPU support, and ease of use.
ctransformers, a Python library with GPU accel, LangChain support, and OpenAI-compatible AI server. Note ctransformers has not been updated in a long time and does not support many recent models.

Responsible for model quantization

Remigiusz Kinas^SpeakLeash - team leadership, conceptualizing, calibration data preparation, process creation and quantized model delivery.
Kuba Sołtys^SpeakLeash - prepared a template with tools for Ollama
Szymon Baczyński^SpeakLeash - team assistant

Contact Us

If you have any questions or suggestions, please use the discussion tab. If you want to contact us directly, join our Discord SpeakLeash.

📂 GGUF File List

📁 Filename	📦 Size	⚡ Download
Bielik-11B-v3.0-Instruct.Q4_K_M.gguf Recommended LFS Q4	6.26 GB	Download
Bielik-11B-v3.0-Instruct.Q5_K_M.gguf LFS Q5	7.36 GB	Download
Bielik-11B-v3.0-Instruct.Q6_K.gguf LFS Q6	8.53 GB	Download
Bielik-11B-v3.0-Instruct.Q8_0.gguf LFS Q8	11.05 GB	Download
Bielik-11B-v3.0-Instruct.bf16.gguf LFS FP16	20.8 GB	Download
Bielik-11B-v3.0-Instruct.f16.gguf LFS FP16	20.8 GB	Download

📊 Model Information

🆔 Model ID: speakleash/Bielik-11B-v3.0-Instruct-GGUF

📅 Created: 2 months ago

🔄 Last Updated: 4 weeks ago

📥 Downloads: 4.7K

❤️ Likes: 16

🎯 Difficulty: Advanced

⚙️ Quantization: Q4, Q5, Q6, Q8, FP16

🏷️ Tags

transformersgguffinetunedtext-generationmultilingualplensqbelbsbghrcsdaetfifrelesisltnldenoptrurosrhbssvsksltrukhuitlvbase_model:speakleash/Bielik-11B-v3.0-Instructbase_model:quantized:speakleash/Bielik-11B-v3.0-Instructlicense:apache-2.0region:usconversational

🔗 Related Links

🤗 Visit HuggingFace ⚡ Quick Download