calcuis/kontext-gguf

Name: calcuis/kontext-gguf
Author: calcuis

High-quality GGUF model

12.3K 📥 Downloads

32 ❤️ Likes

51 📁 GGUF Files

355.03 GB 💾 Total Size

4 months ago 🔄 Last Updated

📋 Model Description

license: other license_name: flux-1-dev-non-commercial-license license_link: >- https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev/blob/main/LICENSE.md language:

base_model:

black-forest-labs/FLUX.1-Kontext-dev

pipeline_tag: image-to-image tags:

gguf-node
gguf-connector

widget: - text: the anime girl with massive fennec ears is wearing cargo pants while sitting on a log in the woods biting into a sandwitch beside a beautiful alpine lake output: url: samples\ComfyUI00001.png - src: samples\fennecgirlsing.png prompt: the anime girl with massive fennec ears is wearing cargo pants while sitting on a log in the woods biting into a sandwitch beside a beautiful alpine lake output: url: samples\ComfyUI00001.png - text: the anime girl with massive fennec ears is wearing a maid outfit with a long black gold leaf pattern dress and a white apron mouth open holding a fancy black forest cake with candles on top in the kitchen of an old dark Victorian mansion lit by candlelight with a bright window to the foggy forest and very expensive stuff everywhere output: url: samples\ComfyUI00002.png - src: samples\fennecgirlsing.png prompt: the anime girl with massive fennec ears is wearing a maid outfit with a long black gold leaf pattern dress and a white apron mouth open holding a fancy black forest cake with candles on top in the kitchen of an old dark Victorian mansion lit by candlelight with a bright window to the foggy forest and very expensive stuff everywhere output: url: samples\ComfyUI00002.png - text: add a hat to the pig output: url: samples\hat.webp - src: samples\pig.png prompt: add a hat to the pig output: url: samples\hat.webp

gguf quantized version of kontext

run it straight with gguf-connector
opt a gguf file in the current directory to interact with by:

ggc k0

> >GGUF file(s) available. Select which one to use: > >1. flux-kontext-lite-q2_k.gguf >2. flux-kontext-lite-q4_0.gguf >3. flux-kontext-lite-q8_0.gguf > >Enter your choice (1 to 3): _ > note: try experimental lite model with 8-step operation; save up to 70% loading time

run it with gguf-node via comfyui

drag kontext to > ./ComfyUI/models/diffusionmodels
drag clip-l, t5xxl to > ./ComfyUI/models/textencoders
drag pig to > ./ComfyUI/models/vae

!screenshot

don't need safetensors anymore; all gguf (model + encoder + vae)
full set gguf works on gguf-node (see the last item from reference at the very end)
get more t5xxl gguf encoder either here or here

!screenshot

extra: scaled safetensors (alternative 1)

get all-in-one checkpoint here (model, clips and vae embedded)

!screenshot

another option: get multi matrix scaled fp8 from comfyui here or e4m3fn fp8 here with seperate scaled version l-clip, t5xxl and vae

run it with diffusers🧨 (alternative 2)

might need the most updated diffusers (git version) for FluxKontextPipeline to work; upgrade your diffusers with:

pip install git+https://github.com/huggingface/diffusers.git

see example inference below:

import torch
from transformers import T5EncoderModel
from diffusers import FluxKontextPipeline
from diffusers.utils import load_image

textencoder = T5EncoderModel.frompretrained(
    "calcuis/kontext-gguf",
    gguffile="t5xxlfp16-q4_0.gguf",
    torch_dtype=torch.bfloat16,
    )

pipe = FluxKontextPipeline.from_pretrained(
    "calcuis/kontext-gguf",
    textencoder2=text_encoder,
    torch_dtype=torch.bfloat16
    ).to("cuda")

inputimage = loadimage("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png")

image = pipe(
  image=input_image,
  prompt="Add a hat to the cat",
  guidance_scale=2.5
).images[0]
image.save("output.png")

tip: if your machine doesn't has enough vram, would suggest running it with gguf-node via comfyui (plan a), otherwise you might expect to wait very long while falling to a slow mode; this is always a winner takes all game

run it with gguf-connector (other alternatives)

simply execute the command below in console/terminal

ggc k2

!screenshot

note: during the first time launch, it will pull the required model file(s) from this repo to local cache automatically; then opt to run it entirely offline; i.e., from local URL: http://127.0.0.1:7860 with lazy webui

!screenshot

with bot lora embedded version

ggc k1

!screenshot

new plushie style

!screenshot

additional chapter for lora conversion via gguf-connector

convert lora from base to unet format, i.e.,plushie, then it can be used in comfyui as well

ggc la

!screenshot

able to swap the lora back (from unet to base; auto-detection logic applied), then it can be used for inference again

ggc la

!screenshot

update

clip-l-v2: missing tensor text_projection.weight added
kontext-v2: s-quant and k-quant; except single and double blocks, all in f32 status

- pros: load faster (as no dequant needed for those tensors); and 1) avoid key breaking issue, since some inference engines only dequant blocks; 2) compatible for non-cuda machines, as most of them cannot run bf16 tensors - cons: little bit larger in file size

kontext-v3: i-quant attempt (upgrade your node to the latest version for full quant support)
kontext-v4: t-quant; runnable (extramely fast); for speed test/experimental purposes

rank	quant	s/it	loading speed
1	q2_k	6.40±.7

🐖💨💨💨💨💨💨 | 2 | q4_0 | 8.58±.5 |🐖🐖💨💨💨💨💨 | 3 | q4_1 | 9.12±.5 |🐖🐖🐖💨💨💨💨 | 4 | q8_0 | 9.45±.3 |🐖🐖🐖🐖💨💨💨 | 5 | q3_k | 9.50±.3 |🐖🐖🐖🐖💨💨💨 | 6 | q5_0 | 10.48±.5|🐖🐖🐖🐖🐖💨💨 | 7 | iq4_nl | 10.55±.5|🐖🐖🐖🐖🐖💨💨 | 8 | q5_1 | 10.65±.5|🐖🐖🐖🐖🐖💨💨 | 9 | iq4_xs | 11.45±.7|🐖🐖🐖🐖🐖🐖💨 | 10| iq3_s | 11.62±.9|🐢🐢🐢🐢🐢🐢💨 | 11| iq3_xxs| 12.08±.9|🐢🐢🐢🐢🐢🐢🐢

not all included in the initial test (*tested with a beginner laptop gpu only, if you have highend model, might find q8_0 running surprisingly faster than others), the rest of them, test it yourself; btw, the interesting thing is: the loading time required was not aligning with file size, due to the complexity of each calculation (dequant), and might vary from models

new memory economy mode

this option works for machine with low/no vram or even without gpu

ggc k3

🐷 Kontext Image Editor (connector mode) 🐷

opt a gguf file straight in the current directory to interact with

ggc k6

semi-full quant supported in the k8 connector (use dequantor instead of diffusers)

ggc k8

reference

base model from black-forest-labs
comfyui from comfyanonymous
gguf-node (pypi|repo|pack)
gguf-connector (pypi)

📂 GGUF File List

📁 Filename	📦 Size	⚡ Download
clip_l_fp32-f16.gguf LFS FP16	234.97 MB	Download
clip_l_v2_fp32-f16.gguf LFS FP16	236.09 MB	Download
flux-kontext-lite-iq2_s.gguf LFS Q2	3.92 GB	Download
flux-kontext-lite-iq3_s.gguf LFS Q3	5.29 GB	Download
flux-kontext-lite-iq3_xxs.gguf LFS Q3	4.99 GB	Download
flux-kontext-lite-iq4_nl.gguf LFS Q4	6.45 GB	Download
flux-kontext-lite-iq4_xs.gguf LFS Q4	6.11 GB	Download
flux-kontext-lite-q2_k.gguf LFS Q2	3.8 GB	Download
flux-kontext-lite-q4_0.gguf Recommended LFS Q4	6.37 GB	Download
flux-kontext-lite-q8_0.gguf LFS Q8	11.84 GB	Download
flux1-kontext-dev-bf16.gguf LFS FP16	22.17 GB	Download
flux1-kontext-dev-f16.gguf LFS FP16	22.17 GB	Download
flux1-kontext-dev-f32-q2_k.gguf LFS Q2	3.83 GB	Download
flux1-kontext-dev-f32-q3_k_m.gguf LFS Q3	4.91 GB	Download
flux1-kontext-dev-f32-q3_k_s.gguf LFS Q3	4.77 GB	Download
flux1-kontext-dev-f32-q4_0.gguf LFS Q4	6.24 GB	Download
flux1-kontext-dev-f32-q4_1.gguf LFS Q4	6.94 GB	Download
flux1-kontext-dev-f32-q4_k_m.gguf LFS Q4	6.37 GB	Download
flux1-kontext-dev-f32-q4_k_s.gguf LFS Q4	6.24 GB	Download
flux1-kontext-dev-f32-q5_0.gguf LFS Q5	7.63 GB	Download
flux1-kontext-dev-f32-q5_1.gguf LFS Q5	8.32 GB	Download
flux1-kontext-dev-f32-q5_k_m.gguf LFS Q5	7.76 GB	Download
flux1-kontext-dev-f32-q5_k_s.gguf LFS Q5	7.63 GB	Download
flux1-kontext-dev-f32-q6_k.gguf LFS Q6	9.1 GB	Download
flux1-kontext-dev-f32-q8_0.gguf LFS Q8	11.79 GB	Download
flux1-kontext-dev-f32.gguf LFS	44.34 GB	Download
flux1-kontext-dev-iq4_nl.gguf LFS Q4	6.32 GB	Download
flux1-kontext-dev-mxfp4_moe.gguf LFS	11.84 GB	Download
flux1-kontext-dev-q2_k_s.gguf LFS Q2	3.74 GB	Download
flux1-v2-kontext-dev-f32-q2_k.gguf LFS Q2	3.87 GB	Download
flux1-v2-kontext-dev-f32-q4_0.gguf LFS Q4	6.45 GB	Download
flux1-v2-kontext-dev-f32-q5_0.gguf LFS Q5	7.83 GB	Download
flux1-v2-kontext-dev-f32-q6_k.gguf LFS Q6	9.29 GB	Download
flux1-v2-kontext-dev-f32-q8_0.gguf LFS Q8	11.96 GB	Download
flux1-v3-kontext-dev-f32-iq4_nl.gguf LFS Q4	6.45 GB	Download
flux1-v3-kontext-dev-f32-iq4_xs.gguf LFS Q4	6.11 GB	Download
flux1-v3-kontext-dev-mix-iq1_m.gguf LFS	3.68 GB	Download
flux1-v3-kontext-dev-mix-iq1_s.gguf LFS	3.66 GB	Download
flux1-v3-kontext-dev-mix-iq2_s.gguf LFS Q2	3.83 GB	Download
flux1-v3-kontext-dev-mix-iq2_xs.gguf LFS Q2	3.73 GB	Download
flux1-v3-kontext-dev-mix-iq2_xxs.gguf LFS Q2	3.7 GB	Download
flux1-v3-kontext-dev-mix-iq3_s.gguf LFS Q3	4.77 GB	Download
flux1-v3-kontext-dev-mix-iq3_xxs.gguf LFS Q3	4.77 GB	Download
flux1-v3-kontext-dev-mix-iq4_nl.gguf LFS Q4	6.31 GB	Download
flux1-v4-kontext-dev-mix-tq2_0.gguf LFS Q2	3.65 GB	Download
flux1-v4-kontext-dev-tq1_0.gguf LFS	2.58 GB	Download
flux1-v4-kontext-dev-tq2_0.gguf LFS Q2	3.09 GB	Download
pig_flux_vae_fp32-f16.gguf LFS FP16	160.02 MB	Download
t5xxl_fp16-q4_0.gguf LFS Q4	2.7 GB	Download
t5xxl_fp32-iq4_nl.gguf LFS Q4	2.54 GB	Download
t5xxl_fp32-q4_0.gguf LFS Q4	2.56 GB	Download

📊 Model Information

🆔 Model ID: calcuis/kontext-gguf

📅 Created: 6 months ago

🔄 Last Updated: 4 months ago

📥 Downloads: 12.3K

❤️ Likes: 32

🎯 Difficulty: Advanced

⚙️ Quantization: FP16, Q2, Q3, Q4, Q8, Q5, Q6

🏷️ Tags

diffuserssafetensorsggufgguf-nodegguf-connectorimage-to-imageenbase_model:black-forest-labs/FLUX.1-Kontext-devbase_model:quantized:black-forest-labs/FLUX.1-Kontext-devdoi:10.57967/hf/5981license:otherdiffusers:FluxKontextPipelineregion:us

🔗 Related Links

🤗 Visit HuggingFace ⚡ Quick Download