πŸ“‹ Model Description


license: other license_name: flux-1-dev-non-commercial-license license_link: >- https://huggingface.co/black-forest-labs/FLUX.1-Kontext-dev/blob/main/LICENSE.md language:
  • en
base_model:
  • black-forest-labs/FLUX.1-Kontext-dev
pipeline_tag: image-to-image tags:
  • gguf-node
  • gguf-connector
widget: - text: the anime girl with massive fennec ears is wearing cargo pants while sitting on a log in the woods biting into a sandwitch beside a beautiful alpine lake output: url: samples\ComfyUI00001.png - src: samples\fennecgirlsing.png prompt: the anime girl with massive fennec ears is wearing cargo pants while sitting on a log in the woods biting into a sandwitch beside a beautiful alpine lake output: url: samples\ComfyUI00001.png - text: the anime girl with massive fennec ears is wearing a maid outfit with a long black gold leaf pattern dress and a white apron mouth open holding a fancy black forest cake with candles on top in the kitchen of an old dark Victorian mansion lit by candlelight with a bright window to the foggy forest and very expensive stuff everywhere output: url: samples\ComfyUI00002.png - src: samples\fennecgirlsing.png prompt: the anime girl with massive fennec ears is wearing a maid outfit with a long black gold leaf pattern dress and a white apron mouth open holding a fancy black forest cake with candles on top in the kitchen of an old dark Victorian mansion lit by candlelight with a bright window to the foggy forest and very expensive stuff everywhere output: url: samples\ComfyUI00002.png - text: add a hat to the pig output: url: samples\hat.webp - src: samples\pig.png prompt: add a hat to the pig output: url: samples\hat.webp

gguf quantized version of kontext

  • run it straight with gguf-connector
  • opt a gguf file in the current directory to interact with by:
ggc k0
> >GGUF file(s) available. Select which one to use: > >1. flux-kontext-lite-q2_k.gguf >2. flux-kontext-lite-q4_0.gguf >3. flux-kontext-lite-q8_0.gguf > >Enter your choice (1 to 3): _ > note: try experimental lite model with 8-step operation; save up to 70% loading time

run it with gguf-node via comfyui

  • drag kontext to > ./ComfyUI/models/diffusionmodels
  • drag clip-l, t5xxl to > ./ComfyUI/models/textencoders
  • drag pig to > ./ComfyUI/models/vae

!screenshot

  • don't need safetensors anymore; all gguf (model + encoder + vae)
  • full set gguf works on gguf-node (see the last item from reference at the very end)
  • get more t5xxl gguf encoder either here or here

!screenshot

extra: scaled safetensors (alternative 1)

  • get all-in-one checkpoint here (model, clips and vae embedded)
!screenshot
  • another option: get multi matrix scaled fp8 from comfyui here or e4m3fn fp8 here with seperate scaled version l-clip, t5xxl and vae

run it with diffusers🧨 (alternative 2)

  • might need the most updated diffusers (git version) for FluxKontextPipeline to work; upgrade your diffusers with:
pip install git+https://github.com/huggingface/diffusers.git
  • see example inference below:
import torch
from transformers import T5EncoderModel
from diffusers import FluxKontextPipeline
from diffusers.utils import load_image

textencoder = T5EncoderModel.frompretrained(
"calcuis/kontext-gguf",
gguffile="t5xxlfp16-q4_0.gguf",
torch_dtype=torch.bfloat16,
)

pipe = FluxKontextPipeline.from_pretrained(
"calcuis/kontext-gguf",
textencoder2=text_encoder,
torch_dtype=torch.bfloat16
).to("cuda")

inputimage = loadimage("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png")

image = pipe(
image=input_image,
prompt="Add a hat to the cat",
guidance_scale=2.5
).images[0]
image.save("output.png")

  • tip: if your machine doesn't has enough vram, would suggest running it with gguf-node via comfyui (plan a), otherwise you might expect to wait very long while falling to a slow mode; this is always a winner takes all game

run it with gguf-connector (other alternatives)

  • simply execute the command below in console/terminal
ggc k2

!screenshot

  • note: during the first time launch, it will pull the required model file(s) from this repo to local cache automatically; then opt to run it entirely offline; i.e., from local URL: http://127.0.0.1:7860 with lazy webui

!screenshot

  • with bot lora embedded version

ggc k1

!screenshot

  • new plushie style

!screenshot

additional chapter for lora conversion via gguf-connector

  • convert lora from base to unet format, i.e.,plushie, then it can be used in comfyui as well
ggc la

!screenshot

  • able to swap the lora back (from unet to base; auto-detection logic applied), then it can be used for inference again

ggc la

!screenshot

update

  • clip-l-v2: missing tensor text_projection.weight added
  • kontext-v2: s-quant and k-quant; except single and double blocks, all in f32 status
- pros: load faster (as no dequant needed for those tensors); and 1) avoid key breaking issue, since some inference engines only dequant blocks; 2) compatible for non-cuda machines, as most of them cannot run bf16 tensors - cons: little bit larger in file size
  • kontext-v3: i-quant attempt (upgrade your node to the latest version for full quant support)
  • kontext-v4: t-quant; runnable (extramely fast); for speed test/experimental purposes
rankquants/itloading speed
1q2_k6.40Β±.7
πŸ–πŸ’¨πŸ’¨πŸ’¨πŸ’¨πŸ’¨πŸ’¨ | 2 | q4_0 | 8.58Β±.5 |πŸ–πŸ–πŸ’¨πŸ’¨πŸ’¨πŸ’¨πŸ’¨ | 3 | q4_1 | 9.12Β±.5 |πŸ–πŸ–πŸ–πŸ’¨πŸ’¨πŸ’¨πŸ’¨ | 4 | q8_0 | 9.45Β±.3 |πŸ–πŸ–πŸ–πŸ–πŸ’¨πŸ’¨πŸ’¨ | 5 | q3_k | 9.50Β±.3 |πŸ–πŸ–πŸ–πŸ–πŸ’¨πŸ’¨πŸ’¨ | 6 | q5_0 | 10.48Β±.5|πŸ–πŸ–πŸ–πŸ–πŸ–πŸ’¨πŸ’¨ | 7 | iq4_nl | 10.55Β±.5|πŸ–πŸ–πŸ–πŸ–πŸ–πŸ’¨πŸ’¨ | 8 | q5_1 | 10.65Β±.5|πŸ–πŸ–πŸ–πŸ–πŸ–πŸ’¨πŸ’¨ | 9 | iq4_xs | 11.45Β±.7|πŸ–πŸ–πŸ–πŸ–πŸ–πŸ–πŸ’¨ | 10| iq3_s | 11.62Β±.9|πŸ’πŸ’πŸ’πŸ’πŸ’πŸ’πŸ’¨ | 11| iq3_xxs| 12.08Β±.9|🐒🐒🐒🐒🐒🐒🐒

not all included in the initial test (*tested with a beginner laptop gpu only, if you have highend model, might find q8_0 running surprisingly faster than others), the rest of them, test it yourself; btw, the interesting thing is: the loading time required was not aligning with file size, due to the complexity of each calculation (dequant), and might vary from models

new memory economy mode

  • this option works for machine with low/no vram or even without gpu
ggc k3

🐷 Kontext Image Editor (connector mode) 🐷

  • opt a gguf file straight in the current directory to interact with
ggc k6
  • semi-full quant supported in the k8 connector (use dequantor instead of diffusers)
ggc k8

reference

πŸ“‚ GGUF File List

πŸ“ Filename πŸ“¦ Size ⚑ Download
clip_l_fp32-f16.gguf
LFS FP16
234.97 MB Download
clip_l_v2_fp32-f16.gguf
LFS FP16
236.09 MB Download
flux-kontext-lite-iq2_s.gguf
LFS Q2
3.92 GB Download
flux-kontext-lite-iq3_s.gguf
LFS Q3
5.29 GB Download
flux-kontext-lite-iq3_xxs.gguf
LFS Q3
4.99 GB Download
flux-kontext-lite-iq4_nl.gguf
LFS Q4
6.45 GB Download
flux-kontext-lite-iq4_xs.gguf
LFS Q4
6.11 GB Download
flux-kontext-lite-q2_k.gguf
LFS Q2
3.8 GB Download
flux-kontext-lite-q4_0.gguf
Recommended LFS Q4
6.37 GB Download
flux-kontext-lite-q8_0.gguf
LFS Q8
11.84 GB Download
flux1-kontext-dev-bf16.gguf
LFS FP16
22.17 GB Download
flux1-kontext-dev-f16.gguf
LFS FP16
22.17 GB Download
flux1-kontext-dev-f32-q2_k.gguf
LFS Q2
3.83 GB Download
flux1-kontext-dev-f32-q3_k_m.gguf
LFS Q3
4.91 GB Download
flux1-kontext-dev-f32-q3_k_s.gguf
LFS Q3
4.77 GB Download
flux1-kontext-dev-f32-q4_0.gguf
LFS Q4
6.24 GB Download
flux1-kontext-dev-f32-q4_1.gguf
LFS Q4
6.94 GB Download
flux1-kontext-dev-f32-q4_k_m.gguf
LFS Q4
6.37 GB Download
flux1-kontext-dev-f32-q4_k_s.gguf
LFS Q4
6.24 GB Download
flux1-kontext-dev-f32-q5_0.gguf
LFS Q5
7.63 GB Download
flux1-kontext-dev-f32-q5_1.gguf
LFS Q5
8.32 GB Download
flux1-kontext-dev-f32-q5_k_m.gguf
LFS Q5
7.76 GB Download
flux1-kontext-dev-f32-q5_k_s.gguf
LFS Q5
7.63 GB Download
flux1-kontext-dev-f32-q6_k.gguf
LFS Q6
9.1 GB Download
flux1-kontext-dev-f32-q8_0.gguf
LFS Q8
11.79 GB Download
flux1-kontext-dev-f32.gguf
LFS
44.34 GB Download
flux1-kontext-dev-iq4_nl.gguf
LFS Q4
6.32 GB Download
flux1-kontext-dev-mxfp4_moe.gguf
LFS
11.84 GB Download
flux1-kontext-dev-q2_k_s.gguf
LFS Q2
3.74 GB Download
flux1-v2-kontext-dev-f32-q2_k.gguf
LFS Q2
3.87 GB Download
flux1-v2-kontext-dev-f32-q4_0.gguf
LFS Q4
6.45 GB Download
flux1-v2-kontext-dev-f32-q5_0.gguf
LFS Q5
7.83 GB Download
flux1-v2-kontext-dev-f32-q6_k.gguf
LFS Q6
9.29 GB Download
flux1-v2-kontext-dev-f32-q8_0.gguf
LFS Q8
11.96 GB Download
flux1-v3-kontext-dev-f32-iq4_nl.gguf
LFS Q4
6.45 GB Download
flux1-v3-kontext-dev-f32-iq4_xs.gguf
LFS Q4
6.11 GB Download
flux1-v3-kontext-dev-mix-iq1_m.gguf
LFS
3.68 GB Download
flux1-v3-kontext-dev-mix-iq1_s.gguf
LFS
3.66 GB Download
flux1-v3-kontext-dev-mix-iq2_s.gguf
LFS Q2
3.83 GB Download
flux1-v3-kontext-dev-mix-iq2_xs.gguf
LFS Q2
3.73 GB Download
flux1-v3-kontext-dev-mix-iq2_xxs.gguf
LFS Q2
3.7 GB Download
flux1-v3-kontext-dev-mix-iq3_s.gguf
LFS Q3
4.77 GB Download
flux1-v3-kontext-dev-mix-iq3_xxs.gguf
LFS Q3
4.77 GB Download
flux1-v3-kontext-dev-mix-iq4_nl.gguf
LFS Q4
6.31 GB Download
flux1-v4-kontext-dev-mix-tq2_0.gguf
LFS Q2
3.65 GB Download
flux1-v4-kontext-dev-tq1_0.gguf
LFS
2.58 GB Download
flux1-v4-kontext-dev-tq2_0.gguf
LFS Q2
3.09 GB Download
pig_flux_vae_fp32-f16.gguf
LFS FP16
160.02 MB Download
t5xxl_fp16-q4_0.gguf
LFS Q4
2.7 GB Download
t5xxl_fp32-iq4_nl.gguf
LFS Q4
2.54 GB Download
t5xxl_fp32-q4_0.gguf
LFS Q4
2.56 GB Download