π Model Description
base_model:
- deepseek-ai/DeepSeek-OCR
DeepSeek-OCR GGUF (for llama.cpp PR #17400)
This repository provides GGUF model files required for running the DeepSeek-OCR / MTMD support introduced in the following llama.cpp pull request:
π llama.cpp PR #17400
https://github.com/ggml-org/llama.cpp/pull/17400
These models are only compatible with the PR branch and will not run on upstream llama.cpp main.
π₯ Download
You can download the model files directly using:
huggingface-cli download <this-repo> --include "deepseek-ocr-f16.gguf" --local-dir gguf_models/deepseek-ai
huggingface-cli download <this-repo> --include "mmproj-deepseek-ocr-f16.gguf" --local-dir gguf_models/deepseek-ai
π Build llama.cpp PR Branch
Clone llama.cpp and check out the PR branch:
git clone https://github.com/ggml-org/llama.cpp
cd llama.cpp
Checkout PR branch (recommended: GitHub CLI)
gh pr checkout 17400
or manually:
git fetch origin pull/17400/head:pr17400
git checkout pr17400
Build:
cmake -B build -DCMAKEBUILDTYPE=Release
cmake --build build -j
π Run Example
Use the llama-mtmd-cli executable from the PR:
build/bin/llama-mtmd-cli \
-m gguf_models/deepseek-ai/deepseek-ocr-f16.gguf \
--mmproj gguf_models/deepseek-ai/mmproj-deepseek-ocr-f16.gguf \
--image tmp/mtmdtestdata/Deepseek-OCR-2510.18234v1_page1.png \
-p "<|grounding|>Convert the document to markdown." \
--chat-template deepseek-ocr --temp 0
build/bin/llama-mtmd-cli \
-m ggufmodels/deepseek-ai/deepseek-ocr-f16new.gguf \
--mmproj ggufmodels/deepseek-ai/mmproj-deepseek-ocr-f16new.gguf \
--image tools/mtmd/test-1.jpeg \
-p "Free OCR." \
--chat-template deepseek-ocr --temp 0
π GGUF File List
| π Filename | π¦ Size | β‘ Download |
|---|---|---|
|
deepseek-ocr-Q4_K_M.gguf
Recommended
LFS
Q4
|
1.82 GB | Download |
|
deepseek-ocr-f16.gguf
LFS
FP16
|
5.47 GB | Download |
|
deepseek-ocr-q8_0.gguf
LFS
Q8
|
2.91 GB | Download |
|
mmproj-deepseek-ocr-f16.gguf
LFS
FP16
|
774.27 MB | Download |
|
mmproj-deepseek-ocr-q8_0.gguf
LFS
Q8
|
774.27 MB | Download |