πŸ“‹ Model Description


pipeline_tag: sentence-similarity tags:
  • gguf
  • embedding
  • eurobert
  • llama-cpp
  • jina-embeddings-v5
language:
  • multilingual
base_model: jinaai/jina-embeddings-v5-text-nano basemodelrelation: quantized inference: false license: cc-by-nc-4.0 library_name: llama.cpp

jina-embeddings-v5-text-nano-retrieval-GGUF

GGUF quantizations of jina-embeddings-v5-text-nano-retrieval using llama.cpp. A 239M parameter multilingual embedding model quantized for efficient inference.

Elastic Inference Service | ArXiv | Blog

[!IMPORTANT]

We highly recommend to first read this blog post for more technical details and customized llama.cpp build.

Overview


jina-embeddings-v5-text Architecture

jina-embeddings-v5-text-nano-retrieval is a task-specific embedding model for retrieval, part of the jina-embeddings-v5-text model family.




FeatureValue
Parameters239M
Taskretrieval
Embedding Dimension768
Matryoshka Dimensions32, 64, 128, 256, 512, 768
Pooling StrategyLast-token pooling
Base Modeljina-embeddings-v5-text-nano


MMTEB Multilingual Benchmark


MTEB English Benchmark


Retrieval Benchmark Results

Usage with llama.cpp


via Elastic Inference Service

The fastest way to use v5-text in production. Elastic Inference Service (EIS) provides managed embedding inference with built-in scaling, so you can generate embeddings directly within your Elastic deployment.

PUT inference/textembedding/jina-v5
{
  "service": "elastic",
  "service_settings": {
    "model_id": "jina-embeddings-v5-text-nano"
  }
}

See the Elastic Inference Service documentation for setup details.

# Build llama.cpp (upstream)
git clone https://github.com/ggml-org/llama.cpp
cd llama.cpp && cmake -B build && cmake --build build --config Release

Run embedding

./build/bin/llama-embedding -m jina-embeddings-v5-text-nano-retrieval-Q8_0.gguf \ --pooling last -p "Your text here"

License

CC-BY-NC-4.0. For commercial use, please contact us.

πŸ“‚ GGUF File List

πŸ“ Filename πŸ“¦ Size ⚑ Download
v5-nano-retrieval-F16.gguf
LFS FP16
411.41 MB Download
v5-nano-retrieval-IQ1_M.gguf
LFS
96.99 MB Download
v5-nano-retrieval-IQ1_S.gguf
LFS
94.83 MB Download
v5-nano-retrieval-IQ2_M.gguf
LFS Q2
108.43 MB Download
v5-nano-retrieval-IQ2_XXS.gguf
LFS Q2
100.6 MB Download
v5-nano-retrieval-IQ4_NL.gguf
LFS Q4
145.34 MB Download
v5-nano-retrieval-IQ4_XS.gguf
LFS Q4
141.97 MB Download
v5-nano-retrieval-Q2_K.gguf
LFS Q2
124.15 MB Download
v5-nano-retrieval-Q3_K_M.gguf
LFS Q3
136.52 MB Download
v5-nano-retrieval-Q4_K_M.gguf
Recommended LFS Q4
149.7 MB Download
v5-nano-retrieval-Q5_K_M.gguf
LFS Q5
161.09 MB Download
v5-nano-retrieval-Q5_K_S.gguf
LFS Q5
158.84 MB Download
v5-nano-retrieval-Q6_K.gguf
LFS Q6
173.19 MB Download
v5-nano-retrieval-Q8_0.gguf
LFS Q8
222.1 MB Download