ggml-org/embeddinggemma-300M-GGUF

Name: ggml-org/embeddinggemma-300M-GGUF
Author: ggml-org

High-quality GGUF model

1.8K 📥 Downloads

16 ❤️ Likes

1 📁 GGUF Files

313.36 MB 💾 Total Size

4 months ago 🔄 Last Updated

📋 Model Description

base_model:

google/embeddinggemma-300M

embeddinggemma-300M GGUF

Recommended way to run this model:

llama-server -hf ggml-org/embeddinggemma-300M-GGUF --embeddings

Then the endpoint can be accessed at http://localhost:8080/embedding, for
example using curl:

curl --request POST \
    --url http://localhost:8080/embedding \
    --header "Content-Type: application/json" \
    --data '{"input": "Hello embeddings"}' \
    --silent

Alternatively, the llama-embedding command line tool can be used:

llama-embedding -hf ggml-org/embeddinggemma-300M-GGUF --verbose-prompt -p "Hello embeddings"

#### embd_normalize
When a model uses pooling, or the pooling method is specified using --pooling,
the normalization can be controlled by the embd_normalize parameter.

The default value is 2 which means that the embeddings are normalized using
the Euclidean norm (L2). Other options are:

-1 No normalization
0 Max absolute
1 Taxicab
2 Euclidean/L2
\>2 P-Norm

This can be passed in the request body to llama-server, for example:

--data '{"input": "Hello embeddings", "embd_normalize": -1}' \

And for llama-embedding, by passing --embd-normalize , for example:

llama-embedding -hf ggml-org/embeddinggemma-300M-GGUF  --embd-normalize -1 -p "Hello embeddings"

📂 GGUF File List

📁 Filename	📦 Size	⚡ Download
embeddinggemma-300M-Q8_0.gguf Recommended LFS Q8	313.36 MB	Download

📊 Model Information

🆔 Model ID: ggml-org/embeddinggemma-300M-GGUF

📅 Created: 4 months ago

🔄 Last Updated: 4 months ago

📥 Downloads: 1.8K

❤️ Likes: 16

🎯 Difficulty: Beginner

⚙️ Quantization: Q8

🏷️ Tags

ggufbase_model:google/embeddinggemma-300mbase_model:quantized:google/embeddinggemma-300mendpoints_compatibleregion:us

🔗 Related Links

🤗 Visit HuggingFace ⚡ Quick Download