RichardErkhov/Replete-AI_-_Phi-Delthanar-gguf

Name: RichardErkhov/Replete-AI_-_Phi-Delthanar-gguf
Author: RichardErkhov

High-quality GGUF model

3.2K 📥 Downloads

0 ❤️ Likes

22 📁 GGUF Files

96.82 GB 💾 Total Size

2 years ago 🔄 Last Updated

📋 Model Description

Quantization made by Richard Erkhov.

Github

Discord

Request more models

Phi-Delthanar - GGUF

Model creator: https://huggingface.co/Replete-AI/
Original model: https://huggingface.co/Replete-AI/Phi-Delthanar/

Name	Quant method	Size
Phi-Delthanar.Q2K.gguf	Q2K	2.82GB
Phi-Delthanar.IQ3XS.gguf	IQ3XS	3.05GB
Phi-Delthanar.IQ3S.gguf	IQ3S	3.19GB
Phi-Delthanar.Q3KS.gguf	Q3K_S	3.19GB
Phi-Delthanar.IQ3M.gguf	IQ3M	3.38GB
Phi-Delthanar.Q3K.gguf	Q3K	3.67GB
Phi-Delthanar.Q3KM.gguf	Q3K_M	3.67GB
Phi-Delthanar.Q3KL.gguf	Q3K_L	4.09GB
Phi-Delthanar.IQ4XS.gguf	IQ4XS	3.96GB
Phi-Delthanar.Q40.gguf	Q40	4.14GB
Phi-Delthanar.IQ4NL.gguf	IQ4NL	4.17GB
Phi-Delthanar.Q4KS.gguf	Q4K_S	4.18GB
Phi-Delthanar.Q4K.gguf	Q4K	4.51GB
Phi-Delthanar.Q4KM.gguf	Q4K_M	4.51GB
Phi-Delthanar.Q41.gguf	Q41	4.58GB
Phi-Delthanar.Q50.gguf	Q50	5.03GB
Phi-Delthanar.Q5KS.gguf	Q5K_S	5.03GB
Phi-Delthanar.Q5K.gguf	Q5K	5.22GB
Phi-Delthanar.Q5KM.gguf	Q5K_M	5.22GB
Phi-Delthanar.Q51.gguf	Q51	5.48GB
Phi-Delthanar.Q6K.gguf	Q6K	5.98GB
Phi-Delthanar.Q80.gguf	Q80	7.74GB

Original model description:

license: mit
language:

thumbnail: "https://cdn-uploads.huggingface.co/production/uploads/6589d7e6586088fd2784a12c/iYImJKf2HZZZJ9IwDSN00.png"

The forest is with you.

!image/png

Named after the method used to create it, interleaving the layers of its predecessor to become far larger, giving it much more potential.

Del'thanar is a supposed ancient treeant, and I couldn't think of a better naming convention for a model that was created using the passthrough method.

By concatenating layers from different LLMs, it can produce models with an exotic number of parameters (e.g., 9B with two 7B parameter models). These models are often referred to as "frankenmerges" or "Frankenstein models" by the community.

Many thanks to Abacaj for providing the fine tuned weights that were used in the creation of this base model. You can find the full script for how the model was merged here...thanks to KatyTheCutie for inspiring me to test out this script.

This idea was brought to me by The Face of Goonery, also known as Caleb Morgan. I have him to thank if fine-tuning this model turns out to be a success...he also helped me to make this model even larger than the prior one.

How to run inference:

import transformers
import torch

if name == "main":
  model_name = "Replete-AI/Phi-Delthanar"
  tokenizer = transformers.AutoTokenizer.frompretrained(modelname)

model = (
      transformers.AutoModelForCausalLM.from_pretrained(
          model_name,
      )
      .to("cuda:0")
      .eval()
  )

messages = [
      {"role": "user", "content": "Hello, who are you?"}
  ]
  inputs = tokenizer.applychattemplate(messages, return_tensors="pt").to(model.device)
  inputidscutoff = inputs.size(dim=1)

with torch.no_grad():
      generated_ids = model.generate(
          input_ids=inputs,
          use_cache=True,
          maxnewtokens=512,
          temperature=0.2,
          top_p=0.95,
          do_sample=True,
          eostokenid=tokenizer.eostokenid,
          padtokenid=tokenizer.padtokenid,
      )

completion = tokenizer.decode(
      generatedids[0][inputids_cutoff:],
      skipspecialtokens=True,
  )

print(completion)

Chat template

The model uses the same chat template as found in Mistral instruct models:

Join the Replete AI Discord here!

📂 GGUF File List

📁 Filename	📦 Size	⚡ Download
Phi-Delthanar.IQ3_M.gguf LFS Q3	3.38 GB	Download
Phi-Delthanar.IQ3_S.gguf LFS Q3	3.19 GB	Download
Phi-Delthanar.IQ3_XS.gguf LFS Q3	3.05 GB	Download
Phi-Delthanar.IQ4_NL.gguf LFS Q4	4.17 GB	Download
Phi-Delthanar.IQ4_XS.gguf LFS Q4	3.96 GB	Download
Phi-Delthanar.Q2_K.gguf LFS Q2	2.82 GB	Download
Phi-Delthanar.Q3_K.gguf LFS Q3	3.67 GB	Download
Phi-Delthanar.Q3_K_L.gguf LFS Q3	4.09 GB	Download
Phi-Delthanar.Q3_K_M.gguf LFS Q3	3.67 GB	Download
Phi-Delthanar.Q3_K_S.gguf LFS Q3	3.19 GB	Download
Phi-Delthanar.Q4_0.gguf Recommended LFS Q4	4.14 GB	Download
Phi-Delthanar.Q4_1.gguf LFS Q4	4.58 GB	Download
Phi-Delthanar.Q4_K.gguf LFS Q4	4.51 GB	Download
Phi-Delthanar.Q4_K_M.gguf LFS Q4	4.51 GB	Download
Phi-Delthanar.Q4_K_S.gguf LFS Q4	4.18 GB	Download
Phi-Delthanar.Q5_0.gguf LFS Q5	5.03 GB	Download
Phi-Delthanar.Q5_1.gguf LFS Q5	5.48 GB	Download
Phi-Delthanar.Q5_K.gguf LFS Q5	5.22 GB	Download
Phi-Delthanar.Q5_K_M.gguf LFS Q5	5.22 GB	Download
Phi-Delthanar.Q5_K_S.gguf LFS Q5	5.03 GB	Download
Phi-Delthanar.Q6_K.gguf LFS Q6	5.98 GB	Download
Phi-Delthanar.Q8_0.gguf LFS Q8	7.74 GB	Download

📊 Model Information

🆔 Model ID: RichardErkhov/Replete-AI_-_Phi-Delthanar-gguf

📅 Created: 2 years ago

🔄 Last Updated: 2 years ago

📥 Downloads: 3.2K

❤️ Likes: 0

🎯 Difficulty: Advanced

⚙️ Quantization: Q3, Q4, Q2, Q5, Q6, Q8

🏷️ Tags

ggufendpoints_compatibleregion:usconversational

🔗 Related Links

🤗 Visit HuggingFace ⚡ Quick Download