πŸ“‹ Model Description


language:
  • en
license: apache-2.0 tags:
  • text-generation-inference
  • transformers
  • gemma
  • gguf
  • llama.cpp
base_model: pmking27/PrathameshLLM-2B

Uploaded model

  • Developed by: pmking27
  • License: apache-2.0
  • Finetuned from model : pmking27/PrathameshLLM-2B

Provided Quants Files

NameQuant methodBitsSize
PrathameshLLM-2B.IQ3M.ggufIQ3M31.31 GB
PrathameshLLM-2B.IQ3S.ggufIQ3S31.29 GB
PrathameshLLM-2B.IQ3XS.ggufIQ3XS31.24 GB
PrathameshLLM-2B.IQ4NL.ggufIQ4NL41.56 GB
PrathameshLLM-2B.IQ4XS.ggufIQ4XS41.5 GB
PrathameshLLM-2B.Q2K.ggufQ2K21.16 GB
PrathameshLLM-2B.Q3KL.ggufQ3K_L31.47 GB
PrathameshLLM-2B.Q3KM.ggufQ3K_M31.38 GB
PrathameshLLM-2B.Q3KS.ggufQ3K_S31.29 GB
PrathameshLLM-2B.Q40.ggufQ4041.55 GB
PrathameshLLM-2B.Q4KM.ggufQ4K_M41.63 GB
PrathameshLLM-2B.Q4KS.ggufQ4K_S41.56 GB
PrathameshLLM-2B.Q50.ggufQ5051.8 GB
PrathameshLLM-2B.Q5KM.ggufQ5K_M51.84 GB
PrathameshLLM-2B.Q5KS.ggufQ5K_S51.8 GB
PrathameshLLM-2B.Q6K.ggufQ6K62.06 GB
PrathameshLLM-2B.Q80.ggufQ8082.67 GB
#### First install the package

Run one of the following commands, according to your system:

# Base ctransformers with no GPU acceleration
pip install llama-cpp-python

With NVidia CUDA acceleration

CMAKEARGS="-DLLAMACUBLAS=on" pip install llama-cpp-python

Or with OpenBLAS acceleration

CMAKEARGS="-DLLAMABLAS=ON -DLLAMABLASVENDOR=OpenBLAS" pip install llama-cpp-python

Or with CLBLast acceleration

CMAKEARGS="-DLLAMACLBLAST=on" pip install llama-cpp-python

Or with AMD ROCm GPU acceleration (Linux only)

CMAKEARGS="-DLLAMAHIPBLAS=on" pip install llama-cpp-python

Or with Metal GPU acceleration for macOS systems only

CMAKEARGS="-DLLAMAMETAL=on" pip install llama-cpp-python

In windows, to set the variables CMAKE_ARGS in PowerShell, follow this format; eg for NVidia CUDA:

$env:CMAKEARGS = "-DLLAMAOPENBLAS=on" pip install llama-cpp-python

Model Download Script

import os
from huggingfacehub import hfhub_download

Specify model details

modelrepoid = "pmking27/PrathameshLLM-2B-GGUF" # Replace with the desired model repo filename = "PrathameshLLM-2B.Q4KM.gguf" # Replace with the specific GGUF filename local_folder = "." # Replace with your desired local storage path

Create the local directory if it doesn't exist

os.makedirs(localfolder, existok=True)

Download the model file to the specified local folder

filepath = hfhubdownload(repoid=modelrepoid, filename=filename, cachedir=local_folder)

print(f"GGUF model downloaded and saved to: {filepath}")

Replace modelrepoid and filename with the desired model repository ID and specific GGUF filename respectively. Also, modify local_folder to specify where you want to save the downloaded model file.

#### Simple llama-cpp-python Simple inference example code

from llama_cpp import Llama

llm = Llama(
model_path = filepath, # Download the model file first
n_ctx = 32768, # The max sequence length to use - note that longer sequence lengths require much more resources
n_threads = 8, # The number of CPU threads to use, tailor to your system and the resulting performance
ngpulayers = 35 # The number of layers to offload to GPU, if you have GPU acceleration available
)

Defining the Alpaca prompt template


alpaca_prompt = """

Instruction:


{}

Input:

{}

Response:

{}"""

output = llm(
alpaca_prompt.format(
'''
You're an assistant trained to answer questions using the given context.

context:

General elections will be held in India from 19 April 2024 to 1 June 2024 to elect the 543 members of the 18th Lok Sabha. The elections will be held in seven phases and the results will be announced on 4 June 2024. This will be the largest-ever election in the world, surpassing the 2019 Indian general election, and will be the longest-held general elections in India with a total span of 44 days (excluding the first 1951–52 Indian general election). The incumbent prime minister Narendra Modi who completed a second term will be contesting elections for a third consecutive term.

Approximately 960 million individuals out of a population of 1.4 billion are eligible to participate in the elections, which are expected to span a month for completion. The Legislative assembly elections in the states of Andhra Pradesh, Arunachal Pradesh, Odisha, and Sikkim will be held simultaneously with the general election, along with the by-elections for 35 seats among 16 states.
''', # instruction
"In how many phases will the general elections in India be held?", # input
"", # output - leave this blank for generation!
), #Alpaca Prompt
max_tokens = 512, # Generate up to 512 tokens
stop = ["<eos>"], #stop token
echo = True # Whether to echo the prompt
)

output_text = output['choices'][0]['text']
start_marker = "### Response:"
end_marker = "<eos>"
startpos = outputtext.find(startmarker) + len(startmarker)
endpos = outputtext.find(endmarker, startpos)

Extracting the response text

responsetext = outputtext[startpos:endpos].strip()

print(response_text)

#### Simple llama-cpp-python Chat Completion API Example Code

from llama_cpp import Llama
llm = Llama(modelpath = filepath, chatformat="gemma")  # Set chat_format according to the model you are using
message=llm.createchatcompletion(
    messages = [
        {"role": "system", "content": "You are a story writing assistant."},
        {
            "role": "user",
            "content": "Write a story about llamas."
        }
    ]
)
message['choices'][0]["message"]["content"]

πŸ“‚ GGUF File List

πŸ“ Filename πŸ“¦ Size ⚑ Download
PrathameshLLM-2B.IQ3_M.gguf
LFS Q3
1.22 GB Download
PrathameshLLM-2B.IQ3_S.gguf
LFS Q3
1.2 GB Download
PrathameshLLM-2B.IQ3_XS.gguf
LFS Q3
1.16 GB Download
PrathameshLLM-2B.IQ4_NL.gguf
LFS Q4
1.45 GB Download
PrathameshLLM-2B.IQ4_XS.gguf
LFS Q4
1.4 GB Download
PrathameshLLM-2B.Q2_K.gguf
LFS Q2
1.08 GB Download
PrathameshLLM-2B.Q3_K_L.gguf
LFS Q3
1.36 GB Download
PrathameshLLM-2B.Q3_K_M.gguf
LFS Q3
1.29 GB Download
PrathameshLLM-2B.Q3_K_S.gguf
LFS Q3
1.2 GB Download
PrathameshLLM-2B.Q4_0.gguf
Recommended LFS Q4
1.44 GB Download
PrathameshLLM-2B.Q4_K_M.gguf
LFS Q4
1.52 GB Download
PrathameshLLM-2B.Q4_K_S.gguf
LFS Q4
1.45 GB Download
PrathameshLLM-2B.Q5_0.gguf
LFS Q5
1.68 GB Download
PrathameshLLM-2B.Q5_K_M.gguf
LFS Q5
1.71 GB Download
PrathameshLLM-2B.Q5_K_S.gguf
LFS Q5
1.68 GB Download
PrathameshLLM-2B.Q6_K.gguf
LFS Q6
1.92 GB Download
PrathameshLLM-2B.Q8_0.gguf
LFS Q8
2.49 GB Download