InferenceIllusionist/llama3-42b-v0-iMat-GGUF

Name: InferenceIllusionist/llama3-42b-v0-iMat-GGUF
Author: InferenceIllusionist

High-quality GGUF model

4.5K 📥 Downloads

12 ❤️ Likes

18 📁 GGUF Files

364.21 GB 💾 Total Size

2 years ago 🔄 Last Updated

📋 Model Description

tags:

gguf
llama3
iMat

llama3-42b-v0-iMat-GGUF

Quantized from fp32 with love. All credits to Charles Goddard for the original model.

Weighted quantizations were calculated using groupsmerged.txt with 105 chunks (recommended amount for this file) and nctx=512. Special thanks to jukofyork for sharing this process

For more information on the pruning technique utilized in this model: https://arxiv.org/abs/2403.17887

Brief rundown of iMatrix quant performance

All quants are verified working prior to uploading to repo for your safety and convenience.

Tip: Pick a size that can fit in your GPU while still allowing some room for context for best speed. You may need to pad this further depending on if you are running image gen or TTS as well.

FP16 model card can be found here

📂 GGUF File List

📁 Filename	📦 Size	⚡ Download
llama3-42b-v0-iMat-IQ1_M.gguf LFS	9.76 GB	Download
llama3-42b-v0-iMat-IQ2_M.gguf LFS Q2	13.92 GB	Download
llama3-42b-v0-iMat-IQ2_S.gguf LFS Q2	12.87 GB	Download
llama3-42b-v0-iMat-IQ2_XS.gguf LFS Q2	12.21 GB	Download
llama3-42b-v0-iMat-IQ2_XXS.gguf LFS Q2	11.07 GB	Download
llama3-42b-v0-iMat-IQ3_M.gguf LFS Q3	18.29 GB	Download
llama3-42b-v0-iMat-IQ3_S.gguf LFS Q3	17.72 GB	Download
llama3-42b-v0-iMat-IQ3_XS.gguf LFS Q3	16.82 GB	Download
llama3-42b-v0-iMat-IQ3_XXS.gguf LFS Q3	15.74 GB	Download
llama3-42b-v0-iMat-IQ4_XS.gguf LFS Q4	21.71 GB	Download
llama3-42b-v0-iMat-Q2_K.gguf LFS Q2	15.14 GB	Download
llama3-42b-v0-iMat-Q3_K_M.gguf LFS Q3	19.6 GB	Download
llama3-42b-v0-iMat-Q4_K_M.gguf Recommended LFS Q4	24.28 GB	Download
llama3-42b-v0-iMat-Q4_K_S.gguf LFS Q4	23.05 GB	Download
llama3-42b-v0-iMat-Q5_K_M.gguf LFS Q5	28.51 GB	Download
llama3-42b-v0-iMat-Q5_K_S.gguf LFS Q5	27.78 GB	Download
llama3-42b-v0-iMat-Q6_K.gguf LFS Q6	32.99 GB	Download
llama3-42b-v0-iMat-Q8_0.gguf LFS Q8	42.73 GB	Download

📊 Model Information

🆔 Model ID: InferenceIllusionist/llama3-42b-v0-iMat-GGUF

📅 Created: 2 years ago

🔄 Last Updated: 2 years ago

📥 Downloads: 4.5K

❤️ Likes: 12

🎯 Difficulty: Advanced

⚙️ Quantization: Q2, Q3, Q4, Q5, Q6, Q8

🏷️ Tags

ggufllama3iMatarxiv:2403.17887endpoints_compatibleregion:usconversational

🔗 Related Links

🤗 Visit HuggingFace ⚡ Quick Download