πŸ“‹ Model Description


library_name: transformers
base_model: Qwen/Qwen2.5-14B
tags:

  • axolotl
  • generatedfromtrainer

model-index:
  • name: medius-erebus-magnum-14b

results: []


QuantFactory Banner</a>

QuantFactory/medius-erebus-magnum-14b-GGUF

This is quantized version of
underwoods/medius-erebus-magnum-14b created using llama.cpp

Original Model Card

Built with Axolotl

See axolotl config

axolotl version: 0.4.1

base_model: /workspace/medius-erebus
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer

hubmodelid: magnum-erebus-14b-v1
hubstrategy: "allcheckpoints"
pushdatasetto_hub:
hfuseauth_token: true

plugins:
- axolotl.integrations.liger.LigerPlugin
liger_rope: true
ligerrmsnorm: true
liger_swiglu: true
ligerfusedlinearcrossentropy: true

loadin8bit: false
loadin4bit: false
strict: false

datasets:
- path: anthracite-core/c2logs32kllama3qwen2_v1.2
type: sharegpt
- path: anthracite-org/kalo-opus-instruct-22k-no-refusal
type: sharegpt
- path: lodrick-the-lafted/kalo-opus-instruct-3k-filtered
type: sharegpt
- path: anthracite-org/nopmclaudewriting_fixed
type: sharegpt
- path: anthracite-org/kaloopusmisc_240827
type: sharegpt
- path: anthracite-org/kalomiscpart2
type: sharegpt
chat_template: chatml
shufflemergeddatasets: true
defaultsystemmessage: "You are an assistant that responds to the user."
datasetpreparedpath: /workspace/data/magnum-14b-data
valsetsize: 0.0
output_dir: /workspace/data/magnum-erebus-14b-fft

sequence_len: 32768
sample_packing: true
padtosequence_len: true

adapter:
loramodeldir:
lora_r:
lora_alpha:
lora_dropout:
loratargetlinear:
lorafaninfanout:

wandb_project: 14b-magnum-fft
wandb_entity:
wandb_watch:
wandb_name: v4-r2-erebus-attempt-1
wandblogmodel:

gradientaccumulationsteps: 1
microbatchsize: 2
num_epochs: 2
optimizer: adamwbnb8bit
lr_scheduler: cosine
learning_rate: 0.000008

trainoninputs: false
groupbylength: false
bf16: auto
fp16:
tf32: false

gradient_checkpointing: unsloth
earlystoppingpatience:
resumefromcheckpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true

warmup_steps: 40
evalsperepoch:
evaltablesize:
evalmaxnew_tokens:
savesperepoch: 2
debug:
deepspeed: deepspeedconfigs/zero3bf16.json
weight_decay: 0.1
fsdp:
fsdp_config:
special_tokens:


medius-erebus-magnum

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learningrate: 8e-06
  • trainbatchsize: 2
  • evalbatchsize: 2
  • seed: 42
  • distributedtype: multi-GPU
  • numdevices: 8
  • totaltrainbatchsize: 16
  • totalevalbatchsize: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lrschedulertype: cosine
  • lrschedulerwarmupsteps: 40
  • num_epochs: 2

Training results

Framework versions

  • Transformers 4.45.1
  • Pytorch 2.3.1+cu121
  • Datasets 2.21.0
  • Tokenizers 0.20.0

πŸ“‚ GGUF File List

πŸ“ Filename πŸ“¦ Size ⚑ Download
medius-erebus-magnum-14b.Q2_K.gguf
LFS Q2
5.37 GB Download
medius-erebus-magnum-14b.Q3_K_L.gguf
LFS Q3
7.38 GB Download
medius-erebus-magnum-14b.Q3_K_M.gguf
LFS Q3
6.83 GB Download
medius-erebus-magnum-14b.Q3_K_S.gguf
LFS Q3
6.2 GB Download
medius-erebus-magnum-14b.Q4_0.gguf
Recommended LFS Q4
7.93 GB Download
medius-erebus-magnum-14b.Q4_0_4_4.gguf
LFS Q4
7.93 GB Download
medius-erebus-magnum-14b.Q4_0_4_8.gguf
LFS Q4
7.93 GB Download
medius-erebus-magnum-14b.Q4_0_8_8.gguf
LFS Q4
7.93 GB Download
medius-erebus-magnum-14b.Q4_1.gguf
LFS Q4
8.74 GB Download
medius-erebus-magnum-14b.Q4_K_M.gguf
LFS Q4
8.37 GB Download
medius-erebus-magnum-14b.Q4_K_S.gguf
LFS Q4
7.98 GB Download
medius-erebus-magnum-14b.Q5_0.gguf
LFS Q5
9.56 GB Download
medius-erebus-magnum-14b.Q5_1.gguf
LFS Q5
10.37 GB Download
medius-erebus-magnum-14b.Q5_K_M.gguf
LFS Q5
9.78 GB Download
medius-erebus-magnum-14b.Q5_K_S.gguf
LFS Q5
9.56 GB Download
medius-erebus-magnum-14b.Q6_K.gguf
LFS Q6
11.29 GB Download
medius-erebus-magnum-14b.Q8_0.gguf
LFS Q8
14.62 GB Download