π Model Description
library_name: transformers
base_model: Qwen/Qwen2.5-14B
tags:
- axolotl
- generatedfromtrainer
model-index:
- name: medius-erebus-magnum-14b
results: []
QuantFactory/medius-erebus-magnum-14b-GGUF
This is quantized version of underwoods/medius-erebus-magnum-14b created using llama.cppOriginal Model Card
axolotl version: 0.4.1
base_model: /workspace/medius-erebus
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer
hubmodelid: magnum-erebus-14b-v1
hubstrategy: "allcheckpoints"
pushdatasetto_hub:
hfuseauth_token: true
plugins:
- axolotl.integrations.liger.LigerPlugin
liger_rope: true
ligerrmsnorm: true
liger_swiglu: true
ligerfusedlinearcrossentropy: true
loadin8bit: false
loadin4bit: false
strict: false
datasets:
- path: anthracite-core/c2logs32kllama3qwen2_v1.2
type: sharegpt
- path: anthracite-org/kalo-opus-instruct-22k-no-refusal
type: sharegpt
- path: lodrick-the-lafted/kalo-opus-instruct-3k-filtered
type: sharegpt
- path: anthracite-org/nopmclaudewriting_fixed
type: sharegpt
- path: anthracite-org/kaloopusmisc_240827
type: sharegpt
- path: anthracite-org/kalomiscpart2
type: sharegpt
chat_template: chatml
shufflemergeddatasets: true
defaultsystemmessage: "You are an assistant that responds to the user."
datasetpreparedpath: /workspace/data/magnum-14b-data
valsetsize: 0.0
output_dir: /workspace/data/magnum-erebus-14b-fft
sequence_len: 32768
sample_packing: true
padtosequence_len: true
adapter:
loramodeldir:
lora_r:
lora_alpha:
lora_dropout:
loratargetlinear:
lorafaninfanout:
wandb_project: 14b-magnum-fft
wandb_entity:
wandb_watch:
wandb_name: v4-r2-erebus-attempt-1
wandblogmodel:
gradientaccumulationsteps: 1
microbatchsize: 2
num_epochs: 2
optimizer: adamwbnb8bit
lr_scheduler: cosine
learning_rate: 0.000008
trainoninputs: false
groupbylength: false
bf16: auto
fp16:
tf32: false
gradient_checkpointing: unsloth
earlystoppingpatience:
resumefromcheckpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true
warmup_steps: 40
evalsperepoch:
evaltablesize:
evalmaxnew_tokens:
savesperepoch: 2
debug:
deepspeed: deepspeedconfigs/zero3bf16.json
weight_decay: 0.1
fsdp:
fsdp_config:
special_tokens:
medius-erebus-magnum
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learningrate: 8e-06
- trainbatchsize: 2
- evalbatchsize: 2
- seed: 42
- distributedtype: multi-GPU
- numdevices: 8
- totaltrainbatchsize: 16
- totalevalbatchsize: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lrschedulertype: cosine
- lrschedulerwarmupsteps: 40
- num_epochs: 2
Training results
Framework versions
- Transformers 4.45.1
- Pytorch 2.3.1+cu121
- Datasets 2.21.0
- Tokenizers 0.20.0
π GGUF File List
| π Filename | π¦ Size | β‘ Download |
|---|---|---|
|
medius-erebus-magnum-14b.Q2_K.gguf
LFS
Q2
|
5.37 GB | Download |
|
medius-erebus-magnum-14b.Q3_K_L.gguf
LFS
Q3
|
7.38 GB | Download |
|
medius-erebus-magnum-14b.Q3_K_M.gguf
LFS
Q3
|
6.83 GB | Download |
|
medius-erebus-magnum-14b.Q3_K_S.gguf
LFS
Q3
|
6.2 GB | Download |
|
medius-erebus-magnum-14b.Q4_0.gguf
Recommended
LFS
Q4
|
7.93 GB | Download |
|
medius-erebus-magnum-14b.Q4_0_4_4.gguf
LFS
Q4
|
7.93 GB | Download |
|
medius-erebus-magnum-14b.Q4_0_4_8.gguf
LFS
Q4
|
7.93 GB | Download |
|
medius-erebus-magnum-14b.Q4_0_8_8.gguf
LFS
Q4
|
7.93 GB | Download |
|
medius-erebus-magnum-14b.Q4_1.gguf
LFS
Q4
|
8.74 GB | Download |
|
medius-erebus-magnum-14b.Q4_K_M.gguf
LFS
Q4
|
8.37 GB | Download |
|
medius-erebus-magnum-14b.Q4_K_S.gguf
LFS
Q4
|
7.98 GB | Download |
|
medius-erebus-magnum-14b.Q5_0.gguf
LFS
Q5
|
9.56 GB | Download |
|
medius-erebus-magnum-14b.Q5_1.gguf
LFS
Q5
|
10.37 GB | Download |
|
medius-erebus-magnum-14b.Q5_K_M.gguf
LFS
Q5
|
9.78 GB | Download |
|
medius-erebus-magnum-14b.Q5_K_S.gguf
LFS
Q5
|
9.56 GB | Download |
|
medius-erebus-magnum-14b.Q6_K.gguf
LFS
Q6
|
11.29 GB | Download |
|
medius-erebus-magnum-14b.Q8_0.gguf
LFS
Q8
|
14.62 GB | Download |