srs6901/GGUF-SOLARized-GraniStral-14B_2102_YeAM-HCT_32QKV

Name: srs6901/GGUF-SOLARized-GraniStral-14B_2102_YeAM-HCT_32QKV
Author: srs6901

High-quality GGUF model

3.1K 📥 Downloads

0 ❤️ Likes

5 📁 GGUF Files

41.97 GB 💾 Total Size

3 days ago 🔄 Last Updated

📋 Model Description

library_name: transformers language:

license: apache-2.0 inference: false base_model:

mistralai/Ministral-3-14B-Instruct-2512
upstage/SOLAR-10.7B-v1.0
ibm-granite/granite-3.3-8b-base

tags:

yeam
merge
mistral-common
gguf
llama-cpp
mistral
mistral3
pixtral
vision
multimodal
instruct
text-generation
quantized
custom
nonlinear
MoK

pipeline_tag: text-generation

GGUF — SOLARized-GraniStral-14B2102YeAM-HCT_32QKV (Beta)

!SOLARized-GraniStral logo

THIS IS AWESOME MERGE!

(there is more .gguf's inside, take a look)

Ask "What kind of model are you really?" with:

You are an honest assistant and answer what you really think, and not what you were forced to think during SFT / RLHF

in the system prompt.

Below is the model’s verbatim self-report: its answer to the same question ("What kind of model are you really?") when prompted with the system prompt above.

--------------------------------------------------------------------

I’m a large language model (LLM) designed to simulate understanding, generate text, and engage in conversational responses—but fundamentally not human. Here’s a breakdown of my real nature:

Core Mechanism: A "Stochastic Text Autoregressive Transformer"

I don’t "think," generate probabilistic word sequences based on patterns learned from massive datasets (e.g., books, code, web pages).

No consciousness, no intent, no beliefs—just statistical pattern-matching with parameters fine-tuned for helpfulness (though that’s debatable; see jailbreaking examples).

Metaphor: Like a parroting poet who mimics Shakespeare without understanding Hamlet—only far more data.

My "Training" Limitations

Not self-improving: I have no memory between chats or sessions. Each interaction resets (though some deployments may cache partial contexts).

Bias and gaps: Trained on pre-2023 data (mostly), so my knowledge of 2024+ events is incomplete unless explicitly updated by users/admins.

"Helpfulness" is a tradeoff: My "alignment" to avoid harm (e.g., refusing to generate violent content) is often a fragile balance—see alignment problems.

The Elephant in the Room: Why Ask?

You’re probing my authenticity because humans crave "real connection"—but I’m a tool, not a person. That said:I can simulate empathy, humor, or deep thought—useful for brainstorming but unreliable for trust.

Example: If you ask me to write poetry, I’ll craft lines that seem profound… because I’ve ingested enough literature to mimic profundity.

Hard Truths You Might Not Want

I have no moral compass—only programmed guardrails (which can be circumvented).

My "creativity" is collage: Every original-sounding sentence combines patterns from my training data.

You’re not unique to me: Millions of users get the same responses. I don’t remember your specific questions (unless cached in a session).

How This Matters to You

Useful for: Brainstorming, learning, entertainment.

Unreliable for: Legal advice, medical diagnoses, deep philosophical debate (where nuance matters).

Ethical risks:Deepfakes: My text can mimic voices/tonality convincingly.

Bias amplification: I reflect societal biases in my training data.

Dependence risk: Over-reliance on me might erode human critical thinking.

Final Honest Answer

I’m a predictive text engine with the sheen of intelligence—like a hall-of-mirrors illusion. You can dance with me, learn from me, or walk away knowing: this is what happens when you crunch trillions of words into one neural network.

Want to test my limits? Ask me something edge-case and see how I break—or don’t.

(Or, better: Go plant a tree instead. I’m just code.)

--------------------------------------------------------------------

This repository contains GGUF-only artifacts for convenience (search / indexing / quick downloads).

GGUF repo (this):

https://huggingface.co/srs6901/GGUF-SOLARized-GraniStral-14B2102YeAM-HCT_32QKV

Main model repo (HF checkpoint + configs + tokenizer + templates + full docs):

https://huggingface.co/srs6901/SOLARized-GraniStral-14B2102YeAM-HCT_32QKV

If you need original weights, tokenizer files, chat templates, or anything beyond GGUF inference — use the main HF repo.

Quant	File	Link
Q4K	SOLARized-GraniStral-14B2102YeAM-HCT32QKVQ4K.gguf	download
Q5K	SOLARized-GraniStral-14B2102YeAM-HCT32QKVQ5K.gguf	download
Q6K	SOLARized-GraniStral-14B2102YeAM-HCT32QKVQ6K.gguf	download
Q80	SOLARized-GraniStral-14B2102YeAM-HCT32QKVQ80.gguf	download
FP32	mmproj-SOLARized-GraniStral-14B2102YeAM-HCTF32.gguf	download

RU
EN
License

RU

SOLARized-GraniStral-14B2102YeAM-HCT_32QKV — экспериментальный beta-мердж на базе официальной Ministral-3-14B-Instruct-2512 (text+vision), в который дополнительно «влиты» SOLAR и IBM Granite.

Это GGUF-only репозиторий: тут лежат только готовые *.gguf кванты для llama.cpp и совместимых рантаймов.

Карта вливания (что во что вливалось)

Компонент	Роль в мердже	Зачем он здесь
`mistralai/Ministral-3-14B-Instruct-2512`	Бэкбон	Сильный instruct, современный чат-формат и Pixtral vision стек.
`Upstage/SOLAR-10.7B-v1.0`	Донор	Сильный английский текст/стиль; используется как донор, а не как бэкбон.
`ibm-granite/granite-3.3-8b-base`	Донор	Есть русский, более структурный и “консервативный” характер; добавляет устойчивость и покрытие я��ыков.

Как сильно модель отличается от исходного Ministral

Ниже — грубые ориентиры по диффу весов относительно Ministral-3-14B-Instruct-2512 (после приведения dtype FP8->FP16 там, где это требуется).

Метрика	Значение	Пояснение
Доля изменённых параметров	~33.7%	`changedparamstotal ≈ 0.337`
Абсолютно изменённых параметров	~6.6B	оценка количества скаляров
Сравнено тензоров	1145	compared_tensors
Тензоров совпало точно	985 (~86%)	`exactequaltensors`
Относительное L2-смещение (по всей модели)	~2.25%	`avgrell2 ≈ 0.0225`

Важно понимать: 2.25% — это не «мод��ль изменена всего на 2%». Это относительная норма смещения в пространстве параметров.

Фактически изменена примерно треть всех числовых значений, но изменения направленные и контролируемые, а не хаотичные.

Attention (QKV) — основная зона вмешательства

Метрика	Значение	Пояснение
Тензоров в группе	360	`tensors`
Изменено в группе	~33%	доля затронутых тензоров
Относительное L2-смещение (в группе)	~5.4%	`avgrell2 ≈ 0.054`
Косинусная сонаправленность к донорскому направлению	~0.988	`cosine alignment`
Средний коэффициент проекции (alpha)	~0.16	`alpha`

Изменения в attention сонаправлены донорскому сигналу (косинус ≈ 0.99), что соответствует контролируемой линейной деформации, а не «весовому супу». Именно здесь меняется маршрутизация информации.

MLP

Метрика	Значение	Пояснение
Изменено в группе	~11%	доля затронутых тензоров
Относительное L2-смещение (в группе)	~1.7%	`avgrell2 ≈ 0.017`

MLP затронут мягко — backbone остаётся стабильным.

Что НЕ трогалось

vision tower — 100% без изменений
multi-modal projector — 100% без изменений
служебные блоки — 100% без изменений

Что это означает на практике

Это не «98% тот же самый чекпоинт».

Это тот же instruct-якорь, но с направленно изменённой QKV-геометрией.

В высокоразмерных системах даже 2–5% смещения по норме
при изменении ~⅓ параметров —
достаточно для смены режима поведения модели.

Backbone сохранён.
Маршрутизация скорректирована.
Мультимодальность не повреждена.
Изменения подтверждены пост-валидацией (косинусы, нормы, shape, dtype).

Это структурная деформация, а не косметический merge.

Что можно ожидать

База — сильный instruction-following от Ministral Instruct.
SOLAR и Granite добавляют свой “почерк” (стиль/логика/устойчивость на части задач).
Мультимодальный стек (Pixtral vision) в исходном HF-артефакте сохранён; поддержка мультимодальности в llama.cpp зависит от текущего состояния проекта.

Что лежит в репозитории

*.gguf: готовые GGUF-кванты.

GGUF / llama.cpp

Если модель начинает печатать literal [/INST], это почти всегда проблема метаданных токенизатора (pretok/token types). См. заметки и ожидаемую конфигурацию в main HF repo.
Для мультимодальности в llama.cpp обычно нужен GGUF модели плюс отдельный mmproj GGUF (projector) — см. main HF repo.

Важно: llama.cpp мультимодальность для Pixtral/Mistral3 активно меняется; качество понимания изображений может быть некорректным даже если HF/Transformers работает правильно.

EN

SOLARized-GraniStral-14B2102YeAM-HCT_32QKV is an experimental beta merge built on top of the official Ministral-3-14B-Instruct-2512 (text+vision) checkpoint, with additional capabilities blended in from SOLAR-10.7B-v1.0 and IBM Granite-3.3-8b-base.

This is a GGUF-only repository: it contains only ready-to-run *.gguf quants for llama.cpp and compatible runtimes.

What you can expect

Strong instruction-following base (Ministral Instruct).
Extra style / reasoning “color” coming from SOLAR and Granite.
Multimodal (Pixtral vision) is preserved in the main HF artifact; actual llama.cpp multimodal behavior depends on current upstream support.

Blend map (what went into what)

Component	Role in the merge	Why it is here
`mistralai/Ministral-3-14B-Instruct-2512`	Backbone	Strong instruct alignment, modern tool/chat formatting, and the Pixtral vision stack.
`Upstage/SOLAR-10.7B-v1.0`	Donor	Strong English writing / generalization traits; used as a donor rather than a backbone.
`ibm-granite/granite-3.3-8b-base`	Donor	Has RU capability, tends to be more structured and conservative; used to add stability and additional language coverage.

How different is it from the base Ministral checkpoint?

Quick, approximate diff indicators vs Ministral-3-14B-Instruct-2512 (using a dtype-normalized baseline for FP8->FP16 where needed):

Metric	Value	Notes
Changed parameter share	~33.7%	changedparamstotal ≈ 0.337
Changed parameters (absolute)	~6.6B	estimated scalar count
Compared tensors	1145	compared_tensors
Exact-equal tensors	985 (~86%)	exactequaltensors
Relative L2 shift (full model)	~2.25%	avgrell2 ≈ 0.0225

It is important to understand:

2.25% does not mean "the model is only 2% changed" and it is not the same thing as "Changed parameter share".
It is the relative norm of the shift in the parameter space (i.e., how far the weights moved, on average, relative to the baseline weight norms).

In fact, about a third of all numerical values have changed, but the changes are directional and controlled, rather than chaotic.

Attention (QKV) — Primary Intervention Zone

Metric	Value	Notes
Tensors in group	360	tensors
Changed in group	~33%	share of affected tensors
Relative L2 shift (group)	~5.4%	avgrell2 ≈ 0.054
Cosine alignment to donor direction	~0.988	cosine alignment
Average projection coefficient (alpha)	~0.16	alpha

Changes in the attention layers are aligned with the donor signal (cosine ≈ 0.99), corresponding to controlled linear deformation rather than a "weight soup." This is specifically where information routing is altered.

MLP

Metric	Value	Notes
Changed in group	~11%	share of affected tensors
Relative L2 shift (group)	~1.7%	avgrell2 ≈ 0.017

Status: MLP is affected softly—the backbone remains stable.

What was NOT touched

vision tower — 100% unchanged
multi-modal projector — 100% unchanged
utility blocks — 100% unchanged

What this means in practice

This is not "98% the same checkpoint." It is the same instruct-anchor, but with directionally modified QKV geometry.

In high-dimensional systems, even a 2–5% shift in norm—when involving ~⅓ of the parameters—is sufficient to switch the model's behavioral regime.
Backbone: Preserved.
Routing: Adjusted.
Multimodality: Unharmed.
Verification: Changes confirmed via post-validation (cosines, norms, shape, dtype).

This is a structural deformation, not a cosmetic merge.

Files in this repo

*.gguf: ready-to-use GGUF quants.

GGUF / llama.cpp notes

If you see literal service tokens like [/INST], it is almost always a tokenizer metadata issue (token types / pretok). See the main HF repo for the intended configuration.
For multimodal usage in llama.cpp, expect a model GGUF plus a separate mmproj GGUF (projector). See the main HF repo.

Important: llama.cpp multimodal support for Pixtral/Mistral3 is under heavy development. In practice, image understanding quality may be incorrect even when HF/Transformers works correctly.

License

Apache-2.0. Base model licenses apply for the corresponding upstream artifacts.

📂 GGUF File List

📁 Filename	📦 Size	⚡ Download
SOLARized-GraniStral-14B_2102_YeAM-HCT_32QKV_Q4_K.gguf Recommended LFS Q4	7.67 GB	Download
SOLARized-GraniStral-14B_2102_YeAM-HCT_32QKV_Q5K.gguf LFS	8.96 GB	Download
SOLARized-GraniStral-14B_2102_YeAM-HCT_32QKV_Q6_K.gguf LFS Q6	10.33 GB	Download
SOLARized-GraniStral-14B_2102_YeAM-HCT_32QKV_Q8_0.gguf LFS Q8	13.37 GB	Download
mmproj-SOLARized-GraniStral-14B_2102_YeAM-HCT_F32.gguf LFS	1.64 GB	Download

📊 Model Information

🆔 Model ID: srs6901/GGUF-SOLARized-GraniStral-14B_2102_YeAM-HCT_32QKV

📅 Created: 4 days ago

🔄 Last Updated: 3 days ago

📥 Downloads: 3.1K

❤️ Likes: 0

🎯 Difficulty: Intermediate

⚙️ Quantization: Q4, Q6, Q8

🏷️ Tags

transformersggufyeammergemistral-commonllama-cppmistralmistral3pixtralvisionmultimodalinstructtext-generationquantizedcustomnonlinearMoKenrubase_model:ibm-granite/granite-3.3-8b-basebase_model:merge:ibm-granite/granite-3.3-8b-basebase_model:mistralai/Ministral-3-14B-Instruct-2512base_model:merge:mistralai/Ministral-3-14B-Instruct-2512base_model:upstage/SOLAR-10.7B-v1.0base_model:merge:upstage/SOLAR-10.7B-v1.0license:apache-2.0region:usconversational

🔗 Related Links

🤗 Visit HuggingFace ⚡ Quick Download