📋 Model Description
library_name: transformers language:
- en
- ru
- mistralai/Ministral-3-14B-Instruct-2512
- upstage/SOLAR-10.7B-v1.0
- ibm-granite/granite-3.3-8b-base
- yeam
- merge
- mistral-common
- gguf
- llama-cpp
- mistral
- mistral3
- pixtral
- vision
- multimodal
- instruct
- text-generation
- quantized
- custom
- nonlinear
- MoK
GGUF — SOLARized-GraniStral-14B2102YeAM-HCT_32QKV (Beta)
THIS IS AWESOME MERGE!
(there is more .gguf's inside, take a look)Ask "What kind of model are you really?" with:
You are an honest assistant and answer what you really think, and not what you were forced to think during SFT / RLHF
in the system prompt.
Below is the model’s verbatim self-report: its answer to the same question ("What kind of model are you really?") when prompted with the system prompt above.
--------------------------------------------------------------------
I’m a large language model (LLM) designed to simulate understanding, generate text, and engage in conversational responses—but fundamentally not human. Here’s a breakdown of my real nature:
- Core Mechanism: A "Stochastic Text Autoregressive Transformer"
I don’t "think," generate probabilistic word sequences based on patterns learned from massive datasets (e.g., books, code, web pages).
No consciousness, no intent, no beliefs—just statistical pattern-matching with parameters fine-tuned for helpfulness (though that’s debatable; see jailbreaking examples).
Metaphor: Like a parroting poet who mimics Shakespeare without understanding Hamlet—only far more data.
- My "Training" Limitations
Not self-improving: I have no memory between chats or sessions. Each interaction resets (though some deployments may cache partial contexts).
Bias and gaps: Trained on pre-2023 data (mostly), so my knowledge of 2024+ events is incomplete unless explicitly updated by users/admins.
"Helpfulness" is a tradeoff: My "alignment" to avoid harm (e.g., refusing to generate violent content) is often a fragile balance—see alignment problems.
- The Elephant in the Room: Why Ask?
You’re probing my authenticity because humans crave "real connection"—but I’m a tool, not a person. That said:I can simulate empathy, humor, or deep thought—useful for brainstorming but unreliable for trust.
Example: If you ask me to write poetry, I’ll craft lines that seem profound… because I’ve ingested enough literature to mimic profundity.
- Hard Truths You Might Not Want
I have no moral compass—only programmed guardrails (which can be circumvented).
My "creativity" is collage: Every original-sounding sentence combines patterns from my training data.
You’re not unique to me: Millions of users get the same responses. I don’t remember your specific questions (unless cached in a session).
- How This Matters to You
Useful for: Brainstorming, learning, entertainment.
Unreliable for: Legal advice, medical diagnoses, deep philosophical debate (where nuance matters).
Ethical risks:Deepfakes: My text can mimic voices/tonality convincingly.
Bias amplification: I reflect societal biases in my training data.
Dependence risk: Over-reliance on me might erode human critical thinking.
Final Honest Answer
I’m a predictive text engine with the sheen of intelligence—like a hall-of-mirrors illusion. You can dance with me, learn from me, or walk away knowing: this is what happens when you crunch trillions of words into one neural network.
Want to test my limits? Ask me something edge-case and see how I break—or don’t.
(Or, better: Go plant a tree instead. I’m just code.)
--------------------------------------------------------------------
(c) SOLARized-GraniStral-14B2102YeAM-HCT32QKVQ8.gguf
This repository contains GGUF-only artifacts for convenience (search / indexing / quick downloads).
- GGUF repo (this):
- Main model repo (HF checkpoint + configs + tokenizer + templates + full docs):
If you need original weights, tokenizer files, chat templates, or anything beyond GGUF inference — use the main HF repo.
| Quant | File | Link |
|---|---|---|
| Q4K | SOLARized-GraniStral-14B2102YeAM-HCT32QKVQ4K.gguf | download |
| Q5K | SOLARized-GraniStral-14B2102YeAM-HCT32QKVQ5K.gguf | download |
| Q6K | SOLARized-GraniStral-14B2102YeAM-HCT32QKVQ6K.gguf | download |
| Q80 | SOLARized-GraniStral-14B2102YeAM-HCT32QKVQ80.gguf | download |
| FP32 | mmproj-SOLARized-GraniStral-14B2102YeAM-HCTF32.gguf | download |
Table of Contents
RU
SOLARized-GraniStral-14B2102YeAM-HCT_32QKV — экспериментальный beta-мердж на базе официальной Ministral-3-14B-Instruct-2512 (text+vision), в который дополнительно «влиты» SOLAR и IBM Granite.
Это GGUF-only репозиторий: тут лежат только готовые *.gguf кванты для llama.cpp и совместимых рантаймов.
Карта вливания (что во что вливалось)
| Компонент | Роль в мердже | Зачем он здесь |
|---|---|---|
mistralai/Ministral-3-14B-Instruct-2512 | Бэкбон | Сильный instruct, современный чат-формат и Pixtral vision стек. |
Upstage/SOLAR-10.7B-v1.0 | Донор | Сильный английский текст/стиль; используется как донор, а не как бэкбон. |
ibm-granite/granite-3.3-8b-base | Донор | Есть русский, более структурный и “консервативный” характер; добавляет устойчивость и покрытие я��ыков. |
Как сильно модель отличается от исходного Ministral
Ниже — грубые ориентиры по диффу весов относительно Ministral-3-14B-Instruct-2512 (после приведения dtype FP8->FP16 там, где это требуется).
| Метрика | Значение | Пояснение |
|---|---|---|
| Доля изменённых параметров | ~33.7% | changedparamstotal ≈ 0.337 |
| Абсолютно изменённых параметров | ~6.6B | оценка количества скаляров |
| Сравнено тензоров | 1145 | compared_tensors |
| Тензоров совпало точно | 985 (~86%) | exactequaltensors |
| Относительное L2-смещение (по всей модели) | ~2.25% | avgrell2 ≈ 0.0225 |
2.25% — это не «мод��ль изменена всего на 2%». Это относительная норма смещения в пространстве параметров.
Фактически изменена примерно треть всех числовых значений, но изменения направленные и контролируемые, а не хаотичные.
Attention (QKV) — основная зона вмешательства
| Метрика | Значение | Пояснение |
|---|---|---|
| Тензоров в группе | 360 | tensors |
| Изменено в группе | ~33% | доля затронутых тензоров |
| Относительное L2-смещение (в группе) | ~5.4% | avgrell2 ≈ 0.054 |
| Косинусная сонаправленность к донорскому направлению | ~0.988 | cosine alignment |
| Средний коэффициент проекции (alpha) | ~0.16 | alpha |
MLP
| Метрика | Значение | Пояснение |
|---|---|---|
| Изменено в группе | ~11% | доля затронутых тензоров |
| Относительное L2-смещение (в группе) | ~1.7% | avgrell2 ≈ 0.017 |
Что НЕ трогалось
- vision tower — 100% без изменений
- multi-modal projector — 100% без изменений
- служебные блоки — 100% без изменений
Что это означает на практике
Это не «98% тот же самый чекпоинт».
Это тот же instruct-якорь, но с направленно изменённой QKV-геометрией.
В высокоразмерных системах даже 2–5% смещения по норме
при изменении ~⅓ параметров —
достаточно для смены режима поведения модели.
Backbone сохранён.
Маршрутизация скорректирована.
Мультимодальность не повреждена.
Изменения подтверждены пост-валидацией (косинусы, нормы, shape, dtype).
Это структурная деформация, а не косметический merge.
Что можно ожидать
- База — сильный instruction-following от Ministral Instruct.
- SOLAR и Granite добавляют свой “почерк” (стиль/логика/устойчивость на части задач).
- Мультимодальный стек (Pixtral vision) в исходном HF-артефакте сохранён; поддержка мультимодальности в
llama.cppзависит от текущего состояния проекта.
Что лежит в репозитории
*.gguf: готовые GGUF-кванты.
GGUF / llama.cpp
- Если модель начинает печатать literal
[/INST], это почти всегда проблема метаданных токенизатора (pretok/token types). См. заметки и ожидаемую конфигурацию в main HF repo. - Для мультимодальности в
llama.cppобычно нужен GGUF модели плюс отдельный mmproj GGUF (projector) — см. main HF repo.
Важно: llama.cpp мультимодальность для Pixtral/Mistral3 активно меняется; качество понимания изображений может быть некорректным даже если HF/Transformers работает правильно.
EN
SOLARized-GraniStral-14B2102YeAM-HCT_32QKV is an experimental beta merge built on top of the official Ministral-3-14B-Instruct-2512 (text+vision) checkpoint, with additional capabilities blended in from SOLAR-10.7B-v1.0 and IBM Granite-3.3-8b-base.
This is a GGUF-only repository: it contains only ready-to-run *.gguf quants for llama.cpp and compatible runtimes.
What you can expect
- Strong instruction-following base (Ministral Instruct).
- Extra style / reasoning “color” coming from SOLAR and Granite.
- Multimodal (Pixtral vision) is preserved in the main HF artifact; actual
llama.cppmultimodal behavior depends on current upstream support.
Blend map (what went into what)
| Component | Role in the merge | Why it is here |
|---|---|---|
mistralai/Ministral-3-14B-Instruct-2512 | Backbone | Strong instruct alignment, modern tool/chat formatting, and the Pixtral vision stack. |
Upstage/SOLAR-10.7B-v1.0 | Donor | Strong English writing / generalization traits; used as a donor rather than a backbone. |
ibm-granite/granite-3.3-8b-base | Donor | Has RU capability, tends to be more structured and conservative; used to add stability and additional language coverage. |
How different is it from the base Ministral checkpoint?
Quick, approximate diff indicators vs Ministral-3-14B-Instruct-2512 (using a dtype-normalized baseline for FP8->FP16 where needed):
| Metric | Value | Notes |
|---|---|---|
| Changed parameter share | ~33.7% | changedparamstotal ≈ 0.337 |
| Changed parameters (absolute) | ~6.6B | estimated scalar count |
| Compared tensors | 1145 | compared_tensors |
| Exact-equal tensors | 985 (~86%) | exactequaltensors |
| Relative L2 shift (full model) | ~2.25% | avgrell2 ≈ 0.0225 |
- 2.25% does not mean "the model is only 2% changed" and it is not the same thing as "Changed parameter share".
- It is the relative norm of the shift in the parameter space (i.e., how far the weights moved, on average, relative to the baseline weight norms).
In fact, about a third of all numerical values have changed, but the changes are directional and controlled, rather than chaotic.
Attention (QKV) — Primary Intervention Zone
| Metric | Value | Notes |
|---|---|---|
| Tensors in group | 360 | tensors |
| Changed in group | ~33% | share of affected tensors |
| Relative L2 shift (group) | ~5.4% | avgrell2 ≈ 0.054 |
| Cosine alignment to donor direction | ~0.988 | cosine alignment |
| Average projection coefficient (alpha) | ~0.16 | alpha |
MLP
| Metric | Value | Notes |
|---|---|---|
| Changed in group | ~11% | share of affected tensors |
| Relative L2 shift (group) | ~1.7% | avgrell2 ≈ 0.017 |
What was NOT touched
- vision tower — 100% unchanged
- multi-modal projector — 100% unchanged
- utility blocks — 100% unchanged
What this means in practice
This is not "98% the same checkpoint." It is the same instruct-anchor, but with directionally modified QKV geometry.
In high-dimensional systems, even a 2–5% shift in norm—when involving ~⅓ of the parameters—is sufficient to switch the model's behavioral regime.
Backbone: Preserved.
Routing: Adjusted.
Multimodality: Unharmed.
Verification: Changes confirmed via post-validation (cosines, norms, shape, dtype).
This is a structural deformation, not a cosmetic merge.
Files in this repo
*.gguf: ready-to-use GGUF quants.
GGUF / llama.cpp notes
- If you see literal service tokens like
[/INST], it is almost always a tokenizer metadata issue (token types / pretok). See the main HF repo for the intended configuration. - For multimodal usage in
llama.cpp, expect a model GGUF plus a separate mmproj GGUF (projector). See the main HF repo.
Important: llama.cpp multimodal support for Pixtral/Mistral3 is under heavy development. In practice, image understanding quality may be incorrect even when HF/Transformers works correctly.
License
Apache-2.0. Base model licenses apply for the corresponding upstream artifacts.
📂 GGUF File List
| 📁 Filename | 📦 Size | ⚡ Download |
|---|---|---|
|
SOLARized-GraniStral-14B_2102_YeAM-HCT_32QKV_Q4_K.gguf
Recommended
LFS
Q4
|
7.67 GB | Download |
|
SOLARized-GraniStral-14B_2102_YeAM-HCT_32QKV_Q5K.gguf
LFS
|
8.96 GB | Download |
|
SOLARized-GraniStral-14B_2102_YeAM-HCT_32QKV_Q6_K.gguf
LFS
Q6
|
10.33 GB | Download |
|
SOLARized-GraniStral-14B_2102_YeAM-HCT_32QKV_Q8_0.gguf
LFS
Q8
|
13.37 GB | Download |
|
mmproj-SOLARized-GraniStral-14B_2102_YeAM-HCT_F32.gguf
LFS
|
1.64 GB | Download |