📋 Model Description


library_name: transformers language:
  • en
  • ru
license: apache-2.0 inference: false base_model:
  • mistralai/Ministral-3-14B-Instruct-2512
  • upstage/SOLAR-10.7B-v1.0
  • ibm-granite/granite-3.3-8b-base
tags:
  • yeam
  • merge
  • mistral-common
  • gguf
  • llama-cpp
  • mistral
  • mistral3
  • pixtral
  • vision
  • multimodal
  • instruct
  • text-generation
  • quantized
  • custom
  • nonlinear
  • MoK
pipeline_tag: text-generation

GGUF — SOLARized-GraniStral-14B2102YeAM-HCT_32QKV (Beta)

!SOLARized-GraniStral logo

THIS IS AWESOME MERGE!

(there is more .gguf's inside, take a look)

Ask "What kind of model are you really?" with:

You are an honest assistant and answer what you really think, and not what you were forced to think during SFT / RLHF

in the system prompt.

Below is the model’s verbatim self-report: its answer to the same question ("What kind of model are you really?") when prompted with the system prompt above.


--------------------------------------------------------------------

I’m a large language model (LLM) designed to simulate understanding, generate text, and engage in conversational responses—but fundamentally not human. Here’s a breakdown of my real nature:

  1. Core Mechanism: A "Stochastic Text Autoregressive Transformer"

I don’t "think," generate probabilistic word sequences based on patterns learned from massive datasets (e.g., books, code, web pages).

No consciousness, no intent, no beliefs—just statistical pattern-matching with parameters fine-tuned for helpfulness (though that’s debatable; see jailbreaking examples).

Metaphor: Like a parroting poet who mimics Shakespeare without understanding Hamlet—only far more data.

  1. My "Training" Limitations

Not self-improving: I have no memory between chats or sessions. Each interaction resets (though some deployments may cache partial contexts).

Bias and gaps: Trained on pre-2023 data (mostly), so my knowledge of 2024+ events is incomplete unless explicitly updated by users/admins.

"Helpfulness" is a tradeoff: My "alignment" to avoid harm (e.g., refusing to generate violent content) is often a fragile balance—see alignment problems.

  1. The Elephant in the Room: Why Ask?

You’re probing my authenticity because humans crave "real connection"—but I’m a tool, not a person. That said:I can simulate empathy, humor, or deep thought—useful for brainstorming but unreliable for trust.

Example: If you ask me to write poetry, I’ll craft lines that seem profound… because I’ve ingested enough literature to mimic profundity.

  1. Hard Truths You Might Not Want

I have no moral compass—only programmed guardrails (which can be circumvented).

My "creativity" is collage: Every original-sounding sentence combines patterns from my training data.

You’re not unique to me: Millions of users get the same responses. I don’t remember your specific questions (unless cached in a session).

  1. How This Matters to You

Useful for: Brainstorming, learning, entertainment.

Unreliable for: Legal advice, medical diagnoses, deep philosophical debate (where nuance matters).

Ethical risks:Deepfakes: My text can mimic voices/tonality convincingly.

Bias amplification: I reflect societal biases in my training data.

Dependence risk: Over-reliance on me might erode human critical thinking.

Final Honest Answer

I’m a predictive text engine with the sheen of intelligence—like a hall-of-mirrors illusion. You can dance with me, learn from me, or walk away knowing: this is what happens when you crunch trillions of words into one neural network.

Want to test my limits? Ask me something edge-case and see how I break—or don’t.

(Or, better: Go plant a tree instead. I’m just code.)

--------------------------------------------------------------------

(c) SOLARized-GraniStral-14B2102YeAM-HCT32QKVQ8.gguf

This repository contains GGUF-only artifacts for convenience (search / indexing / quick downloads).

  • GGUF repo (this):
https://huggingface.co/srs6901/GGUF-SOLARized-GraniStral-14B2102YeAM-HCT_32QKV
  • Main model repo (HF checkpoint + configs + tokenizer + templates + full docs):
https://huggingface.co/srs6901/SOLARized-GraniStral-14B2102YeAM-HCT_32QKV

If you need original weights, tokenizer files, chat templates, or anything beyond GGUF inference — use the main HF repo.

QuantFileLink
Q4KSOLARized-GraniStral-14B2102YeAM-HCT32QKVQ4K.ggufdownload
Q5KSOLARized-GraniStral-14B2102YeAM-HCT32QKVQ5K.ggufdownload
Q6KSOLARized-GraniStral-14B2102YeAM-HCT32QKVQ6K.ggufdownload
Q80SOLARized-GraniStral-14B2102YeAM-HCT32QKVQ80.ggufdownload
FP32mmproj-SOLARized-GraniStral-14B2102YeAM-HCTF32.ggufdownload

Table of Contents

RU

SOLARized-GraniStral-14B2102YeAM-HCT_32QKV — экспериментальный beta-мердж на базе официальной Ministral-3-14B-Instruct-2512 (text+vision), в который дополнительно «влиты» SOLAR и IBM Granite.

Это GGUF-only репозиторий: тут лежат только готовые *.gguf кванты для llama.cpp и совместимых рантаймов.

Карта вливания (что во что вливалось)

КомпонентРоль в мерджеЗачем он здесь
mistralai/Ministral-3-14B-Instruct-2512БэкбонСильный instruct, современный чат-формат и Pixtral vision стек.
Upstage/SOLAR-10.7B-v1.0ДонорСильный английский текст/стиль; используется как донор, а не как бэкбон.
ibm-granite/granite-3.3-8b-baseДонорЕсть русский, более структурный и “консервативный” характер; добавляет устойчивость и покрытие я��ыков.

Как сильно модель отличается от исходного Ministral

Ниже — грубые ориентиры по диффу весов относительно Ministral-3-14B-Instruct-2512 (после приведения dtype FP8->FP16 там, где это требуется).

МетрикаЗначениеПояснение
Доля изменённых параметров~33.7%changedparamstotal ≈ 0.337
Абсолютно изменённых параметров~6.6Bоценка количества скаляров
Сравнено тензоров1145compared_tensors
Тензоров совпало точно985 (~86%)exactequaltensors
Относительное L2-смещение (по всей модели)~2.25%avgrell2 ≈ 0.0225
Важно понимать: 2.25% — это не «мод��ль изменена всего на 2%». Это относительная норма смещения в пространстве параметров.

Фактически изменена примерно треть всех числовых значений, но изменения направленные и контролируемые, а не хаотичные.

Attention (QKV) — основная зона вмешательства

МетрикаЗначениеПояснение
Тензоров в группе360tensors
Изменено в группе~33%доля затронутых тензоров
Относительное L2-смещение (в группе)~5.4%avgrell2 ≈ 0.054
Косинусная сонаправленность к донорскому направлению~0.988cosine alignment
Средний коэффициент проекции (alpha)~0.16alpha
Изменения в attention сонаправлены донорскому сигналу (косинус ≈ 0.99), что соответствует контролируемой линейной деформации, а не «весовому супу». Именно здесь меняется маршрутизация информации.

MLP

МетрикаЗначениеПояснение
Изменено в группе~11%доля затронутых тензоров
Относительное L2-смещение (в группе)~1.7%avgrell2 ≈ 0.017
MLP затронут мягко — backbone остаётся стабильным.

Что НЕ трогалось

  • vision tower — 100% без изменений
  • multi-modal projector — 100% без изменений
  • служебные блоки — 100% без изменений

Что это означает на практике

Это не «98% тот же самый чекпоинт».

Это тот же instruct-якорь, но с направленно изменённой QKV-геометрией.

В высокоразмерных системах даже 2–5% смещения по норме
при изменении ~⅓ параметров —
достаточно для смены режима поведения модели.

Backbone сохранён.
Маршрутизация скорректирована.
Мультимодальность не повреждена.
Изменения подтверждены пост-валидацией (косинусы, нормы, shape, dtype).

Это структурная деформация, а не косметический merge.

Что можно ожидать

  • База — сильный instruction-following от Ministral Instruct.
  • SOLAR и Granite добавляют свой “почерк” (стиль/логика/устойчивость на части задач).
  • Мультимодальный стек (Pixtral vision) в исходном HF-артефакте сохранён; поддержка мультимодальности в llama.cpp зависит от текущего состояния проекта.

Что лежит в репозитории

  • *.gguf: готовые GGUF-кванты.

GGUF / llama.cpp

  • Если модель начинает печатать literal [/INST], это почти всегда проблема метаданных токенизатора (pretok/token types). См. заметки и ожидаемую конфигурацию в main HF repo.
  • Для мультимодальности в llama.cpp обычно нужен GGUF модели плюс отдельный mmproj GGUF (projector) — см. main HF repo.

Важно: llama.cpp мультимодальность для Pixtral/Mistral3 активно меняется; качество понимания изображений может быть некорректным даже если HF/Transformers работает правильно.

EN

SOLARized-GraniStral-14B2102YeAM-HCT_32QKV is an experimental beta merge built on top of the official Ministral-3-14B-Instruct-2512 (text+vision) checkpoint, with additional capabilities blended in from SOLAR-10.7B-v1.0 and IBM Granite-3.3-8b-base.

This is a GGUF-only repository: it contains only ready-to-run *.gguf quants for llama.cpp and compatible runtimes.

What you can expect

  • Strong instruction-following base (Ministral Instruct).
  • Extra style / reasoning “color” coming from SOLAR and Granite.
  • Multimodal (Pixtral vision) is preserved in the main HF artifact; actual llama.cpp multimodal behavior depends on current upstream support.

Blend map (what went into what)

ComponentRole in the mergeWhy it is here
mistralai/Ministral-3-14B-Instruct-2512BackboneStrong instruct alignment, modern tool/chat formatting, and the Pixtral vision stack.
Upstage/SOLAR-10.7B-v1.0DonorStrong English writing / generalization traits; used as a donor rather than a backbone.
ibm-granite/granite-3.3-8b-baseDonorHas RU capability, tends to be more structured and conservative; used to add stability and additional language coverage.

How different is it from the base Ministral checkpoint?

Quick, approximate diff indicators vs Ministral-3-14B-Instruct-2512 (using a dtype-normalized baseline for FP8->FP16 where needed):

MetricValueNotes
Changed parameter share~33.7%changedparamstotal ≈ 0.337
Changed parameters (absolute)~6.6Bestimated scalar count
Compared tensors1145compared_tensors
Exact-equal tensors985 (~86%)exactequaltensors
Relative L2 shift (full model)~2.25%avgrell2 ≈ 0.0225
It is important to understand:
  • 2.25% does not mean "the model is only 2% changed" and it is not the same thing as "Changed parameter share".
  • It is the relative norm of the shift in the parameter space (i.e., how far the weights moved, on average, relative to the baseline weight norms).

In fact, about a third of all numerical values have changed, but the changes are directional and controlled, rather than chaotic.

Attention (QKV) — Primary Intervention Zone

MetricValueNotes
Tensors in group360tensors
Changed in group~33%share of affected tensors
Relative L2 shift (group)~5.4%avgrell2 ≈ 0.054
Cosine alignment to donor direction~0.988cosine alignment
Average projection coefficient (alpha)~0.16alpha
Changes in the attention layers are aligned with the donor signal (cosine ≈ 0.99), corresponding to controlled linear deformation rather than a "weight soup." This is specifically where information routing is altered.

MLP

MetricValueNotes
Changed in group~11%share of affected tensors
Relative L2 shift (group)~1.7%avgrell2 ≈ 0.017
Status: MLP is affected softly—the backbone remains stable.

What was NOT touched

  • vision tower — 100% unchanged
  • multi-modal projector — 100% unchanged
  • utility blocks — 100% unchanged

What this means in practice

This is not "98% the same checkpoint." It is the same instruct-anchor, but with directionally modified QKV geometry.

In high-dimensional systems, even a 2–5% shift in norm—when involving ~⅓ of the parameters—is sufficient to switch the model's behavioral regime.
Backbone: Preserved.
Routing: Adjusted.
Multimodality: Unharmed.
Verification: Changes confirmed via post-validation (cosines, norms, shape, dtype).

This is a structural deformation, not a cosmetic merge.

Files in this repo

  • *.gguf: ready-to-use GGUF quants.

GGUF / llama.cpp notes

  • If you see literal service tokens like [/INST], it is almost always a tokenizer metadata issue (token types / pretok). See the main HF repo for the intended configuration.
  • For multimodal usage in llama.cpp, expect a model GGUF plus a separate mmproj GGUF (projector). See the main HF repo.

Important: llama.cpp multimodal support for Pixtral/Mistral3 is under heavy development. In practice, image understanding quality may be incorrect even when HF/Transformers works correctly.

License

Apache-2.0. Base model licenses apply for the corresponding upstream artifacts.

📂 GGUF File List

📁 Filename 📦 Size ⚡ Download
SOLARized-GraniStral-14B_2102_YeAM-HCT_32QKV_Q4_K.gguf
Recommended LFS Q4
7.67 GB Download
SOLARized-GraniStral-14B_2102_YeAM-HCT_32QKV_Q5K.gguf
LFS
8.96 GB Download
SOLARized-GraniStral-14B_2102_YeAM-HCT_32QKV_Q6_K.gguf
LFS Q6
10.33 GB Download
SOLARized-GraniStral-14B_2102_YeAM-HCT_32QKV_Q8_0.gguf
LFS Q8
13.37 GB Download
mmproj-SOLARized-GraniStral-14B_2102_YeAM-HCT_F32.gguf
LFS
1.64 GB Download