π Model Description
tags:
- gguf
- llama.cpp
- Jackrong/ShareGPT-Qwen3-235B-A22B-Instuct-2507
- en
- zh
- Qwen/Qwen3-4B-Instruct-2507
GPT-5-Distill-Qwen3-4B-Instruct-2507
!Base Model
!Distillation
!Language
!Context
!Format
!License



Model Type: Instruction-tuned conversational LLM
Supports LoRA adapters and full-finetuned models for inference
- Base Model:
Qwen/Qwen3-4B-Instruct-2507 - Parameters: 4B
- Training Method:
- Supervised Fine-Tuning (SFT) on ShareGPT data
- Knowledge distillation from LMSYS GPT-5 responses
- Supported Languages: Chinese, English, mixed inputs/outputs
- Max Context Length: Up to 32K tokens (
maxseqlength = 32768)
This model is trained on ShareGPT-Qwen3 instruction datasets and distilled toward the conversational style and quality of GPT-5. It aims to achieve high-quality, natural-sounding dialogues with low computational overheadβperfect for lightweight applications without sacrificing responsiveness.
2. Intended Use Cases
β Recommended:
- Casual chat in Chinese/English
- General knowledge explanations & reasoning guidance
- Code suggestions and simple debugging tips
- Writing assistance: editing, summarizing, rewriting
- Role-playing conversations (with well-designed prompts)
β οΈ Not Suitable For:
- High-risk decision-making:
- Real-time factual tasks (e.g., news, stock updates)
- Authoritative judgment on sensitive topics
Note: Outputs are for reference only and not intended as the sole basis for critical decisions.
3. Training Data & Distillation Process
Key Datasets:
#### (1) ds1: ShareGPT-Qwen3 Instruction Dataset
- Source:
Jackrong/ShareGPT-Qwen3-235B-A22B-Instuct-2507 - Purpose:
- Provides diverse instruction-response pairs
- Supports multi-turn dialogues and context awareness
- Processing:
- Cleaned for quality and relevance
- Standardized into
instruction, input, output format
#### (2) ds2: LMSYS GPT-5 Teacher Response Data
- Source:
ytz20/LMSYS-Chat-GPT-5-Chat-Response - Filtering:
- Only kept samples with
flaw == "normal"- Removed hallucinations and inconsistent responses
- Purpose:
- Distillation target for conversational quality
- Enhances clarity, coherence, and fluency
Training Flow:
- Prepare unified Chat-formatted dataset
- Fine-tune base Qwen3-4B-Instruct-2507 via SFT
- Conduct knowledge distillation using GPT-5's normal responses as teacher outputs
- Balance style imitation with semantic fidelity to ensure robustness
βοΈ Note: This work is based on publicly available, non-sensitive datasets and uses them responsibly under fair use principles.
4. Key Features Summary
| Feature | Description |
|---|---|
| Lightweight | ~4B parameter model β fast inference, low resource usage |
| Distillation-Style Responses | Mimics GPT-5βs conversational fluency and helpfulness |
| Highly Conversational | Excellent for chatbot-style interactions with rich dialogue flow |
| Multilingual Ready | Seamless support for Chinese and English |
5. Acknowledgements
We thank:
- LMSYS team for sharing GPT-5 response data
- Jackrong for the ShareGPT-Qwen3 dataset
- Qwen team for releasing
Qwen3-4B-Instruct
This project is an open research effort aimed at making high-quality conversational AI accessible with smaller models.
π GGUF File List
| π Filename | π¦ Size | β‘ Download |
|---|---|---|
|
GPT-5-Distill-Qwen3-4B-Instruct-IQ4_XS.gguf
LFS
Q4
|
2.13 GB | Download |
|
GPT-5-Distill-Qwen3-4B-Instruct-Q2_K.gguf
LFS
Q2
|
1.55 GB | Download |
|
GPT-5-Distill-Qwen3-4B-Instruct-Q3_K_L.gguf
LFS
Q3
|
2.09 GB | Download |
|
GPT-5-Distill-Qwen3-4B-Instruct-Q3_K_M.gguf
LFS
Q3
|
1.93 GB | Download |
|
GPT-5-Distill-Qwen3-4B-Instruct-Q3_K_S.gguf
LFS
Q3
|
1.76 GB | Download |
|
GPT-5-Distill-Qwen3-4B-Instruct-Q4_K_S.gguf
LFS
Q4
|
2.22 GB | Download |
|
GPT-5-Distill-Qwen3-4B-Instruct-Q5_K_M.gguf
LFS
Q5
|
2.69 GB | Download |
|
GPT-5-Distill-Qwen3-4B-Instruct-Q5_K_S.gguf
LFS
Q5
|
2.63 GB | Download |
|
GPT-5-Distill-Qwen3-4B-Instruct-Q6_K.gguf
LFS
Q6
|
3.08 GB | Download |
|
GPT-5-Distill-Qwen3-4B-Instruct-f16.gguf
LFS
FP16
|
7.5 GB | Download |
|
qwen3-4b-instruct-2507.Q4_K_M.gguf
Recommended
LFS
Q4
|
2.33 GB | Download |
|
qwen3-4b-instruct-2507.Q8_0.gguf
LFS
Q8
|
3.99 GB | Download |