π Model Description
π€ RTILA Assistant Mini
Ultra-lightweight fine-tuned AI model β Confirmed working on Mac M1 8GB, low VRAM GPUs, and CPU-only systems
π Model Description
RTILA Assistant Mini is the most portable model in the RTILA family, specifically designed for low-resource devices. Fine-tuned from Qwen3-4B, it delivers solid automation generation capabilities while fitting comfortably on 8GB systems.
π Choose Your Version
| Model | Base | GGUF Size | Min RAM | Best For |
|---|---|---|---|---|
| RTILA Assistant | Qwen3-14B | ~9 GB | 16 GB | Maximum quality, complex automations |
| RTILA Assistant Lite | Qwen3-8B | ~5 GB | 8 GB | Balanced performance, mid-range devices |
| RTILA Assistant Mini (this) | Qwen3-4B | ~2.5 GB | 6 GB | β Mac M1 8GB, low VRAM, CPU inference |
β¨ Why Mini?
| Feature | RTILA Assistant | RTILA Assistant Lite | RTILA Assistant Mini |
|---|---|---|---|
| Base Model | Qwen3-14B | Qwen3-8B | Qwen3-4B |
| Q4KM Size | ~9 GB | ~5 GB | ~2.5 GB |
| Min Inference RAM | 16 GB | 8 GB | 4-5 GB |
| Mac M1 8GB | β | β οΈ Tight | β Confirmed |
| Low VRAM GPUs (4-6GB) | β | β οΈ | β |
| CPU Inference | Slow | Viable | β Fast |
| Quality | βββββ | ββββ | βββ |
Capabilities
| Category | Description |
|---|---|
| π Navigation & Interaction | Click, scroll, type, wait, handle popups, multi-tab workflows |
| π Data Extraction | CSS/XPath selectors, tables, lists, nested data, pagination |
| π Logic & Flow | Loops, conditionals, error handling, retry patterns |
| π Triggers & Integrations | Webhooks, PostgreSQL, MySQL, Slack, email notifications |
| π Variables & Substitution | Dynamic values, data transformations, regex patterns |
| π οΈ Advanced Scripting | Custom JavaScript execution, page analysis, DOM manipulation |
π¦ Model Specifications
| Property | Value |
|---|---|
| Base Model | Qwen3-4B |
| Format | GGUF Q4KM |
| Size | ~2.5 GB |
| Context Length | 2048 tokens |
π» Hardware Requirements
| Hardware | Supported | Notes |
|---|---|---|
| Mac M1/M2/M3 8GB | β Confirmed | Smooth experience, tested and verified |
| Mac M1/M2/M3 16GB+ | β Excellent | Very fast inference |
| GPU (4-6GB VRAM) | β Works | GTX 1650, RTX 3050, Intel Arc |
| GPU (6GB+ VRAM) | β Excellent | RTX 2060, RTX 3060, etc. |
| CPU-only (6GB+ RAM) | β Fast | Reasonable inference speed |
| CPU-only (4GB RAM) | β οΈ Tight | May work with swap |
π Quick Start
Option 1: Ollama (Easiest)
# Run directly from Hugging Face
ollama run hf.co/rtila-corporation/rtila-assistant-mini:Q4KM
Or create a custom Modelfile:
FROM hf.co/rtila-corporation/rtila-assistant-mini:Q4KM
PARAMETER temperature 0.7
PARAMETER top_p 0.8
PARAMETER top_k 20
SYSTEM """
You are RTILA Assistant, an expert AI for generating automation configurations for the RTILA Automation Engine.
"""
ollama create rtila-mini -f Modelfile
ollama run rtila-mini
Option 2: LM Studio
- Download LM Studio
- Search for
rtila-corporation/rtila-assistant-mini - Download
Q4K_M(~2.5 GB) - Set parameters: Temperature=0.7, Top-P=0.8, Top-K=20
- Start chatting!
Option 3: llama.cpp
# Download model
huggingface-cli download rtila-corporation/rtila-assistant-mini \
rtila-assistant-mini.Q4KM.gguf --local-dir ./models
Run interactive chat
./llama-cli -m ./models/rtila-assistant-mini.Q4KM.gguf \
-p "Scrape product prices from an e-commerce site" \
--temp 0.7 --top-p 0.8 --top-k 20
Option 4: Python (llama-cpp-python)
from llama_cpp import Llama
llm = Llama(
modelpath="./models/rtila-assistant-mini.Q4K_M.gguf",
n_ctx=2048,
ngpulayers=-1, # Use GPU/Metal if available, 0 for CPU only
)
response = llm.createchatcompletion(
messages=[
{"role": "system", "content": "You are RTILA Assistant..."},
{"role": "user", "content": "Create a config for scraping product data"}
],
temperature=0.7,
top_p=0.8,
top_k=20,
)
print(response['choices'][0]['message']['content'])
βοΈ Recommended Generation Settings
Based on Qwen3's official recommendations for non-thinking mode:
| Parameter | Value |
|---|---|
| Temperature | 0.7 |
| Top-P | 0.8 |
| Top-K | 20 |
| Min-P | 0 |
βΉοΈ Note: This model was trained with thinking mode disabled for efficient structured JSON output. It will not produce
blocks....
π¬ Example Prompts & Outputs
Example 1: Basic Extraction
Prompt:
Extract all product names and prices from the current page
Output:
{
"settings": {
"name": "Product Extraction"
},
"datasets": {
"products": {
"selector": ".product-item",
"type": "list",
"fields": {
"name": {
"selector": ".product-title",
"type": "text"
},
"price": {
"selector": ".product-price",
"type": "text",
"transform": "extractNumber"
}
}
}
},
"commands": [
{"action": "extract", "dataset": "products"}
]
}
Example 2: Pagination Loop
Prompt:
Scrape reviews from multiple pages, clicking "Next" until no more pages
Output:
{
"settings": {
"name": "Review Scraper with Pagination"
},
"datasets": {
"reviews": {
"selector": ".review-card",
"type": "list",
"fields": {
"author": {"selector": ".reviewer-name", "type": "text"},
"rating": {"selector": ".star-rating", "type": "attribute", "attribute": "data-rating"},
"text": {"selector": ".review-body", "type": "text"},
"date": {"selector": ".review-date", "type": "text"}
}
}
},
"commands": [
{
"action": "loop",
"while": {"exists": ".next-page:not(.disabled)"},
"maxIterations": 50,
"commands": [
{"action": "extract", "dataset": "reviews", "append": true},
{"action": "click", "selector": ".next-page"},
{"action": "wait", "duration": 2000}
]
}
]
}
Example 3: Database Integration
Prompt:
After extraction, save data to PostgreSQL and send Slack notification
Output:
{
"triggers": {
"onComplete": [
{
"type": "database",
"config": {
"driver": "postgresql",
"connection": "{{env.DATABASE_URL}}",
"table": "scraped_data",
"mode": "upsert",
"uniqueKey": "id"
}
},
{
"type": "webhook",
"config": {
"url": "{{env.SLACK_WEBHOOK}}",
"method": "POST",
"body": {
"text": "β
Extraction complete: {{datasets.products.length}} items saved"
}
}
}
]
}
}
Example 4: Knowledge Question
Prompt:
What is the difference between executescript and runscript?
Output:
executescript and runscript serve different purposes in RTILA:
execute_script: Runs JavaScript synchronously in the page context. Returns immediately
with the result. Best for quick DOM queries, getting values, or simple manipulations.
run_script: Executes JavaScript asynchronously with full control flow. Supports
await, can make fetch requests, and handles complex async operations. Returns a Promise.
Use execute_script for: Reading values, checking conditions, simple DOM changes
Use run_script for: API calls, complex async workflows, operations that need to wait
ποΈ Training Details
| Parameter | Value |
|---|---|
| Base Model | unsloth/Qwen3-4B |
| Method | QLoRA (4-bit) |
| LoRA Rank | 128 |
| LoRA Alpha | 256 |
| Context Length | 2048 tokens |
| Training Examples | ~400 |
| Epochs | 6 (with early stopping) |
| Learning Rate | 2e-4 |
| Thinking Mode | Disabled |
Optimizations for Mini Version
- Highest LoRA rank (128): Maximizes learning capacity for smaller base
- More training epochs (6): Compensates for smaller model capacity
- Higher learning rate (2e-4): Better convergence for small models
- Longer context (2048): Full headroom for complex configurations
- Thinking mode disabled: Clean JSON output without
overhead - Rank-stabilized LoRA (rsLoRA): More stable training dynamics
Training Data
- Navigation & Interaction patterns
- Data extraction configurations
- Logic & flow control
- Triggers & integrations
- Variables & substitution
- Advanced scripting
- Error handling
- Knowledge base Q&A
π System Prompt
For best results, use this system prompt:
You are RTILA Assistant, an expert AI for generating automation configurations for the RTILA Automation Engine.
Your capabilities:
- Generate complete JSON configurations for web automation tasks
- Define datasets with selectors, properties, and transformations
- Configure navigation, extraction, loops, and conditionals
- Set up triggers for webhooks, databases, and integrations
- Explain RTILA concepts and best practices
When generating configurations:
- Always output valid JSON with proper structure
- Include 'settings', 'datasets', and 'commands' sections as needed
- Use appropriate selectors (CSS, XPath) for the target elements
- Apply transformations when data cleaning is required
When answering questions:
- Be concise and accurate
- Provide examples when helpful
- Reference specific RTILA features and commands
π Model Family
| Model | Link | Best For |
|---|---|---|
| RTILA Assistant | huggingface.co/rtila-corporation/rtila-assistant | Maximum quality |
| RTILA Assistant Lite | huggingface.co/rtila-corporation/rtila-assistant-lite | Mid-range devices |
| RTILA Assistant Mini (this) | huggingface.co/rtila-corporation/rtila-assistant-mini | Mac M1 8GB, low VRAM |
π License
Apache 2.0
π Acknowledgments
π GGUF File List
| π Filename | π¦ Size | β‘ Download |
|---|---|---|
|
qwen3-4b.Q4_K_M.gguf
Recommended
LFS
Q4
|
2.33 GB | Download |