πŸ“‹ Model Description

πŸ€– RTILA Assistant Mini

Ultra-lightweight fine-tuned AI model β€” Confirmed working on Mac M1 8GB, low VRAM GPUs, and CPU-only systems

Model</a>
License</a>
GGUF</a>
Mac M1 8GB</a>

πŸ“‹ Model Description

RTILA Assistant Mini is the most portable model in the RTILA family, specifically designed for low-resource devices. Fine-tuned from Qwen3-4B, it delivers solid automation generation capabilities while fitting comfortably on 8GB systems.

πŸ”„ Choose Your Version

ModelBaseGGUF SizeMin RAMBest For
RTILA AssistantQwen3-14B~9 GB16 GBMaximum quality, complex automations
RTILA Assistant LiteQwen3-8B~5 GB8 GBBalanced performance, mid-range devices
RTILA Assistant Mini (this)Qwen3-4B~2.5 GB6 GBβœ… Mac M1 8GB, low VRAM, CPU inference

✨ Why Mini?

FeatureRTILA AssistantRTILA Assistant LiteRTILA Assistant Mini
Base ModelQwen3-14BQwen3-8BQwen3-4B
Q4KM Size~9 GB~5 GB~2.5 GB
Min Inference RAM16 GB8 GB4-5 GB
Mac M1 8GB❌⚠️ Tightβœ… Confirmed
Low VRAM GPUs (4-6GB)βŒβš οΈβœ…
CPU InferenceSlowViableβœ… Fast
Quality⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐

Capabilities

CategoryDescription
🌐 Navigation & InteractionClick, scroll, type, wait, handle popups, multi-tab workflows
πŸ“Š Data ExtractionCSS/XPath selectors, tables, lists, nested data, pagination
πŸ”„ Logic & FlowLoops, conditionals, error handling, retry patterns
πŸ”— Triggers & IntegrationsWebhooks, PostgreSQL, MySQL, Slack, email notifications
πŸ“ Variables & SubstitutionDynamic values, data transformations, regex patterns
πŸ› οΈ Advanced ScriptingCustom JavaScript execution, page analysis, DOM manipulation

πŸ“¦ Model Specifications

PropertyValue
Base ModelQwen3-4B
FormatGGUF Q4KM
Size~2.5 GB
Context Length2048 tokens

πŸ’» Hardware Requirements

HardwareSupportedNotes
Mac M1/M2/M3 8GBβœ… ConfirmedSmooth experience, tested and verified
Mac M1/M2/M3 16GB+βœ… ExcellentVery fast inference
GPU (4-6GB VRAM)βœ… WorksGTX 1650, RTX 3050, Intel Arc
GPU (6GB+ VRAM)βœ… ExcellentRTX 2060, RTX 3060, etc.
CPU-only (6GB+ RAM)βœ… FastReasonable inference speed
CPU-only (4GB RAM)⚠️ TightMay work with swap

πŸš€ Quick Start

Option 1: Ollama (Easiest)

# Run directly from Hugging Face
ollama run hf.co/rtila-corporation/rtila-assistant-mini:Q4KM

Or create a custom Modelfile:

FROM hf.co/rtila-corporation/rtila-assistant-mini:Q4KM

PARAMETER temperature 0.7
PARAMETER top_p 0.8
PARAMETER top_k 20

SYSTEM """
You are RTILA Assistant, an expert AI for generating automation configurations for the RTILA Automation Engine.
"""

ollama create rtila-mini -f Modelfile
ollama run rtila-mini

Option 2: LM Studio

  1. Download LM Studio
  2. Search for rtila-corporation/rtila-assistant-mini
  3. Download Q4K_M (~2.5 GB)
  4. Set parameters: Temperature=0.7, Top-P=0.8, Top-K=20
  5. Start chatting!

Option 3: llama.cpp

# Download model
huggingface-cli download rtila-corporation/rtila-assistant-mini \
  rtila-assistant-mini.Q4KM.gguf --local-dir ./models

Run interactive chat

./llama-cli -m ./models/rtila-assistant-mini.Q4KM.gguf \ -p "Scrape product prices from an e-commerce site" \ --temp 0.7 --top-p 0.8 --top-k 20

Option 4: Python (llama-cpp-python)

from llama_cpp import Llama

llm = Llama(
modelpath="./models/rtila-assistant-mini.Q4K_M.gguf",
n_ctx=2048,
ngpulayers=-1, # Use GPU/Metal if available, 0 for CPU only
)

response = llm.createchatcompletion(
messages=[
{"role": "system", "content": "You are RTILA Assistant..."},
{"role": "user", "content": "Create a config for scraping product data"}
],
temperature=0.7,
top_p=0.8,
top_k=20,
)
print(response['choices'][0]['message']['content'])


βš™οΈ Recommended Generation Settings

Based on Qwen3's official recommendations for non-thinking mode:

ParameterValue
Temperature0.7
Top-P0.8
Top-K20
Min-P0

ℹ️ Note: This model was trained with thinking mode disabled for efficient structured JSON output. It will not produce ... blocks.


πŸ’¬ Example Prompts & Outputs

Example 1: Basic Extraction

Prompt:

Extract all product names and prices from the current page

Output:

{
"settings": {
"name": "Product Extraction"
},
"datasets": {
"products": {
"selector": ".product-item",
"type": "list",
"fields": {
"name": {
"selector": ".product-title",
"type": "text"
},
"price": {
"selector": ".product-price",
"type": "text",
"transform": "extractNumber"
}
}
}
},
"commands": [
{"action": "extract", "dataset": "products"}
]
}

Example 2: Pagination Loop

Prompt:

Scrape reviews from multiple pages, clicking "Next" until no more pages

Output:

{
"settings": {
"name": "Review Scraper with Pagination"
},
"datasets": {
"reviews": {
"selector": ".review-card",
"type": "list",
"fields": {
"author": {"selector": ".reviewer-name", "type": "text"},
"rating": {"selector": ".star-rating", "type": "attribute", "attribute": "data-rating"},
"text": {"selector": ".review-body", "type": "text"},
"date": {"selector": ".review-date", "type": "text"}
}
}
},
"commands": [
{
"action": "loop",
"while": {"exists": ".next-page:not(.disabled)"},
"maxIterations": 50,
"commands": [
{"action": "extract", "dataset": "reviews", "append": true},
{"action": "click", "selector": ".next-page"},
{"action": "wait", "duration": 2000}
]
}
]
}

Example 3: Database Integration

Prompt:

After extraction, save data to PostgreSQL and send Slack notification

Output:

{
"triggers": {
"onComplete": [
{
"type": "database",
"config": {
"driver": "postgresql",
"connection": "{{env.DATABASE_URL}}",
"table": "scraped_data",
"mode": "upsert",
"uniqueKey": "id"
}
},
{
"type": "webhook",
"config": {
"url": "{{env.SLACK_WEBHOOK}}",
"method": "POST",
"body": {
"text": "βœ… Extraction complete: {{datasets.products.length}} items saved"
}
}
}
]
}
}

Example 4: Knowledge Question

Prompt:

What is the difference between executescript and runscript?

Output:

executescript and runscript serve different purposes in RTILA:

execute_script: Runs JavaScript synchronously in the page context. Returns immediately
with the result. Best for quick DOM queries, getting values, or simple manipulations.

run_script: Executes JavaScript asynchronously with full control flow. Supports
await, can make fetch requests, and handles complex async operations. Returns a Promise.

Use execute_script for: Reading values, checking conditions, simple DOM changes
Use run_script for: API calls, complex async workflows, operations that need to wait


πŸ‹οΈ Training Details

ParameterValue
Base Modelunsloth/Qwen3-4B
MethodQLoRA (4-bit)
LoRA Rank128
LoRA Alpha256
Context Length2048 tokens
Training Examples~400
Epochs6 (with early stopping)
Learning Rate2e-4
Thinking ModeDisabled

Optimizations for Mini Version

  • Highest LoRA rank (128): Maximizes learning capacity for smaller base
  • More training epochs (6): Compensates for smaller model capacity
  • Higher learning rate (2e-4): Better convergence for small models
  • Longer context (2048): Full headroom for complex configurations
  • Thinking mode disabled: Clean JSON output without overhead
  • Rank-stabilized LoRA (rsLoRA): More stable training dynamics

Training Data

  • Navigation & Interaction patterns
  • Data extraction configurations
  • Logic & flow control
  • Triggers & integrations
  • Variables & substitution
  • Advanced scripting
  • Error handling
  • Knowledge base Q&A

πŸ“ System Prompt

For best results, use this system prompt:

You are RTILA Assistant, an expert AI for generating automation configurations for the RTILA Automation Engine.

Your capabilities:

  1. Generate complete JSON configurations for web automation tasks
  2. Define datasets with selectors, properties, and transformations
  3. Configure navigation, extraction, loops, and conditionals
  4. Set up triggers for webhooks, databases, and integrations
  5. Explain RTILA concepts and best practices

When generating configurations:

  • Always output valid JSON with proper structure
  • Include 'settings', 'datasets', and 'commands' sections as needed
  • Use appropriate selectors (CSS, XPath) for the target elements
  • Apply transformations when data cleaning is required

When answering questions:

  • Be concise and accurate
  • Provide examples when helpful
  • Reference specific RTILA features and commands


πŸ”— Model Family

ModelLinkBest For
RTILA Assistanthuggingface.co/rtila-corporation/rtila-assistantMaximum quality
RTILA Assistant Litehuggingface.co/rtila-corporation/rtila-assistant-liteMid-range devices
RTILA Assistant Mini (this)huggingface.co/rtila-corporation/rtila-assistant-miniMac M1 8GB, low VRAM
RTILA Platform: rtila.com

πŸ“„ License

Apache 2.0


πŸ™ Acknowledgments

πŸ“‚ GGUF File List

πŸ“ Filename πŸ“¦ Size ⚑ Download
qwen3-4b.Q4_K_M.gguf
Recommended LFS Q4
2.33 GB Download