π Model Description
π€ RTILA Assistant Lite
Balanced fine-tuned AI model for mid-range devices β 8GB+ RAM systems
π Model Description
RTILA Assistant Lite is the balanced option in the RTILA family, fine-tuned from Qwen3-8B. It offers excellent quality while fitting on more devices than the full 14B model.
π Choose Your Version
| Model | Base | GGUF Size | Min RAM | Best For |
|---|---|---|---|---|
| RTILA Assistant | Qwen3-14B | ~9 GB | 16 GB | Maximum quality, complex automations |
| RTILA Assistant Lite (this) | Qwen3-8B | ~5 GB | 8 GB | π― Balanced performance, mid-range devices |
| RTILA Assistant Mini | Qwen3-4B | ~2.5 GB | 6 GB | β Mac M1 8GB, low VRAM, CPU inference |
β¨ Why Lite?
| Feature | RTILA Assistant | RTILA Assistant Lite | RTILA Assistant Mini |
|---|---|---|---|
| Base Model | Qwen3-14B | Qwen3-8B | Qwen3-4B |
| Q4KM Size | ~9 GB | ~5 GB | ~2.5 GB |
| Min RAM | 16 GB | 8 GB | 6 GB |
| Mac M1 8GB | β | β οΈ Tight | β |
| Quality | βββββ | ββββ | βββ |
β οΈ Mac M1 8GB Users: While this model may run on 8GB systems, it will be tight on memory. For a smoother experience, we recommend RTILA Assistant Mini which is confirmed working on Mac M1 8GB.
Capabilities
| Category | Description |
|---|---|
| π Navigation & Interaction | Click, scroll, type, wait, handle popups, multi-tab workflows |
| π Data Extraction | CSS/XPath selectors, tables, lists, nested data, pagination |
| π Logic & Flow | Loops, conditionals, error handling, retry patterns |
| π Triggers & Integrations | Webhooks, PostgreSQL, MySQL, Slack, email notifications |
| π Variables & Substitution | Dynamic values, data transformations, regex patterns |
| π οΈ Advanced Scripting | Custom JavaScript execution, page analysis, DOM manipulation |
π¦ Model Specifications
| Property | Value |
|---|---|
| Base Model | Qwen3-8B |
| Format | GGUF Q4KM |
| Size | ~5 GB |
| Context Length | 2048 tokens |
π» Hardware Requirements
| Hardware | Supported | Notes |
|---|---|---|
| GPU (8GB+ VRAM) | β Recommended | RTX 3060, RTX 4060, RTX 3070 |
| GPU (6GB VRAM) | β οΈ May work | RTX 2060, GTX 1660 - needs CPU offloading |
| Apple Silicon 16GB+ | β Excellent | M1/M2/M3 Pro/Max - fast and smooth |
| Apple Silicon 8GB | β οΈ Tight | May work but memory-constrained |
| CPU-only | β Viable | 8GB+ RAM, reasonable inference speed |
π‘ Memory tight? Try RTILA Assistant Mini (~2.5GB GGUF, runs smoothly on 6GB)
π Quick Start
Option 1: Ollama (Easiest)
# Run directly from Hugging Face
ollama run hf.co/rtila-corporation/rtila-assistant-lite:Q4KM
Or create a custom Modelfile:
FROM hf.co/rtila-corporation/rtila-assistant-lite:Q4KM
PARAMETER temperature 0.7
PARAMETER top_p 0.8
PARAMETER top_k 20
SYSTEM """
You are RTILA Assistant, an expert AI for generating automation configurations for the RTILA Automation Engine.
"""
ollama create rtila-lite -f Modelfile
ollama run rtila-lite
Option 2: LM Studio
- Download LM Studio
- Search for
rtila-corporation/rtila-assistant-lite - Download
Q4K_M - Set parameters: Temperature=0.7, Top-P=0.8, Top-K=20
- Start chatting!
Option 3: llama.cpp
# Download model
huggingface-cli download rtila-corporation/rtila-assistant-lite \
rtila-assistant-lite.Q4KM.gguf --local-dir ./models
Run interactive chat
./llama-cli -m ./models/rtila-assistant-lite.Q4KM.gguf \
-p "Scrape product prices from an e-commerce site" \
--temp 0.7 --top-p 0.8 --top-k 20
Option 4: Python (llama-cpp-python)
from llama_cpp import Llama
llm = Llama(
modelpath="./models/rtila-assistant-lite.Q4K_M.gguf",
n_ctx=2048,
ngpulayers=-1, # Use GPU if available, 0 for CPU only
)
response = llm.createchatcompletion(
messages=[
{"role": "system", "content": "You are RTILA Assistant..."},
{"role": "user", "content": "Create a config for scraping product data"}
],
temperature=0.7,
top_p=0.8,
top_k=20,
)
print(response['choices'][0]['message']['content'])
βοΈ Recommended Generation Settings
Based on Qwen3's official recommendations for non-thinking mode:
| Parameter | Value |
|---|---|
| Temperature | 0.7 |
| Top-P | 0.8 |
| Top-K | 20 |
| Min-P | 0 |
βΉοΈ Note: This model was trained with thinking mode disabled for efficient structured JSON output. It will not produce
blocks....
π¬ Example Prompts & Outputs
Example 1: Basic Extraction
Prompt:
Extract all product names and prices from the current page
Output:
{
"settings": {
"name": "Product Extraction"
},
"datasets": {
"products": {
"selector": ".product-item",
"type": "list",
"fields": {
"name": {
"selector": ".product-title",
"type": "text"
},
"price": {
"selector": ".product-price",
"type": "text",
"transform": "extractNumber"
}
}
}
},
"commands": [
{"action": "extract", "dataset": "products"}
]
}
Example 2: Pagination Loop
Prompt:
Scrape reviews from multiple pages, clicking "Next" until no more pages
Output:
{
"settings": {
"name": "Review Scraper with Pagination"
},
"datasets": {
"reviews": {
"selector": ".review-card",
"type": "list",
"fields": {
"author": {"selector": ".reviewer-name", "type": "text"},
"rating": {"selector": ".star-rating", "type": "attribute", "attribute": "data-rating"},
"text": {"selector": ".review-body", "type": "text"},
"date": {"selector": ".review-date", "type": "text"}
}
}
},
"commands": [
{
"action": "loop",
"while": {"exists": ".next-page:not(.disabled)"},
"maxIterations": 50,
"commands": [
{"action": "extract", "dataset": "reviews", "append": true},
{"action": "click", "selector": ".next-page"},
{"action": "wait", "duration": 2000}
]
}
]
}
Example 3: Database Integration
Prompt:
After extraction, save data to PostgreSQL and send Slack notification
Output:
{
"triggers": {
"onComplete": [
{
"type": "database",
"config": {
"driver": "postgresql",
"connection": "{{env.DATABASE_URL}}",
"table": "scraped_data",
"mode": "upsert",
"uniqueKey": "id"
}
},
{
"type": "webhook",
"config": {
"url": "{{env.SLACK_WEBHOOK}}",
"method": "POST",
"body": {
"text": "β
Extraction complete: {{datasets.products.length}} items saved"
}
}
}
]
}
}
Example 4: Knowledge Question
Prompt:
What is the difference between executescript and runscript?
Output:
executescript and runscript serve different purposes in RTILA:
execute_script: Runs JavaScript synchronously in the page context. Returns immediately
with the result. Best for quick DOM queries, getting values, or simple manipulations.
run_script: Executes JavaScript asynchronously with full control flow. Supports
await, can make fetch requests, and handles complex async operations. Returns a Promise.
Use execute_script for: Reading values, checking conditions, simple DOM changes
Use run_script for: API calls, complex async workflows, operations that need to wait
ποΈ Training Details
| Parameter | Value |
|---|---|
| Base Model | unsloth/Qwen3-8B |
| Method | QLoRA (4-bit) |
| LoRA Rank | 128 |
| LoRA Alpha | 256 |
| Context Length | 2048 tokens |
| Training Examples | ~400 |
| Epochs | 5 (with early stopping) |
| Learning Rate | 2e-4 |
| Thinking Mode | Disabled |
Optimizations for Lite Version
- Higher LoRA rank (128 vs 64): Compensates for smaller base model
- Higher learning rate (2e-4 vs 2e-5): Better convergence for smaller models
- Longer context (2048 vs 1536): More headroom for complex configurations
- Thinking mode disabled: Eliminates
overhead for structured output - Rank-stabilized LoRA (rsLoRA): More stable training
Training Data
- Navigation & Interaction patterns
- Data extraction configurations
- Logic & flow control
- Triggers & integrations
- Variables & substitution
- Advanced scripting
- Error handling
- Knowledge base Q&A
π System Prompt
For best results, use this system prompt:
You are RTILA Assistant, an expert AI for generating automation configurations for the RTILA Automation Engine.
Your capabilities:
- Generate complete JSON configurations for web automation tasks
- Define datasets with selectors, properties, and transformations
- Configure navigation, extraction, loops, and conditionals
- Set up triggers for webhooks, databases, and integrations
- Explain RTILA concepts and best practices
When generating configurations:
- Always output valid JSON with proper structure
- Include 'settings', 'datasets', and 'commands' sections as needed
- Use appropriate selectors (CSS, XPath) for the target elements
- Apply transformations when data cleaning is required
When answering questions:
- Be concise and accurate
- Provide examples when helpful
- Reference specific RTILA features and commands
π Model Family
| Model | Link | Best For |
|---|---|---|
| RTILA Assistant | huggingface.co/rtila-corporation/rtila-assistant | Maximum quality |
| RTILA Assistant Lite (this) | huggingface.co/rtila-corporation/rtila-assistant-lite | Mid-range devices |
| RTILA Assistant Mini | huggingface.co/rtila-corporation/rtila-assistant-mini | Mac M1 8GB, low VRAM |
π License
Apache 2.0
π Acknowledgments
π GGUF File List
| π Filename | π¦ Size | β‘ Download |
|---|---|---|
|
qwen3-8b.Q4_K_M.gguf
Recommended
LFS
Q4
|
4.68 GB | Download |