πŸ“‹ Model Description

πŸ€– RTILA Assistant Lite

Balanced fine-tuned AI model for mid-range devices β€” 8GB+ RAM systems

Model</a>
License</a>
GGUF</a>

πŸ“‹ Model Description

RTILA Assistant Lite is the balanced option in the RTILA family, fine-tuned from Qwen3-8B. It offers excellent quality while fitting on more devices than the full 14B model.

πŸ”„ Choose Your Version

ModelBaseGGUF SizeMin RAMBest For
RTILA AssistantQwen3-14B~9 GB16 GBMaximum quality, complex automations
RTILA Assistant Lite (this)Qwen3-8B~5 GB8 GB🎯 Balanced performance, mid-range devices
RTILA Assistant MiniQwen3-4B~2.5 GB6 GBβœ… Mac M1 8GB, low VRAM, CPU inference

✨ Why Lite?

FeatureRTILA AssistantRTILA Assistant LiteRTILA Assistant Mini
Base ModelQwen3-14BQwen3-8BQwen3-4B
Q4KM Size~9 GB~5 GB~2.5 GB
Min RAM16 GB8 GB6 GB
Mac M1 8GB❌⚠️ Tightβœ…
Quality⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐⭐

⚠️ Mac M1 8GB Users: While this model may run on 8GB systems, it will be tight on memory. For a smoother experience, we recommend RTILA Assistant Mini which is confirmed working on Mac M1 8GB.

Capabilities

CategoryDescription
🌐 Navigation & InteractionClick, scroll, type, wait, handle popups, multi-tab workflows
πŸ“Š Data ExtractionCSS/XPath selectors, tables, lists, nested data, pagination
πŸ”„ Logic & FlowLoops, conditionals, error handling, retry patterns
πŸ”— Triggers & IntegrationsWebhooks, PostgreSQL, MySQL, Slack, email notifications
πŸ“ Variables & SubstitutionDynamic values, data transformations, regex patterns
πŸ› οΈ Advanced ScriptingCustom JavaScript execution, page analysis, DOM manipulation

πŸ“¦ Model Specifications

PropertyValue
Base ModelQwen3-8B
FormatGGUF Q4KM
Size~5 GB
Context Length2048 tokens

πŸ’» Hardware Requirements

HardwareSupportedNotes
GPU (8GB+ VRAM)βœ… RecommendedRTX 3060, RTX 4060, RTX 3070
GPU (6GB VRAM)⚠️ May workRTX 2060, GTX 1660 - needs CPU offloading
Apple Silicon 16GB+βœ… ExcellentM1/M2/M3 Pro/Max - fast and smooth
Apple Silicon 8GB⚠️ TightMay work but memory-constrained
CPU-onlyβœ… Viable8GB+ RAM, reasonable inference speed

πŸ’‘ Memory tight? Try RTILA Assistant Mini (~2.5GB GGUF, runs smoothly on 6GB)


πŸš€ Quick Start

Option 1: Ollama (Easiest)

# Run directly from Hugging Face
ollama run hf.co/rtila-corporation/rtila-assistant-lite:Q4KM

Or create a custom Modelfile:

FROM hf.co/rtila-corporation/rtila-assistant-lite:Q4KM

PARAMETER temperature 0.7
PARAMETER top_p 0.8
PARAMETER top_k 20

SYSTEM """
You are RTILA Assistant, an expert AI for generating automation configurations for the RTILA Automation Engine.
"""

ollama create rtila-lite -f Modelfile
ollama run rtila-lite

Option 2: LM Studio

  1. Download LM Studio
  2. Search for rtila-corporation/rtila-assistant-lite
  3. Download Q4K_M
  4. Set parameters: Temperature=0.7, Top-P=0.8, Top-K=20
  5. Start chatting!

Option 3: llama.cpp

# Download model
huggingface-cli download rtila-corporation/rtila-assistant-lite \
  rtila-assistant-lite.Q4KM.gguf --local-dir ./models

Run interactive chat

./llama-cli -m ./models/rtila-assistant-lite.Q4KM.gguf \ -p "Scrape product prices from an e-commerce site" \ --temp 0.7 --top-p 0.8 --top-k 20

Option 4: Python (llama-cpp-python)

from llama_cpp import Llama

llm = Llama(
modelpath="./models/rtila-assistant-lite.Q4K_M.gguf",
n_ctx=2048,
ngpulayers=-1, # Use GPU if available, 0 for CPU only
)

response = llm.createchatcompletion(
messages=[
{"role": "system", "content": "You are RTILA Assistant..."},
{"role": "user", "content": "Create a config for scraping product data"}
],
temperature=0.7,
top_p=0.8,
top_k=20,
)
print(response['choices'][0]['message']['content'])


βš™οΈ Recommended Generation Settings

Based on Qwen3's official recommendations for non-thinking mode:

ParameterValue
Temperature0.7
Top-P0.8
Top-K20
Min-P0

ℹ️ Note: This model was trained with thinking mode disabled for efficient structured JSON output. It will not produce ... blocks.


πŸ’¬ Example Prompts & Outputs

Example 1: Basic Extraction

Prompt:

Extract all product names and prices from the current page

Output:

{
"settings": {
"name": "Product Extraction"
},
"datasets": {
"products": {
"selector": ".product-item",
"type": "list",
"fields": {
"name": {
"selector": ".product-title",
"type": "text"
},
"price": {
"selector": ".product-price",
"type": "text",
"transform": "extractNumber"
}
}
}
},
"commands": [
{"action": "extract", "dataset": "products"}
]
}

Example 2: Pagination Loop

Prompt:

Scrape reviews from multiple pages, clicking "Next" until no more pages

Output:

{
"settings": {
"name": "Review Scraper with Pagination"
},
"datasets": {
"reviews": {
"selector": ".review-card",
"type": "list",
"fields": {
"author": {"selector": ".reviewer-name", "type": "text"},
"rating": {"selector": ".star-rating", "type": "attribute", "attribute": "data-rating"},
"text": {"selector": ".review-body", "type": "text"},
"date": {"selector": ".review-date", "type": "text"}
}
}
},
"commands": [
{
"action": "loop",
"while": {"exists": ".next-page:not(.disabled)"},
"maxIterations": 50,
"commands": [
{"action": "extract", "dataset": "reviews", "append": true},
{"action": "click", "selector": ".next-page"},
{"action": "wait", "duration": 2000}
]
}
]
}

Example 3: Database Integration

Prompt:

After extraction, save data to PostgreSQL and send Slack notification

Output:

{
"triggers": {
"onComplete": [
{
"type": "database",
"config": {
"driver": "postgresql",
"connection": "{{env.DATABASE_URL}}",
"table": "scraped_data",
"mode": "upsert",
"uniqueKey": "id"
}
},
{
"type": "webhook",
"config": {
"url": "{{env.SLACK_WEBHOOK}}",
"method": "POST",
"body": {
"text": "βœ… Extraction complete: {{datasets.products.length}} items saved"
}
}
}
]
}
}

Example 4: Knowledge Question

Prompt:

What is the difference between executescript and runscript?

Output:

executescript and runscript serve different purposes in RTILA:

execute_script: Runs JavaScript synchronously in the page context. Returns immediately
with the result. Best for quick DOM queries, getting values, or simple manipulations.

run_script: Executes JavaScript asynchronously with full control flow. Supports
await, can make fetch requests, and handles complex async operations. Returns a Promise.

Use execute_script for: Reading values, checking conditions, simple DOM changes
Use run_script for: API calls, complex async workflows, operations that need to wait


πŸ‹οΈ Training Details

ParameterValue
Base Modelunsloth/Qwen3-8B
MethodQLoRA (4-bit)
LoRA Rank128
LoRA Alpha256
Context Length2048 tokens
Training Examples~400
Epochs5 (with early stopping)
Learning Rate2e-4
Thinking ModeDisabled

Optimizations for Lite Version

  • Higher LoRA rank (128 vs 64): Compensates for smaller base model
  • Higher learning rate (2e-4 vs 2e-5): Better convergence for smaller models
  • Longer context (2048 vs 1536): More headroom for complex configurations
  • Thinking mode disabled: Eliminates overhead for structured output
  • Rank-stabilized LoRA (rsLoRA): More stable training

Training Data

  • Navigation & Interaction patterns
  • Data extraction configurations
  • Logic & flow control
  • Triggers & integrations
  • Variables & substitution
  • Advanced scripting
  • Error handling
  • Knowledge base Q&A

πŸ“ System Prompt

For best results, use this system prompt:

You are RTILA Assistant, an expert AI for generating automation configurations for the RTILA Automation Engine.

Your capabilities:

  1. Generate complete JSON configurations for web automation tasks
  2. Define datasets with selectors, properties, and transformations
  3. Configure navigation, extraction, loops, and conditionals
  4. Set up triggers for webhooks, databases, and integrations
  5. Explain RTILA concepts and best practices

When generating configurations:

  • Always output valid JSON with proper structure
  • Include 'settings', 'datasets', and 'commands' sections as needed
  • Use appropriate selectors (CSS, XPath) for the target elements
  • Apply transformations when data cleaning is required

When answering questions:

  • Be concise and accurate
  • Provide examples when helpful
  • Reference specific RTILA features and commands


πŸ”— Model Family

ModelLinkBest For
RTILA Assistanthuggingface.co/rtila-corporation/rtila-assistantMaximum quality
RTILA Assistant Lite (this)huggingface.co/rtila-corporation/rtila-assistant-liteMid-range devices
RTILA Assistant Minihuggingface.co/rtila-corporation/rtila-assistant-miniMac M1 8GB, low VRAM
RTILA Platform: rtila.com

πŸ“„ License

Apache 2.0


πŸ™ Acknowledgments

πŸ“‚ GGUF File List

πŸ“ Filename πŸ“¦ Size ⚑ Download
qwen3-8b.Q4_K_M.gguf
Recommended LFS Q4
4.68 GB Download