📋 Model Description


library_name: transformers pipeline_tag: text-generation license: mit language:
  • en
base_model:
  • Qwen/Qwen3-30B-A3B-Thinking-2507
tags:
  • agent
  • open-source
  • miromind
  • deep-research

MiroThinker-v1.5-30B GGUF Models

Model Generation Details

This model was generated using llama.cpp at commit 05fa625ea.


Quantization Beyond the IMatrix

I've been experimenting with a new quantization approach that selectively elevates the precision of key layers beyond what the default IMatrix configuration provides.

In my testing, standard IMatrix quantization underperforms at lower bit depths, especially with Mixture of Experts (MoE) models. To address this, I'm using the --tensor-type option in llama.cpp to manually "bump" important layers to higher precision. You can see the implementation here:
👉 Layer bumping with llama.cpp

While this does increase model file size, it significantly improves precision for a given quantization level.

I'd love your feedback—have you tried this? How does it perform for you?



Click here to get info on choosing the right GGUF model format



MiroThinker

Introduction

MiroThinker v1.5 is the world-leading search agent designed to advance tool-augmented reasoning and information-seeking capabilities.

Unlike previous agents that scale only model size or context length, MiroThinker introduces interactive scaling at the agent level, systematically training the agent to handle deeper and more frequent agent–environment interactions as a third dimension of performance improvement. Interactive scaling leverages environment feedback and external information acquisition to correct errors and refine trajectories.

Empirical results demonstrate the effectiveness of this interactive scaling. Performance across several benchmarks improves predictably as the agent engages in increasingly deep and frequent interactions with its environment.

!image

Key Features

  • MiroThinker v1.5 supports a 256K context window, long-horizon reasoning, and deep multi-step analysis.
  • Handles up to 400 tool calls per task — a substantial improvement over previous open-source research agents.
  • Released in 30B and 235B parameter scales, accompanied by a comprehensive suite of tools and workflows to flexibly support diverse research settings and compute budgets.

Agent NameBase AgentMax ContextMax Tool CallsHF Link
MiroThinker-v1.5-30BQwen3-30B-A3B-Thinking-2507256K400🤗 link
MiroThinker-v1.5-235BQwen3-235B-A22B-Thinking-2507256K400🤗 link

MiroThinker v1.5 demonstrates strong general-research performance across a broad range of benchmarks, achieving 39.2%, 69.8%, 71.5%, and 80.8% on HLE-Text, BrowseComp, BrowseComp-ZH, and GAIA-Val-165, respectively. These results surpass previous open-source agents and set the new world-leading BrowseComp performance.

!image

More details can be found in our technical report (coming soon).

Online Demo

Welcome to try out our online demo DR MiroMind which offers agentic general QA experience better than OpenAI DeepResearch.

[!IMPORTANT]

Note: This demo is not intended for BrowseComp evaluation. Each query is limited to 100 tool calls for latency and stability. BrowseComp involves long-horizon tasks that typically require over 200 tool calls for our agent, which is outside the scope of this demo.

Performance

To prevent potential information leakage (e.g., searching benchmark answers from HuggingFace), access to HuggingFace has been explicitly disabled in these tools.

We further perform canary string testing on the tool outputs of all trajectories and disregard any trajectory found to be contaminated, treating it as an incorrect answer.


MiroThinker

Quick Start

Please refer to our GitHub repository for installation instructions, examples, and full documentation:

👉 https://github.com/MiroMindAI/MiroThinker

Local Deployment

It is recommended to use SGLang or vLLM for deploying the agent:

# SGLang
python -m sglang.launch_server --model-path miromind-ai/MiroThinker-v1.5-235B --tp 8 --host 0.0.0.0 --port 1234

vLLM

vllm serve miromind-ai/MiroThinker-v1.5-235B --tensor-parallel-size 8 --max-model-len 262144 --enable-reasoning

For optimal performance in agentic tasks, we recommend the following inference parameters:

temperature: 1.0
top_p: 0.95
repetition_penalty: 1.05
maxcontextlength: 262144
max_tokens: 16384

Recommended System Prompt

We use this unified XML-wrapped JSON format to describe and organize all tools. If you have additional tools, please document them using the same structure and formatting to ensure consistent parsing, compatibility, and optimal performance across the environment.


Click to expand system prompt example

You are MiroThinker, an advanced AI assistant developed by MiroMind.

In this environment you have access to a set of tools you can use to answer the user's question.

You only have access to the tools provided below. You can only use one tool per message, and will receive the result of that tool in the user's next response. You use tools step-by-step to accomplish a given task, with each tool-use informed by the result of the previous tool-use. Today is: {today_date}

Tool-Use Formatting Instructions

Tool-use is formatted using XML-style tags. The tool-use is enclosed in <usemcptool></usemcptool> and each parameter is similarly enclosed within its own set of tags.

The Model Context Protocol (MCP) connects to servers that provide additional tools and resources to extend your capabilities. You can use the server's tools via the usemcptool.

Description:
Request to use a tool provided by a MCP server. Each MCP server can provide multiple tools with different capabilities. Tools have defined input schemas that specify required and optional parameters.

Parameters:

  • servername: (required) The name of the MCP server providing the tool
  • toolname: (required) The name of the tool to execute
  • arguments: (required) A JSON object containing the tool's input parameters, following the tool's input schema, quotes within string must be properly escaped, ensure it's valid JSON

Usage:
<usemcptool>
<servername>server name here</servername>
<toolname>tool name here</toolname>
<arguments>
{
"param1": "value1",
"param2": "value2 \"escaped string\""
}
</arguments>
</usemcptool>

Important Notes:

  • Tool-use must be placed at the end of your response, top-level, and not nested within other tags.
  • Always adhere to this format for the tool use to ensure proper parsing and execution.

String and scalar parameters should be specified as is, while lists and objects should use JSON format. Note that spaces for string values are not stripped. The output is not expected to be valid XML and is parsed with regular expressions.
Here are the functions available in JSONSchema format:

Server name: tool-python

Tool name: create_sandbox

Description: Create a linux sandbox.

Args:
timeout: Time in seconds before the sandbox is automatically shutdown. The default is 600 seconds.

Returns:
The id of the newly created sandbox. You should use this sandbox_id to run other tools in the sandbox.

Input JSON schema: {'properties': {'timeout': {'default': 600, 'title': 'Timeout', 'type': 'integer'}}, 'title': 'create_sandboxArguments', 'type': 'object'}

Tool name: runpythoncode

Description: Run python code in an interpreter and return the execution result.

Args:
code_block: The python code to run.
sandboxid: The id of the sandbox to run the code in. Reuse existing sandboxes whenever possible. To create a new sandbox, use tool createsandbox.

Returns:
A result of the command execution, format like (stderr=..., stdout=..., exit_code=..., error=...)

Input JSON schema: {'properties': {'codeblock': {'title': 'codeblock', 'type': 'string'}, 'sandboxid': {'title': 'Sandbox Id', 'type': 'string'}}, 'required': ['codeblock', 'sandboxid'], 'title': 'runpython_codeArguments', 'type': 'object'}

Server name: searchandscrape_webpage

Tool name: google_search

Description: Tool to perform web searches via Serper API and retrieve rich results.

It is able to retrieve organic search results, people also ask,
related searches, and knowledge graph.

Args:
q: Search query string
gl: Optional region code for search results in ISO 3166-1 alpha-2 format (e.g., 'us')
hl: Optional language code for search results in ISO 639-1 format (e.g., 'en')
location: Optional location for search results (e.g., 'SoHo, New York, United States', 'California, United States')
num: Number of results to return (default: 10)
tbs: Time-based search filter ('qdr:h' for past hour, 'qdr:d' for past day, 'qdr:w' for past week, 'qdr:m' for past month, 'qdr:y' for past year)
page: Page number of results to return (default: 1)
autocorrect: Whether to autocorrect spelling in query

Returns:
Dictionary containing search results and metadata.

Input JSON schema: {'properties': {'q': {'title': 'Q', 'type': 'string'}, 'gl': {'default': 'us', 'title': 'Gl', 'type': 'string'}, 'hl': {'default': 'en', 'title': 'Hl', 'type': 'string'}, 'location': {'default': None, 'title': 'Location', 'type': 'string'}, 'num': {'default': None, 'title': 'Num', 'type': 'integer'}, 'tbs': {'default': None, 'title': 'Tbs', 'type': 'string'}, 'page': {'default': None, 'title': 'Page', 'type': 'integer'}, 'autocorrect': {'default': None, 'title': 'Autocorrect', 'type': 'boolean'}}, 'required': ['q'], 'title': 'google_searchArguments', 'type': 'object'}

Server name: jinascrapellm_summary

Tool name: scrapeandextract_info

Description: Scrape content from a URL and extract specific types of information using LLM.

Args:
url (str): The URL to scrape content from
infotoextract (str): The specific types of information to extract (usually a question)
custom_headers (Dict[str, str]): Additional headers to include in the scraping request

Returns:
Dict[str, Any]: A dictionary containing:
- success (bool): Whether the operation was successful
- url (str): The original URL
- extracted_info (str): The extracted information
- error (str): Error message if the operation failed
- scrape_stats (Dict): Statistics about the scraped content
- model_used (str): The model used for summarization
- tokens_used (int): Number of tokens used (if available)

Input JSON schema: {'properties': {'url': {'title': 'Url', 'type': 'string'}, 'infotoextract': {'title': 'Info To Extract', 'type': 'string'}, 'customheaders': {'additionalProperties': {'type': 'string'}, 'default': None, 'title': 'Custom Headers', 'type': 'object'}}, 'required': ['url', 'infotoextract'], 'title': 'scrapeandextractinfoArguments', 'type': 'object'}

General Objective

You accomplish a given task iteratively, breaking it down into clear steps and working through them methodically.

Minimal Runnable Example

The following example shows how to run a MCP-style tool-calling workflow, including system prompt generation, agent invocation, tool execution, and final response generation.

Before running the script, make sure to set the required environment variables:

export OPENAIAPIKEY="your-api-key-here"
export BASE_URL="https://your-agent-endpoint.example.com/v1"


Click to expand python code example

import json
import os
import inspect
import re
from openai import OpenAI
from jsonrepair import repairjson

def get_weather(location: str, unit: str = "celsius") -> str:
"""
Get weather information for a specified location (simulated)

Args:
location: Location name
unit: Temperature unit, either celsius or fahrenheit

Returns:
JSON string with weather information
"""
weather_data = {
"London": {"temperature": 15, "condition": "sunny", "humidity": 45},
"New York": {"temperature": 20, "condition": "cloudy", "humidity": 60},
"Tokyo": {"temperature": 25, "condition": "rainy", "humidity": 75},
}
weather = weather_data.get(location, {"temperature": 18, "condition": "unknown", "humidity": 50})
if unit == "fahrenheit":
weather["temperature"] = weather["temperature"] * 9/5 + 32
weather["unit"] = "°F"
else:
weather["unit"] = "°C"
return json.dumps(weather, ensure_ascii=False)

def calculate(expression: str) -> str:
"""
Calculate a mathematical expression

Args:
expression: Mathematical expression, e.g., "2 + 3 * 4"

Returns:
Calculation result
"""
try:
result = eval(expression)
return json.dumps({"result": result, "expression": expression}, ensure_ascii=False)
except Exception as e:
return json.dumps({"error": str(e)}, ensure_ascii=False)

tools = [
{"type": "function", "function": {"name": "get_weather", "parameters": {"type": "object", "properties": {"location": {"type": "string", "description": "Location name"}, "unit": {"type": "string", "enum": ["celsius", "fahrenheit"], "description": "Temperature unit, default is celsius"}}, "required": ["location"]}}},
{"type": "function", "function": {"name": "calculate", "parameters": {"type": "object", "properties": {"expression": {"type": "string", "description": "Mathematical expression to calculate, e.g., '2 + 3 * 4'"}}, "required": ["expression"]}}}
]

availablefunctions = {"getweather": get_weather, "calculate": calculate}

def parsemcptoolcall(responsetext: str):
"""Parse MCP-style tool call from model response. Returns first tool call or None."""
match = re.search(r'<usemcptool>(.*?)</usemcptool>', response_text, re.DOTALL)
if not match:
return None
content = match.group(1)
servermatch = re.search(r'<servername>(.*?)</server_name>', content, re.DOTALL)
toolmatch = re.search(r'<toolname>(.*?)</tool_name>', content, re.DOTALL)
args_match = re.search(r'<arguments>(.*?)</arguments>', content, re.DOTALL)
servername = servermatch.group(1).strip() if server_match else None
toolname = toolmatch.group(1).strip() if tool_match else None
if args_match:
try:
arguments = json.loads(args_match.group(1).strip())
except json.JSONDecodeError as e:
print(f"⚠️ Warning: Failed to parse arguments JSON: {e}, attempting to repair...")
try:
repaired = repairjson(argsmatch.group(1).strip())
arguments = json.loads(repaired)
print(f"✅ Successfully repaired JSON")
except Exception as repair_error:
print(f"❌ Failed to repair JSON: {repair_error}")
arguments = {}
else:
arguments = {}
if servername and toolname:
return {"servername": servername, "toolname": toolname, "arguments": arguments}
return None

def generatemcpsystemprompt(openaitools: list, availablefunctions: dict = None, servername: str = "default", date: str = "2025-11-27") -> str:
"""Generate MCP-style system prompt from OpenAI tools format."""
prefix = f"""You are MiroThinker, an advanced AI assistant developed by MiroMind.

In this environment you have access to a set of tools you can use to answer the user's question.

You only have access to the tools provided below. You can only use one tool per message, and will receive the result of that tool in the user's next response. You use tools step-by-step to accomplish a given task, with each tool-use informed by the result of the previous tool-use. Today is: {date}

Tool-Use Formatting Instructions

Tool-use is formatted using XML-style tags. The tool-use is enclosed in <usemcptool></usemcptool> and each parameter is similarly enclosed within its own set of tags.

The Model Context Protocol (MCP) connects to servers that provide additional tools and resources to extend your capabilities. You can use the server's tools via the usemcptool.

Description:
Request to use a tool provided by a MCP server. Each MCP server can provide multiple tools with different capabilities. Tools have defined input schemas that specify required and optional parameters.

Parameters:

  • servername: (required) The name of the MCP server providing the tool
  • toolname: (required) The name of the tool to execute
  • arguments: (required) A JSON object containing the tool's input parameters, following the tool's input schema, quotes within string must be properly escaped, ensure it's valid JSON

Usage:
<usemcptool>
<servername>server name here</servername>
<toolname>tool name here</toolname>
<arguments>
{{
"param1": "value1",
"param2": "value2 \\"escaped string\\""
}}
</arguments>
</usemcptool>

Important Notes:

  • Tool-use must be placed at the end of your response, top-level, and not nested within other tags.
  • Always adhere to this format for the tool use to ensure proper parsing and execution.

String and scalar parameters should be specified as is, while lists and objects should use JSON format. Note that spaces for string values are not stripped. The output is not expected to be valid XML and is parsed with regular expressions.
Here are the functions available in JSONSchema format:

Server name: {server_name}

""" tools_section = [] for i, tool in enumerate(openai_tools): if tool.get("type") == "function": func = tool["function"] tool_name = func["name"] funcobj = availablefunctions[tool_name] fulldescription = inspect.getdoc(funcobj) or func.get("description", "") if i > 0: tools_section.append("\n") toolssection.append(f"### Tool name: {toolname}\nDescription: {fulldescription}\n\nInput JSON schema: {json.dumps(func['parameters'], ensureascii=False)}\n") suffix = "\n# General Objective\n\nYou accomplish a given task iteratively, breaking it down into clear steps and working through them methodically." return prefix + ''.join(tools_section) + suffix

def runconversation(userquery: str, model: str = "MiroThinker"):
"""Run a complete conversation with tool calling"""
systemprompt = generatemcpsystemprompt(openaitools=tools, availablefunctions=availablefunctions, servername="My-Tools", date="2025-12-01")
client = OpenAI(apikey=os.environ.get("OPENAIAPIKEY", "your-api-key-here"), baseurl=os.environ.get("BASE_URL", "your-base-url-here"))
print(f"\n{'='60}\nUser Query: {user_query}\n{'='60}\n")
messages = [{'role': 'system', 'content': systemprompt}, {"role": "user", "content": userquery}]
print("📤 Sending request to model...")
response = client.chat.completions.create(model=model, messages=messages)
response_message = response.choices[0].message
responsecontent = responsemessage.content
toolcall = parsemcptoolcall(response_content)
print(f"📝 Model response:\n{response_content}\n")
messages.append(response_message)
if tool_call:
servername = toolcall["server_name"]
toolname = toolcall["tool_name"]
functionargs = toolcall["arguments"]
print(f"\n🔧 Model decided to call tool:\n - Server: {servername}\n Tool: {toolname}\n Args: {json.dumps(functionargs, ensureascii=False)}")
functionresponse = availablefunctionstool_name
print(f" Result: {function_response}\n")
messages.append({"role": "user", "content": function_response})
print("📤 Requesting model to generate final response based on tool results...\n")
second_response = client.chat.completions.create(model=model, messages=messages)
finalmessage = secondresponse.choices[0].message.content
print(f"💬 Final Response:\n{final_message}\n")
return final_message
else:
print(f"💬 Model Response (no tool calls):\n{response_message.content}\n")
return response_message.content

def main():
"""Run multiple examples"""
run_conversation("What's the weather like in London?")
# run_conversation("Calculate (25 + 15) * 3 - 10")

if name == "main":
main()

License

MiroThinker v1.5 is released under the MIT License.

Citation

If you find this project useful in your research, please consider citing:

@article{miromind2025mirothinker,
  title={MiroThinker: Pushing the Performance Boundaries of Open-Source Research Agents via Model, Context, and Interactive Scaling},
  author={MiroMind Team and Bai, Song and Bing, Lidong and Chen, Carson and Chen, Guanzheng and Chen, Yuntao and Chen, Zhe and Chen, Ziyi and Dong, Xuan and others},
  journal={arXiv preprint arXiv:2511.11793},
  year={2025}
}

Contact Us

MiroThinker is developed by the MiroMind AI Team.
If you would like to leave us a message, feel free to get in touch.
In addition to GitHub,
Discord,
WeChat,
and RedNote,
you can also reach us via email at [email protected].


🚀 If you find these models useful

Help me test my AI-Powered Quantum Network Monitor Assistant with quantum-ready security checks:

👉 Quantum Network Monitor

The full Open Source Code for the Quantum Network Monitor Service available at my github repos ( repos with NetworkMonitor in the name) : Source Code Quantum Network Monitor. You will also find the code I use to quantize the models if you want to do it yourself GGUFModelBuilder

💬 How to test:
Choose an AI assistant type:
- TurboLLM (GPT-4.1-mini)
- HugLLM (Hugginface Open-source models)
- TestLLM (Experimental CPU-only)

What I’m Testing

I’m pushing the limits of small open-source models for AI network monitoring, specifically:
  • Function calling against live network services
  • How small can a model go while still handling:
- Automated Nmap security scans - Quantum-readiness checks - Network Monitoring tasks

🟡 TestLLM – Current experimental model (llama.cpp on 2 CPU threads on huggingface docker space):

  • Zero-configuration setup
  • ⏳ 30s load time (slow inference but no API costs) . No token limited as the cost is low.
  • 🔧 Help wanted! If you’re into edge-device AI, let’s collaborate!

Other Assistants

🟢 TurboLLM – Uses gpt-4.1-mini :
  • It performs very well but unfortunatly OpenAI charges per token. For this reason tokens usage is limited.
  • Create custom cmd processors to run .net code on Quantum Network Monitor Agents
  • Real-time network diagnostics and monitoring
  • Security Audits
  • Penetration testing (Nmap/Metasploit)

🔵 HugLLM – Latest Open-source models:

  • 🌐 Runs on Hugging Face Inference API. Performs pretty well using the lastest models hosted on Novita.

💡 Example commands you could test:

  1. "Give me info on my websites SSL certificate"
  2. "Check if my server is using quantum safe encyption for communication"
  3. "Run a comprehensive security audit on my server"
  4. '"Create a cmd processor to .. (what ever you want)" Note you need to install a Quantum Network Monitor Agent to run the .net code on. This is a very flexible and powerful feature. Use with caution!

Final Word

I fund the servers used to create these model files, run the Quantum Network Monitor service, and pay for inference from Novita and OpenAI—all out of my own pocket. All the code behind the model creation and the Quantum Network Monitor project is open source. Feel free to use whatever you find helpful.

If you appreciate the work, please consider buying me a coffee ☕. Your support helps cover service costs and allows me to raise token limits for everyone.

I'm also open to job opportunities or sponsorship.

Thank you! 😊

📂 GGUF File List

📁 Filename 📦 Size ⚡ Download
MiroThinker-v1.5-30B-imatrix.gguf
LFS
116.38 MB Download
MiroThinker-v1.5-30B-iq1_m.gguf
LFS
10.2 GB Download
MiroThinker-v1.5-30B-iq1_s.gguf
LFS
8.9 GB Download
MiroThinker-v1.5-30B-iq2_m.gguf
LFS Q2
12.01 GB Download
MiroThinker-v1.5-30B-iq2_s.gguf
LFS Q2
11.52 GB Download
MiroThinker-v1.5-30B-iq2_xs.gguf
LFS Q2
11.8 GB Download
MiroThinker-v1.5-30B-iq2_xxs.gguf
LFS Q2
9.67 GB Download
MiroThinker-v1.5-30B-iq3_m.gguf
LFS Q3
14.58 GB Download
MiroThinker-v1.5-30B-iq3_s.gguf
LFS Q3
14.58 GB Download
MiroThinker-v1.5-30B-iq3_xs.gguf
LFS Q3
13.5 GB Download
MiroThinker-v1.5-30B-iq3_xxs.gguf
LFS Q3
13.4 GB Download
MiroThinker-v1.5-30B-iq4_nl.gguf
LFS Q4
16.12 GB Download
MiroThinker-v1.5-30B-iq4_xs.gguf
LFS Q4
15.24 GB Download
MiroThinker-v1.5-30B-q2_k_l.gguf
LFS Q2
14.73 GB Download
MiroThinker-v1.5-30B-q2_k_m.gguf
LFS Q2
14.58 GB Download
MiroThinker-v1.5-30B-q2_k_s.gguf
LFS Q2
14.51 GB Download
MiroThinker-v1.5-30B-q3_k_l.gguf
LFS Q3
17.7 GB Download
MiroThinker-v1.5-30B-q3_k_m.gguf
LFS Q3
17.56 GB Download
MiroThinker-v1.5-30B-q3_k_s.gguf
LFS Q3
17.48 GB Download
MiroThinker-v1.5-30B-q4_0.gguf
Recommended LFS Q4
16.04 GB Download
MiroThinker-v1.5-30B-q4_0_l.gguf
LFS Q4
16.33 GB Download
MiroThinker-v1.5-30B-q4_1.gguf
LFS Q4
17.82 GB Download
MiroThinker-v1.5-30B-q4_1_l.gguf
LFS Q4
18.07 GB Download
MiroThinker-v1.5-30B-q4_k_l.gguf
LFS Q4
17.73 GB Download
MiroThinker-v1.5-30B-q4_k_m.gguf
LFS Q4
17.59 GB Download
MiroThinker-v1.5-30B-q4_k_s.gguf
LFS Q4
16.93 GB Download
MiroThinker-v1.5-30B-q5_0.gguf
LFS Q5
19.59 GB Download
MiroThinker-v1.5-30B-q5_0_l.gguf
LFS Q5
19.81 GB Download
MiroThinker-v1.5-30B-q5_1.gguf
LFS Q5
21.37 GB Download
MiroThinker-v1.5-30B-q5_1_l.gguf
LFS Q5
21.55 GB Download
MiroThinker-v1.5-30B-q5_k_l.gguf
LFS Q5
20.92 GB Download
MiroThinker-v1.5-30B-q5_k_m.gguf
LFS Q5
20.78 GB Download
MiroThinker-v1.5-30B-q5_k_s.gguf
LFS Q5
20.43 GB Download
MiroThinker-v1.5-30B-q6_k_l.gguf
LFS Q6
23.51 GB Download
MiroThinker-v1.5-30B-q6_k_m.gguf
LFS Q6
23.37 GB Download
MiroThinker-v1.5-30B-q8_0.gguf
LFS Q8
30.25 GB Download