πŸ“‹ Model Description


license: apache-2.0 language:
  • en
base_model:
  • Menlo/Lucy-128k
pipeline_tag: text-generation library_name: transformers

Lucy: Edgerunning Agentic Web Search on Mobile with a 1.7B model.

GitHub</a>
License</a>


Lucy-128k

Authors: Alan Dao, Bach Vu Dinh, Alex Nguyen, Norapat Buppodom

!image/gif

Overview

Lucy is a compact but capable 1.7B model focused on agentic web search and lightweight browsing. Built on Qwen3-1.7B, Lucy inherits deep research capabilities from larger models while being optimized to run efficiently on mobile devices, even with CPU-only configurations.

We achieved this through machine-generated task vectors that optimize thinking processes, smooth reward functions across multiple categories, and pure reinforcement learning without any supervised fine-tuning.

What Lucy Excels At

  • πŸ” Strong Agentic Search: Powered by MCP-enabled tools (e.g., Serper with Google Search)
  • 🌐 Basic Browsing Capabilities: Through Crawl4AI (MCP server to be released), Serper,...
  • πŸ“± Mobile-Optimized: Lightweight enough to run on CPU or mobile devices with decent speed
  • 🎯 Focused Reasoning: Machine-generated task vectors optimize thinking processes for search tasks

Evaluation

Following the same MCP benchmark methodology used for Jan-Nano and Jan-Nano-128k, Lucy demonstrates impressive performance despite being only a 1.7B model, achieving higher accuracy than DeepSeek-v3 on SimpleQA.

!image/png

πŸ–₯️ How to Run Locally

Lucy can be deployed using various methods including vLLM, llama.cpp, or through local applications like Jan, LMStudio, and other compatible inference engines. The model supports integration with search APIs and web browsing tools through the MCP.

Deployment

Deploy using VLLM:

vllm serve Menlo/Lucy-128k \
--host 0.0.0.0 \
--port 1234 \
--enable-auto-tool-choice \
--tool-call-parser hermes \
--rope-scaling '{"ropetype":"yarn","factor":3.2,"originalmaxpositionembeddings":40960}' --max-model-len 131072

Or llama-server from llama.cpp:

llama-server ... --rope-scaling yarn --rope-scale 3.2 --yarn-orig-ctx 40960

Recommended Sampling Parameters

Temperature: 0.7
Top-p: 0.9
Top-k: 20
Min-p: 0.0

🀝 Community & Support

πŸ“„ Citation

@misc{dao2025lucyedgerunningagenticweb,
      title={Lucy: edgerunning agentic web search on mobile with machine generated task vectors}, 
      author={Alan Dao and Dinh Bach Vu and Alex Nguyen and Norapat Buppodom},
      year={2025},
      eprint={2508.00360},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2508.00360}, 
}
Paper : Lucy: edgerunning agentic web search on mobile with machine generated task vectors.

πŸ“‚ GGUF File List

πŸ“ Filename πŸ“¦ Size ⚑ Download
lucy_128k-Q3_K_L.gguf
LFS Q3
957.01 MB Download
lucy_128k-Q3_K_M.gguf
LFS Q3
896.01 MB Download
lucy_128k-Q3_K_S.gguf
LFS Q3
827.08 MB Download
lucy_128k-Q4_0.gguf
Recommended LFS Q4
1005.58 MB Download
lucy_128k-Q4_1.gguf
LFS Q4
1.06 GB Download
lucy_128k-Q4_K_M.gguf
LFS Q4
1.03 GB Download
lucy_128k-Q4_K_S.gguf
LFS Q4
1011.08 MB Download
lucy_128k-Q5_0.gguf
LFS Q5
1.15 GB Download
lucy_128k-Q5_1.gguf
LFS Q5
1.23 GB Download
lucy_128k-Q5_K_M.gguf
LFS Q5
1.17 GB Download
lucy_128k-Q5_K_S.gguf
LFS Q5
1.15 GB Download
lucy_128k-Q6_K.gguf
LFS Q6
1.32 GB Download
lucy_128k-Q8_0.gguf
LFS Q8
1.71 GB Download