Jarvis

Local-first agentic AI platform

PythonMLXOllamaMCPSQLiteFastAPI

$ jarvis --status

live

route ▸ /chatqueue 1 · q3:14b

tiers

mcp tools

servers

reasoning · synthesis▍

─── 01

Four-tier model router

A deterministic sub-millisecond classifier picks the tier per query based on cost vs. capability. Most queries resolve at flash or standard. The agent tier is reserved for multi-step plans where tools must execute.

─── 02

31 MCP tools across 6 servers

Filesystem — read, write, glob, search.
Git — status, log, diff, commit.
Shell — governed; only a whitelisted command set.
HTTP fetch — outbound requests with explicit allowlist.
Calendar / mail.
Workspace state — persistence, memory snapshots, run history.

─── 03

Why local-first

The constraint was no cloud round-trips for anything that touches personal context. Models had to fit on Apple Silicon (Ollama, qwen3 family). Persistence had to be embedded (SQLite). Tool execution had to be operator-approved, not autonomous (Telegram approval flow — the agent proposes, the operator signs off, the tool runs).

─── 04

What I learned

Deterministic router beats LLM-as-router for cost, latency, and predictability.
Tool-calling latency dominates once the agent tier engages. Chunk what you can.
Operator-in-the-loop is a feature, not a constraint. Catches more bad calls than test coverage ever will.