Build, deploy, and operate your own AI agent — on hardware you control. A hands-on playbook for operators who want to own the full stack.
This book exists because I got tired of asking for permission. Permission to use a language model on my own data. Permission to run an agent that touches my filesystem.
So I built the other thing. A local inference stack, running on hardware I own, with a system prompt I wrote, calling tools I compiled, on a schedule I defined.
This book is the field manual for building that system.
[Unit]
Description=Familiar Heartbeat
After=network-online.target ollama.service
Requires=ollama.service
[Service]
Type=oneshot
User=op
ExecStart=/home/op/familiar/bin/orchestrator --heartbeat
MemoryMax=4G
CPUQuota=80%Context assembly → inference → parsing → observation → reflection. The cognitive cycle that makes an agent an agent.
Three tiers — sliding conversation window, working digests, vector-indexed long-term recall with sqlite-vss.
JSON-over-STDIO tool contracts with explicit permissioning. Janitor, Vault, Ghost-Write.
Process isolation, filesystem restrictions, network boundaries. Autonomy through explicit constraints.
“An agent is not a smarter chatbot — it is a loop. The orchestrator assembles context, calls the model, dispatches tools, and feeds results back in.”
The memory system gives your agent the ability to recall context across sessions. Conversation state handles the immediate window. Working digests compress recent history. Vector-indexed long-term memory enables semantic recall over your entire knowledge base.
def query(conn, question, top_k=3):
q_embedding = get_embedding(question)
results = conn.execute("""
SELECT chunks.content,
chunks.source,
chunks.date
FROM vss_chunks
JOIN chunks ON chunks.id = vss_chunks.rowid
WHERE vss_search(embedding, ?)
LIMIT ?
""", (serialize(q_embedding), top_k))
return results“Identity is not a prompt — it is a configuration architecture. Three files, three responsibilities, injected in a fixed order before every inference call.”
You are comfortable in a terminal. You can read a JSON response, edit a systemd unit file, and compile a Rust binary. Works on Linux, macOS, and Windows via WSL2.
You want to understand the system you are running — not just use it.
Operator's Playbook · v1.0
Secure checkout via Stripe · Instant delivery
Build the system. Trust the system. Own the system.