v1.0 published·Preface + 8 chapters

The Local
Familiar

Build, deploy, and operate your own AI agent — on hardware you control. A hands-on playbook for operators who want to own the full stack.

local-inferenceagent-loopmemory-systemtool-contractssystemdcontainment

Get the Playbook$29 See what's inside ↓

permissions.json

{
  "version": 1,
  "default_policy": "deny",
  "tools": {
    "janitor": {
      "enabled": true,
      "commands": {
        "disk_usage": { "allowed": true },
        "cleanup": { "allowed": false }
      }
    }
  }
}

What this is

This book exists because I got tired of asking for permission. Permission to use a language model on my own data. Permission to run an agent that touches my filesystem.

So I built the other thing. A local inference stack, running on hardware I own, with a system prompt I wrote, calling tools I compiled, on a schedule I defined.

This book is the field manual for building that system.

familiar.service

[Unit]
Description=Familiar Heartbeat
After=network-online.target ollama.service
Requires=ollama.service

[Service]
Type=oneshot
User=op
ExecStart=/home/op/familiar/bin/orchestrator --heartbeat
MemoryMax=4G
CPUQuota=80%

What you'll build

agent-loop

The Loop

Context assembly → inference → parsing → observation → reflection. The cognitive cycle that makes an agent an agent.

memory

The Memory

Three tiers — sliding conversation window, working digests, vector-indexed long-term recall with sqlite-vss.

tools

The Tools

JSON-over-STDIO tool contracts with explicit permissioning. Janitor, Vault, Ghost-Write.

containment

The Containment

Process isolation, filesystem restrictions, network boundaries. Autonomy through explicit constraints.

Key takeaway · Chapter 01

“An agent is not a smarter chatbot — it is a loop. The orchestrator assembles context, calls the model, dispatches tools, and feeds results back in.”

Chapter 03 · Memory

Three-tier recall — from context window to vector search

The memory system gives your agent the ability to recall context across sessions. Conversation state handles the immediate window. Working digests compress recent history. Vector-indexed long-term memory enables semantic recall over your entire knowledge base.

memory.py

def query(conn, question, top_k=3):
    q_embedding = get_embedding(question)
    results = conn.execute("""
        SELECT chunks.content,
               chunks.source,
               chunks.date
        FROM vss_chunks
        JOIN chunks ON chunks.id = vss_chunks.rowid
        WHERE vss_search(embedding, ?)
        LIMIT ?
    """, (serialize(q_embedding), top_k))
    return results

Contents

Eight chapters. One weekend.

~12 hrs total hands-on

Your First Agent in 30 Minutes

Install the inference stack. Configure the system prompt. Validate end-to-end with a raw API call.

30 min

The Agent Loop

Build the core loop that separates an agent from a one-shot response generator.

90 min

The Identity Layer

Split behavioral directives, runtime context, and operator context into stable config files.

60 min

The Three-Tier Memory System

Conversation state, working memory, and vector-based long-term recall.

2 hrs

The Tool Suite

Janitor, Vault, Ghost-Write — and the permission gate that controls tool execution.

2 hrs

Autonomous Operations

systemd timers, standing orders, exit semantics, notification paths.

90 min

Security & Isolation

Process isolation, filesystem restrictions, network boundaries, audit logging.

90 min

Your First Real Familiar

Deploy the whole system in stages. Operate with explicit trust boundaries.

2 hrs

Key takeaway · Chapter 02

“Identity is not a prompt — it is a configuration architecture. Three files, three responsibilities, injected in a fixed order before every inference call.”

Who this is for

You are comfortable in a terminal. You can read a JSON response, edit a systemd unit file, and compile a Rust binary. Works on Linux, macOS, and Windows via WSL2.

You want to understand the system you are running — not just use it.

$ hardware: 16 GB RAM minimum

$ os: Linux (Ubuntu 22.04+) / macOS / Windows via WSL2

$ privileges: sudo access required

$ model: 8B params, ~8 GB at runtime

$ time: ~1 weekend of focused work

One-time purchase

The Local Familiar

Operator's Playbook · v1.0

$29

●Preface + 8 chapters — complete technical walkthrough
●Companion repository with all source code
●Janitor reference implementation (working)
●Vault & Ghost-Write design specifications
●systemd units, permission files, config templates
●All future updates included

Get the Playbook

Secure checkout via Stripe · Instant delivery

Build the system. Trust the system. Own the system.