13 min read

The Fellowship of the Smart Things

Taming the dragon (house) with orchestrated agents.

A dragon bursting through the kitchen window, breathing fire at the refrigerator Actual footage of Smaug.

When I moved into my house, things started breaking. Constantly. The brand-new top-tier Miele dishwasher bricked itself after two weeks. I ate off paper plates for the next month-plus waiting on a tech to try and fail to fix it multiple times, each time shipping a part from Germany; a contractor sloppily installed a bathroom fan vent and it would’ve dumped moist air into the attic and rotted out the framing if I hadn’t accidentally bumped into it while installing flooring. The crawlspace had a $20,000 flooding problem that got missed at inspection. And on and on ad absurdum. The house had a lot of pipes, and I’d heard so many horror stories about six-figure insurance claims from leaks.

I told a friend I’d bought a dragon — Smaug — who was grumpy, hoarded my treasure, and randomly spit fire at my shit at least three times a day 🤦‍♂️.

I managed Smaug the best way I knew how: As a distributed system. I added canaries. 16 humidity sensors. One for every room with multiple in the attic and crawlspace. 14 leak sensors under every sink, every toilet, fridge, laundry, crawlspace. 49 motion sensors to manage 124 lights. Smart smoke, CO, and air-quality alarms that would alert my phone. North of 200 sensors all in, all relayed to a Home Assistant instance running on a Talos Kubernetes cluster in the basement gym (a spot chosen because it’s nice and cool down there for the servers).

A CO2 air quality monitor reading 517 ppm, 19.4°C, 58% humidity, in the COMFORT band The gym’s CO₂ + temp + humidity monitor. $60 on Amazon.

It worked, and…I very quickly found myself in a maintenance overhead nightmare. 50,000 lines of Home Assistant code to manage all the automations and config. I refactored it the best I could. I built a custom Jinja template engine that was DRY and rendered out the full YAML config. I had a staging environment. Backups. Automated deployments and rollbacks with Ansible.

But it was still just hours of extra work every week. I joked to my uncle (a contractor) that I thought I would save time by managing the house with code, but I in fact spent more time managing all the bugs in the code.

Enter The Fellowship.

I was already solving AI orchestration at work. So I did the obvious thing: I assembled a dragon-taming team 🔥.

The Fellowship has eighteen members and counting. It runs every part of my Home Assistant deployment, and can reach every smart device in the house directly when it needs to.

The Shire

A quick frame, because the rest of the post depends on it.

The Shire is my Home Assistant config repo. HA itself runs on a Talos Linux Kubernetes cluster and serves the dashboards I actually use day to day. The config is generated from Jinja2 templates by Ansible, then deployed into the cluster. Nothing handwritten lives in config/generated/ — that directory is output, not source.

Over 350 tests cover the templates — 336 unit and integration tests across ~31 pytest files, plus a 21-test Playwright e2e suite that drives the live dashboard. A predeploy gate refuses to ship if any of it is unhappy. The repo is encrypted at rest with git-crypt, and a 1Password-backed unlock script wires the key in for new clones.

Gandalf — the orchestrator

Gandalf at a wooden desk in Rivendell, typing on a MacBook, staff propped against the chair, mug of Rivendell Brew within reach Gandalf at work. Rivendell Brew within reach.

Gandalf (shire-dev-agent) is the orchestrator. The One Agent to Coordinate Them All. Every request enters through him. He doesn’t usually do the work himself — he dispatches the rest of the Fellowship, relays evidence between them, resolves disagreements, and synthesizes a single coherent answer for me at the end.

He’s pinned to Claude Opus for the deeper reasoning work, has a higher turn budget than a normal skill (80 turns), and carries persistent project memory across sessions. He remembers the zone inventory, the macro patterns, the entity registry, the bug log, and the feature flag state — so the second session in a given area is always faster than the first.

The handoff protocol is boring on purpose:

Context: <what David asked and the relevant repo/runtime facts>
Question/Task: <bounded specialist task>
Constraints: <files owned, guardrails, invariants, tests required>
Return: findings, changed files, verification run, remaining risks

When two specialists disagree, Gandalf doesn’t average them. He asks the agent who owns the disputed layer to re-check using the other agent’s evidence, then he calls it.

The Core Four

These four touch nearly every change.

Aragorn (The Inspector) is the ranger. He talks to a running Home Assistant over REST and WebSocket — entity state, recent logs, automation traces, the integrations panel — and produces evidence. He doesn’t write code. He answers “what is actually true in the running system right now?” Every other agent gets quieter and more correct when Aragorn goes first.

Sam (The Builder) is the steadfast builder. He writes the Jinja2, YAML, packages, blueprints, and macros. His brief is clean, DRY, reusable — he factors macros, names invariants, and refuses to copy-paste a fourth time. The actual implementation work in the repo is mostly Sam’s.

Legolas (The Designer) is the keen-eyed one. He owns the dashboards — Bubble Card configs, Lovelace layouts, section/view structure, tablet sizing, all the visual polish. The reason the wall tablet doesn’t look like a Home Assistant tutorial screenshot is Legolas.

Galadriel (The Ideator) is the Lady of Lothlórien, keeper of the Mirror. She doesn’t write automations. She mines HA’s logbook and recorder history for patterns I’m doing manually — “you turned off these three lights in this order four nights in a row” — and writes ranked Mirror Readings of new-automation candidates and missing-device gaps. Her output is a report. Implementation gets routed to Sam or one of the specialists.

The Specialists

The rest of the Fellowship is the ecosystem layer — one specialist per vendor cloud or local API. The shape is the same in each case: they own the bridge or cloud directly, separate from the HA integration, so when the HA side is out of sync with reality, somebody can go check reality.

Guardrails

Eighteen agents with write access to a config repo is a lot of trust to hand out, and I have not earned it. So the guardrails are aggressive.

Invariants

A subset of the guardrails got promoted into a separate file — .grace/invariants.md — because they came from real bugs and I never want to debug them twice. Each entry is a one-line rule, plus the context that earned it. A taste:

Every one of those rules represents an evening I’m not getting back. They live next to the code so the next agent — human or otherwise — doesn’t repeat the trip.

Tests

336 unit and integration tests across ~31 pytest files, plus 21 Playwright e2e tests against the live dashboard. Highlights:

I lean hard on Given / When / Then comments in the integration tests. The setup, the action, and the assertion get their own labeled blocks. If a test doesn’t fit that shape, it should be a unit test or it should be rewritten until it does.

Why Lord of the Rings

Back to Smaug.

The Fellowship name started as a one-line joke. I was telling someone about the agent I’d just built to check on the Hue bridge directly when our connection to it was flaking, and I called it “basically Gandalf for the appliance dragon.” The next agent — the one that reads HA’s runtime state and reports back evidence — became Aragorn the same week, because that’s what a ranger does. By the time I had three of them, the bit had legs.

It kept going for one real reason: the names are functional. When Gandalf says “Aragorn confirms the Hue bridge is unreachable on IPv4 but answering on IPv6; Eärendil, can you re-check the scene state from the bridge side,” I know exactly who’s doing what without parsing six shire-foo-agent strings. The roles compress into single-token identities. Aragorn means runtime evidence. Sam means careful implementation. Legolas means the dashboard. Galadriel means what should we automate next. Fëanor means the energy stack, and the name does some of the dramatic lifting his job genuinely deserves.

It’s also a small act of resistance against the corporate-AI naming convention, which seems to want every agent to be CompanionAssistantPro™. The Fellowship doesn’t sound like a product. It sounds like a team. That’s the right frame.

And — to bring it back — the Fellowship exists to fight Smaug. Smaug is the entropy in the house. He’s Mayhem from AllState; the surge that fried the dishwasher control board, the Hue bridge that’s somehow on the wrong IP again, the Zigbee channel that won’t stay quiet, the Sonos group that splits at random. Shit breaks. Second law of thermodynamics. Specialists run diagnostics. A wizard coordinates the response. Most nights the lights come on warm and the bed is the right temperature and the cats get fed, and that’s the win condition.

What’s next

A few things in the queue:

Eighteen agents is a lot. It’s also the right number, because the house genuinely has that many surfaces and I want each one inspected by something that knows only that surface deeply. The orchestrator keeps it coherent. The guardrails keep it safe. The invariants keep the bugs from coming back. And the names — bless them — keep it personable and fun.

Fly, you fools! 🧙

← All posts