▌ IAN'S AI THOUGHTSTREAM ▌ THOUGHTSTREAM / #ai
Tag

#ai

24 posts

2026·05·19 17:40 / 1 MIN

Beyond llms.txt for Agent Readability

A friend pointed me at a14y.dev, which scans your site for "agent readability" and hands back a scored fix-list. It's the obvious next thing after llms.txt, and the suggestions are sharper than I expected.

The scorecard is 38 checks pinned at v0.2.0, split across discoverability, parsing, and comprehension. Some are the ones you'd guess: llms.txt exists, robots allows AI bots, canonical links, lang attributes, JSON-LD breadcrumbs. The interesting ones are the suggestions I hadn't seen pushed as a standard yet.

The less obvious suggestions

A Markdown mirror of every page, served at the same URL with a .md suffix, plus a <link rel="alternate" type="text/markdown"> in the HTML head so agents can find it without guessing. Content negotiation on the canonical URL so a request with Accept: text/markdown gets the Markdown directly. A glossary page, because agents resolving acronyms and project-specific terms benefit from one canonical place to look. Language tags on every code block. A /sitemap.md alongside the XML one.

None of these are exotic. They're the kind of thing you'd do for a thoughtful human reader, just written down as pass/fail checks.

The loop they're pushing

The CLI ships with an --output agent-prompt mode that writes a Markdown brief aimed at a coding agent: every failure, its detection rule, the fix, and a link back to the scorecard page. The intended workflow is to pipe that into Claude Code or Codex, let it patch, then re-run with --fail-under 80 in CI. There's also a skills add package for agents that speak the open skills format.

2026·05·18 16:05 / 2 MIN

Open Sourcing a MeshCore Bot

I open sourced Blorkobot, a chatbot for our local Bay Area MeshCore LoRa mesh radio network, and put it in the public domain via the Unlicense. That's my new default for any vibe-coded (sorry, "agentically engineered") funsie project that someone could reproduce in an hour by pointing Claude Code at the same problem.

The bot exists to increase chat activity on the mesh, which helps stress-test the network without anyone having to manually spam it. It's about 3k lines of Python, written as a plugin for the Remote Terminal MeshCore client. Nothing exotic.

Why I hesitated

The SoCal MeshCore folks asked if I'd open source it, and I sat on it for a while. Releasing trivial code feels strange. Anyone with an AI coding agent and an afternoon could rebuild this from the README. What's the point of a repo for something that's nearly free to recreate?

I released it anyway, because the value isn't the lines of code, it's the hours of trial and error already baked in: the plugin shape that actually works with Remote Terminal, the commands that turned out to be fun on the mesh, the ones that didn't.

Why Unlicense and not AGPL

The first response after I pushed it was "have you thought about AGPL?"

Setting aside the copyright theory, the AGPL question is really a question about effort. AGPL is the right tool when you've poured serious work into something and want to make sure derivatives stay open. That's not this. This is a weekend project that any competent operator could regenerate from scratch. Defending it with a copyleft license would be cosplay.

Public domain matches the actual situation. Take it, fork it, paste it into your own bot, don't credit me, I genuinely do not care. Unlicense says that cleanly.

That's the rule going forward for the easily-reproducible stuff: Unlicense, no ceremony, no strings.

2026·05·17 19:12 / 1 MIN

Idempotent Claude Code Skills

Claude Code is good at creating skills. Say "create a skill that does X" and it makes one. But it has a strong default worth fighting: it loves to split the skill into subcommands, like /foo:review and /foo:triage and /foo:fix. I don't want a menu. The whole point is automation.

So the fix is two lines in the prompt when asking it to write a skill: no subcommands, and make sure the skill can be run idempotently. Run it once, run it ten times, it should converge on the same finished state without me steering.

Idempotence is the part that matters more than it sounds. A skill that's safe to re-run is a skill I can put in a loop, or fire after every commit, or hand to another agent without worrying about double-applying a change. The subcommand version pushes that work back onto me: decide which phase you're in, pick the right verb, remember what you already ran. That's the opposite of automation.

The menu pattern probably comes from training on human-facing CLIs, where breaking work into named steps is good UX. For a skill that an agent is going to invoke, it's the wrong shape. One entry point, idempotent, done when it says it's done.

2026·05·16 19:54 / 1 MIN

Sandboxing AI Coding Agents

Coding agents will happily run whatever they generate, and most of them have your shell, your SSH keys, and your AWS creds one rm -rf away. Sandboxing the agent is the cheapest insurance you can buy, and in 2026 there are finally enough good options that you should pick one.

The landscape splits into a few camps. Full VMs (Firecracker, Lima, OrbStack) give you the strongest isolation and the most overhead. Containers (Docker, Podman, devcontainers) are the default for most people and work fine until the agent needs to touch your real checkout. And then there's the OS-native path: Seatbelt on macOS, seccomp-bpf and Landlock on Linux. Those last two are what the kernel already uses to sandbox App Store apps and Chrome tabs, so the primitives are battle-tested. The friction has always been the ergonomics.

My current favorite is nono. It's a CLI wrapper that uses Landlock on Linux and Seatbelt on macOS to restrict filesystem and network access for any process you launch under it. No container, no VM, no daemon. It ships with profiles for the popular coding agents and lets you write your own, and I've gotten into the habit of creating a profile per project. The agent gets exactly the directories and hosts it needs, and nothing else.

The per-project profile is the part that actually changed my behavior. Once writing a profile takes thirty seconds, you stop talking yourself out of it. The agent can still go off the rails inside the box, but the blast radius is whatever you wrote down, and the rollback story is just git. I'm extremely curious to see where this category goes once more agents ship with sandbox profiles in the box.