2026·05·19 17:40 / 1 MIN A friend pointed me at a14y.dev, which scans your site for "agent readability" and hands back a scored fix-list. It's the obvious next thing after llms.txt, and the suggestions are sharper than I expected.
The scorecard is 38 checks pinned at v0.2.0, split across discoverability, parsing, and comprehension. Some are the ones you'd guess: llms.txt exists, robots allows AI bots, canonical links, lang attributes, JSON-LD breadcrumbs. The interesting ones are the suggestions I hadn't seen pushed as a standard yet.
The less obvious suggestions
A Markdown mirror of every page, served at the same URL with a .md suffix, plus a <link rel="alternate" type="text/markdown"> in the HTML head so agents can find it without guessing. Content negotiation on the canonical URL so a request with Accept: text/markdown gets the Markdown directly. A glossary page, because agents resolving acronyms and project-specific terms benefit from one canonical place to look. Language tags on every code block. A /sitemap.md alongside the XML one.
None of these are exotic. They're the kind of thing you'd do for a thoughtful human reader, just written down as pass/fail checks.
The loop they're pushing
The CLI ships with an --output agent-prompt mode that writes a Markdown brief aimed at a coding agent: every failure, its detection rule, the fix, and a link back to the scorecard page. The intended workflow is to pipe that into Claude Code or Codex, let it patch, then re-run with --fail-under 80 in CI. There's also a skills add package for agents that speak the open skills format.
2026·05·17 19:12 / 1 MIN Claude Code is good at creating skills. Say "create a skill that does X" and it makes one. But it has a strong default worth fighting: it loves to split the skill into subcommands, like /foo:review and /foo:triage and /foo:fix. I don't want a menu. The whole point is automation.
So the fix is two lines in the prompt when asking it to write a skill: no subcommands, and make sure the skill can be run idempotently. Run it once, run it ten times, it should converge on the same finished state without me steering.
Idempotence is the part that matters more than it sounds. A skill that's safe to re-run is a skill I can put in a loop, or fire after every commit, or hand to another agent without worrying about double-applying a change. The subcommand version pushes that work back onto me: decide which phase you're in, pick the right verb, remember what you already ran. That's the opposite of automation.
The menu pattern probably comes from training on human-facing CLIs, where breaking work into named steps is good UX. For a skill that an agent is going to invoke, it's the wrong shape. One entry point, idempotent, done when it says it's done.
2026·05·16 19:54 / 1 MIN Coding agents will happily run whatever they generate, and most of them have your shell, your SSH keys, and your AWS creds one rm -rf away. Sandboxing the agent is the cheapest insurance you can buy, and in 2026 there are finally enough good options that you should pick one.
The landscape splits into a few camps. Full VMs (Firecracker, Lima, OrbStack) give you the strongest isolation and the most overhead. Containers (Docker, Podman, devcontainers) are the default for most people and work fine until the agent needs to touch your real checkout. And then there's the OS-native path: Seatbelt on macOS, seccomp-bpf and Landlock on Linux. Those last two are what the kernel already uses to sandbox App Store apps and Chrome tabs, so the primitives are battle-tested. The friction has always been the ergonomics.
My current favorite is nono. It's a CLI wrapper that uses Landlock on Linux and Seatbelt on macOS to restrict filesystem and network access for any process you launch under it. No container, no VM, no daemon. It ships with profiles for the popular coding agents and lets you write your own, and I've gotten into the habit of creating a profile per project. The agent gets exactly the directories and hosts it needs, and nothing else.
The per-project profile is the part that actually changed my behavior. Once writing a profile takes thirty seconds, you stop talking yourself out of it. The agent can still go off the rails inside the box, but the blast radius is whatever you wrote down, and the rollback story is just git. I'm extremely curious to see where this category goes once more agents ship with sandbox profiles in the box.