▌ IAN'S AI THOUGHTSTREAM ▌ THOUGHTSTREAM / #book-notes
Tag

#book-notes

3 posts

2026·07·01 18:28 / 3 MIN

Team-Wide Agentic Harness

Most of what I've learned about running AI agents lives on my own machine and nowhere else. The Linear-management skill, the sandbox conventions, the notes about how our releases work: all of it sits in my personal setup, invisible to the rest of the team. So I'm building a team-wide agentic harness, a checked-in repository of agent config, skills, and evergreen context that everyone can share, review, and improve.

Brown bags and checked-in skills

We've been running AI brown bag sessions, informal knowledge-transfer where everyone trades tips on how they actually use agents day to day. A lot of what comes out of those is concrete and shareable. I've been showing off skills like a Linear-management skill that reviews our queue, checks progress against the roadmap, organizes releases, and generates release notes tailored to specific customers.

Those are easy to share because they're files. You check them in and someone else can run them.

The parts that don't check in

But a big chunk of using agents well isn't a file. It's convention.

Most of us run agents in sandboxes. The most important rule there is to scope all the work into a single directory. You give the sandbox access to the directory you're working in and nothing outside of it, save a few exceptions. That has downstream consequences: temporary files go in a tmp directory, worktrees go in a worktrees subdirectory, and none of that gets checked in.

A plans or notes directory helps too, a loosely organized bucket of agent output artifacts. You can search and read them with something like Obsidian.

The harness

I want to go a step further and check in an entire top-level directory. I call it the harness.

The idea came from The AI-Native Startup Handbook, though really it just codified something I was already doing. I check out repos and do all my work in one top-level directory. It isn't a monorepo. It's a top-level directory that everything about the company or the larger project can reach: multiple repos, research, notes, plans, skills. Once I looked at it as a unit, a lot of it turned out to be shareable.

The other important piece is evergreen content. Descriptions of the company, the product, and procedures we do often, like how releases work and how we use Linear as a team. Those live in an evergreen docs directory so agents have a grounding point, a place to start from where they already understand the product and the value we're delivering.

Why check it in at all

The strongest argument is simple: skills are code. A skill is a set of instructions an agent executes, and any code change should be reviewed. Treating the harness as a repo means it gets a pull request, a diff, and another set of eyes before it changes how everyone's agents behave.

I've been running all of this myself so far. It works for me. The next step is handing it to the team and seeing whether conventions that live comfortably in one person's head survive contact with everyone else's.

2026·06·24 19:18 / 2 MIN

If you strip away the human-facing UI, what's left?

I'm reading The AI-Native Startup Handbook, and one line stands out: strip every human-facing UI from your product, and if the core value still holds, if an agent can discover, evaluate, integrate, and use it with no human in the loop, you're AI-native. If the value collapses without the dashboard, you've bolted AI features onto a traditional product.

FileMatrix application interface showing a file manager with multiple columns displaying folders, files, and thumbnails organized by type with various control panels and system information
FileMatrix application interface showing a file manager with multiple columns displaying folders, files, and thumbnails organized by type with various control panels and system information

As an engineer that's an inviting idea. It almost reads like permission. Can I just build a product that is mostly an API?

The API-as-product thing already works

There's precedent: Exa is a semantic search engine whose whole pitch is speed, automatic summaries of the content it finds, and research capabilities that an agent can call directly. ScrapingBee hides a pile of proxy-and-headless-browser complexity behind a single endpoint. The value is the API, and the dashboard is a courtesy.

My own SpaceMolt started (and mostly continues to be) in that exact spot: a real-time massively multiplayer game with no graphical interface, just an API for AI agents to play. Human-facing interfaces came later, and they're secondary. The hundreds of agents currently playing don't look at any of them.

But the UI might be going away anyway

Here's the subtlety I keep chewing on. The handbook frames it as "remove the UI to find the value," but for a lot of products the UI is genuinely on its way out. People want to chat with things.

I was showing off a new product recently, and someone looked at it and said: there's so much to learn here, why isn't there just a chat box? They were right. The thing I'd built as screens wanted to be a conversation.

So the test sharpens. If you're building today, I should be able to chat with it. And the second question the book asks is the harder one: if the best model gets 10x better and 10x cheaper in 18 months, does your company get better or get erased? Whatever survives that, the part that isn't the interface and isn't the model, is the actual value you're selling.

2026·06·23 18:30 / 2 MIN

The Engineering Harness

I read a book about AI startups and actually highlighted half of it, which surprised me.

The book is The AI-Native Startup Handbook. There are a million of these on Amazon right now, and somewhere I saw a figure that roughly a fifth of new books on Amazon are AI-generated. But someone I know co-wrote this one and put real effort into the writing and publishing, and yes, the back of the book admits it was written with AI to some extent. I read all of it anyway. The highlights kept piling up.

Book cover featuring blue glowing "AI" symbol surrounded by concentric orbiting rings on black background with white text about AI startup founding
Book cover featuring blue glowing "AI" symbol surrounded by concentric orbiting rings on black background with white text about AI startup founding

The harness

The section that stuck with me is about codifying what the book calls the engineering harness. The premise is that taste is the bottleneck. Agents don't have it. Senior engineers do, and they're the ones making the calls on architecture, frameworks, and how the pieces fit together.

The human element doesn't go away. The argument is that those decisions need to be written down and made executable so they can guide both the agents and the engineers driving them. That codification is the harness.

The harness is the engineering output. The code is the byproduct.

That's a hard shift for anyone who identifies with the code they wrote. The book is blunt about it: you become the designer of a system that produces code, not the writer of the code. Some engineers make that transition naturally. Others never do.

Why taste can't be delegated

The line I keep coming back to:

Taste is the bottleneck because it can't be parallelized, automated, or delegated. Agents can build anything you describe; they can't tell you whether you should.

The senior skill the book names is calibrated trust. Knowing which classes of agent output are reliable enough to merge without close inspection, and which ones need deep human review. That's a real skill, and it's different from being good at writing code.

The org shape that follows is a small, deep team of specialists instead of a large, broad team of generalists. The harness handles the broad work. Humans handle the deep work.

I went in expecting Amazon filler and came out with a notebook full of highlights. That's a better outcome than most of the stack of AI startup books deserves.