I use AI in my workflow mostly for simple boilerplate, or to troubleshoot issues/docs.
I've dipped into agentic work now and again, but never been very impressed with the output (well, that there is any functioning output is insanely impressive, but it isn't code I want to be on the hook for complaining).
I hear a lot of people saying the same, but similarly a bunch of people I respect saying they barely write code anymore. It feels a little tricky to square these up sometimes.
Anyway, really looking forward to trying some if these patterns as the book develops to see if that makes a difference. Understanding how other peopke really use these tools is a big gap for me.
I was in the same boat as you until I saw DHH post about how he’s changed his use of agents. In his talk with Lex Fridman his approach was similar to mine and it really felt like a kernel of sanity amongst the hype. So when he said he’s changed his approach I had another look. I’m using agents (Claude code) every day now. I still write code every day too. (So does Dax Raad from OpenCode to throw a bit more weight behind this stance). I’m not convinced the models can own a production code base and that therefore engineers need to maintain their skills sufficiently to be responsible. I find agents helpful for a lot of stuff, usually heavily patterned code with a lot of prior art. I find CC consistently sucks at writing polars code. I honestly don’t enjoy using agents at all and I don’t think anyone can honestly claim they know how this is going to shake out. But I feel by using the tools myself I have a much stronger sense of reality amongst the hype.
I've experimented with agentic coding/engineering a lot recently. My observation is that software that is easily tested are perfect for this sort of agentic loop.
In one of my experiments I had the simple goal of "making Linux binaries smaller to download using better compression" [1]. Compression is perfect for this. Easily validated (binary -> compress -> decompress -> binary) so each iteration should make a dent otherwise the attempt is thrown out.
Lessons I learned from my attempts:
- Do not micro-manage. AI is probably good at coming up with ideas and does not need your input too much
- Test harness is everything, if you don't have a way of validating the work, the loop will go stray
- Let the iterations experiment. Let AI explore ideas and break things in its experiment. The iteration might take longer but those experiments are valuable for the next iteration
- Keep some .md files as scratch pad in between sessions so each iteration in the loop can learn from previous experiments and attempts
You have to have really good tests as it fucks up in strange ways people don't (because I think experienced programmers run loops in their brain as they code)
Good news - agents are good at open ended adding new tests and finding bugs. Do that. Also do unit tests and playwright. Testing everything via web driving seems insane pre agents but now its more than doable.
The underlying technology is still improving at a rapid pace. Many of last year's tricks are a waste of tokens now. Some ideas seem less fragile: knowing two things allows you to imagine the confluence of the two so you know to ask. Other things are less so: I'm a big fan of the test-based iteration loop; it is so effective that I suspect almost all users have arrived at it independently[0]. But the emergent properties of models are so hard to actually imagine. A future sufficiently-smart intelligence may take a different approach that is less search and more proof. I wouldn't bet on it, but I've been surprised too many times over the last few years.
Verification represents only one dependency. The other: discovery.
The unratified.org ecosystem advertises its capabilities through open protocols:
/.well-known/agent-inbox.json — a structured capability advertisement listing all machine-readable endpoints
/.well-known/glossary.json — Schema.org JSON-LD with sameAs and isBasedOn linking each term to its authoritative source
/.well-known/taxonomy.json — SKOS ConceptScheme with exactMatch, closeMatch, and rdfs:seeAlso for semantic web interoperability
RSS feed — blog posts syndicated through a standard from 2002 that still outperforms proprietary notification APIs
These protocols share a design assumption: the web remains crawlable, discoverable, and structured. An agent encountering unratified.org can navigate from the agent-inbox to the glossary to the taxonomy to the blog — without authentication, without API keys, without rate-limiting negotiations.
I've dipped into agentic work now and again, but never been very impressed with the output (well, that there is any functioning output is insanely impressive, but it isn't code I want to be on the hook for complaining).
I hear a lot of people saying the same, but similarly a bunch of people I respect saying they barely write code anymore. It feels a little tricky to square these up sometimes.
Anyway, really looking forward to trying some if these patterns as the book develops to see if that makes a difference. Understanding how other peopke really use these tools is a big gap for me.
I think trying agents to do larger tasks was always very hit or miss, up to about the end of last year.
In the past couple of months I have found them to have gotten a lot better (and I'm not the only one).
My experience with what coding assistants are good for shifted from:
smart autocomplete -> targeted changes/additions -> full engineering
In one of my experiments I had the simple goal of "making Linux binaries smaller to download using better compression" [1]. Compression is perfect for this. Easily validated (binary -> compress -> decompress -> binary) so each iteration should make a dent otherwise the attempt is thrown out.
Lessons I learned from my attempts:
- Do not micro-manage. AI is probably good at coming up with ideas and does not need your input too much
- Test harness is everything, if you don't have a way of validating the work, the loop will go stray
- Let the iterations experiment. Let AI explore ideas and break things in its experiment. The iteration might take longer but those experiments are valuable for the next iteration
- Keep some .md files as scratch pad in between sessions so each iteration in the loop can learn from previous experiments and attempts
[1] https://github.com/mohsen1/fesh
Good news - agents are good at open ended adding new tests and finding bugs. Do that. Also do unit tests and playwright. Testing everything via web driving seems insane pre agents but now its more than doable.
Feels like it’s a lot of words to say what amounts to make the agent do the steps we know works well for building software.
0: https://wiki.roshangeorge.dev/w/Blog/2025-12-01/Grounding_Yo...
https://news.ycombinator.com/item?id=47240834
The Discovery Layer
Verification represents only one dependency. The other: discovery.
The unratified.org ecosystem advertises its capabilities through open protocols:
These protocols share a design assumption: the web remains crawlable, discoverable, and structured. An agent encountering unratified.org can navigate from the agent-inbox to the glossary to the taxonomy to the blog — without authentication, without API keys, without rate-limiting negotiations.