Really nice visualisation of this, makes understanding the flow at a high levle pretty clear. Also the tool system and command catalog, particularly the gated ones are super interesting.
Kairos and auto-dream are more interesting than anything in the agent loop section. Memory consolidation between sessions is the actual unsolved problem. The rest is just plumbing tbh
I guess they really do eat their own dogfood and vibe code their way through it without care for technical debt? In a way, it’s a good challenge, but it’s fairly painful to watch the current state of the project (which is about a year old now, so it should be in prime shape).
There's this weird thing about AI generated content where it has the perfect presentation but conveys very little.
For example the whole animation on this website, what does it say beyond that you make a request to backend and get a response that may have some tool call?
That's fair. The site isn't meant to be a deep technical dive, it's more of a visual high-level guide of what I've curated while exploring the codebase while assisted by AI, 500k loc codebase is just too much to sift through in a short amount of time.
I doubt there is anything special about the transformer code the frontier labs use. The only thing proprietary in it are probably the infrastructure-specific optimizations for very large scale distributed training and some GPU kernel tricks. The real moat is the training data, especially the RLHF/finetuning data and verifiable reward environments, and the GPU clusters of course.
The open source models are quite close, and they'd probably be just as good with the equivalent amount of compute/data the frontier labs have access to.
I mean, I get it: vibe-coded software deserves vibe-coded coverage. But I would at least appreciate it if the main part of it, the animation, went at a speed that at least makes it possible to follow along and didn't glitch out with elements randomly disappearing in Firefox...
I feel the same way. Given it's AI-written, looking at the code isn't even worth it to me. I would rather read a blog post about how they develop it day to day.
Thanks, I'll use this for teaching next week (on what not to do). BashTool.ts :D But, in general, I guess it just shows yet again that the emperor has no clothes.
Also I definitely want a Claude Code spirit animal
(Yes, I know I can turn it off. I have.)
“Complete thyself.”
And I want an octopus. Who orchestrates octopuses.
For example the whole animation on this website, what does it say beyond that you make a request to backend and get a response that may have some tool call?
The open source models are quite close, and they'd probably be just as good with the equivalent amount of compute/data the frontier labs have access to.
How is this on the front page?
I use it all day and love it. Don't get me wrong. But it's a terminal-based app that talks to an LLM and calls local functions. Ooookay…
- find nothing - still manage to fill entire lages - somehow have a similar structure - are boring as fuck
At least this one is 3/4, the previous one had BINGO.