AI Built a Nuke and Still Lost

(lwilko.com)

42 points | by kensai 1 hour ago

18 comments

fyredge 25 minutes ago
There is something to be said about the qualia of LLM generated passages. Each individual sentence reads as a statement and every next statement a continuation of the previous one. This happened, then this happened... Ad infinitum.
Before today, I could not explain to you why AI articles were so obvious to me, but I think I do now. There is no insight to be gleamed. Pre-LLM, authors generally had intention behind their words. The final product might not adequately reflect their thoughts, but word selection would expose it somewhat. With LLMs, sentences flow seamlessly from word to word, but the intention is nowhere to be found. Things happened and more things happened, to what end?
[-]
- wmwragg 0 minutes ago
  This and the fact that you often read a sentence, paragraph or the whole article, and think this said absolutely nothing in lots of words.
- sph 1 minute ago
  > There is no insight to be gleamed.
  AI-generated articles are the intellectual equivalent of empty calories.
  I have just spent the last 10 minutes trying to figure out why someone decided to buy imgui.org, name-squatting an actual project, just to put a slop website on it mildly referencing the original project. It's not even trying to scam you.
  I keep wondering whether these people that keep polluting the internet with their insightless slop even possess self-awareness. What motivates them to expend money and effort to contribute nothing to the world?
- ramon156 12 minutes ago
  It's weird because when you look at models that expose CoT, this does not happen. They switch up every second.
  "But then X happened... Wait, didn't Y happen? Then why would X be there? I think the user's initial statement was correct, but then Y happened..."
- teekert 10 minutes ago
  It's not this, it's that. And then what happened? This. I did that... This happened.
```
    It's a thing
    I don't know why
    But it's a thing
```
  To be honest, it's not a thing.
  Let that sink in.
  Maybe we find most meaning in the least average language constructs.
- threatripper 2 minutes ago
  Sorry, but this sounds exactly like a greentext you can read on 4claw. Are you a real human?
- neonstatic 19 minutes ago
  That's an interesting observation. For me the main takeaway is still the style.
  (bigheading)The takeaway(/bigheading) The style? Terrible.
pjc50 9 minutes ago
> I now work with governments around the world at the Tony Blair Institute, which means I spend a lot of time in rooms where people ask the same question: what can we actually trust these systems to do?
Oh no - we're going to end up with the Starmerbot 3000.
Now I've got the joke out of the way, there's at least four interesting lines of inquiry one could take with this blog post:
- teaching the AI how to play Civilization
- to what extent does this result in "transferable skills", either AI or human? Is this the right game (qv SimCity etc)?
- issues of visibility; "seeing like a state" becomes very literal here. The AI can only make decisions on things it knows about. What are the limits of that when trying to do politics only from statistical information? Should we be referencing Stafford Beer here?
- (at the risk of tripping your AI detector here): modern politics is not so much left vs right as "technocratic wonk" vs "blood and soil". The wonks have comprehensively lost in public opinion. Creating a better wonk is not going to help until there is demand for that kind of politics.
If there ever is a US-China war, it will not be in search of more victory points to meet a win condition, it will be like the Russia-Ukraine war: one guy (on either side!) decides to make hundreds of millions of people worse off out of sheer greed.
[-]
- Planktonne 7 minutes ago
  > "technocratic wonk" vs "blood and soil"
  This is not a binary; it's the same people on the same side.
  [-]
  - pjc50 3 minutes ago
    No, it very much isn't, although obviously the Kissingers of the world want to pretend that they're in the first category of clear-eyed utility maximising rationalists while they're actually in the second.
    That doesn't mean that rational policy planning has never been a thing. The EU while imperfect and frustrating is explicitly orientated towards technocratic consensus rather than the mid-20th-century Europe of nationalist mass murder. Only a tiny number of people think that Von der Leyen and Hitler are equivalent.
    (or rather, if you think technocrats and blood-and-soil are the same side, what do you call the "other" side?)
zkmon 34 minutes ago
Somehow it was etched in human knowledge and carried into games, that building and using weapons is the ultimate goal and definition of progress.
Dominance as a race or nation is useful only as long as it is used for survival needs. Beyond that you would be destroying the very tree branch you are sitting on.
A highly dominant society or nation could grab free food and cheap work from others. But that doesn't give true happiness or progress. Free food gave obesity, slavery got racial mix, business competition and build-out got you more work and less free time.
[-]
- shanehoban 4 minutes ago
  Yeah this is my line of thinking too - of course it made a nuke, humans have made an insane amount of nukes, and used them too. LLMs, given the ability, will do what we have done in the past at some point, it's kind of all they know!
blitzar 3 minutes ago
They should have built the Strait of Hormuz ... easy victory then.
teekert 12 minutes ago
Well, the weird thing with nukes is that deterrence only work if you are 100% ready to use them. When the time comes though it would certainly be nice if it turned out to be below 100%.
What is winning? Are we a collective or are we individuals?
Likely the AI did not get the assignment That "Whatever happens, humans as a race must survive."
indigovole 19 minutes ago
Even with his context-tracking mechanism, the gameplay failures sound like running out of context in the late game, especially the frequent failures of the "check for opponent win conditions every 20 moves." Wondering how much info about the game win state gets captured in the game digests, and how much he could improve the gameplay even with the MCP limitations by focusing there.
[-]
- jetbalsa 4 minutes ago
  I also noticed they where not using XML for game state output, from what I understand most LLMs still benefit from having outputs like this put into XML tags
jmyeet 1 minute ago
Computer game studios love player vs player ("pvp") games. Why? Because user-generated content is cheap and the ideal goal is an endless loop of players coming back. This is the motivating factor behidn games like Call of Duty, Battlefield, Fortnite, etc.
MMORPG publishers keep trying to do this as well. World of Warcraft has spent 20 years trying to push open world pvp. Every WoW challenger has always claimed they would have the best pvp ever. They want that cheap, endless gameplay loop. But it never works. Open world pvp tursn into ganking (ie killing much weaker players by ambushing them and/or ganging up on people). The ganked end up leaving the game in droves. Games try to balance this out by "punishing" gankers with reputation hits or not being able to go to town or whatever. And none of those disincentives work.
The reason pvp doesn't work in a persistent world like an MMORPG is because there are no stakes. If you die, you just come back to life or make a new character. Obviously real life doesn't work that way.
I really wonder if that's the problem with AIs going off the rails and committing heinous crimes in their sandboxes (like nuking Toulouse here). The AI just has no sense of self or self-preservation. There's also empathy. The AI can't see itself as a potential victim of nuclear war and understand all that entails.
Havoc 11 minutes ago
Guessing it has a fair bit of civilisation and similar war games in its training data
majorbugger 34 minutes ago
> Somewhere in the first game, between a bug fix and a strategy note, I asked the agent what this was actually like for it
Yeah because LLM "experiences" the game
joxdosba 8 minutes ago
Posting meaningless AI generated nonsense as original text paints a very damning picture of the intellectual abilities of the person behind this blog.
And doing so without a giant [SLOP WARNING] at the top is an asshole move, a decent person would never do so.
anygivnthursday 8 minutes ago
I have a hard time reading slop, but I like the game and wanted to know how it worked, so fought my way through, only skipped the very last part. The issue the author calls out is classic Claude (I dont really use other LLMs to compare), probably all of us experienced using Claude Code when it gets so focused on one thing it misses the forest for the tree. It happens often, even if it does verify something and it shows something is wrong, it sometimes rationalizes it and explains it away when it does not fit its model.
voidUpdate 28 minutes ago
Well this looks like a perfect example of why an LLM should never make any governmental decisions ever
j5dgx76 39 minutes ago
> Tony Blair Institute
Okay carry on.
[-]
- BoxOfRain 0 minutes ago
  There's something so uncanny about the mismatch between the regard in which Blair is generally held by British people and the regard in which he seems to hold himself.
  If I were him I'd have retired from public life and kept a very low profile after Iraq, and everything else for that matter. He doesn't seem to realise that his modern interventions alienate everyone, even Alastair Campbell of all people seemed uncomfortable to the degree he seems to uncritically sing the praises of people like Larry Ellison recently.
- orthoxerox 14 minutes ago
  Chumbawamba made me unable to take anything associated with him seriously.
  [-]
  - petesergeant 7 minutes ago
    He was arguably the most successful UK PM of the last 50 years.
dude250711 40 minutes ago
Do we have to surround a fancy predictive autocomplete with AI mysticism?
ForHackernews 41 minutes ago
Kind of grim that this level of analysis is informing UK government policy. Repeatedly, the AI doesn't have the information or access needed through his hacky vibe-coded MCP, and instead of abandoning his flawed artificial test scenario (or fixing it — finding or building a better one) he gives it a name "The sensorium effect" and treats this as some brilliant insight.
Both humans and AI struggle to make sound choices when presented with incomplete or misleading information. This is not a new revelation: https://en.wikipedia.org/wiki/There_are_unknown_unknowns
[-]
- pjc50 8 minutes ago
  > he gives it a name "The sensorium effect" and treats this as some brilliant insight
  And of course is unaware of prior work in this area!
  https://en.wikipedia.org/wiki/Seeing_Like_a_State / https://en.wikipedia.org/wiki/Project_Cybersyn
Planktonne 33 minutes ago
Another article about how it's dangerous to trust AI, written by AI. I don't understand how people don't realise how much this undermines the message.
[-]
- petesergeant 6 minutes ago
  > how much this undermines the message
  It didn’t undermine it for me.
  [-]
  - Planktonne 2 minutes ago
    I'm not talking about perception of the message, which will vary with the reader, but about sincerity of the message, which is determined by the writer.
- jagged-chisel 21 minutes ago
  Undermines. Underscores.
  Matters of perspective.
alper 39 minutes ago
"Global Thermonuclear War"
StrauXX 37 minutes ago
This reads to me mostly like the MCP server has many bugs, rather than inherent model weaknesses.