The sigmoids won't save you

(astralcodexten.com)

41 points | by Tomte 6 hours ago

16 comments

zkmon 1 minute ago
The curve is a smoothed step curve (y=1 if x>1 otherwise 0). Nature doesn't allow any change to happen instantly at any degree of rate of change. The curveis just a manifestation a change with exponential smoothening of the sharp corners.
For example, When a car starts, it's speed and acceleration become more than zero. But what about rate of change in higher degrees? It suddenly doesn't change from zero acceleration to non-zero. That means the car has a non-zero derivative at all degrees. In other words, the movement is exponential. The same thing happens in reverse when the car reaches a constant speed.
LarsDu88 11 minutes ago
I think an interesting thing about recent AI developments is that its all happening right as we hit the diminishing returns side of another "exponential that's actually a sigmoid" which is Moore's law.
The naive expectation is that AI will slow down b/c Moore's law is coming to an end, but if you really think about the models and how they are currently implemented in silicon, they are still inefficient as hell.
At some point someone will build a tensor processing chip that replaces all the digital matmuls with analogue logamp matmuls, or some breakthrough in memristors will start breaking down the barrier between memory and compute.
With the right level of research funding in hardware, the ceiling for AI can be very high.
[-]
- throwaway27448 2 minutes ago
  Even at orders of magnitude greater speed, we've still hit diminishing returns for quality of output. We simply haven't found anything like superhuman reasoning ability, just superhuman (potentially) reasoning speed.
- cyanydeez 8 minutes ago
  they already did put a model into the silicon and it's crazy fast. https://chatjimmy.ai/
  I'm pretty sure there's a 3 year design goal starting this year that'll do that to any of the qwen, deepseek, etc models. There's a lot you could do with sped up models of these quality.
  It might even be bad enough that the real bubble is how much we don't need giant data centers when 80-90% of use cases could just be a silicon chip with a model rather than as you say, bloated SOTA
  [-]
  - clickety_clack 0 minutes ago
    It would be pretty cool to have interchangeable usb keys with models on them.
Brendinooo 8 minutes ago
> then what is their model?
My mental model has been 3D computer graphics: doubling the polygon count had huge returns early on but delivered diminishing returns over time.
Ultimately, you can't make something look more realistic than real.
I don't know what the future holds, but the answer to the question "can LLMs be more realistic than real" will determine much about whether or not you think the curve will level off soon.
gm678 45 minutes ago
I don't know what the Y-axis is supposed to be on that Wharton AI capabilities graph, but I am not really convinced that Opus 4.6 has more than double the intelligence/capability/whatever of GPT 5.1 Max.
[-]
- throwaway27448 0 minutes ago
  > more than double the intelligence/capability/whatever
  I'm curious what people really mean when they say this. Intelligence is famously hard to define, let alone measure; it certainly doesn't scale linearly; it only loosely correlates to real-world qualities that are easy to measure; etc. Are you referring to coding ability or...?
- strken 8 minutes ago
  Check out Re-Bench and HCAST.
  The tasks are obviously all of the form "Go do this, and if you get the following output you passed". Setting up a web server apparently takes 15 minutes for a human, which is news to me since I'm able to search for https://gist.github.com/willurd/5720255, find the python one-liner, and copy it within about ten seconds.
  Anyway, this is cool but it does not mean Claude can perform any human tasks that take less than 8 hours and are within its physical capabilities.
- NitpickLawyer 43 minutes ago
  IIRC that graph tracks capabilities as time_to_solve a task for humans (i.e. the model can now handle tasks that usually take a human ~8h). Which, depending on what tasks you look at, could be a reasonable finding. I could see Opus 4.6 handling tasks that take ~8h for humans, and that 5.1 couldn't previously handle (with 5.1 being "limited" at 4h tasks let's say). It is a bit arbitrary, but I think this is what they're tracking.
  [-]
  - lukan 15 minutes ago
    "It is a bit arbitrary, but I think this is what they're tracking."
    I don't know if they can get their numbers right this way, but this seems a way more useful metric, than theoretic capabilities.
- myhf 13 minutes ago
  According to this article: whenever someone games a benchmark to make an upward chart on some y-axis, it's YOUR responsibility to prove how and why that trend can't continue indefinitely.
  emoji face with eyes rolling upward
  [-]
  - AnimalMuppet 10 minutes ago
    I'm pretty sure that gaming benchmarks can continue indefinitely.
- BoredPositron 44 minutes ago
  https://metr.org/time-horizons/ on linear scale. Clickbait garbage article as most of his in the last year.
  [-]
  - afthonos 38 minutes ago
    …yeah, that’s where you see the exponential?
btilly 7 minutes ago
Lindy’s Law is an absolute gem, that I'm keeping.
If we don't understand the fundamental limits to any particular kind of trend, our default assumption should be that it will continue for about as long as it has gone on already.
We can, in fact, easily put a confidence interval on this. With 90% odds we're not in the first 5% of the trend, or the last 5% of the trend. Therefore it will probably go on between 1/19th longer, and 19 times longer. With a median of as long as it has gone on so far.
This is deeply counterintuitive. When we expect something to last a finite time, every year it goes on, brings us a year closer to when it stops. But every year that it goes on properly brings the expectation that it will go on for a year longer still.
We're looking at a trend. We believe that it will be finite. Our intuition for that is that every year spent, is a year closer to the end. But our expectation becomes that every year spent, means that it will last yet another year more!
How can we apply that? A simple way is stocks. How long should we expect a rapidly growing company, to continue growing rapidly?
philipallstar 50 minutes ago
But they do explain the improvement of AI driving 2017-2021 vs 2022-2026.
kubb 14 minutes ago
If the scary AI is so inevitable, why do you feel such an overwhelming need to convince people about that? Surely you can just wait a bit, and they'll see for themselves.
[-]
- adleyjulian 8 minutes ago
  1. It's not inevitable. 2. Those that see AI as an existential risk don't generally think it's a guarantee, but if it's say a 5% chance then that's worth addressing/mitigating. 3. That's not what this article was even about.
itkovian_ 4 minutes ago
The other thing people don’t understand is exponential curves are self similar. The start of an exponential looks like an exponential. People always look at and think ‘well that’s it it’s exponential now, have missed it, can’t sustain’. Nope.
Good example of this is number of submissions to neurips/icml/iclr. In 2017 that curve was exponential.
krupan 26 minutes ago
News flash: predicting the future is hard
[-]
- energy123 19 minutes ago
  The individual who is the best at predicting the future is predicting ASI and full labor automation by 2040:
  https://xcancel.com/peterwildeford/status/202963666232244661...
  [-]
  - Aurornis 15 minutes ago
    > The individual who is the best at predicting the future
    Going to need a big citation for that claim
    [-]
    - margalabargala 4 minutes ago
      Source: trust me bro
  - gerikson 17 minutes ago
    Past results is no guarantee of future performance.
  - margalabargala 4 minutes ago
    > The individual who is the best at predicting the future
    Lol
andai 46 minutes ago
Well, curve shape aside, the high watermark might be lower than where it tapers off.
https://news.ycombinator.com/item?id=46199723
inglor_cz 43 minutes ago
Hmmm, this is quite an interesting take by Scott.
Lindy's Law is not actually a law and many exact minds will be provoked by the very name; it also fails spectacularly in certain contexts (e.g. lifetime of a single organism, though not necessarily existence of entire species).
But at the same time, I am willing to take its invocation in the context of AI somewhat seriously. There is an international arms race with China, which has less compute, but more engineers and scientists. This sort of intellectual arms race does not exhaust itself easily.
A similar space race in the 1950s and 1960s progressed from first unmanned spaceflight to a moonwalk in mere 12 years, which is probably less than what it takes to approve a bicycle lane in Chicago now.
[-]
- krupan 26 minutes ago
  "There is an international arms race with China"
  I keep seeing this. Where did it come from? Has China said that they intend to attack other countries using AI? Have other countries declared that they intend to attack China with AI?
  Also, why does anyone believe that AI could actually be that dangerous, given it's inherent unpredictable and unreliable performance? I would be terrified to rely on AI in a life or death situation.
  [-]
  - aspenmartin 15 minutes ago
    AI in war is like Palintirs whole business model. You have a system that can effectively deal with ambiguity and has superhuman performance on reasoning plus superhuman physical abilities via embodiment…
    Inherent unpredictable and unreliable performance is also quite the feature of human beings as well.
  - dmbche 21 minutes ago
    https://www.forbes.com/sites/greatspeculations/2025/11/25/wh...
  - inglor_cz 23 minutes ago
    It was a metaphor. I meant, and later clarified, an intellectual arms race.
    BTW your handle is an actual Czech word, minus a diacritic sign ("křupan"), and a bit amusing one. It basically means hillbilly. Not that it matters, just FYI.
devmor 45 minutes ago
"Exponentials all tend to become sigmoids but you can't predict exactly when" is a true statement, but I'm not sure it needed an article.
This doesn't say much, and the author fights their own points a couple times, suggesting that they maybe didn't think through what they wanted to write until they were in the middle of writing it and started realizing their assumptions didn't match what they expected the data to say.
I really don't get the point of what I just read.
[-]
- aspenmartin 13 minutes ago
  The point is the tiring arguments from AI skeptics saying “things are flattening, they have to” which while technically correct says nothing because no one knows when that will happen and we see no mechanism for this yet. Lindy’s law as a reasonable prediction under total uncertainty is interesting and insightful and a lot of people don’t know about it or why it holds. I did enjoy the reference to this!
nathan_compton 47 minutes ago
A lot of words to say "The initial part of a sigmoidal curve is not very informative about the parameters of the sigmoid function in question."
[-]
- inglor_cz 38 minutes ago
  That is true, but I generally enjoy reading a lot of words from Scott, who has a talent for writing.
  The entire plot of the Lord of the Rings could probably be compressed into less than 10 kB of text too.
  Edit: this seems to be a controversial comment, but IMHO a blog of Scott Alexander's type is an art form, not just a communication channel.
  [-]
  - jeffreyrogers 14 minutes ago
    I find him more interesting when he talks about non-AI topics. Lots of other interesting people are like this too. I'd rather get my knowledge on AI from people who have unique insights into it. Scott has a lot of unique perspectives of his own, but his views on AI are bog-standard for his social group.
addaon 44 minutes ago
https://xkcd.com/605/
[-]
BoredPositron 46 minutes ago
If you use the log scale you'll see that the time horizon of opus 4.6 was as expected...
[-]
- afthonos 36 minutes ago
  As expected by the exponential. The Wharton study was predicting when the exponential would turn into a sigmoid.
- ReptileMan 15 minutes ago
  Everything is linear on a log log scale with a fat marker.
theturtle 39 minutes ago
[dead]