RTX 5090 and M4 MacBook Air: Can It Game?

(scottjg.com)

156 points | by allenleee 1 hour ago

12 comments

matthewfcarlson 42 minutes ago
I have been bothering the VM team for years for VM GPU pass through. I worked on the Apple Silicon Mac Pro and it would have made way more sense if you could run a linux VM and pass through the GPU that goes inside the case!
Sadly, as you can tell, they have not taken me up on my requests. Awesome that other people got it working!
[-]
- crdrost 24 minutes ago
  It feels like half the problem in this blog post is dealing with memory access issues induced by QEMU and the VM boundary... it's probably something dumb I'm missing, but if you boot up Ubuntu in Docker, wouldn't the NVIDIA drivers still load? And then you wouldn't have to fight Apple about the memory management because OSX would still own the memory?
  [-]
  - swiftcoder 19 minutes ago
    > but if you boot up Ubuntu in Docker, wouldn't the NVIDIA drivers still load?
    Even if the drivers loaded, they can't talk to the GPU from within docker (unless one implements PCI passthrough). MacOS owns the PCI bus in this scenario.
  - jmalicki 21 minutes ago
    The driver wants to own the memory is the problem.
- brcmthrowaway 21 minutes ago
  I still believe the lack of NVIDIA GPU support in the Mac Pro will go down as one of the greatest missed opportunities in tech.
  Anyway, the Mac Pro is dead now. There's only so much sales audio and video professionals can provide.
  [-]
  - Aurornis 1 minute ago
    > I still believe the lack of NVIDIA GPU support in the Mac Pro will go down as one of the greatest missed opportunities in tech.
    I don’t know about that. Apple supported some full size GPUs in past product lines and the number of users was very small. Granted, LLMs change that demand but the audience for Mac Pro buyers who would use a full-size GPU that is impossible to obtain is almost nothing compared to their laptop sales.
Aurornis 6 minutes ago
Excellent article.
The game benchmarks are fun but the LLM improvements are where this gets really interesting for practical use. I love Apple platforms as an approachable way to run local models with a lot of RAM, but their relatively slow prompt processing speed is often overlooked.
> Here you can see the big issue with Macs: the prompt processing (aka “prefill”) speed. It just gets worse and worse, the longer the prompt gets. At a 4K-token prompt, which doesn’t seem very long, it takes 17 seconds for the M4 MacBook Air to parse before we even start generating a response. Meanwhile, if you strap the eGPU to it, it’ll only take 150ms. It’s 120x faster.
The prefill problem goes unnoticed when you’re playing around with the LLM with small chats. When you start trying to use it for bigger work pieces the compute limit becomes a bottleneck.
The time to first token (TTFT) charts don’t look bad until you notice that they had to be shown on a logarithmic scale because the Mac platforms were so much slower than full GPU compute.
mywittyname 44 minutes ago
> As much as I hate to admit it, step one in most of my projects now is to ask AI about it. Maybe it’ll tell me something I don’t know.
Or, more likely, it will tell you something it doesn't know.
Reminds me of yesterday, when I was arguing with ChatGPT that the 5070TI was an actual video card. It kept trying to correct me by saying I must have meant a 4070ti, since no such 5070ti card exists.
[-]
- collabs 31 minutes ago
  Or, it will acknowledge that it made a mistake and continue to make the same mistake again.
  I asked Claude to generate an HTML page about PowerShell 7. It gave me a page saying 7.4 was the latest LTS release. I corrected it with links showing 7.6 was released in March and asked it to regenerate with the latest information.
  It generated basically the same page with the same claim that 7.4 was the latest release.
  [-]
  - ericmay 17 minutes ago
    > Or, it will acknowledge that it made a mistake and continue to make the same mistake again.
    People do this too though. At least the AI generally tries to follow instructions that you give it even when you are lacking clarity in the details.
    I feel like it's similar to the self-driving car problem. The car could have 99.9999% reliability, drive much better and safer than a human, yet folks will still freak out about a single mistake that's made even though you have actual humans today driving the wrong way down the highway, crashing in to buildings, drunk driving, stealing cars, and all sorts of other just absolutely stupid things.
    We need to move away from this idea that because it's an AI system it should give you perfect responses. It's not a deterministic system and it can be wrong, though it should get better over time. Your Google search results are wrong all the time too. The NYT writes things that are factually incorrect. Why do we have such a high standard for these models when we don't apply them elsewhere?
    [-]
    - bryceacc 6 minutes ago
      >I corrected it with links
      it should be reasonably expected that you can give a source and fix an error in the AI output.
      I would even go as far as to say if a human directly told the AI "no, use 7.6 as the latest version", the AI should absolutely follow direct instructions no matter what it thinks is true. What if this human was working on a slide about the upcoming release of 7.6 that has no public documentation?
- corry 21 minutes ago
  LLMs are (broadly-speaking) poorly-positioned to give you a strong verdict on plausibility of a frontier topic. That said - ChatGPT was exactly right in its response to OP!
  "Very deep", "border-line impractical" "in a research-sense" is the perfect summary of this article itself! :)
- funimpoded 10 minutes ago
  Watching the entire economy of a superpower and ~all of online culture go absolutely ga-ga over Furbys has been one of the weirdest things I've ever witnessed.
- amluto 23 minutes ago
  At least ChatGPT is now aware that Codex exists. I have a chat, still in my history, from a few months ago, in which I asked for help wrangling npm to get @openai/codex working, and ChatGPT said:
  > Important: Codex CLI no longer exists
  > OpenAI discontinued the Codex model + CLI a while back. There is no official binary named codex in any current OpenAI npm packages. OpenAI’s current CLI tool is:
```
    npm install -g openai
```
  > which installs the openai command, not codex.
  The world knowledge of these models is not necessarily up to date :)
  edit: I replayed the same prompt into current ChatGPT and it is less clueless now. Maybe OpenAI noticed that it was utterly dumb that GPT-5.whatever didn't believe that Codex existed and fine-tuned it.
  [-]
  - sigmoid10 12 minutes ago
    >The world knowledge of these models is not necessarily up to date :)
    It's amazing how this still needs to be said. Codex was released in April 2025. The initial GPT-5 and 5.1 still had a knowledge cutoff in late 2024. Like, what did you expect? Always beware the knowledge cutoff for LLMs (although recent releases have gotten much better with researching the web for updates before answering modern software topics).
- perarneng 33 minutes ago
  This is why i use grok expert mode. It agressivly goes out searching the web for info. Its so much better then relying on year old data.
  [-]
  - _blk 28 minutes ago
    Yes, I really like that about Grok. It had a few good qualities but it was too verbose so now it's mostly Claude.
    [-]
    - JumpCrisscross 28 minutes ago
      Solid compromise is Kagi's research assistant. Aggressively cites, unlike Claude. Concise, unlike Grok.
- simonh 32 minutes ago
  It’s training data only goes up to late 2024 or early 2025 so that might be why, though it does have access to the internet.
  [-]
  - mywittyname 23 minutes ago
    Yeah, the solution was to link it to the nvidia page of the card, then it was like, 'oh, okay.' But at that point, I lost faith in it's ability to provide me with the information I was looking for. If it's information is so out of date that it doesn't know about the 5000 series, how could I be confident that it knew the details I was asking about (game engine related research)?
  - weird-eye-issue 28 minutes ago
    Depending on your ChatGPT settings...
divbzero 28 minutes ago
This is pretty impressive. My impression was that eGPUs simply do not work with Apple Silicon.
(EDIT: Apple agrees with my impression. “To use an eGPU, a Mac with an Intel processor is required.” And, on top of that, the officially supported eGPUs were all AMD not NVIDIA. https://support.apple.com/en-us/102363)
swiftcoder 59 minutes ago
This is proper mad science, love it
coder68 54 minutes ago
This seems pretty useful for AI inference if it can pass Apple approval. I've wanted to use my Nvidia GPUs with a Mac Mini, this would enable it to run CUDA directly. Very cool!
frollogaston 1 hour ago
I'm guessing the x86 emu is cause Windows games are rarely built for ARM, right? Was kinda curious how an ARM VM would fare. Anyway awesome article.
[-]
- hparadiz 56 minutes ago
  Yes. Valve has done a ton of work here because it's required to be able to run x86 games on a Steam Frame which has an ARM cpu.
  [-]
  - hypercube33 39 minutes ago
    Steam deck runs a full x86-64 AMD APU. The work valve has done for that was to get Windows games to run seamlessly on Linux.
    Hopefully in 2026 the Valve Index VR headset which is ARM (Qualcomm?) we get what you're talking about here - basically proton for Win32/64 to Linux ARM64.
    Side note that Windows on ARM isn't bad just that its priced out of its league and cooling is awful for gaming on current laptops. The only issue I had was OpenGL needing some obscure GL on DirectX thing for Maya3D to get games to work.
    [-]
    - delecti 23 minutes ago
      To keep the chain of Cunningham's Law going, Valve's 2026 headset is called the Steam Frame, not the Index (which came out in 2019).
      But Valve's ARM efforts even mean that Android devices can play some (mostly less graphically intensive) Steam games. That makes me very excited about the prospects for the future of gaming handhelds.
  - sva_ 39 minutes ago
    As sibling pointed out, the Steamdeck basically runs a Ryzen 3 7335U which is x86.
  - bigyabai 54 minutes ago
    The Steam Deck is pure x86, it's not an ARM-based CPU. The Steam Frame might be what you're thinking of.
    [-]
    - hparadiz 28 minutes ago
      You're right. I was thinking of what I was reading about the Steam Frame
delbronski 52 minutes ago
Nicely done! Glad to see real hacking is still alive in the age of AI.
moralestapia 1 hour ago
Wow, phenomenal project and write-up, thanks for sharing it.
"no - not in any practical sense today, and "maybe" only in a very deep, borderline-impractical research sense."
This is why humans will always rule over crappy LLMs.
[-]
- falcor84 51 minutes ago
  Wait, why? This is exactly what I as a human would have said in this situation.
  Or if you're referring to how the OP still decided to go ahead, I've seen AIs go ahead on impractical courses of action many times, and surprisingly succeed on some of them.
  [-]
  - moralestapia 43 minutes ago
    And I see that you succeeded in not doing it.
    Congrats! Each one got what they wanted :).
- csours 55 minutes ago
  I believe that LLM (and ML in general) tools really shine when they are developed and used AS tools.
  Unfortunately, I also believe that market forces may push away from this direction, as LLM companies try to capture the value stream
- rvz 52 minutes ago
  Exactly. AI psychosis is real.
  Never let an AI tell you that you cannot do something practical for your own self for research, discovery or for fun.
  The only thing that is close to impractical is expecting your non-technical friends or others to follow you without any incentive or benefit.
nothinkjustai 33 minutes ago
> As much as I hate to admit it, step one in most of my projects now is to ask AI about it. Maybe it’ll tell me something I don’t know.
It’s these people, not the ones who refuse to use LLMs, who are as they say, “cooked”.