On the surface, it appears that ChatGPT, Gemini, and Grok are the only hot Ai models around. But there’s a hidden Ai power play, a nerdy hardware battle between Intel and AMD. We’ve talked about the CPU and GPU, but this next upgrade cycle features a new acronym, it’s the NPU.
We have written about local Ai before, specifically the “open source” farce. With the NPU, the appeal of local Ai and the farce could go mainstream, beyond the current realm of nerds.
A PC running a larger model with a GPU could cost $4000, but a PC with NPU could acheive a similar capacity at around $700 without a graphics card. This is serious. The democratization of local Ai.
Intel goes Ultra
The first NPU release from Intel is called Ultra. The i3, i5, and i7 have been deprecated in favor of the Core Ultra X branding. Intel is starting on the mobile side with the star player, the Core Ultra 7 (specifically 155H). This chip includes 3 tiers of cores, including 6P (Performance) + 8E (Efficency) + 2 LPE (Low Power Efficient) cores to help priotize tasks. Desktop NPUs come later. A life after LGA 1700!
Ryzen gets another G
AMD is starting in the desktop segment, a different approach. The star player is the successor to the 5700G from the AM4 generation. Meet the 8700G, said to offer graphics that compete with a low-end dGPU. It offers 8 cores of equal strength as well as the NPU. The 5700G was wildly popular with pre-built PCs. We’ll see whether the 8700G will pair similarly. AM5 boards are just getting started.
But why NPU?
Integrating the NPU could means local Ai without a dGPU and the ability to increase Ai capacity at minimal cost. DDR5 sticks are far less costly than VRAM in a new graphics card. The Ai-capable computer could become mainstream in 2-5 years. Intel and AMD are also playing catchup. Apple introduced its NPU competitor in 2020 with the M1, but in the PC world, RAM is cheaper.
Many aspects of PC hardware have remained stagnant for years. Many of the most critical machine specifications have not changed dramatically in nearly a decade. Here we’ll explore how and why.
RAM Offerings
- 15″ MacBook Pro, 2014: 16 GB
- 16″ MacBook Pro, 2022: 16 GB
Despite the 8 year gap, both Apple’s machines offered the same amount of RAM and storage in base configuration. For most users, there was no compelling technical reason to offer more RAM in base model. There’s nothing shameful about what the base models can do.
VRAM Offerings
- AMD RX 480, 2016: 8 GB
- AMD RX 5700, 2019: 8 GB
- NVidia GTX 1070, 2016: 8 GB
- NVidia RTX 3070, 2020: 8 GB
At mid-range, graphics cards offered 8 GB for several years, primarily in line with 1080P and 1440P’s memory requirements while gaming. 12 and 16 GB cards will soon become the norm.
Core Counts
- i7-2700k, 2011: 4 cores
- i7-7700k, 2017: 4 cores
- Ryzen 7 1700x, 2017: 8 cores
- Ryzen 7 7700x, 2022: 8 cores
AMD’s 8-core Ryzen and FX predecessor triggered Intel to double their core counts. This is probably more about competition than technical need, but we are still glad it happened. Kudos AMD.
How much RAM will we actually need? A goldfish eventually grows to the size of the bowl. Might local LLMs do the same? It’s certainly possible that 64 GB or 128 GB could become preferred. Most of our current models aim to fit within the 8 GB VRAM limit of a typical graphics card. 12 GB is rapidly becoming more common in desktop, but the NPU could allow models to grow rapidly. The NPU means that main memory could sustain the load of the model, no dGPU needed.
What about NVidia? When the NPU goes mainstream, the expensive NVidia graphics card is potentially no longer necessary for local inference. But we’re not too worried, considering the cloud will likely remain lucrative for NVidia.
What about Apple? The 16 GB models of the M1, M2, and M3 are already well-suited to local inference, but Apple doesn’t allow memory upgrades after purchase. As PCs push to 64 or 128 GB, Apple hardware offers fewer, more expensive options. M3 Max costs $4999, quite steep. Apple’s 64 GB upgrade adds $1000. On PC side, Crucial offers <$200 for DDR5 (2×32 GB) in SODIMM form (at current prices). 16 GB is enough for now, but for how long?
Could the NPU justify increased base memory capacity? Can users and PC manufacturers push the norm to 32 or 64 GB using the NPU and local inference to justify the hardware? Are we on the verge of breaking the stagnation? One thing is certain: Intel and AMD are happy to sell us new machines just in case the future is local.
Disclaimer: We own NVDA stock.