The $100B memory war: Inside the battle for AI’s future

Summary

The generative-AI boom has exposed a new bottleneck: memory bandwidth. Datacentres full of GPUs are now starved for the ability to move data fast enough; without that, raw FLOPS are wasted. The article explains why High Bandwidth Memory 4 (HBM4) is pivotal — offering roughly double the bandwidth of HBM3 and much larger capacity per stack — and how that will reshape training times, inference cost and vendor fortunes.

JEDEC’s HBM4 standard targets ~8 Gbps per pin across a 2,048-bit interface (about 2 TB/s per stack) and supports taller 16-high stacks delivering up to 64 GB per module. These advances improve performance and energy efficiency but are extremely hard to manufacture at scale. Only SK hynix, Micron and Samsung currently have the 3D-stacking expertise to pull it off.

SK hynix is the front-runner with early samples, JEDEC-beating speeds and strong ties to Nvidia. Micron has surged from near-zero to a meaningful second source with HBM3E and HBM4 samples, while Samsung has lagged but is catching up and is strategically important to AMD. 2026 looks set to be the decisive year for volume supply and industry alignment.

Key Points

Memory bandwidth, not processor flops, is the primary bottleneck for large-scale generative AI workloads.
HBM4 promises roughly double HBM3 bandwidth and up to 64 GB per stack via 16-high stacking.
HBM4 improves energy efficiency by allowing lower I/O and core voltages, directly benefiting AI training and inference costs.
Only three vendors — SK hynix, Micron and Samsung — currently have credible paths to HBM4 volume production.
SK hynix is the market leader and early HBM4 front-runner, already sampling high-speed parts and aligned closely with Nvidia.
Micron moved rapidly into the HBM market, shipping HBM4 samples and securing major pre-booked orders for 2026 supply.
Samsung struggled with yields on cutting-edge DRAM nodes but is accelerating qualification and is strategically tied to AMD (and OpenAI) supply deals.
2026 will reveal who can actually deliver HBM4 at scale — winners will shape GPU and accelerator roadmaps, while losers may force customers to revise plans.

Context and relevance

This is crucial reading for anyone tracking AI infrastructure, semiconductor supply chains or cloud economics. The article ties memory technology directly to model training speed, inference cost per query and datacentre power consumption — all core determinants of who can afford to run the next generation of large language models and recommendation systems.

For hardware vendors (Nvidia, AMD, Broadcom), cloud providers and hyperscalers, HBM4 supply is a strategic constraint: shortages or a single dominant supplier could skew pricing, availability and product roadmaps. For researchers and enterprises, faster and denser memory shortens time-to-model and reduces operational expense. Geopolitics and industrial policy (sourcing rules, sanctions) also colour the race, amplifying supply risk.

Why should I read this?

Short answer: because memory — not raw chip speed — will decide who wins the next wave of AI. If you care about which vendors will dominate GPUs and AI accelerators, or how quickly big models will be trained (and at what cost), this is the supply-chain battlefield that matters. We’ve done the digging: this piece explains the technical differences, who’s ahead, and why 2026 is make-or-break. Read it if you want to understand where the money and risk are actually heading.

Source

Source: https://go.theregister.com/feed/www.theregister.com/2025/10/16/race_to_supply_advanced_memory/