How can NVIDIA halve a rack's memory after the chip is designed?
The Vera CPU swaps soldered LPDDR for socketed SOCAMM modules, turning system memory into a configuration lever. That is how NVIDIA could halve a Vera Rubin rack's DRAM after the design was frozen — and why the June 2026 memory selloff misread the move.
Soldered memory locks capacity at the foundry; SOCAMM makes it a field-serviceable choice. Right-sizing CPU-side DRAM strips cost from an $8M rack and loosens a DRAM-allocation bottleneck, letting NVIDIA build more racks — the opposite of the demand-cooling story the selloff priced in.
What a SOCAMM actually is
SOCAMM — Small Outline Compression Attached Memory Module — is a way to package system DRAM as a small, removable card instead of soldering it to the board. It descends from the LPCAMM2 modules used in thin laptops, and it carries LPDDR5X: the same low-power DRAM that runs in phones and ultrabooks, prized for capacity and energy efficiency rather than raw bandwidth.
What makes it unusual is the connector. Instead of solder or the gold fingers of a DIMM, a SOCAMM presses the module against a grid of contacts — a compression connector. That keeps the electrical path short enough for LPDDR5X to run at near-soldered speeds, around 9,600 MT/s, while leaving the memory slot-based, hot-pluggable, and serviceable. You get the density and efficiency of mobile DRAM without giving up the ability to pull a module and replace it.
Soldered memory, then sockets
In Grace Hopper (GH200) and Grace Blackwell (GB200), the host CPU's system memory was LPDDR5X soldered onto the substrate around the chip. Fast and compact, but the configuration was frozen at the foundry: one failed memory die meant replacing the whole superchip board, and a customer could never buy a lean tray now and add capacity later.
The Vera CPU — NVIDIA's ARM host for the Rubin GPU — drops the solder for SOCAMM. System memory becomes a slotted, field-replaceable module for the first time in NVIDIA's custom-CPU line. Each Vera CPU carries eight SOCAMM slots; fully populated with 192 GB modules, that is 1.5 TB of system memory per CPU. The module vendors — SK Hynix, Samsung, Micron — build them at whatever density is available and economic.
Source: Vera Rubin SOCAMM memory report, June 2026
Why the density is a choice, not a redesign
The chip and the rack are designed around the slot interface and its electrical, power, and thermal envelope — not around a single module density. Any module that meets those signaling and thermal specs drops in. Populating eight 96 GB modules instead of eight 192 GB modules is an assembly and procurement decision made at build time, not a silicon or rack respin.
This is ordinary server practice — the same reason a DDR5 server can ship with different DIMM sizes. What is new is that NVIDIA's custom CPU now plays by those modular rules, so a line that read "1.5 TB" on the roadmap can ship as 768 GB without anyone touching the design.
The cut: 1.5 TB to 768 GB per CPU
On 4 June 2026, a report that most initial Vera Rubin systems would ship with 96 GB SOCAMMs instead of 192 GB modules — 768 GB per Vera CPU rather than 1.5 TB — sent Micron down roughly 10 percent and dragged Samsung and SK Hynix with it. At the rack level, projected system memory fell from about 55 TB to about 28 TB.
To a market priced for unconstrained demand, a halving of expected DRAM per rack looked like the AI capex cycle cracking. The cut is real. The interpretation was not.
Source: Vera Rubin SOCAMM memory report, June 2026
Two memory domains people keep conflating
A Vera Rubin rack has two completely separate memory systems. The Vera CPU's LPDDR5X over SOCAMM — the part that got cut — is high-volume mobile DRAM: commodity, lower-margin, and now socketed. The Rubin GPU's HBM4 — 288 GB per GPU, stacked on-package over advanced TSMC packaging — is the custom, high-margin part, and it was untouched.
The margin story for memory makers is written by HBM yields and allocations, not by the density of commodity LPDDR in the CPU tray. On the same day the SOCAMM cut spooked the market, Micron, SK Hynix, and Samsung all passed HBM4 certification for Rubin — the signal that actually moves their AI revenue.
Source: Vera Rubin SOCAMM memory report, June 2026
Strategic read
With a single NVL72-class rack approaching $8 million, system memory is one of the few line items a buyer can right-size. Most inference — chatbots, light agents — is bound by GPU HBM bandwidth, not CPU system memory; 768 GB per CPU is plenty, and operators can swap in denser modules later for the workloads that need it (large agentic loops, vector search). The socket turns a capex commitment into a pay-as-you-grow option.
Right-sizing also loosens a binding constraint: DRAM capacity cannot expand inside a cycle. Spending less LPDDR per rack lets NVIDIA build and sell more racks under the same tight memory supply — volume expansion, not demand cooling.
For Pere, the read is that the selloff confused a commodity-DRAM configuration choice with the secular HBM story. The binding memory constraint for the AI build-out remains HBM4, not the socketed LPDDR that happened to move the tape that morning.