Loading
Loading
From the H100 to Blackwell.
The silicon that eats power and spits out intelligence.
The current gold standard. The chip that started the generative AI boom. Built on TSMC 4N.
The new king. Two reticle-sized dies connected by a 10TB/s link. Designed to run trillion-parameter models.
The contender. Massive memory capacity and bandwidth advantages. Ideal for inference on large models.
The cloud native. Built for massive pod-scale training with proprietary optical interconnects (ICI).
A CPU (The Ferrari) is designed to do one complex thing very quickly. It's great for running your operating system or opening an app (sequential logic).
A GPU (The Bus) is slower at any single task, but it can transport 1,000 people (pixels or numbers) at the exact same time.
Neural Networks are just massive matrices of numbers. To update them, you need to do billions of tiny math problems simultaneously. This is why the "Bus" won.
| Model | Vendor | Memory | Bandwidth | TFLOPS (FP8) | TDP |
|---|---|---|---|---|---|
| H100 SXM | NVIDIA | 80 GB (HBM3) | 3.35 TB/s | 3,958 (FP8) | 700W |
| H200 SXM | NVIDIA | 141 GB (HBM3e) | 4.8 TB/s | 3,958 (FP8) | 700W |
| B200 | NVIDIA | 192 GB (HBM3e) | 8.0 TB/s | 20,000 (FP4) | 1000W |
| MI300X | AMD | 192 GB (HBM3) | 5.3 TB/s | 5,229 (FP8) | 750W |
| TPU v5p | 95 GB (HBM) | 4.8 Tbps (ICI) | 459 (BF16) | - | |
| Trainium 2 | AWS | 96 GB (HBM3e) | 2.9 TB/s | 1,299 (FP8) | - |