AI hardware · Layer
Networking and interconnect
Training a frontier model uses thousands to tens of thousands of accelerators that must act as one machine. The links between them, inside a server and across the data center, increasingly determine real-world throughput.
Key facts
- Inside a node, proprietary links like NVIDIA's NVLink connect GPUs at very high bandwidth.
- Across racks, InfiniBand and high-speed Ethernet carry traffic between nodes; the choice affects scaling efficiency.
- Optical transceivers and switches are a fast-growing and supply-constrained part of the build.
- At frontier scale, communication overhead between chips, not the chips themselves, often caps how fast a model trains.
Where it bottlenecks
Switch silicon and optical transceivers are supply-tight, and network topology can bound cluster scaling more than accelerator count.
Who dominates it
NVIDIABroadcomMarvellArista
Companies in this layer
Broadcom
United StatesSupplier of networking switch silicon and a partner for custom AI accelerators (ASICs).