Breaking down the Five Key Components of an AI Server

The DGX A100 resembles a typical home computer and can be divided into five main hardware modules:

server
  1. Fan Module: Located at the front, the fan module consists of eight fans, which align with the standard 8U configuration found in traditional servers.
  2. Hard Drives: Positioned below the front fan module, the DGX A100 houses eight 3.84TB hard drives, providing a total internal storage capacity of 30TB.
  3. GPU Board Tray: The rear section of the AI server is where the critical components come together. The GPU board tray is the heart of the system and differentiates AI servers from regular ones. In the DGX A100 architecture, the GPU board tray includes GPU components, module boards, and NVSwitch components—all of which involve various types of PCBs.
  4. CPU Motherboard Tray: This part serves as the core component for all servers, including both regular and AI servers. It contains the CPU motherboard, system memory, network cards, and PCIE switches. The CPU motherboard and system memory significantly contribute to the overall PCB usage.
  5. Power Module: The DGX A100’s rear section also features six power modules, internally utilizing thick copper PCBs.

From a functional perspective, we can categorize the PCB value calculation for AI servers into three parts:

nvidia dgx a100
DGX A100 front and rear

GPU Board Components

Total Value of $12,000, Carrier Boards (52%) and PCBs (48%) The GPU board consists of four main components: GPU carriers, NVSwitch, OCP Accelerator Modules (OAMs), and Unit Baseboards (UBBs).

GPU Board Components

GPU Carriers: The NVIDIA A100 GPUs and DRAM utilize advanced 2.5D/3D packaging technology. The carrier boards, which measure 70x70mm to 100x100mm and have 14 to 16 layers, correspond directly to the number of GPUs. Considering the DGX A100’s configuration with 8 GPUs, each AI server requires 8 GPU carrier boards. Industry research indicates that the value of a single GPU carrier board is approximately $100 (equivalent to ¥650 RMB), resulting in a total value of $5,200 per server.

2.5 3D packaging and carrier

NVSwitch: NVSwitch, based on the NVLink standard, facilitates communication between GPUs. The carriers for NVSwitch are similar to those for GPU carriers, with simpler manufacturing requirements. Their key role is handling high-speed data transfer. Research suggests that the value of a single NVSwitch is around $30 (¥195 RMB). For a DGX A100 with 6 NVSwitches, the total value is $1,170.

nvswitch

OAM (OCP Accelerator Module): OAMs, also known as GPU accelerator modules, carry the GPU chips. The number of OAMs corresponds directly to the number of GPUs (8 in the case of DGX A100). Based on dimensions similar to the PCIE version (267.7mm x 111.15mm), the estimated area of an OAM is approximately 0.03 square meters. OAMs require specific PCB types due to high-speed signal transmission. For the DGX A100 SXM version, involves 20 layers, Ultra Low Loss CCL material, and 4-layer HDI technology, resulting in a unit price of $12,000 per square meter. The PCIE version uses 14 layers, Ultra Low Loss, and high Tg FR4 CCL material, with a corresponding unit price of $7,000 per square meter. Overall, the OAMs in a high-end AI server (configured like the DGX A100) have a unit value of $2,880.

OCP Accelerator Module

UBB (Unit Baseboard): UBBs are PCBs used to mount the entire GPU platform. Each AI server requires one UBB. Based on DGX A100 specifications and industry research, we estimate the UBB area to be approximately 0.30 square meters. These boards use 26-layer through-hole PCBs with Ultra Low Loss CCL material, resulting in a unit price of $10,000 per square meter. The total value of UBBs per server is $3,000.

The NVIDIA DGX A100 GPU board comprises four main parts: GPU carriers, NVSwitch, OCP Accelerator Modules (OAMs), and Unit Baseboards (UBBs). When combined, these components occupy a total PCB area of 0.624 square meters, corresponding to a per-server value of $12,250. Specifically:

GPU carriers contribute $6,370 (52% of the total value).

PCB-level components contribute $5,880 (48% of the total value).

CPU Motherboard Components

The CPU motherboard assembly includes CPU carriers, CPU mainboards, and functional accessory boards. These accessories encompass system memory cards, network cards, expansion cards, and storage OS driver boards. Here’s the breakdown:

DGX A100 CPU motherboard

CPU Carriers: Similar in specifications to GPU carriers, each CPU carrier has an estimated value of $100. With DGX A100 configured with 2 CPUs, the total value per server is approximately $1,300.

CPU Mainboard: Responsible for housing the CPU chip, PCIE Switch chip, TPM module, and various functional accessory cards, the CPU mainboard adheres to the design of the 64-core AMD Rome CPU and PCIE 4.0 bus standard. It uses 10-12 layers of low-loss CCL material and a through-hole design. The estimated area of the CPU mainboard is 0.38 square meters, resulting in a per-server value of $1,140.

Functional Accessory Boards: These boards serve various purposes:

CPU memory cards (32 units, totaling 2TB RAM) have a standard size of approximately 0.004 square meters per card.

Network cards (Mellanox ConnectX series) come in 10 variants (8 single-port 200Gb/s IB and 2 dual-port 200Gb/s Ethernet). Each card occupies an area of about 0.012 square meters.

Riser cards (for expanding PCIE interfaces) cover an area of approximately 0.01 square meters.

Storage OS driver boards (housing two 1.92TB M.2 NVMe drives) occupy a similar area.

The total area for functional accessory boards is 0.27 square meters, corresponding to a value of approximately $405 per server.

Summary for CPU Motherboard: The total PCB area for the NVIDIA DGX A100 CPU motherboard assembly is 0.662 square meters, with a per-server value of approximately $2,845. The breakdown is as follows:

Carrier-level components contribute 46% of the total value.

PCB-level mainboard components contribute 40% of the total value.

PCB-level functional accessory components contribute 14% of the total value.

Other Components

Total Value per Unit 226 CNY Apart from the GPU board assembly and CPU module assembly, other components include the power supply, hard drives, and front control console board. According to industry research, these components primarily use 6-10 layers of FR4/Mid Loss grade CCL (copper-clad laminate) with a unit price ranging from 1000 to 1500 CNY per square meter. Referring to the DGX A100 specifications, we calculate the usage and area as follows:

  • Power Supply: Considering that the DGX A100 is equipped with 6 power supplies, we estimate that the individual PCB area for each power supply is 0.019 square meters, based on the specifications of the Delta Electronics 2200W server power supply (model DPS-2200-AB-2) measuring 73.5 x 265.0 mm.
  • Hard Drives: With 8 hard drives in the DGX A100, we estimate that the PCB area for each drive is 0.008 square meters, following industry-standard 3.5-inch drives.
  • Front Control Console Board: This board is primarily used for controlling external devices and is placed between the 8 hard drives. Based on industry research, we estimate its area to be approximately 0.010 square meters.

Combining the GPU board assembly, CPU module assembly, and other components, we estimate that the total PCB area for the DGX A100 is 1.474 square meters, with a unit value of 15,321 CNY. Specifically:

GPU board assembly contributes 12,000 CNY per unit, accounting for 80% of the total value.

CPU module assembly has a unit value of 2,845 CNY, representing 19% of the total.

Other components contribute 226 CNY per unit, making up 1% of the total value.

In terms of board classification, the carrier board level has a unit value of 7,670 CNY (50.1%), while the PCB board level contributes 7,651 CNY (49.9%).

Leave a Comment

Scroll to Top