AI Computing Hardware: ConnectX-8 SuperNIC

Product Overview

The ConnectX-8 SuperNIC is NVIDIA’s seventh-generation smart network interface card designed for next-generation AI computing clusters, large-scale data centers, and high-performance computing (HPC) scenarios. It deeply integrates network acceleration and computational offloading capabilities, providing ultra-high-speed support for 400GbE/800GbE. Through hardware-level protocol offloading and GPU-NIC co-optimization, it significantly reduces network latency and enhances throughput efficiency, offering ultra-low latency and lossless network transmission capabilities for AI training, inference, and distributed storage scenarios.

ConnectX-8 SuperNIC

Software Protocols and Acceleration Functions

ConnectX-8 SuperNIC optimizes full-stack network performance through the deep collaboration of the software protocol stack and hardware acceleration engine:

Protocol Support

  • RDMA/RoCEv2: Based on Converged Ethernet for Remote Direct Memory Access, achieving zero-copy data transfer with latency as low as sub-microseconds.
  • GPUDirect Technology: Supports GPUDirect RDMA and GPUDirect Storage, enabling direct GPU-to-storage/NIC data interaction, bypassing the CPU.
  • NVIDIA SHARPv3: Aggregated communication hardware acceleration supporting AllReduce, Broadcast, and other operations to enhance AI training efficiency.
  • TLS/IPsec Hardware Offload: Supports full traffic encryption and decryption without performance loss.

Software Ecosystem

  1. DOCA 2.0 (Data Center Infrastructure-on-a-Chip Architecture): Provides an API-driven development framework supporting user-defined data plane acceleration functions (e.g., DPU collaborative orchestration).
  2. Deep Integration with the CUDA Ecosystem: Optimizes multi-GPU cross-node communication efficiency through the NCCL library.
Software Protocols and Acceleration Functions

Hardware Architecture and Connectivity Design

Host Interface

PCIe 5.0 x16, theoretical bandwidth of 128GB/s, fully unleashing 400G/800G network performance.

Network Interface

Supports single-port 800GbE OSFP112 or dual-port 400GbE QSFP112 flexible configurations.

Backward compatible with 200GbE/100GbE speeds, adapting to existing infrastructure.

On-Chip Acceleration Engine

Integrates dedicated ASICs supporting flow table management, congestion control (DCQCN), packet verification, and other full hardware offloads.

400G QSFP112

Networking Architecture and Connectivity

ConnectX-8 SuperNIC supports multi-layer CLOS architecture networking, building high-bandwidth, non-blocking AI computing clusters

Single Node Connection

Each server deploys 1-2 ConnectX-8 NICs, interconnected with the host through PCIe 5.0.

Each port connects directly to the leaf switch via QSFP-DD optical fiber, forming dual uplink redundancy.

Cluster Networking

  1. Leaf Switch: NVIDIA Quantum-3 series (800G) or Spectrum-4 series (400G), supporting RoCEv2 and adaptive routing.
  2. Spine Switch: Fully interconnected with leaf switches through 800G high-speed ports, providing non-blocking bandwidth.
  3. Spine-Leaf Architecture
  4. GPU Direct Networking: Multi-node GPUs achieve cross-node memory direct access via RDMA, forming a distributed training cluster.
Networking Architecture and Connectivity

Optical Modules and Fiber Selection

Optical Modules

800G Scenarios: OSFP112 800G-SR8/VR8 (multi-mode, 100m) / 800G-DR8 (single-mode, 500m).

OSFP-800G-DR8D

400G Scenarios: QSFP112 400G-VR4/SR4/DR4.

Fiber Types:

QSFP112 400G SR4

Multi-Mode (MMF): OM5/OM4 (850nm, supporting 400G-SR8 up to 100m).

Single-Mode (SMF): OS2 (1310nm/1550nm, supporting long-distance transmission over 10km).

Fiber Types
product spec

Compatible Switches and Ecosystem Collaboration

NVIDIA Switches:

Quantum-3: 800G InfiniBand switch supporting SHARPv3 aggregated communication acceleration.

Spectrum-4: 400G Ethernet switch supporting RoCEv2 and intelligent traffic scheduling.

Third-Party Switches:

Arista 7800R3 (800G), Cisco Nexus 92300YC (400G): Ensure support for RoCEv2 and ECMP load balancing.

Compatible Switches and Ecosystem Collaboration

Leave a Comment

Scroll to Top