The Ultra Ethernet Consortium (UEC) was officially established on July 19, 2023. It is a new organization sponsored by the Linux Foundation and its Joint Development Foundation. UEC aims to go beyond existing Ethernet capabilities such as Remote Direct Memory Access (RDMA) and RDMA over Converged Ethernet (RoCE) to provide a high-performance, distributed and lossless transport layer optimized for high-performance computing and artificial intelligence. It takes direct aim at its rival’s transmission protocol InfiniBand.
Ultra Ethernet Consortium
UEC’s founding members include AMD, Arista, Broadcom, Cisco, Eviden, HPE, Intel, Meta and Microsoft, all with decades of experience in large-scale deployment of networking, artificial intelligence, cloud and high-performance computing.
Founding members
Why does Ethernet need UEC?
How is UEC different from the current Ethernet?
Artificial intelligence and high-performance computing bring new challenges to networks, such as the need for greater scale, higher bandwidth density, multipath, rapid response to congestion, and interdependence on the execution of individual data flows (where tail latency is a key point to consider). The UEC specification is designed to bridge these gaps and provide larger scale networking required for these workloads. UEC targets a complete communications stack that solves technical problems across multiple protocol layers and provides functionality that is easy to configure and manage.
How is UEC different from other current protocols?
Existing protocols may address some aspects of the above problems (such as legacy congestion management), but since they are designed for general networks, they lack features critical to artificial intelligence and high-performance computing, such as multipath and easy configuration. Existing protocols may also be vulnerable in relevant scenarios. With UEC members’ rich experience in deploying artificial intelligence and in high-performance computing workloads, UEC will deliver a compelling and comprehensive solution that brings new hardware and software products unavailable with Ethernet or any other network technology today.
What does UEC Plan to Do?
UEC will provide an open, interoperable, high-performance full communications stack architecture based on Ethernet to meet the growing network needs of large-scale artificial intelligence and high-performance computing. From the physical layer to the software layer, UEC plans to make changes to multiple layers of the Ethernet stack. “This is not about revolutionizing Ethernet,” said UEC President Dr. J Metz. “It’s about tuning Ethernet to make it more efficient for workloads with specific performance requirements. We’re looking at every layer from physical to software to find the best ways to improve efficiency and performance at scale.” Metz noted that there is no shortage of network standards and organizations for Ethernet today, and while the IEEE has taken a major role, the UEC focuses on more than the physical transport layer that the IEEE typically focuses on. The UEC’s goal is to study all the elements needed to improve Ethernet and then work with relevant standardization organizations and technical groups to implement these improvements. The Consortium will work to maintain and promote Ethernet interoperability while minimizing changes to the communications stack. The technical goal of UEC is to develop specifications, APIs, and source code to define:
- Protocols, electrical and optical signal characteristics, application program interfaces/data structures for Ethernet communications.
- Link-level and end-to-end network transport protocols that extend or replace existing link and transport protocols.
- Link-level and end-to-end congestion, telemetry, and signaling mechanisms, are all suitable for artificial intelligence, machine learning, and high-performance computing environments.
- Software, storage, management, and security architecture to support a variety of workloads and operating environments.
RDMA vs. UEC Transmission
To improve Ethernet, the UEC proposed the UEC transmission protocol. Metz said UEC transport is being developed to provide better Ethernet transport than current RDMA (which still supports RDMA), retaining the advantages of Ethernet/IP while delivering the performance required for AI and HPC applications. UEC transport is a new form close to the transport layer with some semantic tuning and congestion notification protocol and enhanced security features. UEC will provide more flexible transport that does not require lossless networking, allowing features such as multipath and out-of-order packet transport required for many-to-many AI workloads. UEC transmission protocol:
- An open protocol specification designed from the ground up to run on IP and Ethernet
- With multipath, packet spray transmission, fully utilizing the AI network without causing congestion or head-of-line blocking, without the need for centralized load balancing algorithms and routing controllers
- Incast management mechanism to control fan-in on the final link to the target host with minimal packet loss
- Efficient rate control algorithm allows transmission to quickly increase to line speed without incurring the performance loss of competing streams
- API for out-of-order packet delivery with the option to complete messages in order, maximizing network and application concurrency and minimizing message latency
- Scalable future network supporting 1,000,000 endpoints
- Performance and optimal network utilization without the need for network – and workload-specific tuning of congestion algorithm parameters
- Designed to enable wire-rate performance for 800G, 1.6T and future faster Ethernet on commodity hardware
The Road Ahead for UEC
Looking ahead, the workloads and networking needs of AI and high-performance computing are expected to increasingly overlap. Taking into account the different sensitivities to bandwidth and latency, the UEC specification will provide two profiles – one optimized for AI and the other optimized for HPC.
Ensuring interoperability through plug-in testing and compliance testing will be a future focus, said Uri Elzur, Chairman of the UEC Technical Advisory Committee. UEC’s goal is to ensure its openness and interoperability. The UEC draft specification will be released soon and be open for use.
Related Products:
- NVIDIA MCA7J60-N004 Compatible 4m (13ft) 800G Twin-port OSFP to 2x400G OSFP InfiniBand NDR Breakout Active Copper Cable $800.00
- NVIDIA MCA4J80-N004 Compatible 4m (13ft) 800G Twin-port 2x400G OSFP to 2x400G OSFP InfiniBand NDR Active Copper Cable $650.00
- NVIDIA MCA4J80-N003-FTF Compatible 3m (10ft) 800G Twin-port 2x400G OSFP to 2x400G OSFP InfiniBand NDR Active Copper Cable, Flat top on one end and Finned top on other $600.00
- NVIDIA MFS1S00-H005V Compatible 5m (16ft) 200G InfiniBand HDR QSFP56 to QSFP56 Active Optical Cable $405.00