Technology comparison of HBM, HBM2, HBM3 and HBM3e

HBM, or high-bandwidth memory, consists of multiple layers of DRAM Die stacked vertically. Each layer of Die is connected to the logic Die through TSV (through silicon via) technology, allowing 8-layer and 12-layer Die to be packaged in a small space. This achieves compatibility between small size and high bandwidth and high transmission speed, making it the mainstream solution for high-performance AI server GPU memory.

The current extended version of HBM3, HBM3E, provides a transmission speed of up to 8Gbps and 16GB of memory. It was first released by SK Hynix and will be mass-produced in 2024.

The main application scenario of HBM is AI servers. The latest generation of HBM3e is installed on the H200 released by NVIDIA in 2023. According to Trendforce’s data, AI server shipments reached 860,000 units in 2022, and it is expected that AI server shipments will exceed 2 million units in 2026, with a compound annual growth rate of 29%.

The growth in AI server shipments has catalyzed an explosion in HBM demand, and with the increase in the average HBM capacity of servers, it is estimated that the market size will be approximately US$15 billion in 25 years, with a growth rate of more than 50%.

HBM suppliers are mainly concentrated in the three major storage manufacturers SK Hynix, Samsung and Micron. According to Trendforce’s data, SK Hynix’s market share is expected to be 53% in 2023, Samsung’s market share is 38%, and Micron’s market share is 9%. The main changes in HBM process reflect on CoWoS and TSV.

HBM principle diagram

HBM principle diagram

HBM1 was first launched by AMD and SK Hynix in 2014 as a competitor to GDDR. It is a 4-layer die stack that provides 128GB/s bandwidth and 4GB of memory, which is significantly better than GDDR5 of the same period.

HBM2 was announced in 2016 and officially launched in 2018. It is a 4-layer DRAM die, but now it is mostly an 8-layer die, providing 256GB/s bandwidth, 2.4Gbps transmission speed, and 8GB memory; HBM2E was proposed in 2018 and officially launched in 2020. It has made significant improvements in transmission speed and memory, providing 3.6Gbps transmission speed and 16GB memory. HBM3 was announced in 2020 and officially launched in 2022. The number of stacked layers and management channels have increased, providing a transmission speed of 6.4Gbps, a transmission speed of up to 819GB/s, and 16GB memory. HBM3E is an enhanced version of HBM3 released by SK Hynix, providing a transmission speed of up to 8Gbps, a capacity of 24GB, which is planned to be mass-produced in 2024.

HBM Evolution Paths of the Three Major Storage Manufacturers

HBM Evolution Paths of the Three Major Storage Manufacturers

HBM is widely used in AI server scenarios due to its high bandwidth, low power consumption, and small size. The application of HBM is mainly concentrated in high-performance servers. It was first implemented in the NVP100 GPU (HBM2) in 2016, and then applied to V100 (HBM2) in 2017, A100 (HBM2) in 2020, and H100 (HBM2e/HBM3) in 2022. The latest generation of HBM3e is installed on the H200 released by NVIDIA in 2023, providing faster speed and higher capacity for servers.

HBM suppliers are mainly concentrated in three major manufacturers: SK Hynix, Samsung, and Micron, with SK Hynix leading the way. The three major storage manufacturers are mainly responsible for the production and stacking of DRAM Dies, and are competing in technology upgrades. Among them, the world’s first HBM released by SK Hynix and AMD was the first to supply the new generation of HBM3E in 2023, establishing its market position first. It mainly supplies NVIDIA, and Samsung supplies other cloud manufacturers. According to TrendForce data, in 2022, SK Hynix’s market share was 50%, Samsung’s market share 40%, and Micron’s market share about 10%.

HBM’s changes in packaging technology are mainly in CoWoS and TSV.

1) CoWoS: DRAM Die is placed together on a silicon interposer and connected to the underlying substrate through a ChiponWafer (CoW) packaging process. That is, the chip is connected to the silicon wafer through a CoW packaging process, and then the CoW chip is connected to the substrate to integrate into CoWoS. Currently, the mainstream solution for integrating HBM and GPU is TSMC’s CoWoS, which achieves faster data transmission by shortening the interconnection length and has been widely used in computing chips such as A100 and GH200.

2) TSV: TSV is the core of achieving capacity and bandwidth expansion, forming thousands of vertical interconnections between the front and back of the chip by drilling holes through the entire thickness of the silicon wafer. In HBM, multiple layers of DRAM die are stacked and connected TVS and solder bumps, and only the bottom die can be connected to the storage controller, while the remaining dies are interconnected through internal TSVs.

Leave a Comment

Scroll to Top