**HBM3 Gen 2: Improving Memory Capacity for Generative AI**
Micron has developed a solution to address the increasing need for memory capacity in generative AI applications. The company has introduced HBM3 Gen 2, which offers significant improvements over previous versions of HBM. With 24GB of capacity in an 8-high package and over 1.2 TB/s of bandwidth, this new revision of HBM3 is poised to help Micron gain market share in the industry.
Micron’s Market Share and Competitors
Currently, Micron holds approximately 10% of the market share in the HBM industry. SKHynix is the leader with a 50% market share, followed by Samsung with 40%, according to Trendforce. These companies provide HBM2e and HBM3 memory options for high-end GPUs, with capacities of up to 80GB and bandwidths of 2 TB/s and 3 TB/s, respectively.
Micron’s Advantage
Micron’s implementation of HBM3 Gen 2 offers several advantages over its competitors. The new 8-high cube format of HBM3 Gen 2 provides 50% more capacity than HBM3 in a smaller footprint, measuring 11mmx11mm. Furthermore, Micron plans to release a 12-high package in 2024, which will increase the capacity to 32GB per cube.
Micron also claims that its HBM3 Gen 2 product is the fastest and highest capacity option on the market. For example, Micron boasts a bandwidth of over 1.2 TB/s, exceeding the 1 TB/s offered by SK Hynix’s HBM3E. This faster performance and higher capacity make Micron’s HBM3 Gen 2 a compelling choice for companies seeking to improve their generative AI applications.
The Importance of Memory Capacity for Generative AI
Generative AI, particularly applications like ChatGPT, relies heavily on memory for processing large language models. To ensure optimal performance, GPT3 and GPT4 models require 8 or 16 GPUs, each equipped with 80GB of HBM. However, these GPUs are underutilized for inference processing, with only approximately 20% of their capacity being used. The primary purpose of having multiple GPUs is to access enough HBM to accommodate the massive models used in generative AI.
Addressing the Memory Problem
The industry recognizes the need to solve the memory problem in generative AI to make it a more cost-effective solution. While there are alternative solutions being explored, such as the attachment of more HBM stacks to GPUs pioneered by Eliyan, faster and larger HBM chips are currently the easiest for companies like NVIDIA to adopt.
**Conclusions**
Micron’s new HBM3 Gen 2 offers significant improvements in memory capacity for generative AI applications. With 50% more capacity than HBM3 and a smaller footprint, Micron’s HBM3 Gen 2 is a promising option for companies looking to enhance the performance and reduce the cost of generative AI training and inference processing.
**Disclosures**
This article presents the opinions of the authors and does not provide advice to purchase or invest in the mentioned companies. Cambrian AI Research works with semiconductor firms, including Blaize, Cadence Design, Cerebras, D-Matrix, Eliyan, Esperanto, FuriosaAI, Graphcore, GML, IBM, Intel, Mythic, NVIDIA, Qualcomm Technologies, Si-Five, SiMa.ai, Synopsys, and Tenstorrent. The authors have no investment positions in the companies discussed in this article and do not intend to initiate any in the near future. For more information, please visit Cambrian AI Research’s website at [https://cambrian-ai.com/](https://cambrian-ai.com/).
GIPHY App Key not set. Please check settings