**AMD’s CEO Highlights AI Opportunities at the Data Center & AI Technology Premiere Event**
AMD is making a push into the AI market, leveraging its success in high-performance computing and collaborations with hyperscalers. CEO Dr. Lisa Su believes that AI will become pervasive in computing, and she sees it as AMD’s largest long-term opportunity. With its acquisition of Xilinx and the development of a broad portfolio including CPUs, GPUs, and Adaptive compute elements, AMD is ready to make its mark in AI.
**Introducing the Instinct MI300X: AMD’s New AI Supercomputing Hybrid Processor**
At the recent Data Center & AI Technology Premiere Event, AMD announced the introduction of a new variant of its Instinct MI300 AI supercomputing hybrid processor called the MI300X. Unlike the original MI300, which features both CDNA3 GPU chiplets and a Genoa Zen-4 based Core Complex Die (CCD) chiplet, the MI300X is built with only GPU chiplets. It boasts 153B transistors and up to 192GB of HBM3 memory, making it capable of running large generative AI models with up to 80 billion parameters.
The MI300X also comes with an Open Compute Platform (OCP) rack design that supports eight GPUs, offering a total of 1.5TB of memory in one rack. It is expected to draw up to 750W of power. Its main competitor is Nvidia’s H100 GPU, which currently offers 80GB of memory. However, AMD’s CDNA3-based MI300 GPU lacks the acceleration for transform models that the Hopper GPU architecture provides.
The MI300X is set to be sampled in Q3 of 2023 and will go into production in Q4 of 2023. The original MI300, now called MI300A, is already sampling. Pricing details have not been announced.
**Building a Strong Software Ecosystem**
To succeed in the AI market, AMD recognizes the importance of a robust software ecosystem and strong partnerships. The company has been developing ROCm, an open-source alternative to Nvidia’s CUDA for GPU compute programming. ROCm currently supports AMD GPUs and is now in its 5th generation. While ROCm does not directly support other GPU programming stacks like Intel’s OneAPI and Khronos Group Sycl, AMD’s Versal ACAP AI processors use the Vitus software stack. However, there is currently no ROCm support for the Ryzen 7000 series processors.
At the event, AMD highlighted its collaboration with Meta’s PyTorch lead and discussed how PyTorch 2.0 supports ROCm on Linux. PyTorch is a popular programming language alternative to Google’s TensorFlow for neural net programming. AMD’s ROCm AI software stack has seen significant use in supercomputer systems, such as Europe’s Lumi supercomputer. AMD is also partnering with Hugging Face to optimize and run regression tests on its models for AMD’s Instinct GPUs, Ryzen CPUs, Epyc CPUs, and Alveo ACAP products.
**Summary: AMD’s Path Towards GPU-based AI Acceleration**
AMD has made progress in running AI software stacks on its GPUs, thanks to partnerships with Microsoft, Meta, and Hugging Face. However, it still has a way to go to match the pervasiveness of Nvidia’s CUDA. In the short term, AMD is hyper-focused on hyperscalers for its Instinct GPUs and large PC application providers for its Ryzen AI client processors. While there was no mention of a significant hyperscaler design win for the new MI300X at the event, there is evidence that hyperscalers are considering AMD’s GPUs as an alternative or supplement to Nvidia’s GPUs.
With a growing software stack and strategic partnerships, AMD hopes to break into GPU-based AI acceleration beyond its current HPC and supercomputer customer base, providing strong competition for Nvidia. As AMD continues to make advancements in AI and high-performance computing, it may not be too late for the company to establish itself as a leader in the AI market.
GIPHY App Key not set. Please check settings