Why it issues: Microsoft had been rumored to be engaged on {custom} silicon for its knowledge heart wants for years. Because it seems, the rumors have been true and this week the corporate unveiled not one however two Arm-based processors. The brand new chips will likely be built-in into Azure server farms beginning in early 2024, for use because the workhorses of AI companies like Microsoft Copilot.
This week, Microsoft introduced it has constructed two “homegrown” chips that may deal with AI and common computing workloads within the Azure cloud. The announcement was made on the Ignite 2023 convention and confirms earlier rumors concerning the existence of “Undertaking Athena” – a custom-designed Arm-based chip that would scale back Microsoft’s reliance on off-the-shelf {hardware} from distributors like Nvidia, particularly within the space of synthetic intelligence coaching and inference.
The primary chip known as the Microsoft Azure Maia 100 AI Accelerator and is the direct results of Undertaking Athena. As its prolonged title suggests, the Redmond big designed the chip particularly for operating massive language fashions resembling GPT-3.5 Turbo and GPT-4. Constructed on TSMC’s 5nm course of and that includes no fewer than 105 billion transistors, the brand new chip helps numerous MX knowledge varieties, together with sub-8-bit codecs for quicker mannequin coaching and inference instances.
For reference, Nvidia’s H100 AI Superchip has 80 billion transistors, and AMD’s Intuition MI300X has 153 billion transistors. That mentioned, we now have but to see any direct efficiency comparisons between the Maia 100 AI Accelerator and the present chips utilized by most firms constructing AI companies. What we do know is that every Maia 100 compute unit has an mixture bandwidth of 4.8 Terabits due to a {custom} Ethernet-based community protocol that permits for higher scaling and end-to-end efficiency.
Additionally learn: Goodbye to Graphics: How GPUs Got here to Dominate AI and Compute
It is also price noting that Microsoft developed the Maia 100 chip utilizing intensive suggestions from OpenAI. The 2 firms labored collectively to refine the structure and check GPT fashions. For Microsoft, it will assist optimize the effectivity of Azure’s end-to-end AI structure, whereas OpenAI will have the ability to practice new AI fashions which can be higher and cheaper than what is offered in the present day.
The second chip launched by Microsoft at Ignite known as the Cobalt 100 CPU. This one is a 64-bit, 128-core Arm-based processor based mostly on the Arm Neoverse Compute Subsystems and brings efficiency enhancements of as much as 40 p.c for extra common Azure computing workloads when in comparison with present era {hardware} present in business Arm-based servers. Cobalt 100-based servers will likely be used to energy companies like Microsoft Groups and Home windows 365, amongst different issues.
Rani Borkar, who’s the top of Azure infrastructure methods at Microsoft, says the corporate’s homegrown chip efforts construct on prime of 20 years of expertise in co-engineering silicon for Xbox and Floor. The brand new Cobalt 100 CPU permits the corporate to regulate efficiency and energy consumption on a per-core foundation and makes it attainable to construct a less expensive cloud {hardware} stack.
The fee a part of the equation is especially vital. Within the case of the Maia 100 AI Accelerator, Microsoft needed to provide you with a brand new liquid cooling answer and a brand new rack design that gives more room for energy and networking cables. That mentioned, the price of utilizing the brand new chip continues to be considerably decrease than utilizing specialised {hardware} from Nvidia or AMD.
Microsoft appears decided to make a Copilot “for everybody and all the pieces you do,” and that’s mirrored within the launch of Copilot for Home windows, GitHub, Dynamics 365, Microsoft Safety, and Microsoft 365. The corporate simply rebranded Bing Chat to “Microsoft Copilot,” so it is clear it desires to bolt ever extra superior AI fashions into each service it provides shifting ahead.
AI coaching and inference get costly quick, and operating an AI service is estimated to be as much as ten instances costlier than one thing like a search engine. Making {custom} silicon may additionally alleviate provide points and assist Microsoft get a aggressive benefit in a crowded panorama of AI cloud suppliers. Some like Amazon, Meta, and Google even have their very own homegrown silicon efforts for a similar causes, and firms like Ampere that when dreamed of turning into the go-to suppliers of Arm-based knowledge heart chips will little question be pressured to adapt to those developments in the event that they need to survive.
That mentioned, the Redmond firm says it should preserve utilizing off-the-shelf {hardware} within the close to future, together with the lately introduced H200 Tensor Core GPU from Nvidia. Scott Guthrie, who’s government vice chairman of the Microsoft Cloud + AI Group, says it will assist diversify the corporate’s provide chain and provides prospects extra infrastructure selections.