Tech giants form an industry group to help develop next-gen AI chip components

May 30, 2024 ndowd

Intel, Google, Microsoft, Meta and other tech heavyweights are establishing a new industry group, the Ultra Accelerator Link (UALink) Promoter Group, to guide the development of the components that link together AI accelerator chips in data centers.

Announced Thursday, the UALink Promoter Group — which also counts AMD (but not Arm just yet), Hewlett Packard Enterprise, Broadcom and Cisco among its members — is proposing a new industry standard to connect the AI accelerator chips found within a growing number of servers. Broadly defined, AI accelerators are chips ranging from GPUs to custom-designed solutions to speed up the training, fine-tuning and running of AI models.

“The industry needs an open standard that can be moved forward very quickly, in an open [format] that allows multiple companies to add value to the overall ecosystem,” Forrest Norrod, AMD’s GM of data center solutions, told reporters in a briefing Wednesday. “The industry needs a standard that allows innovation to proceed at a rapid clip unfettered by any single company.”

Version one of the proposed standard, UALink 1.0, will connect up to 1,024 AI accelerators — GPUs only — across a single computing “pod.” (The group defines a pod as one or several racks in a server.) UALink 1.0, based on “open standards” including AMD’s Infinity Fabric, will allow for direct loads and stores between the memory attached to AI accelerators, and generally boost speed while lowering data transfer latency compared to existing interconnect specs, according to the UALink Promoter Group.

The group says it’ll create a consortium, the UALink Consortium, in Q3 to oversee development of the UALink spec going forward. UALink 1.0 will be made available around the same time to companies that join the consortium, with a higher-bandwidth updated spec, UALink 1.1, set to arrive in Q4 2024.

The first UALink products will launch “in the next couple of years,” Norrod said.

Glaringly absent from the list of the group’s members is Nvidia, which is by far the largest producer of AI accelerators with an estimated 80% to 95% of the market. Nvidia declined to comment for this story. But it’s not tough to see why the chipmaker isn’t enthusiastically throwing its weight behind UALink.

For one, Nvidia offers its own proprietary interconnect tech for linking GPUs within a data center server. The company is probably none too keen to support a spec based on rival technologies.

Then there’s the fact that Nvidia’s operating from a position of enormous strength and influence.

In Nvidia’s most recent fiscal quarter (Q1 2025), the company’s data center sales, which include sales of its AI chips, rose more than 400% from the year-ago quarter. If Nvidia continues on its current trajectory, it’s set to surpass Apple as the world’s second-most valuable firm sometime this year.

So, simply put, Nvidia doesn’t have to play ball if it doesn’t want to.

As for Amazon Web Services (AWS), the lone public cloud giant not contributing to UALink, it might be in a “wait and see” mode as it chips (no pun intended) away at its various in-house accelerator hardware efforts. It could also be that AWS, with a stranglehold on the cloud services market, doesn’t see much of a strategic point in opposing Nvidia, which supplies much of the GPUs it serves to customers.

AWS didn’t respond to TechCrunch’s request for comment.

Indeed, the biggest beneficiaries of UALink — besides AMD and Intel — seem to be Microsoft, Meta and Google, which combined have spent billions of dollars on Nvidia GPUs to power their clouds and train their ever-growing AI models. All are looking to wean themselves off of a vendor they see as worrisomely dominant in the AI hardware ecosystem.

In a recent report, Gartner estimates that the value of AI accelerators used in servers will total $21 billion this year, increasing to $33 billion by 2028. Revenue from AI chips will hit $33.4 billion by 2025, meanwhile, projects Gartner.

Google has custom chips for training and running AI models, TPUs and Axion. Amazon has several AI chip families under its belt. Microsoft last year jumped into the fray with Maia and Cobalt. And Meta is refining its own lineup of accelerators.

Elsewhere, Microsoft and its close collaborator, OpenAI, reportedly plan to spend at least $100 billion on a supercomputer for training AI models that’ll be outfitted with future versions of Cobalt and Maia chips. Those chips will need something to link them — and perhaps it’ll be UALink.

source