close
close
migores1

Not everyone is impressed with Elon Musk’s new AI supercomputer

Elon Musk’s AI company xAI recently unveiled a new supercomputer called Colossus. And as the name suggests, it’s big.

The computer is an artificial intelligence training system that Musk says runs on 100,000 Nvidia H100 chips, a powerful graphics processing unit that has become critical to the AI ​​race.

To put this into perspective, Meta’s Llama 3 large language model was trained on 16,000 H100 chips. Meta said in March that it would continue to invest in its AI infrastructure by adding two new clusters of 24,000 chips.

In other words, Musk’s Colossus is powerful. And it might help he is catching up with the AI ​​industry leaders.

But some prominent tech leaders aren’t so sure.

LinkedIn co-founder Reid Hoffman told The Information, a tech publication, that the xAI supercomputer is just “table stakes” in the competitive field of generative AI.

According to The Information, Hoffman meant that Colossus only allowed xAI to catch up to other more advanced AI companies such as OpenAI and Anthropic.

Modular AI CEO Chris Lattner said during a talk at The Information’s AI Summit last week that Musk’s heavy reliance on expensive, finite Nvidia chips is also at odds with the billionaire’s effort to build his own GPU, called Dojo, The Information reported.

Meta, Microsoft, Alphabet and Amazon are developing their own AI chips, even as they continue to stock Nvidia GPUs.

“The difference is that Elon has been working at Dojo for many years,” Lattner told Business Insider in an email.

Musk expressed concern about the challenges of acquiring more of Nvidia’s highly sought-after chips and said his Dojo project would help reduce his company’s reliance on the chipmaker.

“We see a way to be competitive with Nvidia with Dojo,” Musk said during a Tesla earnings call in July. “We kind of have no choice.”

When he spoke about Colossus on X in early September, Musk said he aimed to double the size of the supercomputer to 200,000 chips within months.

He said the cluster was built in just 122 days — an impressive feat that no other company has matched, according to The Information.

It’s unclear whether Colossus runs 100,000 GPUs at the same time, which would require sophisticated networking technology and a lot of power.

“Musk previously said the 100,000-chip cluster was up and running at the end of June,” The Information reported. “But at the time, a local electric utility said publicly that xAI only had access to a few megawatts of power from the local grid.”

Last month, CNBC reported that an environmental advocacy group complained that xAI was running gas turbines to produce more power for the data center without permission.

The outlet reported that the Southern Environmental Law Center wrote in a letter to the local health department that xAI has installed and operates at least 18 unpermitted turbines “with more potentially on the way” to supplement its massive energy needs.

The local utility, Memphis Light, Gas and Water, told CNBC that it has provided 50 megawatts of power to xAI since early August, but that the facility needs another 100 megawatts to operate.

Data cluster developers said The Information that this could only power a few thousand GPUs. Musk’s company would need another electrical substation to get enough power to run 100,000 chips.

Hoffman and Musk did not immediately respond to requests for comment from Business Insider.

Related Articles

Back to top button