How SemiAnalysis Uses the Supply Chain to Get Inside Nvidia

VaselineAugust 14, 2024

0 5 minutes read

How SemiAnalysis Uses the Supply Chain to Get Inside Nvidia

Nvidia CEO Jensen Huang isn’t the only one who has gone from uber-nerd to tech luminary in the space of a few years. Dylan Patel may be less of a household name, but for those watching Huang’s moves closely, Patel’s firm SemiAnalysis has become a central source of intel.

OpenAI cofounder Sam Altman even called him “that semianalysis guy” in a post on X, formerly Twitter, last year. Patel responded with an edited image of Google CEO Sundar Pichai pouring milk into Altman’s mouth while holding him by the hair. Is there any greater sign of tech prowess than a high-profile online feud?

SemiAnalysis gets deep into the weeds on graphics processing units, from the base of the supply chain to the end-users building the AI models that require so many specialized chips. That exchange with Altman was just a year ago, but it might as well have been in a different lifetime.

In the last two years, SemiAnalysis has grown from a team of three analysts in two countries to a team of 12 subject matter experts in the US, Japan, Singapore, Taiwan, and France.

“We had some major companies that were customers, but now almost every major company in the AI infrastructure, server, model, space from sub-components all the way through to clouds and hyperscalers have become customers, which is a big shift,” Patel told Business Insider.

The firm’s newsletter, launched in 2020, and its blog have become mandatory reading since they connect the semiconductor industry to the tech it enables in a way few do.

News that Nvidia’s next chip launch, Blackwell, would be delayed stirred up many questions about how AI hardware from the largest player in the game is rolling out. Patel spoke to BI about the impact of the delay, and how to observe Nvidia like a pro.

This Q&A has been edited for clarity and length.

BI: Where do you get your information?

The team consists of people across the entire supply chain, geographically and technically. We have someone who previously worked as low as ASML (Advanced Semiconductor Materials Lithography), the largest equipment company, and then all the way through to companies upstream, such as former employees of Microsoft, Nvidia, etc. So we have the entire view of the supply chain, from manufacturing, up to models. Also, several people on the team are from hedge fund backgrounds.

I think the team goes to more than 50 technical conferences across the entire stack — not just the trade shows, which are not really as valuable, but more the technical conferences where people are presenting papers and advancements in the field.

When you establish all these research and consulting relationships, you end up with a large well of information. That is how we’re able to get disparate pieces of information that may be known and even common in one part of the supply chain but don’t spread far outside it. Engineering especially, is very horizontal. Trying to draw the lines between all the different parts of the stack is very important.

BI: There was a recent report about Nvidia’s next-generation Blackwell chip being delayed. How common is a delay like this in the semi-world?

In semiconductor manufacturing, there is this loop of the design being sent to TSMC, which manufactures. Once you have the stencils, it’s a multi-month process to produce chips. Then you send those wafers out to customers.

Customers test them to make sure everything works. More often than not, it doesn’t. Why? Take, for example, a modern iPhone chip. It has 25 billion transistors and 15 layers of metal. Twenty-five billion transistors have 25 billion on-off switches, and they’re perfectly timed and organized so that everything you do on your iPhone happens.

AI chips are even more complicated than that. It’s impossible to perfectly simulate everything, so you are always going to have issues. Then they’ll need to make a few modifications to the design and send it back. That has always happened. And the range of modifications can be major to minor.

Nvidia has always had an incredibly aggressive timeline. They’ve always been really good about shipping either the first or second revision of the chip. Whereas, other companies can go through 12 revisions. Even big companies like AMD (Advanced Micro Devices) tend to go through two, or three, or four revisions.

The fact that Nvidia wanted to ramp so fast — this Blackwell ramp was supposed to be the first chip with $100 billion in sales — meant there were some minor delays. It’s just a few months at most.

But because so many people are affected by that delay, it shakes the entire world. In the past, the customers affected by this would know, and it wouldn’t shake up all the way to the top levels of every company.

BI: What are your favorite non-revenue indicators to watch at Nvidia?

One is the size and use of models.

Meta releases a new model, and it’s now 405 billion parameters, right? These new models continue to increase the demand for compute voraciously. Now that Meta released this new model, and now that everyone uses these open-source models and doesn’t develop their own, everyone switches to this bigger, newer model, and that creates a significant increase in demand.

On the flip side, all the way downstream, companies in Taiwan have to report their revenue monthly. You can see the exact ramp of everything as soon as you want to see it.

So there’s a lot of information all the way upstream in terms of the models and deployments, and all the way downstream to the various sub-component suppliers.

The other place I like to look a lot is random suppliers that are very small in niche parts of the supply chain, or suppliers of suppliers.

Maybe Nvidia suppliers are very hush-hush, but maybe TSMC’s suppliers are very open about what TSMC is ordering and you can estimate what’s going to happen to Nvidia’s production capacity or ability to sell servers from there.

BI: Assuming the initial Blackwell rollout will indeed be smaller than promised, what do you know about who gets to be first in line for the new chips? And why do they get to be first in line?

How Nvidia decides who gets to be first in line is a very complicated decision matrix — especially now with the investigation by the FTC.

There are all sorts of reports that if you buy their networking equipment, they will push you first in line or similar allegations like that.

So, the question is who are they going to prioritize? My sense is that customers who have no choice or are smaller and, therefore need more support from Nvidia would probably be de-prioritized relative to customers who are much larger and have options.

For example, Microsoft and Meta have their two largest customers. They also have chip programs internally, but they’re not as successful as, say, Amazon and especially Google. But they also went out and bought AMD GPUs. If you’re Nvidia, it makes sense to prioritize your biggest customers because they’re the ones who are going to buy the most products. They have a backup plan of buying from your competitor, and they require the least support.

The calculus around allocations is difficult. But it would make sense that they prioritize their biggest customers, who purchase from other people and may purchase their networking equipment, whereas companies like Amazon don’t purchase their networking equipment.

You can go down the list and think about why you would provide hardware to one customer over another.

Got a tip or an insight to share? Contact Senior Reporter Emma Cosgrove at [email protected] or use the secure messaging app Signal: 443-333-9088

VaselineAugust 14, 2024

0 5 minutes read

Vaseline

Related Articles

XAU/USD are nevoie de un raport ușor al inflației IPC din SUA pentru a atinge nivelul de 2.500 USD

Goldman Sachs’ Apple split could cost $500-4 billion, analyst says

50% off in 2024, is Opendoor a bargain right now?

Tesla stock falls 3% after Q3 deliveries miss estimates