Is the AI Boom Real?
For the past three weeks I have been traveling through Japan and the United States.
This trip has been part sight-seeing and part learning tour with the goal of understanding more about the AI and AI chip boom landscape outside of Taiwan.
Now that I am finished with the big conversations, I wanted to sit down and share a few thoughts.
The Trillion
There’s been a lot of news about a chip venture thing that Sam Altman of OpenAI is working on for the AI industry.
The news came slowly. First there was the report - released at around the time he was temporarily ousted from OpenAI - that he was raising money from Middle East investors for some chip venture. Back then it was just "billions".
Since then, you have also gotten news that Sam Altman and his team have talked to TSMC, people at Abu Dhabi, and the US Government.
Just as I was leaving Tokyo for San Francisco, the WSJ broke the report with that eye catching $1-7 trillion figure that has got everyone buzzing. It was one of the first things that I started asking people about when I arrived in Silicon Valley.
Many people agreed with me that the number is a bit too much. Perhaps it can be a negotiating tactic to set expectations for future talks. Once you start talking trillions then hundreds of billions no longer feels as substantial.
Then you have the Information adding a bit of context to Altman's remarks, saying:
> But in reality, Altman privately has told people that figure represents the sum total of investments that participants ... would need to make, in everything from real estate and power for data centers to the manufacturing of the chips, over some period of years.
The ecosystem concept makes a lot more sense. Total semiconductor sales in 2023 were about $520 billion. Total capital expenditures - according to the trade group SEMI - were about $140 billion.
But even if we were to assume that this trillion is spread out over five years and diluted with real estate expenses and power, that is still another COVID-like step function upwards in capital expenditure.
The semiconductor industry is a conservative one. Many of the people there are old-heads who have seen many a cycle of booms and busts downstream. Sam isn’t really the first guy to knock on a foundry’s door saying that his use case is going to change the world.
Scaling Laws
The concept driving this investment is something called the "scaling laws".
The name dates to a 2020 paper posted by OpenAI titled "Scaling laws for neural language models". I am not going to go over the fine details, but the gist is that if we combine more data and compute, we get better results (i.e. less loss).
This was one of the core ideas that helped OpenAI make the GPT-series a reality. Ilya Sutskever, one of the company's cofounders, mentioned something like this in an appearance on the No Priors podcast back in November:
> I was very fortunate in that I was able to realize that the reason neural networks of the time weren't good is because they are too small. So like if you tried to solve a vision task with a neural network with a thousand neurons, what can it do? It can't do anything.
> It doesn't matter how good your learning is and anything else. But if you have a much larger network then it can do something unprecedented.
GPT-3 is big. GPT-4 was even bigger, performing far better. And GPT-5, whenever it comes out, will be even bigger. So far there are no indications that the scaling laws have broken apart.
I find the parallels with the semiconductor industry's Moore's Law very interesting. The two "laws" do not say the same thing, but they might have similar effects on their respective industries.
In the 1980s and 1990s, Moore's Law became the rallying cry of the entire semiconductor industry - the metronome keeping time for every company from Santa Clara to Tokyo to Hsinchu.
There is a chance that the Scaling Laws can make a similar impact on the AI industry. A simple set of easily understandable rallying cries that drive R&D roadmaps for whatever years to come.
There are a variety of arguments against the Scaling Laws. For instance, people have commented that we have basically pulled all the existing data across the entirety of the Internet.
But there are ways around such things, if the money is there to make it happen. Physics problems have cropped up ahead of Moore's Law, and the industry derived new engineering solutions: the High-K Metal Gate, FinFET, DUV Lithography, Dry Etch, Ion Implantation, SEM, and so on.
The bigger question is whether the money is there to continue driving this investment. More on that later.
For a way deeper review of the technical issues behind scaling, I recommend Dwarkesh Patel's post - "Will scaling work?" His podcast is quite excellent too.
AI Chips & Nvidia
Where there are profits, there are competitors coming out with ideas to take them.
There has been a lot of ink spilled on Nvidia and their competition - particularly the multi-pronged attack on Nvidia's AI accelerator profits. I don't want to add too much to this.
I do question whether the Nvidia fortress is going to be as assailable as it might at first seem. Jensen is responding to the threat by aggressively ramping up the annual updates to their accelerator lineup.
For more information on this, I refer you to the report from SemiAnalysis. It goes into much more detail. Dylan's great.
But this relentless pace reminds me of the days of old Nvidia during the Graphics Cards Wars, when Nvidia released a new product every six months to the market. It worked for them then. Why not try it again now for the AI Chip Wars?
Nvidia can maintain this rollout speed because - as Jensen implied during his appearance at the Acquired podcast - they ship before they test.
They use the latest computer software design and emulation tools to model and ship new GPU designs to the market without first fabbing a physical chip:
> The reason why we needed that emulator is because if you figure out how much money that we have, if we taped out a chip and we got it back from the fab and we started working on our software, by the time that we found all the bugs because we did the software, then we taped out the chip again. We would’ve been out of business already.
The Nvidia teams will try to tape out "perfect chips" as Huang said, but this inevitably will cause problems for customers. For instance, deployments with drivers and other affiliated software that won't work all that well at first.
But the Nvidia brand can take a hit like that, whereas it is unlikely that the AI chip startups will be able to. It takes several years to build a team and then get that team good enough to ship a working product.
This high speed of iteration will be rough on customers. Many have no choice but to buy what is available, but it stings to spend tens of thousands of dollars on a machine, only to have a vastly better one coming out just a short time later.
Giants and Verticalization
Nvidia is less concerned about the startups and even the established silicon players than they are about the tech giants.
The tech giants - Microsoft, Google, and the like - are the ones driving the current investment spend in AI today. They are also the ones with the most incentive to cut Nvidia out of their margin.
They would do this using custom-designed chips or ASICs. Large companies have sought these since the good old days of the computer - commissioning an ASIC to replace an entire board of discrete electronic components to save money.
An example of this would be the Apple IIe, a third-generation version of the Apple II PC powered by a custom ASIC.
A more modern example would be the Google TPU, which right now is in its fifth iteration.
It makes up a big part of Google's compute advantage over OpenAI and the other AI labs.
Microsoft seems to be the giant pushing hardest on vertical integration. CEO Satya Nadella said in the Q2 2024 earnings call that most of the usage in Azure AI services is coming from inference rather than training. Those activities are easier to push to a custom-designed ASIC.
In addition, Microsoft has been working on vertically integrating other parts of the database stack. For instance, the Information's recent report that they are working on a network card to shuttle data between servers.
This type of vertical integration implies either that the scale of the AI database stack is so large and costly that every penny counts/will count. Or that the growth in the industry has petered out, leaving players to compete on price. These scenarios both feel weird to consider.
What does this vertical integration trend mean for Nvidia? It will take some time for the other tech giants to ramp up their AI chip designs. Apparently the first Google TPU was a very bare-bones product. So in the short term, things will be as they are.
But as those chips get better in the medium to long term, it makes sense that Nvidia push as hard as they can to always win the performance-cost crown. Who knows.
Financially Sustainable?
I want to move away from cost. I think the cost benefits of vertical integration are significant enough to matter. Now, I want to move to the other side of the financial equation.
One of the really big questions that I wanted to answer coming to the United States was to learn more about whether or not this OpenAI boom is Real. Real, with a capital R.
There has been so much progress in the AI industry in the past year. For instance, the OpenAI text to video Sora model. The improvement in the models' output quality has been impressive.
But even if the technology is amazing, it does not seem like a hit product by itself. We need to embed them into compelling products. Are there indications of such products breaking through into the public? Who is making money from actual consumers with them?
The massive investments required by Moore's Law and the semiconductor industry were driven by real demand from various downstream industries. First the military, then consumer electronics like radios and calculators, then the PC, smartphones, and then cloud computing.
These were all things that ordinary people wanted. To me, it feels unlikely that the truly large investments in AI will happen unless those ordinary consumers start buying these services in a major way.
The Financial Times did report that ChatGPT hit a $2 billion run-rate revenue in December 2023. That is basically the only solid piece of news indicating that people are paying real money for an LLM service.
Now, we should note that it is still early in the generative AI boom. ChatGPT is a little over a year old.
But I also want to note that the iPhone was first released in June 2007.
About two years later in 2009, Apple sold 20 million iPhones for like $13 billion in revenue. Growth continued, with 40 million iPhones sold the year after that, and then 72 million and 125 million after that.
Okay fine, the iPhone is the greatest consumer technology product that history has ever seen. It is a difficult example to live up to.
ChatGPT is still one of the fastest growing consumer tech products in history, depending on what you think about Threads. And people are still using it a lot - Altman recently said that ChatGPT generates 100 billion words a day.
And Facebook took a while to produce revenue - at first investing to build up the audience. Perhaps ChatGPT is doing the same Aggregator approach, gathering users to monetize later. But few products cost as much to use. AI probably has shorter a leash than we think.
Other than ChatGPT, the one product that everyone in the industry seems to have their eye on is the Microsoft Copilot subscription service. We want to see if this $20 a month offering catches on with people and enterprises.
If it does, then we are really off to the races ... the sky really is the limit ... we are so back ... and so on, insert bombastic metaphor here. If not, then we have to adjust some mental models.
Conclusion
There is one last thing that I should mention. Maybe where the LLM revolution will be most Real is when it is embedded into the products we know today.
For instance, a few people I spoke to insisted that I am overlooking the impact that AI automation will have in supercharging advertising sales.
Like Google's Performance Max ads, which use AI to automate the creation, deployment, and targeting of new ads. Ads like Performance Max apparently generate tens of billions of dollars.
Or how when Apple disabled cookie tracking in 2021 with the App Tracking Transparency thing. Meta/Facebook at first said that this would cost the company over $10 billion in sales.
But over time, Meta somehow managed to claw back its targeting accuracies using AI, collecting data from its Conversions API.
So maybe the killer app for AI is just more ads.
But Ben Thompson and others have been saying for a while that AI is an enabling technology - a technology for making the rich richer, rather than making a whole new class of rich people like the PC or electrification was.
So for me, if this is all that it is, then it is a bit disappointing. But it makes the AI boom nevertheless real.
Anyway, these are my reflections from my trip to the United States. I hope to make more frequent trips in the future and have more interesting conversations with people in the space.
If you are interested in having a chat, shoot me an email. Maybe I will be coming to town again and can have that chat in person.