With backers like Microsoft, d-Matrix is competing against Nvidia to become the next big thing in AI chips.
October 24, 2023
The hottest commodity in tech has been GPUs, where Nvidia’s (NVDA) dominance fetched it a trillion-dollar valuation. However, as generative AI matures, chip startup d-Matrix is inching closer to the spotlight.
While Nvidia’s powerful GPUs have been essential in training large AI models like ChatGPT or Meta's (META) Llama-2, it isn't designed for the next step of the revolution — inference. This stage is where trained AI models are optimized and deployed into applications — say, a chatbot — where the efficiency of the chips becomes increasingly important.
While AMD (AMD) and Intel (INTC) are racing to create their own GPUs, d-Matrix has been working on chips that support the deployment of generative pre-trained transformers (GPTs). In the near future, when the likes of Google (GOOG) and Amazon (AMZN) are done training their large language models and ready to integrate them into products like voice assistants, the demand for inference-focused chips may skyrocket.
In September, d-Matrix raised $110 million in a Series B round led by Temasek, Playground Global, and Microsoft (MSFT), which also invested in d-Matrix in an earlier round.
Yahoo Finance sat down with d-Matrix CEO and founder Sid Sheth to talk about how the company is planning to compete with Nvidia, the next chapter of the AI race, and when the chip shortage may finally come to an end.
The below interview has been edited for brevity and clarity.
I want to start really big picture. Why chips? Why AI chips? Are you crazy?
Well, chips definitely aren't for the faint-hearted.
Precisely why I asked.
Chips have become cool in the last, I'd say, five to seven years. Prior to that, it was super uncool to even think about starting a chip startup — and if you thought of starting one, you typically got laughed out of investors' offices, but that's changed.
There was this 15-year period between 2000 and 2015 where chips weren't very cool, and it was very challenging. The returns were not great. The promise of returns only came back in 2015 when AI emerged as a major application.
What makes an AI chip unique?
There are several factors that must align for a successful AI chip. I know you put a significant focus on the chip itself, but in reality, AI — let's refer to it as AI compute — plays a crucial role. What Nvidia and other companies are addressing is an AI computing problem. While chips are a substantial component of this AI computing challenge, we must not underestimate the importance of software.
In fact, this is the aspect that people are starting to truly appreciate — the AI computing problem is 50% chip and 50% software, making it a far more intricate problem. It's not just a matter of creating a chip and launching it in the market for people to use. You need to build a chip and develop the accompanying software that can effectively utilize that chip for success.
In the case of Nvidia, over 50% of the company's workforce focuses on software. The same goes for AMD, which invests significantly in software. When d-Matrix started in 2019, we recognized this early on and established a software team in parallel with the chip team.
The narrative on Wall Street is that Microsoft has been ahead of the curve on AI. Does that track with your experience of the company?
They had been using the term "generative transformers" almost a year before generative AI became a popular buzzword. When they entered the investment scene at the end of 2021, they emphasized the significance of not just regular transformers but generative transformers [GPTs].
As a result, we re-doubled our efforts, specifically focusing on the generative aspect of transformers and building a platform for generative AI. A year later, this approach became a buzzword in the industry.
Microsoft's partnership with OpenAI and the nature of the workloads and applications they are striving to accelerate have indeed positioned them at the forefront of anticipating and addressing industry needs.
On the subject of these giant companies — Nvidia. What makes d-Matrix chips different from Nvidia's? Can you compete?
I've heard that question 100 times, but there's a reason why we pursued the path we did. At our founding in 2019, Nvidia was already a major player, the gorilla in the space. We knew Nvidia was already quite substantial and heavily focused on training.
When it comes to AI computing, there are two aspects: training, where models are trained, and inference, where models are effectively deployed. In the past year, a lot of the narrative has shifted towards inference. Everyone knows how to train models, and the models have become highly intelligent, but the challenge is cost-effectively deploying those models, and that's where inference comes into play. D-Matrix's focus is solely on inference for large language models.
Nvidia designs high-performance chips that excel in training, but these same high-performance GPUs may not be the best fit for inference. Inference is all about efficiency, cost, latency, and the economic aspects of deployment.
We are entering a field that will be much larger than the original application Nvidia targeted, and this market will be shared among multiple players. It won't be a one-company-dominates-all dynamic.
Imagine you're at a cocktail party. How do you explain what inference means in the context of AI?
I love that question. So, let's consider ourselves as humans, right? What do we do when we are born, and how does our learning process evolve? The first 20 years of our lives are spent in school, attending college — learning, learning, and more learning. We absorb a vast amount of knowledge that's already available, and that's the training phase.
Then, what do [people] do for the next 40 years? They take all that knowledge and put it into practice. They monetize it by either taking on a job or running a business, effectively applying the knowledge acquired in the first 20 years. This phase is called "inference." The inference phase is much larger than the initial 20 years for us humans.
We focus on this "inference" part, and AI models operate in a similar way. You invest a certain amount of time training a model with existing data. Once the model is trained, it's deployed for many months or even years, and during this time, people use the model to make decisions and monetize their data. That's the area we concentrate on.
When does the AI chip shortage end, and how?
In my opinion, this trend is here to stay for at least the next 12 months.
We will have to wait and see how the end market unfolds. There's a significant emphasis on monetizing AI. We've discussed how people know how to train AI models, but the focus is now shifting toward generating revenue from AI.
The scenario I anticipate is that people will reach a point where they'll say, "I've trained enough models, but I need to figure out how to profit from what I already have." When we reach that stage, which I estimate to be around 12 months away, maybe a bit longer — it will prompt individuals to pause and consider not purchasing more AI chips. Instead, they will focus on finding ways to monetize their existing AI resources.
That's the moment we're patiently waiting for. When it arrives, that's when d-Matrix will likely see increased demand, as people will be eager to explore how to use these solutions to make the most of what they already possess and generate profit.