In the escalating race of ‘AI reasoning’ supremacy, NVIDIA has taken another stride on supply-side with the general release of its GB200 NVL72-powered instances on the cloud via CoreWeave.
These high-octane Cloud service setups aim to redefine AI reasoning by leveraging massive computational muscle, which is vital for processing the intricate “reasoning” workflows that AI models require.
The emphasis here? Speed and cost-efficiency, as stated by CoreWeave — achieving up to 30 times faster processing for AI language models compared to its predecessors.
Brian Venturo, CoreWeave’s Chief Strategy Officer, hails this partnership as a “force multiplier for businesses,” spearheading innovation while maintaining operational efficiency.
This comes amid a contest between AI giants OpenAI and DeepSeek. Both firms are deeply engrossed in advancing reasoning models.
OpenAI made waves with o1 reasoning model, only to be outdone by DeepSeek’s R1 — both in terms of costs and benchmarks.
As giants like Nvidia and popular AI startups like Perplexity rushed to integrate the Chinese open-sourced model into their services, OpenAI shot back last week with o3-mini line of models, that further bettered the benchmarks and this time, speed as well, for reasoning models.
Reasoning models are power-hungry as there is a lot going on in background. There are typically more than one processes involved and number of tokens utilized are significantly higher than those consumed by regular LLM models.
And with the rush to still keep responses quick (OpenAI’s o3-mini is incredibly fast, usually), there comes a massive demand for a combination of high-speed communication, memory and compute, Nvidia notes — the exact purpose the GB200 NVL72 is meant to serve.
Now, with CodeWeave, it comes to the cloud for the first time, a special boon for independent developers working with open-source reasoning models.