DeepSeek development cost probably 100 times the sticker price: fundie

Published February 5, 2025

By Joanne Tran

DeepSeek, the Chinese artificial intelligence model that stunned Silicon Valley and Wall Street, may have cost about $1 billion to develop rather than the $US5.6 million ($9 million) price tag it has popularised, says a leading Australian technology investor.

DeepSeek has had such a profound impact since it launched its reasoning model R1 because it claims the bot was produced on a $US5.6 million training budget. OpenAI’s ChatGPT cost at least 100 times that.

The cost of training – which refers to feeding the model with huge data sets so it can learn and hone its accuracy – has taken on outsized importance as the world comes to grips with DeepSeek, says Armina Rosenberg, co-founder of AI-enabled hedge fund Minotaur Capital Management, and a former stockpicker at Mike Cannon-Brookes’ Grok Ventures.

“This focus on raw training costs is misleading. While DeepSeek’s technical innovations in training efficiency are impressive, the total investment required to develop a competitive model goes far beyond just the direct compute costs,” she told AFR Weekend.

“DeepSeek allegedly had access to 50,000 high-end [graphics processing units] (worth around $1 billion) and significant engineering resources to achieve their results. Their efficiency gains are noteworthy but don’t fundamentally change the substantial investment needed to compete at the frontier of AI development.”

Ms Rosenberg referred to comments penned in a blog by the chief executive of Anthropic, Dario Amodei, who said this week that even the most sophisticated labs were incurring “a few $10M’s to train” their models including Anthropic’s own Claude 3.5 Sonnet.

DeepSeek claimed it used 2048 Nvidia H800s, a type of chip that has since been usurped by new generation variants. But some tech analysts posting on social media platform X say it had access to a far greater war chest, an assertion also repeated by Mr Amodei.

The distillation process

Plato Investment Management portfolio manager David Allen referred to the fact that new bots were capitalising on the advances of their predecessors in a process known as distillation. This inevitably makes them leaner, but it has also given rise to accusations that DeepSeek unlawfully used OpenAI data.

“DeepSeek was not developed in isolation from other [large language models],” Dr Allen said. “I guess it’s becoming increasingly clear that open-source models are likely to win the arms race in the long run, though. There doesn’t appear to be much franchise value in building generalised LLMs.”

Verifying information has been one of the challenges for investors in coming to grips with DeepSeek. Ms Rosenberg’s firm canvasses its own testing and benchmarking for clues, technical papers, discussions with contacts in the start-up world, and resources in the developer community such as GitHub and Discord.

“It’s crucial to look beyond headline claims and examine underlying capabilities, resource requirements, and market impacts,” she said. “It is essential to continuously validate information against observed results and industry developments given AI’s rapid pace of progress.”

DeepSeek self-published the details of its V3 model in December 2024, which WAM Global portfolio manager Nick Healy said discussed many innovations pertinent to R1.

He said the inference cost was a more important metric in the AI realm, referring to how much it costs to prompt a model to produce an answer.

“On this front, we have verifiable data that the cost to inference [for R1] is at least an order of magnitude lower than other leading models – this is important given [Nvidia boss] Jensen Huang has been so clear that the future growth of AI demand lies predominantly in inference, not training,” Mr Healy said.

James Rodda at Antipodes Partners was galvanised by how quickly DeepSeek was able to shake up the global AI pecking order, which suggests today’s winners don’t have a lock on the profits of the future.

“In the public cloud sector, if OpenAI succeeds, it benefits Microsoft, which has exclusive distribution rights of OpenAI models until at least 2030,” Mr Rodda said. “Conversely, if an open-source model like Meta’s Llama or DeepSeek wins, it could neutralise the market for Amazon, which lacks a similarly performant proprietary AI model.”

Ms Rosenberg said the cost-reduction curve was an accepted fact and this element of DeepSeek’s origin story checked out. She concurred that “AI progress isn’t monopolised by just a handful of frontier labs”.

“We are seeing a pattern in AI where each new generation becomes both more capable and more cost-efficient,” she said. “DeepSeek’s claim lines up with how quickly costs have been declining, thanks to better architectures, more efficient training algorithms, and strategic hardware use, but I would say it’s an incremental improvement.”

Licensed by Copyright Agency. You must not copy this work without permission.

DeepSeek development cost probably 100 times the sticker price: fundie

The distillation process

WAM Leaders posts strong portfolio gains, lifts interim dividend

Spotlight on BlueScope Steel and NexGen Energy

Spotlight on GemLife Communities Group

Geoff Wilson builds case for CGT reform but says changes ‘must be revenue neutral’

Join 100,000 subscribers today.

Don’t miss regular updates from our investment team.