by Ashutosh Jogalekar
In October last year, Charles Oppenheimer and I wrote a piece for Fast Company arguing that the only way to prevent an AI arms race is to open up the system. Drawing on a revolutionary early Cold War proposal for containing the spread of nuclear weapons, the Acheson-Lilienthal report, we argued that the foundational reason why security cannot be obtained through secrecy is because science and technology claim no real “secrets” that cannot be discovered if smart scientists and technologists are given enough time to find them. That was certainly the case with the atomic bomb. Even as American politicians and generals boasted that the United States would maintain nuclear supremacy for decades, perhaps forever, Russia responded with its first nuclear weapon merely four years after the end of World War II. Other countries like the United Kingdom, China and France soon followed. The myth of secrecy was shattered.
As if on cue after our article was written, in December 2024, a new large-language model (LLM) named DeepSeek v3 came out of China. DeepSeek v3 is a completely homegrown model built by a homegrown Chinese entrepreneur who was educated in China (that last point, while minor, is not unimportant: China’s best increasingly no longer are required to leave their homeland to excel). The model turned heads immediately because it was competitive with GPT-4 from OpenAI which many consider the state-of-the-art in pioneering LLM models. In fact, DeepSeek v3 is far beyond competitive in terms of critical parameters: GPT-4 used about 1 trillion training parameters, DeepSeek v3 used 671 billion; GPT-4 had 1 trillion tokens, DeepSeek v3 used almost 15 trillion. Most impressively, DeepSeek v3 cost only $5.58 million to train, while GPT-4 cost about $100 million. That’s a qualitatively significant difference: only the best-funded startups or large tech companies have $100 million to spend on training their AI model, but $5.58 million is well within the reach of many small startups.
Perhaps the biggest difference is that DeepSeek v3 is open-source while GPT-4 is not. The only other open source model from the United States is Llama, developed by Meta. If this feature of DeepSeek v3 is not ringing massive alarm bells in the heads of American technologists and political leaders, it should. It’s a reaffirmation of the central point that there are very few secrets in science and technology that cannot be discovered sooner or later by a technologically advanced country.
One might argue that DeepSeek v3 cost a fraction of the best LLM models to train because it stood on the shoulders of these giants, but that’s precisely the point: like other software, LLM models follow the standard rule of precipitously diminishing marginal cost. More importantly, the open-source, low-cost nature of DeepSeek v3 means that China now has the capability of capturing the world LLM market before the United States as millions of organizations and users make DeepSeek v3 the foundation on which to build their AI. Once again, the quest for security and technological primacy through secrecy would have proved ephemeral, just like it did for nuclear weapons.
What does the entry of DeepSeek v3 indicate in the grand scheme of things? It is important to dispel three myths and answer some key questions. Read more »