The tech world and U.S.–China competition were shocked when a relatively unknown Chinese tech company recently released an open-source model that quickly surpassed larger U.S. competitors both in quality and resource efficiency.
This result shocked many and brought about what some refer to as the United States’ Sputnik moment: the realization that the Chinese regime presents a formidable competitor. However, before giving in to the hype, let us break down and understand what has explicitly taken place.
At this point in history, most people are aware of artificial intelligence (AI) models or companies such as OpenAI’s ChatGPT, Microsoft’s Copilot, or Google’s Gemini. Even if you do not use those specific products, you may use them in an unknown way through other companies that use their fundamental models. Creating and “training” these models takes enormous intellectual energy and, more fundamentally, computational energy.
At their heart, these computational models are nothing more than enormous statistical models. They are not actually intelligent but are rather statistical models that determine what should be the next letter or word in a sequence and what should be the answer you are seeking to your question.
Let me give you a simple example. If I give you the sentence, “My dog is sick, and I am going to take him to get medicine at the ... office,” you could probably infer, based on context, the missing word in the sequence. These statistical models are trained on massive amounts of text using layered statistical models to infer that the missing word is “veterinarian.”
The key output breakthrough made by Chinese company DeepSeek was to create a cutting-edge model that outperforms models such as ChatGPT but does it at a fraction of the price. According to DeepSeek, it trained this model for $5 million to $6 million rather than the $100 million required to train ChatGPT. This emphasis on increasing efficiency was an industry-wide focus and not unique to DeepSeek or Chinese firms. However, how DeepSeek accomplished this, and even where DeepSeek came from, is where things start to get interesting.
Technically, DeepSeek’s approach to increasing efficiency is not entirely new. It is novel for combining multiple techniques that other firms and researchers had used to improve efficiency into one model. Each of the separate techniques increased efficiency, but DeepSeek employed all of them.
The individual techniques, however, are well-known. For instance, one technique called quantization merely takes a number in the statistical model that may have 32 decimal places and reduces that to, say, four to eight decimal places.
Another technique is called the mixture of experts model. Think of this like sections of a library or gears on a car. When a user asks a question, the AI model, rather than using the entire model, can send some questions to a specific expert and develop multiple smaller models to answer specific questions that also work as part of the larger whole when needed.
All of these techniques and others that DeepSeek used were well-established ways to increase efficiency in modeling, but they had never been combined before in one model.
Where the story about DeepSeek heads sideways is the pitched public relations campaign it waged. The company stated that it trained its foundational model on limited resources but simultaneously stated it had access to 50,000 of the most advanced processors in the world. It claimed it only spent $5 million for the training but had access to nearly $2 billion, which would, in total, cost $4 billion with related other costs. The company’s own story did not add up.
DeepSeek’s story became even more confusing when it talked about this just being a side hustle. Given the political nature of accessing a limited number of state-of-the-art processors in China, which are officially blocked for export to China, it seems unlikely that an unknown startup would get priority over established tech stalwarts such as Alibaba, Tencent, and Baidu without significant influence. This was clearly not a side hustle, and coming from one of the larger hedge funds in China with 200 employees, it seems likely to have significant backing and influence, although the exact source has yet to be determined.
There are multiple lessons to be drawn from what is known right now.
First, the technological breakthrough is, yes, an important output breakthrough because it assembles all the other techniques together, but fundamentally, there is no novel technique breakthrough in work.
Second, the entire AI industry is rapidly evolving, and what is cutting-edge today can easily be geriatric and outdated in a month or two. Only within the past year did Chinese leader Xi Jinping openly worry about Chinese firms lagging behind U.S. AI companies. However, Chinese prowess in AI and related sectors that are linked so closely and are so well funded by the Chinese Communist Party (CCP) security apparatus should pose a long-term worry.
Third, while President Donald Trump issued an executive order to try to find a buyer for TikTok, DeepSeek would seem to fall under the same law. Given that DeepSeek says it sends data to the Chinese regime and conducts a wide variety of activities, such as logging keystrokes, the data collection activities of all Chinese firms with electronics should be of increasing concern to the Trump administration.
Fourth, despite all the information provided, this has still served as a type of Sputnik moment for many U.S. citizens as they begin to realize that the authoritarian state of the CCP is a formidable opponent. The Trump administration would be well served by not only criticizing the CCP but also by establishing policies and incentivizing the most competitive and innovative firms in the world to confront the Chinese regime’s challenge and develop U.S.-led and controlled technology.
Hopefully, this will be the Sputnik moment where we chart a new direction to examine the challenge facing us in a serious competitor and rise to that challenge to protect U.S. liberty and leadership.