Home > News > ChatGPT Maker Suspects China’s Dirt Cheap DeepSeek AI Models Were Built Using OpenAI Data — and the Irony Is Not Lost on the Internet

ChatGPT Maker Suspects China’s Dirt Cheap DeepSeek AI Models Were Built Using OpenAI Data — and the Irony Is Not Lost on the Internet

Author：Kristen Update：Mar 05,2025

OpenAI suspects that China's DeepSeek AI models, significantly cheaper than Western counterparts, may have been developed using OpenAI's data. This revelation, coupled with DeepSeek's rapid rise in popularity, triggered a sharp decline in the stock prices of major AI companies, most notably Nvidia, which experienced its largest single-day loss in history.

The DeepSeek R1 model, built upon the open-source DeepSeek-V3, boasts significantly lower training costs (estimated at $6 million) compared to Western models. While this claim is disputed by some, it has fueled investor concerns about the massive investments being made by American tech companies in AI. DeepSeek's success has also raised questions about the ethical implications of AI model development.

OpenAI and Microsoft are investigating whether DeepSeek violated OpenAI's terms of service by using a technique called "distillation" – extracting data from larger models to train smaller ones. OpenAI confirmed its awareness of such attempts by Chinese and other companies to replicate leading US AI models and stated its commitment to protecting its intellectual property, including collaboration with the US government to safeguard its technology.

David Sacks, President Trump's AI czar, corroborated OpenAI's suspicions, suggesting evidence points towards DeepSeek's use of OpenAI models for training. He anticipates further measures from leading AI companies to prevent similar occurrences.

The situation highlights the irony of OpenAI's accusations, considering previous allegations that OpenAI itself used copyrighted material without permission to train ChatGPT. This hypocrisy has been widely noted on social media, with critics pointing to OpenAI's previous statements claiming the impossibility of training AI models without copyrighted material. OpenAI's stance on the use of copyrighted material has been challenged in court, with lawsuits from the New York Times and 17 authors alleging copyright infringement. The legal landscape surrounding AI training data remains complex and contentious.