Stop That Thief! Says the Data Bandit: OpenAI vs DeepSeek

So it’s already all over the news, with OpenAI’s recent allegations against DeepSeek. (It took a while to find a news link that wasn’t hidden under a paywall).

Quick Brief

In short, DeepSeek unexpectedly released an open-source Mixture-of-Experts language model a few days ago. The established AI companies, already used to monopolies, are also affected. Nvidia suffering a bit, OpenAI taking the biggest loss.

Suddlenly AI is not that proprietary, and also not that hard to train, since DeepSeek used older GPU’s to train their model (R1). Especially not needing the tensor cores, Nvidia’s latest proprietary hardware designed specifically for LLM’s, and AI training.

The company claims R1 matches or exceeds leading models in areas like reasoning, math, and general knowledge, while consuming considerably fewer resources.

Following DeepSeek’s announcement, Alphabet, Microsoft, Nvidia, and Oracle experienced a collective market loss of nearly $1 trillion.

The interesting open source aspect of R1’s model

While OpenAI is investigating DeepSeek for potentially using their data, something even more interesting is happening:

Two brothers (the Unsloth team) managed to compress DeepSeek’s R1 model (671B parameters) from 720GB to just 131GB. This is the beauty and real power of open source, as opposed to closed systems.

Now, you can run the model on consumer hardware (at least 20GB of RAM), though response times will be slower; more RAM and VRAM improve performance, but this is now the minimum requirement, unlike before, which demanded high-end hardware.

This represents a massive democratization of AI technology, and here is the real story.

Double standard at its best

I’m sorry but even if the allegations are true, and Deepseek may have reverse engineered ChatGPT’s reasoning, or the finding is downright stealing, I cannot help but laugh only for a bit.

For years, OpenAI has been battling various court cases, and they have a history with controversial data collection:

  • Accused of scraping the entire internet, including websites that explicitly disallowed robots or scraping
  • Trained their models on so many books where authors did not get consent
  • OpenAI stole “Massive amounts of personal data” to train ChatGPT, a random lawsuit alleges.
  • And the list really goes on, you just have to google for a minute or so.

Now, I cannot stand in ovation and applaud theft. But again, it’s funny to see how a reversal of roles, with an opponent mirroring their own practices, creates this amusing situation for us.

The open source community now has a real open source AI model to experiment with. Let’s see what how this will further develop.

AI losing its job to AI

Photo by Solen Feyissa on Unsplash.