DeepSeek and the Creative Power of Constraints: Doing More with Less Compute

DeepSeek and the Creative Power of Constraints: Doing More with Less Compute
Photo by Robert Keane / Unsplash

DeepSeek’s success challenges the belief that bigger compute means better AI. Facing chip shortages and export restrictions, it built a powerful reasoning model using reinforcement learning over brute-force scale. In our main article today, Betty Li Hou, Computer Science PhD Student and Machine Learning Researcher at NYU, explores how constraints drove innovation—and what it means for AI’s future.

But first:

Five Things to Know About the Paris AI Action Summit

The Paris AI Action Summit next week (Feb 10-11) marks the next stage in international AI coordination, following the UK’s 2023 Bletchley Park AI Safety Summit and the Seoul AI Summit in May 2024. Here are five things to know about the upcoming event:

  1. A Shift from Risk to Governance – The UK summit centered on long-term AI risks and voluntary safety commitments, the Seoul summit focused on near-term AI risks, regulatory approaches, and international cooperation. France is broadening the scope to include AI’s impact on democracy, defense, and the economy. Official website.
  2. India’s Role as Co-Chair – At France’s request, Indian Prime Minister Narendra Modi will co-chair the event and is expected to advocate for global AI governance frameworks. Read about India’s role and Modi’s AI agenda.
  3. A High-Profile Guest List – Attendees include top executives from Alphabet and Microsoft, almost 1,000 heads of state and government, bosses, think tanks, campaign groups, research institutes and artists. Learn more.
  4. Emphasis on Open-Source AI and Clean Energy – France is positioning itself as a leader in sustainable AI, advocating for open-source models and energy-efficient data centers. More on this.
  5. Fringe Events in London and Paris – Alongside the main summit, independent organizations are hosting discussions on AI policy, ethics, and open-source development. Find details.
💡
Sign up for Internet Exchange!
  • Help shape the Internet Exchange community—answer our poll to tell us where you'd like to connect! https://tally.so/r/mDAOqE

Access, Digital Rights & Tech Governance

Decentralization

More on DeepSeek and AI Developments

Upcoming Events

Careers and Funding Opportunities

What did we miss? Please send us a reply or write to editor@exchangepoint.tech.

Creativity Loves Constraints: How DeepSeek’s Limitations Led to Innovation

By Betty Li Hou and Audrey Hingle

The conventional wisdom in artificial intelligence suggests that the bigger the compute, the better the model. Cutting-edge hardware and seemingly infinite processing power have been seen as the ultimate keys to progress. That’s why the Stargate Project, OpenAI’s latest $500 billion initiative, is focused on building massive-scale AI research infrastructure in the United States, and why Meta is spending a further $60 billion on developing AI in the U.S., including constructing a 2 gigawatt+ data center. 

Facing export controls, limited access to NVIDIA’s most advanced chips, and restricted compute resources, Chinese companies had to take a different approach. The result? DeepSeek: A model that reportedly achieves comparable performance to OpenAI’s o1, but with significantly lower computational requirements.

The Evolution of DeepSeek

DeepSeek-R1 is the latest evolution in a lineage that includes DeepSeek-V3 and its earlier predecessor, DeepSeek-V2, which first introduced multi-head latent attention (MLA) back in June 2023. At its core, it’s another large language model (LLM) like GPT, Llama, Claude, and Gemini, but DeepSeek-R1 is notably different from its predecessors in that it is a reasoning model. 

Reasoning models are models that have been specifically trained to perform complex reasoning. In comparison to general-purpose models such as ChatGPT, reasoning models are capable of performing tasks such as problem solving, coding, math, and scientific reasoning. OpenAI’s o1 and o3-mini are other examples of reasoning models. 

Building a Reasoning Model

Most LLMs don’t just soak up massive amounts of text—they go through a second phase of training designed to sharpen their skills, known as post-training. This is where reasoning capabilities typically arise. The dominant post-training approach in widely used models like ChatGPT, Claude, and Gemini is supervised fine-tuning (SFT)—where models are fine-tuned with carefully curated datasets to boost their reasoning, conversational abilities, and technical precision. DeepSeek, however, took a different approach.

Reinforcement Learning: Doing More with Less

Rather than relying on SFT, DeepSeek-R1 employs reinforcement learning (RL)—a methodology that iteratively refines the model by rewarding better responses and penalizing weaker ones. RL has been around for a while, however, demonstrating its effectiveness for large language model reasoning was a surprise to researchers. Although maybe not a total surprise to researchers at OpenAI.

An RL-based reasoning model also suggests a method for overcoming a challenge faced by SFT, known as reward hacking: During SFT, a separate reward model is used to evaluate the accuracy and quality of the model’s responses. Reward hacking occurs when the model achieves high reward scores from the reward model through incorrect means. As a result, the outputs may be unreliable. With DeepSeek’s method of RL, model outputs were evaluated through a rule-based reward: concrete metrics of accuracy and quality, rather than with a reward model. This effectively circumvented reward hacking—the model gave consistent, high-quality results. 

Notably, they found that this method could scale. These high-quality results didn’t require nearly the same magnitude of eye-watering compute bills as leading AI companies claimed were required to build state-of-the-art models. DeepSeek is reported to have cost $5.5 million to train, a miniscule fraction of what some of the biggest AI companies are said to be spending—tens and hundreds of billions of dollars, with future projects projected in the trillions.

This efficiency wasn’t just a happy accident—it was a necessity. Export restrictions limiting access to advanced AI chips meant that DeepSeek had to operate with fewer high-performance GPUs. In other words, they had to get creative. Instead of squeezing out performance improvements with more compute power, they had to be strategic: optimizing model architecture, improving algorithmic efficiency, and making better use of available hardware. The result? A model that holds its own against OpenAI’s latest offerings while using a fraction of the compute.

The NVIDIA Stock Shock

DeepSeek-R1’s efficiency didn’t just shake up AI research—it rippled through the tech market. When investors realized that a high-performing model had been built without heavy reliance on NVIDIA’s best chips, they started questioning the long-held assumption that compute scale is everything, resulting in a dip in NVIDIA’s stock.

Constraints as Catalysts

DeepSeek’s success is a testament to the adage that creativity loves constraints. While other AI developers may have relied on sheer computational brute force, DeepSeek was forced to think differently. Their work suggests that algorithmic ingenuity can rival, if not surpass, the advantages of high-end hardware. As AI development moves forward, DeepSeek’s approach could serve as a playbook for the next wave of efficient, cost-effective machine learning.

In an industry that often equates size with success, DeepSeek-R1 shows that bigger isn’t always better—sometimes, it’s just more expensive.

Looking Ahead

DeepSeek challenges assumptions that OpenAI’s infrastructure and algorithms were years ahead of any competitors. Their open-weight model means anyone with sufficient GPUs can download and run it for free, lowering barriers to AI access. While the ability to run the model locally does not reveal the specifics of how the model was trained, it allows researchers and developers to experiment, fine-tune, and build upon the existing model architecture. In contrast, OpenAI’s closed, API-based approach restricts direct access, making their models proprietary and monetized. 

Meta, which has already embraced open-weight AI with its Llama models, stands to gain as more developers adopt this approach. The increasing appeal of more open AI has even led OpenAI CEO Sam Altman to express regret about abandoning open-source AI, acknowledging that they have been "on the wrong side of history."

DeepSeek’s reliance on reinforcement learning over supervised fine-tuning also raises key research questions: Can pure RL generalize to non-reasoning tasks, such as general interactions and subjective decision-making? If so, it could transform how AI models are trained. Additionally, while DeepSeek’s training process appears more cost-effective, some early research suggests its inference phase—the process of generating responses—may actually be more energy-intensive than comparable models from Meta. If reasoning models like DeepSeek become widespread, will they redefine the industry? Will they ultimately lower energy and computational costs, or will the demand for more complex reasoning capabilities drive up long-term resource consumption? Will they usher in an AI future that is more accessible and more innovative? Time will tell.

💡
Please forward and share!

Subscribe to Internet Exchange

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
jamie@example.com
Subscribe