DeepSeek for People in a Hurry
The AI startup that shook NVIDIA and is cooking up a storm in the AI Landscape
Dear Duniyawaalo (People of the World),
You would have to be living under a rock to miss the waves DeepSeek has made in the AI industry.
I was amazed to witness how DeepSeek — a reasonably unknown AI startup — wiped nearly $600 billion off NVIDIA’s market cap in a single day.
We are talking about more money than the GDP of some countries!
Did they launch a mighty AI that renders GPUs obsolete? Or did they secretly train an AI on all of OpenAI’s secrets while Elon Musk was distracted on Twitter? (Oops, I mean X)
Not quite.
Nonetheless, what they did was nothing short of genius.
This unknown startup didn’t just launch another chatbot — it wiped $600 billion off NVIDIA’s market cap overnight. How? Not by making GPUs obsolete but by proving that AI doesn’t have to cost a fortune. Think of it like a scrappy chef cooking Michelin-star dishes in a budget kitchen. Let’s dig in.
DeepSeek figured out how to build a ChatGPT-level AI for a fraction of the cost. And just like that, the big AI players and the market started sweating.
I have been reading many articles online and even watched some YouTube videos on DeepSeek — many of those catered to technical people with working AI knowledge.
I wanted to write about DeepSeek in a way that even non-techies can better understand what’s going on while still keeping tech-savvy individuals like me interested.
A Cooking Analogy Right from the Horse’s Mouth
Being someone who is intellectually curious, I wanted to understand the technology and implementation details behind DeepSeek’s rapid growth in popularity. At the same time, I also wanted to learn it to explain it to people who don’t understand all the AI jargon.
I plan to take you on a journey to explain how DeepSeek pulled off this AI magic trick, the challenges they might still face, and what comes next.
For reference, I will use an activity we are all familiar with and hopefully enjoy: COOKING!
I aim to use a cooking analogy to explain what DeepSeek has accomplished, what challenges remain, and where they are headed.
I started by going straight to the horse’s mouth, downloaded the research paper for DeepSeek’s R1 model, and, in addition to other research, used it as a reference for this post.
What Did DeepSeek Accomplish? (Building a Michelin-Star AI with a Budget Kitchen)
Imagine all of AI is like cooking.
OpenAI and Google are like the Michelin-star restaurants. They have spent hundreds of millions of dollars sourcing the best ingredients, hiring world-class chefs, and experimenting for years to perfect their dishes.
They have built AI models (signature Michelin-star dish) that are huge, expensive, and require an ungodly amount of computing power to run.
In comes DeepSeek, a scrappy chef who looked at fancy restaurants and said
What if we could make the same dish that is just as tasty but without spending a fortune?
So, how did they do it?
Borrowing Great Recipes
In the culinary world, chefs don't always start from scratch when they want to create a new dish. Instead, they look at tried-and-tested recipes from master chefs, tweak them, and add their own unique twist.
DeepSeek took a similar shortcut by using open-source models like Llama (Meta) and Qwen (Alibaba) as their foundation. Instead of spending millions of dollars training an AI model from scratch, they fine-tuned these existing models with their own data and optimizations.
This significantly reduced development time and cost while maintaining high performance.
Research Page Reference:
Page 1: "We open-source DeepSeek-R1-Zero, DeepSeek-R1, and six dense models (1.5B, 7B, 8B, 14B, 32B, 70B) distilled from DeepSeek-R1 based on Qwen and Llama."
Using a Smarter Cooking Process
Imagine running a restaurant where every dish is prepared by one superstar chef who does everything — slicing, dicing, baking, plating, and garnishing.
That chef would burn out quickly!
A better approach? Have specialized chefs — one for sauces, one for grilling, one for desserts — so that each expert handles only their part.
Similarly, DeepSeek used a reinforcement learning (RL) approach that optimizes reasoning capabilities without running the entire model all the time.
For example, only the “science expert” modules are activated if the task involves answering a science question, while other experts remain idle.
Research Page Reference:
Page 3: “Our goal is to explore the potential of LLMs to develop reasoning capabilities without any supervised data, focusing on their self-evolution through a pure RL process.”
Learning from Master Chefs
Imagine you are an aspiring chef learning from a world-famous master chef. Instead of spending years experimenting on your own, you watch the master chefs closely, learn their techniques, and replicate their dishes.
Over time, you become almost as good as the master chef, but without going through all the trial and error.
DeepSeek used a similar technique called AI distillation, where they learned from larger, more powerful AI models (the “master chefs”) to improve their own smaller models. This allowed them to achieve high performance without massive computational resources.
Research Page Reference:
Page 4: “We further explore distillation from DeepSeek-R1 to smaller dense models… The open source DeepSeek-R1, as well as its API, will benefit the research community to distill better smaller models in the future.”Page 14: “We use DeepSeek-R1 as the teacher model to generate 800K training samples, and fine-tune several small dense models.”
Making the dish just as tasty but Cheaper
Imagine you are running a restaurant and want to serve a gourmet dish that rivals a Michelin-starred restaurant. However, you find a way to source high-quality ingredients at a lower cost and streamline your cooking process without compromising on taste.
Your customers get the same delicious meal, but you save money on ingredients and labor.
DeepSeek achieved something similar. They matched the performance of top-tier AI models like OpenAI’s, but at a fraction of the cost, making advanced AI more accessible and affordable.
Research Page Reference:
Page 1: “DeepSeek-R1 achieves performance comparable to OpenAI-o1–1217 on reasoning tasks.”Page 4: “DeepSeek-R1-Distill-Qwen-7B achieves 55.5% on AIME 2024, surpassing QwQ-32B-Preview.”
And Voila!
NVIDIA’s stock tanked because DeepSeek showed the world that AI models can be trained with fewer GPUs (NVIDIA’s golden goose).
What Are DeepSeek’s Limitations? (Why Their Dish Isn’t Perfect Yet)
Imagine a new restaurant that has just opened. The chef has created a fantastic dish everyone loves. But it is not perfect yet. The seasoning may be slightly off, or the presentation could be better.
Similarly, DeepSeek has created an impressive AI model, but it still has some limitations that must be addressed before competing with OpenAI or Google.
Limited Ingredients (Data Restrictions)
A chef can create a delicious dish without access to premium ingredients, but the dish might lack the richness and depth of flavor offered by a Michelin-starred meal.
Similarly, DeepSeek’s AI model is trained on high-quality data but lacks the same volume or diversity of data as OpenAI or Google. This means it might struggle with rare topics or highly complex reasoning tasks.
For example, DeepSeek-R1 struggles with tasks like function calls, multi-turn conversations, and complex role-playing, whereas models like GPT-4 excel.
Like most AI startups, DeepSeek does not have the data firepower of OpenAI or Google. However, their ability to fine-tune open-source models efficiently helps bridge the gap.
Page 16, Section 5: “As a result, DeepSeek-R1 has not demonstrated a huge improvement over DeepSeek-V3 on software engineering benchmarks.”
Small Kitchen (Scaling Challenges)
A small restaurant can serve amazing food but can’t handle thousands of customers at once.
Similarly, DeepSeek’s AI model is efficient and performs well, but scaling it to the size of OpenAI’s GPT series or Google’s Gemini is still challenging.
The infrastructure and resources required to scale up are immense, and DeepSeek is still working on this.
Research Paper Reference:
Page 15, Section 4.1: “First, distilling more powerful models into smaller ones yields excellent results, whereas smaller models relying on the large-scale RL mentioned in this paper require enormous computational power and may not even achieve the performance of distillation.”Page 15, Section 4.1: “Second, while distillation strategies are both economical and effective, advancing beyond the boundaries of intelligence may still require more powerful base models and larger-scale reinforcement learning.”
Not Their Own Recipe Yet (Innovation vs Borrowing)
A chef may have perfected a famous recipe but are yet to create their signature dish.
Similarly, DeepSeek’s real innovation isn’t a new AI breakthrough — it’s showing that state-of-the-art AI can be built and run for far less.
But will they stay a “smart budget chef,” or will they create their own Michelin-star recipe?
Research Paper Reference
Page 3, Section 1: “In this paper, we take the first step toward improving language model reasoning capabilities using pure reinforcement learning (RL).”Page 3, Section 1: “However, none of these methods has achieved general reasoning performance comparable to OpenAI’s o1 series models.”
Page 16, Section 5: “DeepSeek-R1 achieves performance comparable to OpenAI-o1–1217 on a range of tasks.”
What’s Next for DeepSeek? (Opening a Chain of Restaurants)
DeepSeek has proven that they can cook a fantastic dish efficiently.
What comes next?
Expanding the Menu (Building even more intelligent AI models)
A restaurant can add new dishes and improve existing ones to attract more customers.
Similarly, DeepSeek can fine-tune and improve its models, making them smarter, faster, and more accurate over time.
Creating Their Own Signature Dish (Developing a Unique AI Model)
A chef who initially started by improving existing popular recipes can now move on from following recipes to inventing their own Michelin-start dish.
Similarly, to truly compete with the likes of OpenAI and Google, DeepSeek might need to develop its own architecture instead of relying on existing models.
Scaling Up the Kitchen (Becoming a Big Player in AI)
A small restaurant can aspire to expand into a chain of high-end locations worldwide.
Similarly, DeepSeek could secure more funding to access better computing infrastructure and high-quality data.
Research Paper Reference
“Future work will explore improving the model’s ability to handle more languages, better data processing, and expanding RL training.”
“We plan to develop new architectures to push the boundaries beyond distillation.”
(Section: Future Work & Next Steps)
A Smart Challenger in the AI Kitchen Has Arrived
DeepSeek has shown that you don’t need a billion-dollar budget to make world-class AI — you just need to be smarter about how you cook.
They were able to accomplish all that by:
Not reinventing the wheel and leveraging existing AI models.
Using efficient techniques like Reinforcement Learning and AI Distillation.
Proving that AI can be powerful without breaking the bank.
They still have hurdles to overcome — like limited data, reliance on existing models, and scaling up.
There is no doubt that DeepSeek is a serious competitor in AI, and I am very excited to see if they can take the next step towards becoming a true AI leader.
I would love to see them stop borrowing recipes and start creating their signature dish from scratch.
Research Paper Reference
“DeepSeek-R1 is more powerful, achieving strong performance across various tasks while maintaining efficiency.”
“Future work includes exploring new architectures and expanding scaling capabilities.”
(Section: Conclusion & Future Work)
What do you think?
Do you think DeepSeek can truly challenge OpenAI and Google? Or is this just a clever AI cost-cutting hack?
Let me know your take in the comments!
❤️Rajneesh