Feb 12, 2025 7 min read

No, DeepSeek Isn’t the End for GPUs. It’s Just the Beginning.

No, DeepSeek Isn’t the End for GPUs. It’s Just the Beginning.
Table of Contents

DeepSeek and the GPU Market

DeepSeek has recently made waves in the AI community, not just for developing a large language model (LLM) that rivals that of OpenAI’s, but for how it achieved this feat. Many have framed DeepSeek as a disruptor, proving that high-performance AI models, on par with ChatGPT and Llama, can be trained at a fraction of the cost of industry leaders like OpenAI and Meta.

🗝️
AI model training is more accessible than ever. If you're a startup looking to train your own LLM, our DGX solutions provide the infrastructure to get there.

Reports suggest that DeepSeek trained its model for as little as $5.6 million in GPU hardware, a figure that starkly contrasts with the hundreds of millions spent by OpenAI. This has recently led to speculation that DeepSeek could spell trouble for the GPU market, particularly for companies like NVIDIA, whose hardware powers most AI training clusters. The market reacted strongly, with NVIDIA’s stock experiencing a record one-day loss following the news.

However, the notion that DeepSeek is a “GPU killer” is misplaced. What it has truly demonstrated is a lower barrier to entry for AI startups, making it possible for smaller companies to train competitive AI models without the need for billion-dollar infrastructure. The demand for GPUs remains strong, and if anything, increased accessibility to AI model training could drive broader adoption and increase overall demand for AI computing power.

DeepSeek’s Achievement in AI Model Training

DeepSeek’s recent success has sparked discussions about AI model training costs and the feasibility of building competitive LLMs without billion-dollar infrastructure. Unlike OpenAI and Google, which have historically relied on extensive high-end GPU clusters costing hundreds of millions, DeepSeek trained its model using approximately 2,048 NVIDIA H800 GPUs, keeping hardware costs around $5.6 million.

For comparison, OpenAI reportedly spent over $100 million to train GPT-4, making DeepSeek’s achievement a fraction of that cost. This vast difference highlights how improved efficiency and hardware optimization can significantly lower the financial barrier to training competitive AI models.

The H800, a variant of the H100 designed to comply with export restrictions, delivers strong AI performance while being more widely available to the Chinese market. By leveraging efficient training techniques and optimized hardware utilization, DeepSeek was able to cut costs while still producing a model that competes with the likes of ChatGPT and Llama.

Rather than signaling the end of GPU demand, DeepSeek’s success highlights a shift toward more efficient AI training. Lowering costs doesn’t eliminate the need for GPUs, it simply makes training competitive AI models feasible for more companies.

Reevaluating the ‘GPU Killer’ Narrative

Following DeepSeek’s announcement, speculation spread that its lower-cost AI model training could signal the end of the booming GPU market. The idea was simple: if state-of-the-art models can be trained for a fraction of the cost, then demand for expensive GPU clusters might shrink. This narrative led to immediate market reactions, with NVIDIA’s stock experiencing a temporary drop as investors questioned whether AI companies would continue purchasing GPUs at the same rate.

However, this reaction overlooks key factors that still drive GPU demand:

  • DeepSeek still used over 2,000 high-end GPUs, it didn’t eliminate the need for GPUs, it just optimized how they were used.
  • AI model training remains computationally intensive, and GPUs are still the most effective hardware for this task.
  • More accessible AI training means more AI models being developed, which could ultimately increase overall demand for GPUs.

Even industry leaders pushed back on the idea that DeepSeek’s achievement threatens the GPU market. Former Intel CEO Pat Gelsinger noted that improved AI efficiency often leads to broader adoption, which in turn sustains or even increases demand for AI infrastructure. The key takeaway isn’t that GPUs are becoming obsolete, it’s that AI development is becoming more accessible.

Instead of consolidating AI power in the hands of a few massive players, DeepSeek’s approach opens the door for a wave of new AI startups, all of which will still require GPUs to train and deploy their models. The shift isn’t away from GPUs but toward more distributed ownership of GPU infrastructure.

Beyond training, the demand for GPUs is also driven by inference, running AI models in real-time for applications like chatbots, image generation, and enterprise automation. Even if training becomes more efficient, serving models at scale still requires significant GPU resources, and as AI adoption expands, so does the need for powerful hardware.

NVIDIA and other hardware manufacturers aren’t at risk of losing relevance. If anything, AI hardware demand is broadening, moving beyond cloud giants like OpenAI and Google to a much wider range of companies building their own AI capabilities.

DeepSeek’s success isn’t a death knell for GPUs, it’s proof that AI development is entering a more accessible, decentralized phase. The result? More companies investing in GPU infrastructure, not fewer.

Lowering the Barrier for AI Startups

This shift toward accessibility is nothing new. History has shown that when a once-exclusive technology becomes more efficient and affordable, it doesn’t shrink the market. It expands it.

Another model reshaped an industry in a similar way. No, not an LLM, but the Model T, Ford’s first assembly-line-produced vehicle. Before Ford revolutionized car manufacturing, automobiles were expensive luxury items, hand-built and only available to the wealthy. By optimizing production with assembly line techniques, Ford dramatically lowered costs, making cars affordable to a much larger audience. This approach did not reduce demand for cars, but expanded the market, inadvertently fueling the growth of entirely new industries, including highways, gas stations, motels, and suburban expansion.

DeepSeek is following a similar path by optimizing the AI training process, proving that high-performance models can be developed at a fraction of the cost previously thought necessary. Just as Ford’s innovations led to a wave of new car manufacturers and industries, DeepSeek’s approach is paving the way for more companies to enter AI model development, making the AI market more diverse and competitive.

As more companies can afford to train their own models, the AI market will evolve beyond just a few dominant players. This could lead to new applications and AI specializations, much like how the affordability of automobiles spurred advancements in logistics, transportation, and urban development. We could see breakthroughs in AI-driven robotics, domain-specific LLMs, and industry-tailored AI solutions, each requiring its own compute infrastructure.

Democratizing AI

The democratization of AI has been a stated goal for many leading organizations. Microsoft and OpenAI have made significant strides by offering access to their advanced models through APIs and free online platforms, enabling developers to integrate AI capabilities into their applications.

Meta’s release of models like LLaMA 70B has provided the research community with valuable resources to advance AI development, giving researchers greater flexibility in testing and refining AI models.

However, access to these models is often limited, as they remain closed systems or require partnerships to leverage their full capabilities

While these initiatives have expanded access to AI technologies, they often come without full transparency into the underlying methodologies. Many proprietary AI models are available for use but do not provide the essential components that power their performance. Meta’s Llama models stand out for being more open than most, but true open access remains rare.

DeepSeek’s decision to open-source its AI models, such as DeepSeek-R1, takes a notable step further. By releasing models under the permissive MIT license, DeepSeek has provided developers with comprehensive access to its architecture and training methodologies, enabling AI practitioners to study, adapt, and build upon its work. This level of openness encourages experimentation and allows a broader range of organizations to refine and optimize AI models independently.

That said, open-source AI does not always mean unrestricted AI. Reports indicate that DeepSeek's model includes censorship measures, filtering certain prompts related to politically sensitive topics in China. Queries on topics such as government policies and historical events have been found to trigger refusals or vague responses.

By making its models openly available, DeepSeek has contributed to democratizing AI, lowering financial barriers while offering a practical framework for AI research and innovation. However, as AI accessibility grows, the discussion around how models are developed, deployed, and regulated will continue to shape the future of AI adoption.

DeepSeek is a Step Forward, But Not the Endgame

DeepSeek has shown what’s possible with better optimization and fewer resources, but it also raises the question: what happens when the largest AI companies, with access to millions of GPUs, refine their own training efficiencies?

If a startup can train a state-of-the-art model for a fraction of traditional costs, the potential for massive AI players like OpenAI, Google, and Meta to push boundaries even further is enormous. With more efficient training techniques, these companies could scale models far beyond what was previously possible.

Rather than making GPUs obsolete, DeepSeek has shown that the next phase of AI isn’t about who can spend the most, but who can train the smartest.

And if there’s one thing AI developers have proven, it’s that being “good enough” isn’t the goal. Top AI developers are pushing to build the most advanced models possible. With each breakthrough, the compute demands of AI continue to rise, driving even greater need for high-performance hardware.

Optimizing AI Infrastructure with AMAX

DeepSeek’s success highlights how AI model training is shifting. Efficiency and smart resource allocation now matter just as much as raw compute power. As more companies look to train their own models, having the right infrastructure is key to maximizing performance while keeping costs in check.

AMAX designs and builds high-performance GPU clusters, delivering optimized AI solutions that meet the demands of modern AI workloads. Whether you’re a startup training a competitive LLM or an enterprise scaling AI infrastructure, AMAX provides custom, fully integrated systems that balance efficiency, power, and cost-effectiveness.

💡
For companies looking to train the next breakthrough AI model, our DGX solutions offer scalable, high-performance infrastructure to support demanding AI workloads.

From design to deployment, AMAX ensures your AI infrastructure is built to deliver results efficiently and at any scale.