Tuesday, March 25, 2025
Home Artificial Intelligence DeepSeek – Chinese Startup AI – Rival of OpenAI

DeepSeek – Chinese Startup AI – Rival of OpenAI

by TheBlogBoy
0 comments 5 mins read
deepseek-ai

On January 20, DeepSeek, a lesser-known AI research lab from China, unveiled an open-source model that’s quickly gained attention in Silicon Valley. According to a paper from the company, DeepSeek-R1 outperforms leading models like OpenAI’s o1 on various math and reasoning benchmarks. In fact, on several key metrics—capability, cost, and openness—DeepSeek is challenging Western AI giants.

DeepSeek’s breakthrough highlights an unexpected consequence of the ongoing tech rivalry between the US and China. US export controls have heavily restricted Chinese tech firms’ ability to compete in the traditional Western style, which relies on scaling up by purchasing more chips and training for longer periods. As a result, most Chinese companies have shifted focus to downstream applications instead of developing their own models. However, with this latest release, DeepSeek demonstrates an alternative path to success: rethinking the fundamental structure of AI models and utilizing limited resources more effectively.

By adopting open-source strategies, DeepSeek has harnessed collective expertise and encouraged collaborative innovation. This approach not only helps overcome resource limitations but also speeds up the development of advanced technologies, distinguishing DeepSeek from its more isolated competitors.”

A Rising Star in China’s Hedge Fund Scene

DeepSeek stands out even in China’s rapidly growing AI sector. It originally began as Fire-Flyer, a deep-learning research division of High-Flyer, one of China’s top-performing quantitative hedge funds. Founded in 2015, High-Flyer quickly made a name for itself in China, becoming the first quant hedge fund to raise over 100 billion RMB (around $15 billion). While the fund’s value has since dropped to about $8 billion, it remains one of the country’s most influential quantitative hedge funds.

banner

For years, High-Flyer accumulated GPUs and built Fire-Flyer supercomputers to analyze financial data. But in 2023, Liang, who holds a master’s degree in computer science, made the bold decision to shift the fund’s focus toward creating DeepSeek—an AI company dedicated to developing cutting-edge models and, ultimately, artificial general intelligence. It was as if a financial giant like Jane Street decided to reinvent itself as an AI startup, investing heavily in scientific research.

Liang told the Chinese tech outlet 36Kr that his decision was fueled more by scientific curiosity than financial gain. “I wouldn’t be able to find a commercial reason even if you asked me,” he said. “Because commercially, it’s not worth it. Basic scientific research has a very low return-on-investment. When OpenAI’s early investors funded them, they weren’t focused on the return—they were passionate about the mission.” Today, DeepSeek stands as one of the few major AI companies in China that doesn’t rely on backing from tech giants like Baidu, Alibaba, or ByteDance.

Innovation Born from Adversity

In October 2022, the US government introduced export controls that severely limited Chinese AI companies’ access to cutting-edge chips like Nvidia’s H100. This created a significant challenge for DeepSeek. Although the company had initially stockpiled 10,000 H100s, it needed more to compete with industry leaders like OpenAI and Meta. “Our challenge was never funding; it was the export restrictions on advanced chips,” Liang explained in a second interview with 36Kr in 2024.

To overcome this hurdle, DeepSeek had to develop more efficient ways to train its models. “They refined their model architecture through a variety of engineering techniques—custom communication between chips, reducing field sizes to conserve memory, and an innovative use of the mix-of-models approach. “While many of these ideas aren’t new, successfully combining them to produce a top-tier model is a remarkable achievement.”

DeepSeek has also made significant strides with Multi-head Latent Attention (MLA) and Mixture-of-Experts—two technical strategies that make its models more cost-effective by requiring fewer computing resources for training. In fact, DeepSeek’s latest model is so efficient that it only needed one-tenth the computing power of Meta’s Llama 3.1 model.

By sharing these innovations openly, DeepSeek has garnered strong support from the global AI research community. For many Chinese AI firms, open-source models are the best way to catch up to their Western competitors, as they draw more users and contributors, which helps the models evolve. “DeepSeek has shown that cutting-edge models can be built with less—though still significant—investment, and that there’s plenty of room for optimization within the current norms of model-building.

This development could pose a challenge to current US export controls, which aim to create bottlenecks in computing resources.

You may also like

Leave a Comment

TheBlogBoy publishes latest articles and information related to technology. Join the community to know more about gaming, computers, gadgets and AI

Edtior's Picks

Latest Articles

© TheBlogBoy, Inc. All rights reserved.

Are you sure want to unlock this post?
Unlock left : 0
Are you sure want to cancel subscription?
-
00:00
00:00
Update Required Flash plugin
-
00:00
00:00