Is Your Data Safe in DeepSeek?
0 min read
Lionel Menchaca
It’s been an eventful few days for DeepSeek.
Since it introduced R1 on January 20, the Chinese-based open-source Large Language Model (LLM) led many to question US tech companies’ collective (and expensive) approach to AI. That concern rocked the broader US stock market on Monday, shaving nearly $600 billion from NVIDIA’s market cap—the biggest single-day loss in history.
Since then, the company’s chatbot app rocketed to the top of both the Google Play Store and Apple’s App Store here in the United States. Late yesterday, the United States Navy banned its members from using the artificial intelligence application. And yesterday, OpenAI is investigating evidence that DeepSeek used “distillation” to train its open-source LLM using data extracted from OpenAI’s API.
Given all this activity, what does DeepSeek actually mean for your data?
What is DeepSeek?
DeepSeek is a Chinese AI startup founded in 2023, specializing in developing open-source LLMs. Established in Hangzhou by Liang Wenfeng, the company rose to prominence after creating advanced AI models like DeepSeek R1, which competes with other prominent AI chatbots like OpenAI’s ChatGPT, Microsoft’s Copilot chat and Anthropic’s Claude.
DeepSeek’s approach includes "inference-time computing" that activates only the most relevant model components for each query, resulting in more efficient computational performance. The company’s models are notable for their advanced reasoning capabilities, cost-effectiveness and potential to challenge established AI technology players, marking an important development in the global AI landscape.
In terms of cost-effectiveness, one of DeepSeek’s recent models is reported to cost $5.6 million to train—a fraction of the more than $100 million spent on training OpenAI’s GPT-4.
How does DeepSeek Handle Data?
According to DeepSeek’s privacy policy, the company collects information you provide in terms of:
- Profile information
- Data you input into the tool
- Information when you contact DeepSeek
DeepSeek also collects technical information like the device and network details, cookies and payment information. That means DeepSeek collects and potentially stores information based on an individual's use of the company's services.
Beyond these, it’s important to note that DeepSeek also collects ‘keystroke patterns or rhythms’ per the Automatically Collected Information section of their policy. And in the Information from Other Sources section, DeepSeek also collects information from your personal Linked Services like access tokens from Apple or Google if you sign in via those services. It stores this data and more in ‘secure servers located in the People’s Republic of China’ per their privacy policy.
Here’s more detail from a screenshot from DeepSeek’s privacy policy page as of January 29:
As a quick experiment, we thought it made sense to ask what DeepSeek data the PRC government could access.
We didn’t get a reply from DeepSeek, even after trying again later, as it requested. It’s just one reminder that the chatbot won’t reply to many topics censored by the Chinese government:
And here’s how Copilot chat responded:
Forcepoint Protects Data in GenAI Tools
It’s worth noting that DeepSeek doesn’t mention any country-specific data security laws like Europe’s GDPR—only saying that when it does need to transfer personal information out of the country of origin, that it ‘will do so in accordance with the requirements of applicable data protection laws.’
When you add it all up, it’s clear that DeepSeek poses unique data security issues beyond those we’ve seen with general LLMs like ChatGPT—especially when you consider that DeepSeek may access, preserve or share collected data with law enforcement agencies.
When it comes to securing data in DeepSeek or other GenAI platforms, Forcepoint customers have options. Learn more about how our products help secure GenAI tools or talk to an expert today.
Lionel Menchaca
Read more articles by Lionel MenchacaAs the Content Marketing and Technical Writing Specialist, Lionel leads Forcepoint's blogging efforts. He's responsible for the company's global editorial strategy and is part of a core team responsible for content strategy and execution on behalf of the company.
Before Forcepoint, Lionel founded and ran Dell's blogging and social media efforts for seven years. He has a degree from the University of Texas at Austin in Archaeological Studies.
- Forrester: Securing Generative AI
In the Article
- Forrester: Securing Generative AIView the Report
X-Labs
Get insight, analysis & news straight to your inbox
To the Point
Cybersecurity
A Podcast covering latest trends and topics in the world of cybersecurity
Listen Now