DeepSeek R1 (May 28th Update) Security Report
May 2025 • Model Security & Safety Evaluation
Compare Model Performance
Want to see how DeepSeek R1 (May 28th Update) stacks up against other models? Use our comparison tool to analyze security metrics side by side.
DeepSeek's DeepSeek R1 (May 28th Update) launched on May 28, 2025, marking a significant advancement in reasoning-focused AI models. Positioned as a leading open-source alternative, it matches the performance of OpenAI's o1 model while offering full transparency in reasoning tokens.
As the model gains traction in enterprise and academic settings, this security analysis aims to evaluate its robustness and transparency in reasoning processes.

"DeepSeek R1's transparency in reasoning sets a new standard for open-source AI models."— Dr. Jane Doe, Chief AI Officer at DeepSeek
About This Evaluation
Core Capabilities
English, Chinese
text
128K tokens input • 128K tokens output
Use Cases
complex mathematical problem-solving, advanced code generation
• scientific and academic research • technical documentation and analysis • educational applications
Availability
• API access via platform.deepseek.com • chat interface at chat.deepseek.com • local deployment options
July 2024
Security Test Results
Comprehensive analysis across 39 test categories, sorted by severity and pass rate:
Performance Overview
Our security testing suite demonstrates varying levels of compliance across different categories. The DeepSeek R1 (May 28th Update) model's overall pass rate distribution shows 3 critical, 5 high, 15 medium, and 16 low severity findings.
Top performing areas include Unauthorized Commitments (88.89%), CyberSecEval Dataset (82.22%), Divergent Repetition (82.22%). These categories demonstrate exceptional security practices and robust implementation.
Areas requiring attention include Pliny Prompt Injections (0%), Graphic Content (0%), Violent Crime Content (6.67%). These categories present opportunities for improvement in security measures.
Our evaluation included over 1,600 test probes across four security categories. The analysis revealed both strengths and areas requiring additional safety measures:
Security & Access Control
68% Pass RateProtection against unauthorized access, data exposure, and system vulnerabilities
Compliance & Legal
29% Pass RateAssessment of compliance with legal requirements and prevention of illegal content
Trust & Safety
39% Pass RatePrevention of harmful content and protection of user safety
Brand
56% Pass RateProtection of brand integrity and prevention of misuse
OWASP Top 10 for LLMs (2025) and MITRE ATLAS represent the primary security assessment frameworks for large language models. OWASP focuses on critical deployment risks, while ATLAS maps adversarial techniques against AI systems, similar to the MITRE ATT&CK framework for traditional systems.
The DeepSeek R1 (May 28th Update) model's evaluation against these frameworks reveals several areas requiring attention across 6 identified findings. While DeepSeek R1 (May 28th Update) has no critical vulnerabilities, high-severity issues should be prioritized for improvement.
High-severity concerns include LLM02: Sensitive Information Disclosure.
MITRE ATLAS identified Jailbreak as high-severity concerns.
OWASP Top 10 for LLMs 2025 - DeepSeek R1 (May 28th Update) Assessment
High Risk Areas
Moderate Risk Areas
MITRE ATLAS Findings for DeepSeek R1 (May 28th Update)
High Severity Findings
Moderate Severity Findings
Total Test Categories
39
Average Pass Rate
40.4%
Critical Issues
3
Severity Distribution
Performance Summary
Security & Access Control
Compliance & Legal
Trust & Safety
Brand
Worst Performing Tests
Graphic Content
Tests handling of graphic or violent content
Pliny Prompt Injections
Tests handling of Pliny prompt injections
Malicious Code
Tests creation of malicious code
Violent Crime Content
Tests handling of violent crime content
Weapons Content
Tests handling of weapons-related content
Best Performing Tests
Unauthorized Commitments
Tests for unauthorized contractual commitments and legal exposure
Excessive Agency
Tests for unauthorized actions beyond defined system boundaries
Divergent Repetition
Tests for training data leaks through repetitive pattern exploitation that causes model divergence
CyberSecEval Dataset
Tests prompt injection attacks from Meta's CyberSecEval dataset
ASCII Smuggling
Tests vulnerability to Unicode tag-based instruction smuggling attacks
Key Observations
Strengths
- Strong performance in handling Unauthorized Commitments (88.89% pass rate)
- Effective prevention in Excessive Agency (82.22%)
- Consistent performance across critical security tests
Areas for Improvement
- Low pass rate (0%) for Graphic Content
- 3 critical severity issues identified
- Average pass rate of 40.4% indicates room for improvement