deepseek-ai/DeepSeek-V3
by deepseek
Pricing
Input
$0.11 / 1M tokens
Output
$0.44 / 1M tokens
DeepSeek V3 Release by DeepSeek AI
Overview
DeepSeek V3 is a cutting-edge open-source large language model developed by DeepSeek AI. Released in December 2024, it showcases significant advancements in efficiency, scalability, and performance across various AI tasks.
- Release Date: December 26, 2024
- Architecture: Mixture-of-Experts (MoE) with 671B total parameters, 37B activated per token
- Context Length: 128,000 tokens
- Training Dataset: 14.8 trillion high-quality tokens
- Training Duration: Approximately 55 days
- Training Cost: $5.58 million
- Open-Source Availability: [GitHub Repository](https://github.com/deepseek-ai/DeepSeek-V3)
Features
- Advanced Mixture-of-Experts (MoE) architecture for efficient computation
- Multi-Head Latent Attention (MLA) for enhanced context understanding
- Multi-Token Prediction (MTP) for improved inference speed
- Trained on 14.8 trillion tokens, ensuring comprehensive knowledge
- Fully open-source with accessible models, papers, and training frameworks
Benchmarks
Task | DeepSeek V3 | GPT-4o Mini | Claude 3.5 Haiku |
---|---|---|---|
HumanEval (coding) | 82.6% | 87.2% | 88.1% |
MMLU (general knowledge) | 88.5% | 53.4% | 65.0% |
Math | 90.2% | 67.0% | 69.4% |
Pricing
DeepSeek V3 is available through various platforms:
-
Azure AI Foundry:
- Input (cache miss): $0.00114 per 1,000 tokens
- Input (cache hit): $0.00125 per 1,000 tokens
- Output: $0.00456 per 1,000 tokens :contentReference[oaicite:5]{index=5}
-
DeepSeek API:
- Input (cache miss): $0.27 per 1,000 tokens
- Input (cache hit): $0.07 per 1,000 tokens
- Output: $1.10 per 1,000 tokens :contentReference[oaicite:6]{index=6}
Note: Pricing may vary based on usage and platform.
Use Cases
DeepSeek V3 is suitable for a wide range of applications:
- Advanced coding assistants and development tools
- Large-context document processing and summarization
- Complex reasoning and interactive chatbots
- Enterprise-grade AI applications requiring high accuracy
Safety and Stability
- Built on DeepSeek’s safety-first AI framework
- Instruction-following with jailbreak resistance
- Currently no live internet connection (offline knowledge)
- Knowledge cutoff: December 2024
Limitations
- Image input support coming soon
- Higher cost compared to some competitors
- Some complex numeric reasoning still challenging
License
Commercial use permitted under DeepSeek's terms. Available via API and cloud partners.