deepseek-ai/DeepSeek-V3

by deepseek

Pricing

Input $0.11 / 1M tokens
Output $0.44 / 1M tokens

DeepSeek V3 Release by DeepSeek AI


Overview

DeepSeek V3 is a cutting-edge open-source large language model developed by DeepSeek AI. Released in December 2024, it showcases significant advancements in efficiency, scalability, and performance across various AI tasks.

  • Release Date: December 26, 2024
  • Architecture: Mixture-of-Experts (MoE) with 671B total parameters, 37B activated per token
  • Context Length: 128,000 tokens
  • Training Dataset: 14.8 trillion high-quality tokens
  • Training Duration: Approximately 55 days
  • Training Cost: $5.58 million
  • Open-Source Availability: [GitHub Repository](https://github.com/deepseek-ai/DeepSeek-V3)

Features

  • Advanced Mixture-of-Experts (MoE) architecture for efficient computation
  • Multi-Head Latent Attention (MLA) for enhanced context understanding
  • Multi-Token Prediction (MTP) for improved inference speed
  • Trained on 14.8 trillion tokens, ensuring comprehensive knowledge
  • Fully open-source with accessible models, papers, and training frameworks

Benchmarks

Task DeepSeek V3 GPT-4o Mini Claude 3.5 Haiku
HumanEval (coding) 82.6% 87.2% 88.1%
MMLU (general knowledge) 88.5% 53.4% 65.0%
Math 90.2% 67.0% 69.4%

Pricing

DeepSeek V3 is available through various platforms:

  • Azure AI Foundry:

    • Input (cache miss): $0.00114 per 1,000 tokens
    • Input (cache hit): $0.00125 per 1,000 tokens
    • Output: $0.00456 per 1,000 tokens :contentReference[oaicite:5]{index=5}
  • DeepSeek API:

    • Input (cache miss): $0.27 per 1,000 tokens
    • Input (cache hit): $0.07 per 1,000 tokens
    • Output: $1.10 per 1,000 tokens :contentReference[oaicite:6]{index=6}

Note: Pricing may vary based on usage and platform.


Use Cases

DeepSeek V3 is suitable for a wide range of applications:

  • Advanced coding assistants and development tools
  • Large-context document processing and summarization
  • Complex reasoning and interactive chatbots
  • Enterprise-grade AI applications requiring high accuracy

Safety and Stability

  • Built on DeepSeek’s safety-first AI framework
  • Instruction-following with jailbreak resistance
  • Currently no live internet connection (offline knowledge)
  • Knowledge cutoff: December 2024

Limitations

  • Image input support coming soon
  • Higher cost compared to some competitors
  • Some complex numeric reasoning still challenging

License

Commercial use permitted under DeepSeek's terms. Available via API and cloud partners.