meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8

by meta

Pricing

Input $0.11 / 1M tokens
Output $0.34 / 1M tokens

LLaMA 4 Maverick Release by Meta AI


Overview

LLaMA 4 Maverick ist ein leistungsstarkes multimodales Modell aus Metas LLaMA 4 Familie, veröffentlicht im April 2025. Es bietet starke Text- und Bildverarbeitung, eine enorme Kontextlänge und eine hochskalierbare Mixture-of-Experts-Architektur.

  • Release Date: April 5, 2025
  • Multimodal Support: Text + Images (early fusion)
  • Context Length: 1 million tokens
  • Model Size: 400B total / 17B active (Mixture-of-Experts with 128 experts)

Features

  • Sparse Mixture-of-Experts architecture with 128 experts
  • Multilingual: trained on 12+ languages
  • Strong performance in reasoning, math, and coding
  • Native image understanding with early fusion
  • Optimized for scalable inference on advanced hardware

Benchmarks

Task LLaMA 4 Maverick GPT-4o Mini Gemini Flash
MMLU (reasoning) ~73.4% 82.0% 77.9%
MGSM (math) 87.0% 79.7%
Context Handling 1M tokens 128K tokens 128K tokens

Note: Einige Benchmarks sind community-basiert; Meta hat noch keine vollständigen offiziellen Evaluierungen veröffentlicht.


Access & Pricing

Type Availability
Model Weights Free (LLaMA 4 Community License)
Inference Optimized for scalable hardware with multiple GPUs

Use Cases

  • Multimodal assistant agents with text and image understanding
  • Large-scale document analysis with million-token context windows
  • Fast coding and reasoning tasks
  • Edge and cloud inference optimized for efficiency

Safety and Alignment

  • Fine-tuned with Meta’s alignment pipeline
  • Moderation filters and instruction following
  • Knowledge cutoff: August 2024

Limitations

  • No audio or video input/output
  • Limited official benchmark disclosure
  • No native web browsing or plugin support

License

Released under the LLaMA 4 Community License. Commercial use permitted within specified limits (~700M MAUs).