meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8

Overview

LLaMA 4 Maverick ist ein leistungsstarkes multimodales Modell aus Metas LLaMA 4 Familie, veröffentlicht im April 2025. Es bietet starke Text- und Bildverarbeitung, eine enorme Kontextlänge und eine hochskalierbare Mixture-of-Experts-Architektur.

Release Date: April 5, 2025
Multimodal Support: Text + Images (early fusion)
Context Length: 1 million tokens
Model Size: 400B total / 17B active (Mixture-of-Experts with 128 experts)

Features

Sparse Mixture-of-Experts architecture with 128 experts
Multilingual: trained on 12+ languages
Strong performance in reasoning, math, and coding
Native image understanding with early fusion
Optimized for scalable inference on advanced hardware

Benchmarks

Task	LLaMA 4 Maverick	GPT-4o Mini	Gemini Flash
MMLU (reasoning)	~73.4%	82.0%	77.9%
MGSM (math)	—	87.0%	79.7%
Context Handling	1M tokens	128K tokens	128K tokens

Note: Einige Benchmarks sind community-basiert; Meta hat noch keine vollständigen offiziellen Evaluierungen veröffentlicht.

Access & Pricing

Type	Availability
Model Weights	Free (LLaMA 4 Community License)
Inference	Optimized for scalable hardware with multiple GPUs

Use Cases

Multimodal assistant agents with text and image understanding
Large-scale document analysis with million-token context windows
Fast coding and reasoning tasks
Edge and cloud inference optimized for efficiency

Safety and Alignment

Fine-tuned with Meta’s alignment pipeline
Moderation filters and instruction following
Knowledge cutoff: August 2024

Limitations

No audio or video input/output
Limited official benchmark disclosure
No native web browsing or plugin support

License

Released under the LLaMA 4 Community License. Commercial use permitted within specified limits (~700M MAUs).