gemini-2.0-flash
by google
Pricing
Input
$0.04 / 1M tokens
Output
$0.16 / 1M tokens
Gemini 2.0 Flash Release by Google DeepMind
Overview
Gemini 2.0 Flash is a fast, cost-optimized, multimodal LLM released by Google in early 2025. It's built for speed, tool use, and large-context reasoning across modalities like text, images, and audio.
- Release Date: February 5, 2025
- Multimodal Support: Text, Code, Images, Audio, Video (input); Text + Tool use (output)
- Context Length: Up to 1,048,576 tokens (1M+ input)
- Output Limit: Up to 8,192 tokens
Features
- Ultra low latency and high throughput
- Native tool use (search, function calling, code execution)
- Multimodal input (text, image, audio, video)
- Enhanced reasoning and retrieval abilities
- Flash-Lite variant available for extreme cost-efficiency
Benchmarks
Task | Gemini 2.0 Flash | GPT-4o | Claude 3.5 Haiku |
---|---|---|---|
Visual QA (Ophthalmology) | 54.8% | 52.2% | 49.4% |
HumanEval (coding) | 87.0% | 87.2% | 83.5% |
Tool Use & Function Calling | Integrated Natively | Function-calling support | Limited support |
Pricing
Model Tier | Price per 1M tokens |
---|---|
Flash 2.0 | $0.35 input / $1.05 output |
Flash-Lite | $0.019 input / $0.06 output |
Flash-Lite is among the cheapest high-context models on the market in 2025.
Use Cases
- Realtime chatbots and customer service agents
- Multimodal assistants with audio/video processing
- Lightweight search-augmented agents and retrieval apps
- Function-calling and automation pipelines
- Image-based tutoring, feedback, or Q&A tools
Safety and Stability
- Knowledge cutoff: August 2024
- Bias mitigation ongoing (reduced gender bias vs GPT-4o)
- Tool use and safety guardrails built-in via Vertex AI
- Supports structured outputs, safe parsing, and filter layers
Limitations
- Still susceptible to hallucinations in long-chain reasoning
- Output modalities (video, audio) still in preview
- Multimodal APIs not universally available on all platforms
- Works best within Google’s ecosystem (Vertex AI, AI Studio)
License
Gemini 2.0 Flash is commercially usable via Google Cloud’s Vertex AI and Gemini API. Usage is governed by Google Cloud Platform terms.