gemini-2.0-flash

by google

Pricing

Input $0.04 / 1M tokens
Output $0.16 / 1M tokens

Gemini 2.0 Flash Release by Google DeepMind


Overview

Gemini 2.0 Flash is a fast, cost-optimized, multimodal LLM released by Google in early 2025. It's built for speed, tool use, and large-context reasoning across modalities like text, images, and audio.

  • Release Date: February 5, 2025
  • Multimodal Support: Text, Code, Images, Audio, Video (input); Text + Tool use (output)
  • Context Length: Up to 1,048,576 tokens (1M+ input)
  • Output Limit: Up to 8,192 tokens

Features

  • Ultra low latency and high throughput
  • Native tool use (search, function calling, code execution)
  • Multimodal input (text, image, audio, video)
  • Enhanced reasoning and retrieval abilities
  • Flash-Lite variant available for extreme cost-efficiency

Benchmarks

Task Gemini 2.0 Flash GPT-4o Claude 3.5 Haiku
Visual QA (Ophthalmology) 54.8% 52.2% 49.4%
HumanEval (coding) 87.0% 87.2% 83.5%
Tool Use & Function Calling Integrated Natively Function-calling support Limited support

Pricing

Model Tier Price per 1M tokens
Flash 2.0 $0.35 input / $1.05 output
Flash-Lite $0.019 input / $0.06 output

Flash-Lite is among the cheapest high-context models on the market in 2025.


Use Cases

  • Realtime chatbots and customer service agents
  • Multimodal assistants with audio/video processing
  • Lightweight search-augmented agents and retrieval apps
  • Function-calling and automation pipelines
  • Image-based tutoring, feedback, or Q&A tools

Safety and Stability

  • Knowledge cutoff: August 2024
  • Bias mitigation ongoing (reduced gender bias vs GPT-4o)
  • Tool use and safety guardrails built-in via Vertex AI
  • Supports structured outputs, safe parsing, and filter layers

Limitations

  • Still susceptible to hallucinations in long-chain reasoning
  • Output modalities (video, audio) still in preview
  • Multimodal APIs not universally available on all platforms
  • Works best within Google’s ecosystem (Vertex AI, AI Studio)

License

Gemini 2.0 Flash is commercially usable via Google Cloud’s Vertex AI and Gemini API. Usage is governed by Google Cloud Platform terms.