Gemini 2.x, Glossary, Textbook of AI

Gemini 2.x is Google DeepMind's second major generation of the Gemini model family, succeeding the December 2023 launch of Gemini 1.0 and the February 2024 long-context Gemini 1.5. The 2.x line spans Gemini 2.0 Flash (December 2024), Gemini 2.0 Flash Thinking, Gemini 2.0 Pro, and the Gemini 2.5 Pro and Flash variants released through 2025.

Native multimodality. Gemini 2.0 Flash ships native image generation and native audio output end-to-end within the same model, rather than routing to separate diffusion or TTS systems. This lets the model interleave text, images and speech in a single response, generate sequences of consistent images, and edit images via natural-language instructions while preserving subject identity. By 2025 the same approach extended to short video.

Reasoning. Gemini 2.0 Flash Thinking is the explicit-reasoning sibling, exposing the model's intermediate chain of thought. Like OpenAI's o-series and Claude's extended thinking, it trades latency for accuracy on mathematics, coding and multi-step problems. Gemini 2.5 unified the two paths: a single model toggles between fast and deliberate modes per request.

Context length. Gemini 1.5 introduced 1-million-token context with strong needle-in-haystack retrieval, enabled by improvements in attention and retrieval routing. Gemini 2.x preserved 1M tokens as the default and offered 2M-token windows on Pro tiers. This made Gemini the de facto choice for whole-codebase reasoning, long-video understanding, and multi-document research, and it shaped how teams structured prompts: full repositories or hour-long transcripts could be passed inline rather than chunked through retrieval.

Agentic features. Gemini 2.0 launched alongside Project Mariner (a browser-driving agent) and Project Astra (a real-time multimodal assistant). The Gemini API added native tool use, code execution, Google Search grounding, and structured output. Gemini also became the model behind NotebookLM, including its widely shared audio-overview feature.

Architecture. Google has not disclosed parameter counts. Public information indicates a mixture-of-experts transformer with sparse routing, multimodal early-fusion encoders, and infrastructure tuned for Google's TPU v5p and Trillium accelerators. Training data spans Google's web index, code corpora, and licensed multimodal sources.

Distribution. Available through the Gemini app (gemini.google.com), Google AI Studio, Vertex AI, Workspace integrations (Docs, Gmail, Sheets), and Android / Pixel system features. Gemini Nano runs on-device on flagship Pixel and Samsung phones.

Position. By early 2026 Gemini 2.5 Pro is widely regarded as one of the top three frontier systems alongside Claude 4 Opus and GPT-5, with particular strengths in long-context retrieval, multimodal generation and integration with Google's productivity surface.

Discussed in:

Chapter 15: Modern AI, Modern AI

AI tools used: Claude (research, coding, text), ChatGPT (diagrams, images), Grammarly (editing).