OpenAI launched ChatGPT Images 2.0 (gpt-image-2), its first image model with native reasoning capabilities. The model operates in two modes: Instant for fast output and Thinking for deliberate, multi-step generation. Key improvements include multi-image consistency from a single prompt, up to 2K resolution, better instruction-following, dense text rendering in non-Latin scripts, and flexible aspect ratios. It integrates with Codex for in-workflow visual creation. Limitations include struggles with physical-world spatial tasks and diminishing returns on iterative edits. DALL-E 2 and 3 are being retired May 12, making this a strategic replacement as Google's Gemini currently leads the LM Arena image leaderboard.
Table of contents
What Images 2.0 doesWhat Images 2.0 does betterThe competition heats upCodex integrationDeveloper accessWhat’s nextSort: