OpenAI launched ChatGPT Images 2.0 (gpt-image-2), its first image model with native reasoning capabilities. The model operates in two modes: Instant for fast output and Thinking for deliberate, multi-step generation. Key improvements include multi-image consistency from a single prompt, up to 2K resolution, better instruction-following, dense text rendering in non-Latin scripts, and flexible aspect ratios. It integrates with Codex for in-workflow visual creation. Limitations include struggles with physical-world spatial tasks and diminishing returns on iterative edits. DALL-E 2 and 3 are being retired May 12, making this a strategic replacement as Google's Gemini currently leads the LM Arena image leaderboard.

5m read timeFrom thenewstack.io
Post cover image
Table of contents
What Images 2.0 doesWhat Images 2.0 does betterThe competition heats upCodex integrationDeveloper accessWhat’s next

Sort: