A developer revisits OpenAI's Codex after months of dismissing it and finds it has significantly improved. Key highlights include better value for money compared to Claude Max ($20 vs $100 with more usable daily runway), stronger autonomous reasoning that makes reasonable assumptions without constant hand-holding, a new /goal slash command enabling long-horizon autonomous agent behavior, and GPT-5.5 scoring 82.7% on Terminal-Bench 2.0 versus Claude Opus 4.7's 69.4%. The author concludes that Codex has matured into a compelling alternative to Claude Code, shifting their tool selection from brand loyalty to actual performance.

4m read timeFrom xda-developers.com
Post cover image
Table of contents
The limits alone make Codex worth a second lookCodex is superior at reasoning and more autonomousCodex feels like it finally knows what it wants to be

Sort: