•From x.com

Peter Steinberger 🦞 @steipete
RT @OpenAIDevs: The standard for frontier coding evals is changing with model maturity. We now recommend reporting SWE-bench Pro and are s…
Sort:

Peter Steinberger 🦞 @steipete
RT @OpenAIDevs: The standard for frontier coding evals is changing with model maturity. We now recommend reporting SWE-bench Pro and are s…
Sort: