Is GPT-5 a “phenomenal” success or an “underwhelming” failure?

It was inevitable that people would be disappointed with last week’s release of GPT-5. That’s not because OpenAI did a poor job, and it’s not even because OpenAI did anything in particular to hype up the new version. The problem was simply that OpenAI’s previous “major” model releases—GPT-2, GPT-3, and GPT-4—have been so consequential.

… So of course people had high expectations for GPT-5. And OpenAI seems to have worked hard to meet those expectations.

…. OpenAI probably should have given the GPT-5 name to o1, the reasoning model OpenAI announced last September. That model really did deliver a dramatic performance improvement over previous models. It was followed by o3, which pushed this paradigm—based on reinforcement learning and long chains of thought—to new heights. But we haven’t seen another big jump in performance over the last six months, suggesting that the reasoning paradigm may also be reaching a point of diminishing returns (though it’s hard to know for certain).

Regardless, OpenAI found itself in a tough spot in early 2025. It needed to release something it could call GPT-5, but it didn’t have anything that could meet the sky-high expectations that had developed around that name. So rather than using the GPT-5 name for a dramatically better model, it decided to use it to signal a reboot of ChatGPT as a product.

… The reality is that GPT-5 is a solid model (or technically suite of models—we’ll get to that) that performs as well or better than anything else on the market today. In my own testing over the last week, I found GPT-5 to be the most capable model I’ve ever used. But it’s not the kind of dramatic breakthrough people expected from the GPT-5 name. And it has some rough edges that OpenAI is still working to sand down. — Read More

#nlp