The new wave of frontier AI models in May 2026: Opus 4.7, GPT-5.5, and Gemini 3.1 Ultra
James
Co-founder of Smash Your AI - 18 years in education, now helping businesses and individuals get real results from AI.
Three of the four biggest AI labs in the world shipped a new flagship model in the last three weeks. If you blinked, you missed it. If you tried to keep up, you probably ended up reading 30 launch posts and feeling more confused than when you started.
This is the plain English version. What each model actually does, what changed compared with the version it replaced, and why a small business owner or busy professional should care. No benchmark charts, no acronym soup.
Why three launches in three weeks matters
For most of 2025, the top AI tools felt fairly stable. ChatGPT, Claude, and Gemini were all roughly comparable on the things normal people use them for. Pick one, get on with your work.
That has changed in the last month. Anthropic, OpenAI, and Google all pushed out a major new flagship within a fortnight of each other. Each of them claims to be best at something different. And the prices, capabilities, and ideal use cases are now genuinely different in ways that matter.
If you are paying for an AI subscription, this is the first time in a while where the choice you made six months ago might not be the right one today.
Claude Opus 4.7 (Anthropic, 16 April 2026)
What actually shipped
Anthropic launched Claude Opus 4.7 in mid-April. It is the most capable model in the Claude family, slotting in above Claude Sonnet 4.6 and Haiku 4.5. The headline change is that Opus 4.7 is built for agentic work. That is the industry word for AI that can do a long sequence of tasks on your behalf, not just answer one question at a time.
In practice that means three things. First, Opus 4.7 can use a computer like a person, opening browser tabs, clicking buttons, filling in forms. Second, it can hold a much longer plan in its head before it loses the thread, so you can give it a multi-hour task and walk away. Third, it follows precise instructions much more closely than the version before it, which makes a real difference if you have a strict process you want it to follow.
Why it matters for your business
If you have ever wished you could hand AI a task list and just come back to a finished result, Opus 4.7 is the first model that gets close. Picture giving it your inbox plus a brief on how you respond to common enquiries, and letting it draft replies for half an hour while you do something else.
Anthropic also pushed Claude into Microsoft Word as a beta in April. So if your team lives in Word documents, you can now have Opus working alongside you without leaving the file. We covered the broader Claude family in our plain English tour of Claude if you want to see what else sits in that ecosystem.
GPT-5.5 (OpenAI, 23 April 2026)
What actually shipped
OpenAI followed a week later with GPT-5.5. This is an iterative release rather than a brand new generation. Think of it as GPT-5 with a tune-up. The big areas of improvement are coding and what OpenAI calls agentic tasks, which is the same idea as Anthropic's pitch for Opus 4.7.
The other change worth knowing about is speed. GPT-5.5 is noticeably faster on the kind of medium length tasks that ChatGPT users do every day. Drafting an email, summarising a meeting, rewriting a paragraph. The wait between hitting send and getting a useful answer is shorter than it was on GPT-5.
Why it matters for your business
If you already pay for ChatGPT Plus or Team, GPT-5.5 is included with no price change. You get a quietly better tool the next time you log in. That is a much easier upgrade story than choosing a whole new model.
For people doing real work in ChatGPT, the speed improvement is the bit you will actually notice. The "best at coding" claim matters mostly to developers. The "better at agentic tasks" claim sounds impressive but is harder to feel in day to day use unless you are deliberately building automated workflows.
Gemini 3.1 Ultra (Google, late April 2026)
What actually shipped
Google's flagship for late April is Gemini 3.1 Ultra. The headline is the context window. Gemini 3.1 Ultra can hold up to two million tokens of context in a single conversation. In normal language, that is around 1.5 million words, or about 15 average books, all available at once.
It is also natively multimodal. That means it handles text, images, audio, and video without you having to switch tools or convert anything. You can drop a one hour video meeting recording into Gemini 3.1 Ultra and ask questions about what was said, and it answers from the audio directly.
Why it matters for your business
If your work involves long documents, large transcripts, or hours of recorded calls, this changes the maths. With a two million token window you can drop in your entire customer service history, your full product manual, or every meeting from the last quarter, and ask questions across all of it without having to chunk the data up first.
For people who do not work with that much information at once, the context window is overkill. The native video and audio handling is more interesting for everyday use. If you record meetings, run webinars, or produce podcasts, Gemini 3.1 Ultra can do useful things with that media that the other models still need plug-ins for.
NOT SURE WHICH ONE IS RIGHT FOR YOU?
Get a free AI audit and we will tell you which model fits your business
Book a free discovery call and we will look at your workflows and recommend the model and setup that will actually save you time. No sales pitch.
Book a free discovery callThe bigger picture
If you stand back from the individual launches, three things stand out.
First, the labs are no longer chasing each other on the same metric. Anthropic is leaning into agents and computer use. OpenAI is leaning into speed and developer tooling. Google is leaning into context length and multimodal. They are not converging, they are diverging.
Second, the gap between the top three and everyone else is now huge. If you are still using a free or older AI tool that has not been updated in six months, the difference in what you can do has become significant. We wrote up the case for paid plans in is paying for AI worth it and the picture has only got clearer since.
Third, "which AI is best" is now an unhelpful question. The right question is "which AI is best for the kind of work I actually do". The answer for a freelance designer is different from the answer for a five person accounting firm, which is different again for a school or a sole trader. We dug into that question for small businesses specifically in Opus 4.7 vs GPT-5.5 vs Gemini 3.1 Ultra: which one should a small business actually use.
What you should actually do this month
If you only have ten minutes, do this:
- Check what you are paying for. If you have a ChatGPT, Claude, or Gemini subscription, log in and look at which model your account currently defaults to. Some people are still on year-old versions without realising.
- Try the same prompt in a different tool. Take a real task you did this week. Run it through whichever tool you do not currently use. The difference in the output is the most useful information you will get this month.
- Pick one new feature to test. Long context, computer use, or agentic workflow. Try the one that maps onto a problem you actually have.
The pace of releases is not slowing down. The labs have already hinted at more updates this summer. The right approach is not to chase every announcement, but to revisit your tooling once a quarter and ask whether what you set up six months ago is still the best fit.
If you want a steer on which model to use for what, our AI model recommendations page is updated for these new releases. And if you would rather have us look at your specific business and tell you what to use, that is exactly what our free AI audit is for.