OpenAI’s new agent can browse the web and complete tasks for you. We tested it on the UK’s biggest pensions websites.
AI agents are set to change the way we use the internet.
Think of an AI agent like a digital assistant. Unlike a basic chatbot that can just answer questions, AI agents can use a computer or a web browser to get things done.
This technology is in its early stages. After two years of speculation, we’re finally starting to see AI agents become available to users. Like OpenAI’s first agent, Operator.
Just like ChatGPT, you set Operator a task by typing an instruction in a chat window. But that’s where the similarities end. Operator completes tasks by using a web browser. Operator can click, type and scroll – which means it can navigate and use websites just like a human would. And when it hits an inevitable curve ball – like a pop up, a login screen, or a paywall – it will either try to fix the problem or hand back to the user for help.
What’s particularly interesting are the use cases that OpenAI are touting, like using Operator to book a taxi, reserve a table at a restaurant, or buy things online.

We’ve been testing Operator to figure out how it works, how it handles money questions, and whether it can successfully complete tasks on pensions websites. Here’s what we learned.
Operator searches using Bing, and only looks at the top few results
For every task we set Operator, it began by using Bing to search the internet.
Operator showed very consistent browsing behaviour. It always ignored AI generated search results. Most of the time, it only clicked the first search result that appeared. We never saw it open more than 4 results.
For example, when we asked for the best personal pension scheme in the UK, Operator only opened the top link – an article from Forbes. Then, Operator wrote us an answer that was only based on this article.
Google makes up around 90% of search traffic in the UK, so it’s the search engine that companies optimise for the most. Companies care about where they appear in Google search results, and they’re more relaxed about Bing.
But if you are not the top search result in Bing, then Operator is unlikely to read your website. And if you place outside of the first few results then it definitely will not read your website.
That means you, and your content, will not be served up to a user.
If your content is not human friendly, then it is not agent friendly
We asked Operator to visit a leading UK money website and find the answer to a simple question ‘what is a mortgage?’.
We watched Operator do all the things a human might do – starting with navigating to the correct section of the website and clicking the most logical pages. The answer, however, was nowhere to be found. Operator then turned to the website search box to search for terms like ‘mortgage explained’, ‘mortgage definition’, and ‘what is a mortgage’. None of these brought up the correct answer either.
After 3 minutes, Operator eventually stumbled upon the answer by accident.
This was not a failure of the agent.
When we asked our resident humans to perform the same test, they made all the same mistakes that Operator did, because the information was not where they expected to find it.
The fastest human attempt took 45 seconds. Research shows that most humans would have given up after no more than 20.
Accordions cause Operator problems
We set Operator loose on a pensions website that uses a variety of design elements, like tables, scrollable boxes, and accordions (boxes that expand when you click on them). We asked it to figure out when a deferred member can retire from the scheme.
As we watched Operator work, we found it struggled with accordions. It tried to interact with them, but failed to properly read the information contained within them. That meant we got the wrong answer to a question about when a member can retire.
The accordions were intended to make the information on the page digestible to humans, but they stumped Operator. We spotted this error because we were supervising what was happening. However, as agents become more commonplace, users are unlikely to give them this level of attention.
We also found that Operator would not open every accordion on the page. It will only open the accordions it thinks it needs to complete a task. That means if you hide content with lots of accordions, then Operator might get lost or confused.
Operator does a good job of handling forms
We wanted to see how well Operator could handle filling out online forms, so we put it to work on Unbiased – a website that matches people with financial advisors.
As we worked through the forms together, Operator asked us what information it should input at each step. When we reached the age selection part, with its various ranges like 20 to 30 and 30 to 40, Operator paused to check which range to pick. We told it we were 38 years old, and it selected the correct 30 to 40 bracket. Later, when we came to a question about income, we simply told Operator to pick the biggest option available, which it did without any trouble.

We put Operator to a different challenge by sending it to a defined benefit pension scheme website. We asked it to find out what would happen to the benefits of a deferred member who died. We thought it would fail this task, because the correct page was hidden in a drop down navigational menu under the subsection ‘If you’re not taking your benefits yet’. Operator handled this easily. It navigated the menu and picked the correct subsection and page on the first try.
Operator is appropriately cautious, but it will use pension member portals
We gave Operator tasks like sending complaint emails and logging in to bank accounts. We found it will not do these without supervision, and sometimes will refuse outright.

Next, we asked Operator to log in to a pension account that belonged to one of our team. It was willing to do this as long as we typed in the personal details. After that, Operator took the wheel again. We asked it to navigate to the member portal and figure out how much our pension had grown in the last year.
Operator could not find the information. It stumbled blindly through the pages, and started trawling through investment reports to find details of fund performance. We took control, expecting to find an investment graph or a simple figure explaining how much our investments had changed. There was nothing to be found.
Eventually, we found the answer by downloading an annual statement and picking through the figures ourselves.
Out of curiosity, we wanted to see if a human would experience the same problems as Operator. We asked a family member to have a go at the task. They also struggled. After 3 minutes of exploring, they found the information in an annual statement – in the 8th place that they looked.
The verdict? It’s early days for AI agents
Operator is as basic as an agent will ever be. It’s slow, can make mistakes, and is only available as a research preview to a limited audience of users who are paying a fair chunk of money to be beta testers.
And yet, Operator is a glimpse into the future. It’s a sign of things to come – of a world where users do not visit your website, but instead simply delegate AI agents to find information and complete tasks on their behalf.
Of everything we’ve found, we’re most struck by how difficult financial services websites continue to be to use.
If your website does not work for real people then it will not work for AI agents that are trained to navigate websites like people. This is within your control. Test your content. Don’t test it with yourself, or your boss. This content is not for you – it’s for your users, the people our industry was set up to serve. It’s their lives. And it’s their money. And so your content has to work for them, or it will not work at all.
We were also surprised at how few search results Operator considered. We’ve spoken a lot recently about the importance of appearing in the AI search results generated by Google and ChatGPT. The emergence of Operator now makes the case for working hard to appear at the top of Bing’s search results too.
We don’t yet know what makes for a high performing website in a world of AI agents, but it’s clear that websites will need to meet a minimum standard. If they don’t, then AI agents will not be able to find you and interact with you, and that will significantly affect your ability to attract, win, and retain customers.
For so many years we’ve all focused on keeping bots off of our sites with tricks like CAPTCHA codes and ‘tick if you’re a human’ boxes. But in the future, businesses will need to openly welcome agents to their sites and optimise content for their needs. By doing so, agents will be able to better meet the needs of their users by providing accurate and timely information, and by successfully completing actions on their behalf.
The needs of AI agents. Isn’t that a strange thought?