OpenAI launches new tools to help businesses build AI agents

OpenAI launches new tools to help businesses build AI agents

Latest

AI

Amazon

Apps

Biotech & Health

Climate

Cloud Computing

Commerce

Crypto

Enterprise

EVs

Fintech

Fundraising

Gadgets

Gaming

Google

Government & Policy

Hardware

Instagram

Layoffs

Media & Entertainment

Meta

Microsoft

Privacy

Robotics

Security

Social

Space

Startups

TikTok

Transportation

Venture

Events

Startup Battlefield

StrictlyVC

Newsletters

Podcasts

Videos

Partner Content

TechCrunch Brand Studio

Crunchboard

Contact Us

The tools are part of OpenAI’s new Responses API, which lets businesses develop custom AI agents that can perform web searches, scan through company files, and navigate websites, much like OpenAI’s Operator product. The Responses API effectively replaces OpenAI’s Assistants API, which the company plans to sunset in the first half of 2026.

The hype around AI agents has grown dramatically in recent years despite the fact that the tech industry has struggled to show people, or even define, what “AI agents” really are. In the most recent example of agent hype running ahead of utility, Chinese startup Butterfly Effect earlier this week went viral for a new AI agent platform called Manus that users quickly discovered didn’t deliver on many of the company’s promises.

In other words, the stakes are high for OpenAI to get agents right.

“It’s pretty easy to demo your agent,” Olivier Godemont, OpenAI’s API product head, told TechCrunch in an interview. “To scale an agent is pretty hard, and to get people to use it often is very hard.”

Earlier this year, OpenAI introduced two AI agents in ChatGPT: Operator, which navigates websites on your behalf, and deep research, which compiles research reports for you. Both tools offered a glimpse at what agentic technology can achieve, but left quite a bit to be desired in the “autonomy” department.

Now with the Responses API, OpenAI wants to sell access to the components that power AI agents, allowing developers to build their own Operator- and deep research-style agentic applications. OpenAI hopes that developers can create some applications with its agent technology that feel more autonomous than what’s available today.

Using the Responses API, developers can tap the same AI models (in preview) under the hood of OpenAI’s ChatGPT Search web search tool: GPT-4o search and GPT-4o mini search. The models can browse the web for answers to questions, citing sources as they generate replies.

OpenAI claims that GPT-4o search and GPT-4o mini search are highly factually accurate. On the company’s SimpleQA benchmark, which measures the ability of models to answer short, fact-seeking questions, GPT-4o search scores 90% while GPT-4o mini search scores 88% (higher is better). For comparison, GPT-4.5 – OpenAI’s much larger, recently released model – scores just 63%.

The fact that AI-powered search tools are more accurate than traditional AI models is not necessarily surprising – in theory, GPT-4o search can just look up the right answer. However, web search does not render hallucinations a solved problem. Beyond their factual accuracy, AI search tools also tend to struggle with short, navigational queries (such as “Lakers score today”), and recent reports suggest that ChatGPT’s citations aren’t always reliable.

The Responses API also includes a file search utility that can quickly scan across files in a company’s databases to retrieve information. (OpenAI claims that it won’t train models on these files.) In addition, developers using the Responses API can tap OpenAI’s Computer-Using Agent (CUA) model, which powers Operator. The model generates mouse and keyboard actions, allowing developers to automate computer use tasks like data entry and app workflows.

Enterprises can optionally run the CUA model, which is releasing in research preview, locally on their own systems, OpenAI said. The consumer version of the CUA available in Operator can only take actions on the web.

To be clear, the Responses API won’t solve all the technical problems plaguing AI agents today.

While AI-powered search tools are more accurate than traditional AI models – a fact that is unsurprising given they can just look up the right answer – web search does not render AI hallucinations a solved problem. GPT-4o search still gets 10% of factual questions wrong. Beyond their accuracy, AI search tools also tend to struggle with short, navigational queries (such as “Lakers score today”), and recent reports suggest that ChatGPT’s citations aren’t always reliable.

In a blog post provided to TechCrunch, OpenAI said that the CUA model is “not yet highly reliable for automating tasks on operating systems,” and that it’s susceptible to making “inadvertent” mistakes.

However, OpenAI said these are early iterations of their agent tools, and it’s constantly working to improve them.

Alongside the Responses API, OpenAI is releasing an open-source toolkit called the Agents SDK, which offers developers free tools to integrate models with their internal systems, put in place safeguards, and monitor AI agent activities for debugging and optimization purposes. The Agents SDK is a follow-up of sorts to OpenAI’s Swarm, a framework for multi-agent orchestration that the company released late last year.

Godemont said he hopes OpenAI can bridge the gap between AI agent demos and products this year, and that, in his opinion, “agents are the most impactful application of AI that will happen.” That echoes a proclamation OpenAI CEO Sam Altman made in January: that 2025 is the year AI agents enter the workforce.

Whether or not 2025 truly becomes the “year of the AI agent,” OpenAI’s latest releases show the company wants to shift from flashy agent demos to impactful tools.

Topics

Senior Reporter, Consumer

Apple’s next major OS updates will bring the biggest design overhaul in years

Elon Musk says DOGE involvement is making it harder to run his businesses

Eric Schmidt joins Relativity Space as CEO

In another chess move with Microsoft, OpenAI is pouring $12B into CoreWeave

At SXSW, Bluesky CEO Jay Graber pokes fun at Mark Zuckerberg with Latin phrase T-shirt

Neom is reportedly turning into a financial disaster, except for McKinsey & Co.

Manus probably isn’t China’s second ‘DeepSeek moment’

Subscribe for the industry’s biggest tech news

Every weekday and Sunday, you can get the best of TechCrunch’s coverage.

TechCrunch's AI experts cover the latest news in the fast-moving field.

Every Monday, gets you up to speed on the latest advances in aerospace.

Startups are the core of TechCrunch, so get our best coverage delivered weekly.

By submitting your email, you agree to our Terms and Privacy Notice.

© 2025 Yahoo.

Read more

EMEA Tribune is not responsible for this news, news agencies have provided us this news.
Follow us on our WhatsApp channel here .

Read more