Featured Article

What exactly is an AI agent?

The answer depends on who you ask

Comment

Illustration of a robotic agent helping workers do their jobs.
Image Credits: girafchik123 / Getty Images

AI agents are supposed to be the next big thing in AI, but there isn’t an exact definition of what they are. To this point, people can’t agree on what exactly constitutes an AI agent.

At its simplest, an AI agent is best described as AI-fueled software that does a series of jobs for you that a human customer service agent, HR person or IT help desk employee might have done in the past, although it could ultimately involve any task. You ask it to do things, and it does them for you, sometimes crossing multiple systems and going well beyond simply answering questions.

Seems simple enough, right? Yet it is complicated by a lack of clarity. Even among the tech giants, there isn’t a consensus. Google sees them as task-based assistants depending on the job: coding help for developers; helping marketers create a color scheme; assisting an IT pro in tracking down an issue by querying log data.

For Asana, an agent may act like an extra employee, taking care of assigned tasks like any good co-worker. Sierra, a startup founded by former Salesforce co-CEO Bret Taylor and Google vet Clay Bavor, sees agents as customer experience tools, helping people achieve actions that go well beyond the chatbots of yesteryear to help solve more complex sets of problems.

This lack of a cohesive definition does leave room for confusion over exactly what these things are going to do, but regardless of how they’re defined, the agents are for helping complete tasks in an automated way with as little human interaction as possible.

Rudina Seseri, founder and managing partner at Glasswing Ventures, says it’s early days and that could account for the lack of agreement. “There is no single definition of what an ‘AI agent’ is. However, the most frequent view is that an agent is an intelligent software system designed to perceive its environment, reason about it, make decisions, and take actions to achieve specific objectives autonomously,” Seseri told TechCrunch.

She says they use a number of AI technologies to make that happen. “These systems incorporate various AI/ML techniques such as natural language processing, machine learning, and computer vision to operate in dynamic domains, autonomously or alongside other agents and human users.”

Aaron Levie, co-founder and CEO at Box, says that over time, as AI becomes more capable, AI agents will be able to do much more on behalf of humans, and there are already dynamics at play that will drive that evolution.

“With AI agents, there are multiple components to a self-reinforcing flywheel that will serve to dramatically improve what AI Agents can accomplish in the near and long-term: GPU price/performance, model efficiency, model quality and intelligence, AI frameworks and infrastructure improvements,” Levie wrote on LinkedIn recently.

That’s an optimistic take on the technology that assumes growth will happen in all these areas, when that’s not necessarily a given. MIT robotics pioneer Rodney Brooks pointed out in a recent TechCrunch interview that AI has to deal with much tougher problems than most technology, and it won’t necessarily grow in the same rapid way as, say, chips under Moore’s law have.

“When a human sees an AI system perform a task, they immediately generalize it to things that are similar and make an estimate of the competence of the AI system; not just the performance on that, but the competence around that,” Brooks said during that interview. “And they’re usually very over-optimistic, and that’s because they use a model of a person’s performance on a task.”

The problem is that crossing systems is hard, and this is complicated by the fact that some legacy systems lack basic API access. While we are seeing steady improvements that Levie alluded to, getting software to access multiple systems while solving problems it may encounter along the way could prove more challenging than many think.

If that’s the case, everyone could be overestimating what AI agents should be able to do. David Cushman, a research leader at HFS Research, sees the current crop of bots more like Asana does: assistants that help humans complete certain tasks in the interest of achieving some sort of user-defined strategic goal. The challenge is helping a machine handle contingencies in a truly automated way, and we are clearly not anywhere close to that yet.

“I think it’s the next step,” he said. “It’s where AI is operating independently and effectively at scale. So this is where humans set the guidelines, the guardrails, and apply multiple technologies to take the human out of the loop — when everything has been about keeping the human in the loop with GenAI,” he said. So the key here, he said, is to let the AI agent take over and apply true automation.

Jon Turow, a partner at Madrona Ventures, says this is going to require the creation of an AI agent infrastructure, a tech stack designed specifically for creating the agents (however you define them). In a recent blog post, Turow outlined examples of AI agents currently working in the wild and how they are being built today.

In Turow’s view, the growing proliferation of AI agents — and he admits, too, that the definition is still a bit elusive — requires a tech stack like any other technology. “All of this means that our industry has work to do to build infrastructure that supports AI agents and the applications that rely upon them,” he wrote in the piece.

“Over time, reasoning will gradually improve, frontier models will come to steer more of the workflows, and developers will want to focus on product and data — the things that differentiate them. They want the underlying platform to ‘just work’ with scale, performance, and reliability.”

One other thing to keep in mind here is that it’s probably going to take multiple models, rather than a single LLM, to make agents work, and this makes sense if you think about these agents as a collection of different tasks. “I don’t think right now any single large language model, at least publicly available, monolithic large language model, is able to handle agentic tasks. I don’t think that they can yet do the multi-step reasoning that would really make me excited about an agentic future. I think we’re getting closer, but it’s just not there yet,” said Fred Havemeyer, head of U.S. AI and software research at Macquarie US Equity Research.

“I do think the most effective agents will likely be multiple collections of multiple different models with a routing layer that sends requests or prompts to the most effective agent and model. And I think it would be kind of like an interesting [automated] supervisor, delegating kind of role.”

Ultimately for Havemeyer, the industry is working toward this goal of agents operating independently. “As I’m thinking about the future of agents, I want to see and I’m hoping to see agents that are truly autonomous and able to take abstract goals and then reason out all the individual steps in between completely independently,” he told TechCrunch.

But the fact is that we are still in a period of transition where these agents are concerned, and we don’t know when we’ll get to this end state that Havemeyer described. While what we’ve seen so far is clearly a promising step in the right direction, we still need some advances and breakthroughs for AI agents to operate as they are being envisioned today. And it’s important to understand that we aren’t there yet.

More TechCrunch

Featured Article

Selling a startup in an ‘acqui-hire’ is more lucrative than it seems, founders and VCs say

Selling under such circumstances is often not as poor of an outcome for founders and key staff as it initially seems. 

Selling a startup in an ‘acqui-hire’ is more lucrative than it seems, founders and VCs say

While the rapid pace of funding has slowed, many fintechs are continuing to see growth and expand their teams.

These  fintech companies are hiring, despite a rough market in 2024

This is just one area of leadership where Parker Conrad takes a contrarian approach. He also said he doesn’t believe in top-down management.

Rippling’s Parker Conrad says founders should ‘go all the way to the ground’ to run their companies

Congresswoman Nancy Pelosi issued a statement late yesterday laying out her opposition to SB 1047, a California bill that seeks to regulate AI. “The view of many of us in…

Nancy Pelosi criticizes California AI bill as ‘ill-informed’

Data analytics company Palantir has faced criticism and even protests over its work with the military, police, and U.S. Immigration and Customs Enforcement, but co-founder and CEO Alex Karp isn’t…

Palantir CEO Alex Karp is ‘not going to apologize’ for military work

Timo Resch is basking in the sun. That’s literally true, as we speak on a gloriously clear California day at the Quail, one of Monterey Car Week’s most prestigious events.…

Why Porsche NA CEO Timo Resch is betting on ‘choice’ to survive the turbulent EV market

Made by Google was this week, featuring a full range of reveals from Google’s biggest hardware event. Google unveiled its new lineup of Pixel 9 phones, including the $1,799 Pixel…

Google takes on OpenAI with Gemini Live

I’ve been playing around with OpenAI’s Advanced Voice Mode for the last week, and it’s the most convincing taste I’ve had of an AI-powered future yet. This week, my phone…

OpenAI’s new voice mode let me talk with my phone, not to it

X, the social media platform formerly known as Twitter, said today that it’s ending operations in Brazil, although the service will remain available to users in the country. The announcement…

X says it’s closing operations in Brazil

One of the biggest questions looming over the drone space is how to best use the tech. Inspection has become a key driver, as the autonomous copters are deployed to…

Ikea expands its inventory drone fleet

Brands can use Keychain to look up different products and see who actually manufactures them.

Keychain aims to unlock a new approach to manufacturing consumer goods

In this post, we explain the many Microsoft Copilots available and what they do, and highlight the key differences between each.

Microsoft Copilot: Everything you need to know about Microsoft’s AI

A hack on UnitedHealth-owned tech giant Change Healthcare likely stands as one of the biggest data breaches of U.S. medical data in history.

How the ransomware attack at Change Healthcare went down: A timeline

Gogoro has deferred its India plans over delay in government incentives, but the Taiwanese company has partnered with Rapido for a bike-taxi pilot.

Gogoro delays India plans due to policy uncertainty, launches bike-taxi pilot with Rapido

On Friday, the venture firm Andreessen Horowitz tweeted out a link to its guide on how to “build your social media presence” which features advice for founders.

A16z offers social media tips after its founder’s ‘attack’ tweet goes viral

OpenAI has banned a cluster of ChatGPT accounts linked to an Iranian influence operation that was generating content about the U.S. presidential election, according to a blog post on Friday.…

OpenAI shuts down election influence operation that used ChatGPT

Apple is reportedly shifting into the world of home robots after the wheels came off its electric car. According to a new report from Bloomberg, a team of several hundred…

Apple reportedly has ‘several hundred’ working on a robot arm with attached iPad

Welcome to Startups Weekly — your weekly recap of everything you can’t miss from the world of startups. I’m Anna Heim from TechCrunch’s international team, and I’ll be writing this newsletter…

Another week in the circle of startup life

MIT this week showcased tiny batteries designed specifically for the purpose of power these systems to execute varied tasks.

Researchers develop hair-thin battery to power tiny robots

Rimac revealed Friday during The Quail, a Motorsports Gathering at Monterey Car Week the Nevera R, an all-electric hypercar that’s meant to push the performance bounds of its predecessor.

The Nevera R all-new electric hypercar can hit a top speed of 217 mph, and it only starts at $2.5 million

While the ethics of AI-generated porn are still under debate, using the technology to create nonconsensual sexual imagery of people is, I think we can all agree, reprehensible. One such…

A hellish new AI threat: ‘Undressing’ sites targeted by SF authorities

Almost two weeks ago, TechCrunch reported that African e-commerce giant Jumia was planning to sell 20 million American depositary shares (ADSs) and raise more than $100 million, given its share…

African e-commerce company Jumia completes sale of secondary shares at $99.6M

We’re entering the final week of discounted rates for TechCrunch Disrupt 2024. Save up to $600 on select individual ticket types until August 23. Join a dynamic crowd of over…

Only 7 days left to save on TechCrunch Disrupt 2024 tickets

Epic Games, the maker of Fortnite, announced on Friday that it has officially launched its rival iOS app store in the European Union. The Epic Games Store is also launching…

‘Fortnite’ maker Epic Games launches its app store on iOS in the EU, worldwide on Android

After bringing AI overviews to the U.S., Google is expanding the AI-powered search summaries to six more countries: India, Brazil, Japan, the U.K., Indonesia and Mexico. These markets will also…

Google is bringing AI overviews to India, Brazil, Japan, UK, Indonesia and Mexico

The Commission is seeking more information from Meta following its decision to deprecate its CrowdTangle transparency tool. The latest EU request for information (RFI) on Meta has been made under…

Meta draws fresh questions from EU over its CrowdTangle shut-down

Twitter alternatives — new and old — have found audiences willing to try out a newer social networks since Elon Musk took over the company in 2022. Mastodon, Bluesky, Spill…

What is Instagram’s Threads app? All your questions answered

Revolut has confirmed a new valuation of $45 billion via a secondary market share sale, shortly after the U.K.-based neobank secured its own banking license in the U.K. and Mexico.…

UK neobank Revolut valued at $45B after secondary market sale

A social media spat between billionaire tech investors is raising questions about the journalistic independence of three-year-old news outfit SF Standard, after a reporter representing the outlet reached out to…

Ben Horowitz declares war on Michael Moritz

SB 1047 has drawn the ire of Silicon Valley players large and small, including venture capitalists, big tech trade groups, researchers and startup founders.

California AI bill SB 1047 aims to prevent AI disasters, but Silicon Valley warns it will cause one