Simple AI Agents with a Scarce Set of Skills Beat the More Complex Buddies

For AI agents to work for you you have to train them, refine them, and ultimately build a well-orchestrated agent-to-agent system that can deliver real value.

AI is sexy. But those who are building AI in real, production-grade environments know the truth: the real challenge lies in making something that actually works at scale.

During a recent panel discussion about technical debt and AI transformation at the Infobip Shift Miami conference, where engineers and product leaders deep in the AI trenches gathered, one key piece of advice stood out: keep it simple.

Strip the AI agent down to its core and build a solid foundation for a multi-agent architecture.

Why? Today’s agents are buggy, unreliable, and still learning how to work together. The goal is to train them, refine them, and ultimately build a well-orchestrated agent-to-agent system that can deliver real value.

Gigantic Monolithic Agents are a no-no!

Prabu Ramaraj, Director of Engineering for Generative AI and Automation at AutoZone, is very pragmatic:

„Recently, everybody wants to build agents, but you don’t need to use language models to do all the business for you. If you have to go with the agents, I recommend going with a multi-agent architecture. You don’t want to build a monolithic application, right? The same goes for the agents.

Don’t build one gigantic monolithic agent; build multiple agents that have a specific goal that you can test and form!

“I completely agree!”, instantly added Nick Kljajić, Co-Founder and CEO of AskHandle, an AI-driven support platform specializing in generative AI and natural language processing (NLP):

“Start small, see how it performs. Test it, iterate, once the performance is optimal or it reaches the level where you’re happy with it, build on top of it. And just measure, measure and measure, he insisted.

Making something complex just because it sounds sexy in the AI world doesn’t necessarily add value; it often just adds fragility.

Nick points out that developers should go back to basics:

“Before doing anything complex, make sure your AI system can accurately retrieve the right answers from your knowledge base or documents. And then build and connect to other agents because in multi-agent systems, more things can go wrong.”

Human-in-the-Loop

To build efficient agents, parts of a multi-agent system (that is more effective and scalable than one super-agent!), one must measure and iterate relentlessly. Luckily, agents with a scarce set of skills enable better testing.

Panelists agreed that before scaling up, one should track “meaningful metrics” like AI performance – relevance, confidence, grounding, and business performance – resolution time, containment rate, conversion, and fallback frequency. However, the basic one is whether the customers got the answer they were looking for.

Ervin Jagatić, Product Director at Infobip, explained how, with the “human-in-the-loop, AI” agents can become more accurate and reliable:

“Humans assess the responses from the assistant, par example, are they grounded in the documentation or are they hallucinating. And then reiterate so that humans can go through these analytics, mark the good and the bad answers, and make the system better, iteratively gathering the data and improving the performance of the AI assistant.“

MCP to Reduce Chaos

In a discussion centered around AI agents and system design, standardization was emphasized as a key enabler of scalability and simplicity.

According to the experts, when each team or agent follows its own approach, complexity grows rapidly, especially for engineers. They concluded that adopting standards like MCP helps reduce chaos by defining consistent input/output schemas and improving interoperability across agents.

They declared that standards and shared protocols are especially critical in this early stage to building stable, scalable architectures.

Ervin shared that for companies like Infobip, where APIs are the backbone of communication infrastructure, the goal is to open that stack to AI agents as well, allowing them to use the same tools and communication layers that human users and systems do.