3 Mega Trends Shaping How Developers Build AI Agents – Especially VoiceAI


So, AI agents don’t just chat anymore – they’re leveling up. They team up, take action, and even talk about your favorite brand over WhatsApp and text like teenagers.
But don’t worry, they still need developers to keep them running smoothly, especially when they fall back on human or legacy comms.
Platforms like AWS and Infobip, which announced their strategic partnership at this year’s Shift Conference in Zadar, orchestrate these digital personalities – keeping them compliant, connected, and (mostly) well-behaved in the wild world of business messaging.
Let’s see how.
We’re rewriting the future of customer interaction – right now
“We’re in an interesting moment in tech – three mega-trends are colliding to shake up marketing, sales, support, and more,” says Ivan Ostojic, Chief Business Officer at Infobip.
First, people love chatting – not just with friends, but with businesses too. ChatGPT, Anthropic, and friends have trained them to expect smart, fast answers. Three years ago, only some people preferred chatting with businesses – today, 7 out of 10 do.
Second, app fatigue is real. Even though apps are exhausting, people actually open and click on messaging – 90%+ open rates! Businesses are turning chat into super apps, packing in ads, rich media, payments, calendars, and almost everything an app can do, with way less friction.
Third, agentic AI is everywhere – chatbots, human helpers, personalized marketing, creative tools, and even fraud detection.
Put it together: on the demand side, people want seamless, instant interactions. On the supply side, messaging + AI make massive automation possible. And that equals huge opportunities for businesses – and for developers like you who’ll be building it.

Voice APIs are transforming AI experiences
So, messaging + AI are set to transform marketing, commerce, and support. Developers will build smart agents everywhere – banking, retail, travel… even sports and entertainment, where fans jump in first.
Talking about sports, Ostojic shared an example of a project built by Infobip, AWS and partners for Formula 1: a digital twin of driver Ollie Bearman. Fans can message him on WhatsApp and get interactive replies, including voice responses cloned from his real voice.
Real-time APIs even deliver live updates, like current race results. Fans can text or voice-message and get realistic answers, making it highly interactive and engaging.
And keep in mind this tech isn’t just for F1 – imagine Santa sending kids personalized voice messages at Christmas, or customer support replying instantly in the voice of a brand ambassador. The sky’s the limit.
How do all these agents play nice together?
All of these trends – chat-happy consumers, messaging turning into super apps, and AI taking real actions – set the stage for a bigger question: how do all these agents work together?
That question was at the heart of a panel discussion at the Shift Conference with Ervin Jagatic (Product Director, Infobip) and Andrei Shakirin (Senior Solutions Architect, AWS). “A couple of years ago, generative AI was the buzz. Now it’s agents – the hype is similar, but we started with conversational AI,” says Jagetic.
AI used to just generate tone or products. Now it takes real actions – making purchases, sending hyper-personalized messages, and more. With many agents coming in the next 4–5 years, integration is key. That’s why Infobip is teaming up with AWS to build an agent marketplace where AI agents can communicate and connect.
MCP or agent-to-agent protocol?
Sometimes agents need to talk to each other to fetch data or execute actions. There are two main ways to do this – MCP (Model Context Protocol) or an agent-to-agent protocol, Shakirin explains:
MCP works like this: the MCP server has tools – a function with a description and arguments. The MCP client, built into the agent, reports these tools to the LLM. The model decides if it needs a tool, sends a structured response back, the agent calls the tool on the server with arguments, and returns the result.
Essentially, the model tells the agent: “I need to call something to get data or execute an action.”

Agent-to-agent communication can also run through MCP, where the server might be an API, database, or another agent – you can chain calls between them. MCP was invented by Anthropic two years ago, but Google’s newer agent-to-agent protocol adds “agent cards” for discovering capabilities, async tasks with state tracking, and negotiating data formats like images or video.
When to use which?
For simple, synchronous, text-based chat where MCP is already set up, stick with MCP. For async communication, multiple calls, richer data, or negotiation, use the agent-to-agent protocol. Both have their place. At Infobip, we love MCP (especially in Java), but we support both – says Jagetic.
Don’t forget DEVELOPERS
Remember, developers are front and center – working with clients, partners, and more. So, is there a career path here?
Andrei Shakirin, says yes:
By 2026, we’ll see tons of verticalized agents – finance, banking, healthcare – built for specific use cases. Building them means mastering workflows, tasks, and integrations. It’s a whole new frontier and a huge business opportunity.
And for developers who want to get into this space, Shakirin says you need much more than just LLM communication. You need to:
- Store communication history.
- Use retrieval-augmented generation for additional data.
- Integrate MCP so models can call tools or execute actions.
- Ensure security, traceability, and monitoring.
These are common building blocks across apps, so frameworks like LangChain, LangGraph, and AutoGen save you from reinventing the wheel.
Most are Python-based, but Java frameworks like Spring AI matter too – agents are moving from prototypes to enterprise systems, where Java and Kotlin dominate. Deployment is easy: DIY on AWS or use Amazon Bedrock Agent Framework, which handles runtime, memory, auth, scaling, and multi-framework support. Developers can self-manage or let Bedrock do the heavy lifting.