I Tried Recreating OpenClaw – And The Hype Is Real

After spending time with OpenClaw and seeing how it actually works, I’m convinced the hype is real. It shows that autonomous AI agents are finally living up to their promise.

I was skeptical when I first ran OpenClaw, it looked like just another AI tool riding the hype. Turns out, it’s not.

After experimenting with it and extending its messaging, I also found out that much of its core power (its AI agent architecture and human-in-the-loop interactions) can be recreated with off-the-shelf tools like the Agents SDK and Messages API.

In this post, I’ll share what I learned from using OpenClaw, explain why messaging is what makes autonomous agents truly work, and show how developers can leverage existing tools to build something similar without starting from scratch.

The agent that broke the internet 

In just three months, it’s taken off on GitHub, earning 200k stars in 84 days and thousands of forks. By mid-February, SecurityScorecard was tracking over 240k instances running in the wild.

With LLM token costs of $5-50 per instance, the project is already accounting for millions in inference spending, and it’s even causing Mac mini shortages as people rush to self-host OpenClaw. (You can actually run it on much cheaper hardware, which makes the story even crazier.)

The hype around the project is undeniable, even with a steep barrier to entry (users must install and run the server software themselves) and despite ongoing security concerns and reported vulnerabilities.

Why I think the hype is justified

OpenClaw’s AHA moment is hard to ignore. It shows there’s real demand for autonomous AI agents, ones that free users from being stuck in a chat window on sites like chatgpt.com.

I’ve always felt that calling those website chatbots “agents” was a stretch – they’re more like conversation buddies than AI doing real work for you.

True agents, in my view, should run in the background, acting and reacting on their own without forcing users to stay glued to a single site. That’s exactly the experience OpenClaw delivers.

The “hold my beer” moment

As a developer, I was curious. Running OpenClaw was impressive, but I wanted to know: how does it actually work? And even more, what would it take to recreate its wow factor myself? Let’s break it down.

The first key ingredient is an AI agent, and I mean this in a very specific sense.

As Anthropic puts it, agents are systems where the LLM controls the program’s flow, instead of classic code deciding when to call the LLM. At a high level, agent apps are basically a while loop that calls the LLM and hooks in all the tools the AI might need. With the rise of MCP, connecting these tools has become easier and more standardized.

On the surface this seemed simple, but I quickly got bogged down in a bunch of edge cases and details to implement. Luckly, we don’t need to reinvent the wheel here. There are ready to use SDKs wrapping all the agent logic, recently renamed Agents SDK being a prime example. That got the AI agent part covered. But there was still one secret ingredient missing.

Users still need to approve important actions

Let’s go back to the OpenClaw user experience. Even when freed from a chat website, agents still need a way to stay in touch with their users.

The human-in-the-loop approach remains essential for responsible AI: no one should discover their agent’s spending spree on a month-end bank statement. Critical actions still need user approval, and important results still need to be communicated.

That’s why messaging channels are the very first feature highlighted in OpenClaw’s documentation.

Messaging is what makes autonomous AI agents actually work for you. It lets them check in, keep you in the loop, and get your approval for important actions, without forcing you to refresh a page or babysit a chat window. It’s what gives you peace of mind, convenience, and, most importantly, control.

Cheat codes for messaging

Back to coding.

Connecting to mobile operators or chat services might sound intimidating at first, but I had a secret weapon: I work at Infobip. Luckily, you don’t need that advantage, anyone can pick up the unified Messages API and start sending and receiving messages on users’ phones.

With connectivity sorted, all I had to do was figure out how to hook the agent up to it.

There are few flows:

First up is passing new messages from users to the agent as prompts; basically, launching new tasks.
Secondly, the agent needs a way to send out reports. MCP servers work best here, as they are easy to integrate and trigger by LLMs.
Finally, sending the agent’s output to the phone and getting the user’s feedback or confirmation. This is the all-important human-in-the-loop part! Historically interpreting free form input from users might have been hard, but these days we can easily pass it to an LLM and ask it to summarize the intent: does the user approve of the suggested action or not? Easy.

And with that my experiment was over.

Do a few off-the-shelf components (like the Agents SDK and Messages API) replicate the full OpenClaw experience? Not entirely. But they can help you kickstart a new project, up to the point where you can focus on your core features. And that’s the part that really matters.

It’s time to pay attention to autonomous agents

If you’re already working in AI (or thinking about it) autonomous agents are where things are moving. OpenClaw shows the demand is real, and the tools to build agents that can reason, act, and communicate are already here. Messaging isn’t just nice to have; it’s how your agent stays useful without you having to babysit it. With unified messaging APIs and MCP, sending updates and notifications is easy, so you can focus on shaping how your agent thinks and acts.