5.02.2026.
Event

Forget the Model, It’s Workflows That Make LLM Products Run

Building with LLMs is nothing like traditional software. If we want something that actually works in production, we have to test it, monitor it, and keep iterating on real customer data.

From his experience leading AI product teams, Andrew Mende (Senior Product Manager, Machine Learning at Booking.com) explained what it truly takes to ship LLM-based products in production.

Making AI products reliable requires new workflows

For Mende, the buzz around AI is a rare shift, like the rise of smartphones. But what does it mean for product teams?

This moment unlocks new ways of solving customer problems that were previously impossible due to technical constraints.

He was clear: traditional product management approaches often fail with AI-driven products.

LLM-based systems behave differently, demand new workflows, and bring new types of risk.

Unlike deterministic software, LLMs are probabilistic (identical inputs can produce different outputs), making experimentation easy but production readiness challenging, and forcing teams to rethink how they test, evaluate, and monitor features.

One of the biggest traps, Mende explained, is confusing a successful prototype with a scalable solution:

It’s easy to paste a prompt into ChatGPT and see results; much harder to make it reliable across thousands of real customer inputs.

Teams need structured datasets, big tables of real customer examples, to track accuracy, spot regressions, and see if changes actually work. Without them, it’s all guesswork.

Focus on accuracy, cost, and speed

Mende’s practical approach to model selection focuses on accuracy, cost, and latency: start with the most capable model to see if the problem can be solved, then move to smaller or faster models to optimize performance.

This requires testing multiple configurations (context size, prompts, and parameters) since even small changes affect results. Beyond the model, context selection, prompt instructions, and external tools are critical:

For example, when a customer asks about a specific order, the system should fetch real-time data instead of relying on static knowledge. This combination of LLMs and tools turns simple prompts into full systems, but also increases complexity and maintenance costs.

LLMs can transform how users interact – if teams build the right infrastructure

Mende concluded his How to Web lecture by saying LLMs shine by transforming user interaction: for the first time, digital products can understand plain language, turning customer requests directly into actions.

This shift brings digital experiences closer to human conversations and enables new product patterns that were out of reach just a few years ago.

The challenge now, Mende explained, is not whether LLMs work, but whether teams are willing to build the evaluation, monitoring, and infrastructure required to make them truly useful.