How ‘Guess Who?’ Logic Shapes AI Decision Trees and Predictive ML

The logic behind a simple game of 'Guess Who?' is identical to how we code one of the most transparent AI algorithms. In Decision Trees, we don’t guess - we ask the question that gives the most information, and mastering that intuition teaches the core of predictive Machine Learning

Guess Who? might seem like a nostalgic board game full of quirky characters and clever yes-or-no questions. Yet beneath its playful surface lies a robust logical structure that reflects the decision-making processes of modern AI.

In Guess Who?, players start with 24 possibilities. They ask a series of questions designed to eliminate as many options as possible with each answer, narrowing the field until only one character remains. This mirrors the core principle of decision trees in AI: strategically asking questions to split data into smaller, more organized groups, quickly reducing uncertainty.

Just as a player might ask whether the character wears glasses or has blonde hair, an AI model might ask whether a patient’s blood pressure is high or whether a customer opened a previous email.

Both processes aim to identify the most informative question at each step, resulting in an accurate decision with the fewest possible moves.

What makes a good question?

We want to gather as much information as possible when we ask a question. Ideally, each yes or no question cuts our choices in half.

Since there are 24 characters, if we assume that each question eliminates half the characters, we will need 5 questions to determine who the opponent chose. This is because:

24=16<24<25=322^4 = 16 < 24 < 2^5 = 32

Using the same logic, we can see that 26,000 diseases can be diagnosed with just 15 questions, since 215=32,7682^{15} = 32,768. Of course, this only works if perfectly balanced questions exist, which is rarely the case.

First questions in ‘Guess Who?’

Here are some typical questions and how many characters they eliminate:

Most of these splits are unbalanced. But if we combine conditions, we can do better.

Combining questions for better splits

For example, asking:

“Does your character have black, brown, or orange hair?“

Yes: 13 characters
No: 11 characters

This is much closer to an ideal 50/50 split. We could even create a perfect 12/12 split with a bizarre question like “Does your character have black, brown, or orange hair and is not a female with a hat?”. But that’s essentially the same as the previous one, just moving Maria (the only female with a hat) to the other group.

Now, if the answer is yes, the remaining characters are:

Alex, Alfred, Anne, Bernand, Bill, Frans, Herman, Maria, Max, Phillip, Richard, Robert, Tom

Narrowing it down further

On this reduced set of 13, the next possible splits are:

Now we have questions that split characters the best they can like “Does your character have facial hair?” or “Is your character blond?”. We will use “Does your character have facial hair?” and continue this process. We can then arrive at this diagram bellow:

The technique we used to create this diagram can be generalized with a real-world example where we want to create an AI model to prescribe medicine. We will use this simple dataset.

Decision Trees in the Real World

The technique we used to create the Guess Who? diagram can be generalized with a real-world example where we want to create an AI model to prescribe medicine.

We’ll use a simple dataset with 14 patients. Each has characteristics like age, gender, blood pressure, and cholesterol. A real dataset would be much larger, but this toy example illustrates the point.

For instance, asking “What is the age of the patient?” gives us three groups:

Young: 3 prescribed Drug A, 1 prescribed Drug B
Middle-aged: all prescribed Drug B
Senior: 3 prescribed Drug A, 2 prescribed Drug B

Entropy in action

To determine the best question, we measure entropy – the amount of disorder in the data.

High entropy = outcomes are evenly mixed (e.g., 5 patients with Drug A, 5 with Drug B)
Low entropy = outcomes are mostly one-sided (e.g., 9 patients with Drug A, 1 with Drug B)

When we ask “What is the age of the patient?”:

Young group → medium entropy
Middle-aged group → low entropy (always Drug B)
Senior group → high entropy (mixed outcome)

By calculating information gain, we can rank each possible question and pick the one that reduces uncertainty the most. This is exactly what decision tree algorithms do.

Decision trees in AI and Machine Learning

This process is called building a decision tree. Decision trees are widely used in machine learning for classification and prediction problems.

In practice, companies like Infobip use ensembles of decision trees to predict outcomes – for example, the best time to send a message. These ensembles combine many decision trees for higher accuracy.

Some of the most popular ensemble methods include:

RandomForest
XGBoost
CatBoost
HistBoost

They work extremely well and require relatively little tuning compared to deep learning models.

Smarter questions lead to smarter decisions

Whether you’re playing Guess Who?, diagnosing diseases, or building an AI system, the principle is the same: ask the right questions to reduce uncertainty as efficiently as possible.

Decision trees make this systematic. They balance the dataset, calculate information gain, and split the problem space step by step until the answer becomes clear.

From childhood board games to cutting-edge AI, the lesson is timeless: the smartest path forward begins with the right question.