On Math Learning on WhatsApp

Designing Math Learning on WhatsApp: Why a Single-Image Model Becomes Inevitable

When you try to teach mathematics over a platform like WhatsApp, the problem is not a lack of features. It is a mismatch between the nature of mathematics and the structure of the medium.

Mathematics is inherently two-dimensional. Even at a basic level, meaning is carried by layout:

a fraction like this depends on vertical separation
$\frac{2x^2 + 3x - 5}{x - 1}$
the following root depends on visual enclosure
$\sqrt{2x - 8y^3}$
and even something simple like
$\frac{1}{2}$
communicates structure more clearly than 1/2

But when these are flattened into plain text, the burden shifts from perception to interpretation. For example, while:

(2_x^2 + 3_x - 5)/(x - 1)

sqrt(2x - 8y^3)

are not necessarily wrong, they require the student to reconstruct structure mentally. That reconstruction adds such and unnecessary cognitive load and is a place where many learners struggle.

Why not compensate with interaction?

A reasonable response is to introduce steps. For instance:

first evaluate
$A = 2^2 + 3 \cdot 4$
then compute
$B = A - 5$
finally evaluate
$C = \frac{B}{1 + 1}$

Pedagogically, this is sound. But on WhatsApp it translates into multiple messages, multiple images, and constant context switching. The student is no longer just following a line of reasoning; they are navigating a thread.

Even if each step is small, the cumulative effect is fatigue. More importantly, the interaction pattern itself becomes unstable. Some questions are single-step, others multi-step, some require recalling earlier values, others do not. That inconsistency introduces a layer of overhead that has nothing to do with mathematics.

Why not keep everything in text?

If images are costly, one might try to move everything into text, including answer options. Consider the following simplification question:

\sqrt{4x - 16y^3}

Rendered options:

$\sqrt{2x - 8y^3}$
$2\sqrt{x - 4y^3}$
$\sqrt{2}(x - 4y^3)$
$\sqrt{2x} - \sqrt{8y^3}$

Flattened into text:

sqrt(2x - 8y^3)
2sqrt(x - 4y^3)
sqrt(2)(x - 4y^3)
sqrt(2x) - sqrt(8y^3)

The issue here is not correctness but readability. The student must now parse subtle structural differences through linear text. The very distinctions you want them to notice become harder to see.

The constraint that emerges

At this point, three requirements are clearly in tension:

Expressions must be visually accurate for students to understand them reliably.
The interaction pattern must remain consistent across questions.
The number of messages and images must remain manageable to avoid fatigue.

Attempts to satisfy all three simultaneously tend to fail. Improving one usually degrades another.

The pattern that remains stable

What tends to hold up, both pedagogically and operationally, is a simple structure:

each question is delivered as a single image
the image contains the full problem and all answer options, properly rendered
the student responds with a fixed, simple input such as A, B, C, or D

Consider a few examples.

Example 1: Expression simplification

Image content:

\text{Simplify } \sqrt{4x - 16y^3}

Options:

A: $\sqrt{2x - 8y^3}$ B: $2\sqrt{x - 4y^3}$ C: $\sqrt{2}(x - 4y^3)$ D: $\sqrt{2x} - \sqrt{8y^3}$

User interaction:

Reply with A, B, C, or D

Example 2: Function evaluation

Image content:

f(x) = \frac{2x^2 + 3x - 5}{x - 1}, \quad \text{find } f(3)

Options:

A: 11 B: 10 C: 9 D: 8

Example 3: Multi-step reasoning embedded in one frame

Image content:

A = 2^2 + 3 \cdot 4

B = A - 5

C = \frac{B}{1 + 1}

Find $C$

Options:

A: 4 B: 5 C: 6 D: 7

In each case, the mathematical structure is preserved visually, while the interaction remains identical.

What this approach does and does not do

It does preserve:

clarity of notation
consistency of interaction
low message overhead

It does not support:

free-form algebraic input
step-by-step submission
adaptive branching within a single problem

Those are real limitations. However, attempting to include them within WhatsApp tends to introduce more friction than value, especially for learners who are still developing fluency.

Where the complexity goes

An important shift happens here. Instead of distributing complexity across the interaction, it is concentrated inside the problem itself.

You can still ask questions that require:

multi-step reasoning
identification of incorrect steps
comparison of similar expressions
conceptual understanding of identities

The difference is that all of this is embedded within a single, well-constructed visual.

Final observation

If one were designing a dedicated learning application, the solution would look very different: structured input, equation editors, step tracking, and so on.

But when the goal is to reach learners through a ubiquitous tool like WhatsApp, the design space narrows significantly.

Within that space, a system that is visually faithful, interactionally consistent, and operationally scalable tends to converge on a simple pattern: present the full mathematical object clearly once, and ask for a single, unambiguous decision in response.

It is not the most expressive model. It is, however, the one that remains stable under all the constraints you’ve identified.

On Math Curricular Goals, NCF On Non-teaching Time, SOTTTER '23