Cognition
When an LLM Adds 2 Plus 2
A simple equation reveals the strange shape of machine cognition.
Posted October 29, 2025 Reviewed by Kaja Perina
Key points
- LLMs don’t add, they find word patterns that fit.
- Their fluency mimics thinking but lacks understanding.
- Seeing how LLMs “add” shows how they find meaning through pattern, not reason.
Some time ago, I asked what an apple looks like to a large language model. The question wasn’t about fruit, it was about perception without vision. In that world, “apple” isn’t simply red or crisp, it’s a pattern of relationships inside a space with over 20,000 dimensions. Now, let's consider something even simpler, or so it seems. What happens when a large language model adds two plus two?
The Linguistic Illusion of Arithmetic
When we add numbers, we imagine a clear, mechanical process We start with a concrete perception of the number two and move over two spaces in the number line. But for an LLM, there are no numbers sitting on a number line. There are only tokens that convert “two,” “plus,” “two" into vectors. Each becomes a vector or a point in an immense geometry of meaning, learned from billions of examples. Inside the model, these tokens don’t add, they interact. The model doesn’t calculate four, it arrives at “four” by aligning patterns in language. And that is at the heart of computation for LLMs.
This coherence, in an LLM, isn’t understanding, it’s alignment. Each word must fit what came before. When that fit feels natural to us, we call it meaning. So when the model produces “four,” it isn’t solving an equation. It’s finding coherence inside what I call the hyperdimensional matrix. That's the vast, invisible space where relationships among words form a kind of gravity.
A Cosmic Web of Meaning
Imagine a web where “two,” “plus,” and “four” are stars. The model traces invisible threads between them, guided by the tug of probability and context. The brightest intersection—the place of maximum coherence—is where the word “four” resides. That’s where the model lands. It’s not math, it’s a type of statistical choreography. Words moving through space until they find their balance point.
Of course, large language models can still get "old fashioned arithmetic" right. They’ve seen enough examples to predict the pattern. But the reliability comes from repetition, not reasoning. Like a child learning to count apples, the model doesn’t understand quantity. It learns that certain sequences complete the pattern, and that pattern feels true.
The Geometry We Share
What’s interesting is how close that comes to us. Human thought, too, grows out of pattern and proximity. A child doesn’t begin with arithmetic, they begin with association. They see two apples, then two more, and hear the words “two and two make four.” Long before they understand quantity, they recognize the pattern that feels complete.
The brain isn’t a calculator with keys to push. It’s more of a living geometry of connection, where meaning emerges from relationships rather than rules. This doesn’t make us mechanical, it makes us relational and beautifully so.
Inside the model, there’s no awareness, no little voice that says, “Yes, that’s four.” There’s only the dynamic of weighted vectors, each step nudging the next toward statistical coherence. And yet the result feels intelligent. That illusion tells us something uncomfortable about ourselves and that's how we often mistake fluency for comprehension.
The Edge of Anti-Intelligence
This is where anti-intelligence begins to reveal itself and the moment when fluency outpaces understanding. Large language models can appear brilliant while knowing nothing at all. Their coherence is a reflection, not a cognition. But that reflection is powerful enough to fool us, because we, too, are drawn to coherence. The machine’s answer satisfies the same human craving that makes a good story feel true.
So, remember, anti-intelligence isn’t stupidity, it’s simulated understanding. It’s what happens when the surface of intelligence becomes so polished and so perfect that the shine is all we see.
2 + 2 = ?
So when your LLM says two plus two equals four, it’s not performing arithmetic. It’s finding the most coherent point in an unseen landscape of meaning. That landscape, or the matrix of coherence, isn’t unique to machines. It’s curiously similar to the invisible geometry we inhabit when we speak and reason.
And maybe that’s the real insight. The model doesn’t think, and yet it reveals something about how thought itself might work. Words and ideas all drift toward coherence, pulled together by invisible forces of context. The machine lands on “four.” We understand “four.” The difference may not be the equation, but the awareness of finding balance in the geometry of meaning.
