THQIHVE5

Ma versus Machine

The Japanese have a special concept they call 'Ma': "that which is in-between", or the Negative Space. 'Ma' is as hard to define as the words 'something' and 'nothing' -- or perhaps even harder, as it necessarily precedes them. It's best understood through a shift in attention.

Consider a sumi-e painting: it leaves much of the canvas blank, yet this space doesn't assert itself as a barren emptiness; instead, it draws attention to the forms contained within, making them more striking, and imbuing the whole with spaciousness. Likewise, the space around the strokes in a piece of calligraphy enables their delicate balance, and the pauses between the notes of a symphony let the music breathe. But Ma goes beyond literal space: the expression "to read between the lines" relies not on the blank spaces of a page, but on the Ma that connects what is said, to what is left unsaid, or indeed -- what cannot be said.

From a more pragmatic angle, 'Ma' is that which cannot be objectified, but which allows objects to be perceived. And of what use are the objects, if not to support the relationships perceived between them? Thus Ma hosts and mediates all relationships: palpable or subtle, elegant or contrived, perceived or waiting to be perceived; it allows for them all to be realized at once -- a potential for an infinity of meaning in the finite.

Without Ma, a constellation of stars collapses into a single point; and that point, deprived of space to live, dissolves into a void. Even knowing the exact locations of the stars, the constellation can't be fathomed without the surrounding space, so that the relationships between each star and every other star could all be grasped simultaneously, without effort. And in that gestalt, how many different ways to connect the dots? How many different shapes to be perceived?

Such is the function of the mental space we call the Mind: to host all the elements and aspects of perception -- representations of the world and its objects -- so that relationships between them be fathomed, and those be organized into structures of concepts, and thoughts be formed in terms of them -- all to enable reason. Reason demands a map of the world and the Mind allows for it to be constructed. Yet no single map of that kind can wholly capture such a territory: just as reality itself is inexhaustible in all of its possible relationships, so is its reflection in the Mind.

But as Man grows older, his map of himself and the world grows more complicated. And as his life grows more complicated, so is he forced to rely on this map more often to judge his course. Through familiarity, the map becomes more real than anything else. Habits of the mind calcify the mind, and Man no longer perceives through the Ma, but through the map. If the map could be plucked from his mind, perhaps another could talk for him, and do so faithfully, just by placing a finger on the map and following its well-worn paths -- all while Man sleeps. Perhaps Man already sleepwalks through those paths as he thinks and talks.

But if a man could follow such a map in his sleep, or with his finger, couldn't machines do the same? Perhaps they already do.

Think of a language model: at the bottom of it lies an abstract universe of sorts, filled with constellations of points that represent so-called 'tokens' -- roughly, word-like elements that can be considered the model's atoms of thought. Taken together, these points form a complicated structure whose internal relationships capture all the potential ways for tokens to be strung together into complete texts.

As with a real constellation in the night sky, here the relationships are implicit in the positions of the points relative to each other. But beware of taking the metaphor too literally: the points inhabit a space with hundreds of dimensions and forming a meaningful text from them is not a simple matter of connecting the dots. Instead, a text is generated one token at a time. At every step, the structure must undergo a series of transformations, determined by the sequence of tokens making up the text so far, to reveal relationships suggestive of sensible continuations.

The generative process, lacking a direct comprehension of the "semantic universe" and its contents, relies on a map to navigate it: more precisely, since it relies on an entire pipeline of transformations -- each one a mathematical map leading from inputs to outputs -- we can say that it relies on an entire hierarchy of maps.

Once all the transformations have been applied, the model estimates the plausibility of each possible continuation. This is the stage that most directly influences the final decision, and roughly speaking, it does so on the basis of familiarity: the more direct and straightforward a potential continuation seems in this final "perspective" on the "universe", the more plausible it is deemed. Thus the decision over the next token is governed by a "perspective" that downplays the countless potentialities in the original space, thrusting only a few of them into focus, so that a choice could be made mechanistically. The model does all this in a fixed and predetermined way: given the same inputs, the same relationships will always be emphasized, favoring the same choices of tokens.

This "limitation" is in fact crucial for text generation: the model can neither see the whole nor make any sense of it without a process of gradual reduction. Moreover, although the limitation seems to stem from the reliance on a hierarchy of maps to navigate the semantic universe, it's in fact implied in its very structure: the model is trained end-to-end, so that corrections for a "bad" prediction propagate all the way back, adjusting that structure.

In other words: the semantic universe is organized with the sole purpose of facilitating its own unraveling in essentially predetermined ways. Its essence is pure function, just as the naive Computational Functionalist demands, and that function is to follow a map of mankind's collective mind. No matter how convincingly the model does so, it will be forever limited, by its Ma-less mindlessness, to treading old ground; myopically rediscovering already established relationships as it stumbles upon them after a series of prescribed reductions.