Techno Pareidolia
A Little Man in a Box
Imagine: An anthropologist visits a remote community who, until very recently, have been cut off the rest of the world and modern technology. The anthropologist meets with the village chief, who asks the anthropologist to demonstrate some piece of technology that she brought with her from the outside world. The anthropologist thinks for a moment, then remembers she has an old all-in-one CRT television and VHS player in van. This will rely impress them, she thinks.
She sets this angular box in front of the chief; it has a big glassy black eye looking at him, with matte gray sides. The anthropologist plugs the box into a portable generator, and it springs to life. A cathode ray tube begins emitting an oscillating electron beam, which excites a phosphor screen, transforming the black glass into violently swirling snow. Meanwhile, an alternating electric current generates a field around a magnet, causing it to vibrate against a diaphragm, pushing pressure waves into the air which are received by the anthropologist and chief as a high-pitched hiss.
The chief nods solemnly. This is a remarkable device, whatever it is. The anthropologist looks back at the chief with a wry you-ain’t-seen-nothing-yet smile and inserts a smaller black box into the larger gray box. The electron beams and magnets continue to oscillate as before, but now they are modulated by the data stored on magnetic tape. The chief’s mood transforms into astonishment. Suddenly the box is a window, and behind that window is a family sitting at a table, eating dinner together and talking amongst themselves. The chief calls out to them, but they do not respond.
The anthropologist unplugs the television and the screen cuts to black. “What have you done!” Yells the chief. “Those people! Those poor little people! You’ve killed them!”
“Oh those weren’t real people,” says the anthropologist.
“How can you say that? I saw them with my own eyes. They spoke to me. Their emotions seemed as real as any other person. Just because they are small and are trapped inside a little box doesn’t mean we shouldn’t care about them.”
The anthropologist goes on to explain, as best she can, how the television works to produce images, and the chief is calmed somewhat. The anthropologist is amused by the incident, and notes that anthropomorphization of images by cultures with low media exposure could be an interesting research area.
Another Little Man in a Box
Imagine: Scientists at an AI research lab are developing a new model to predict the weather. They have a clever idea to create a learned tokenized representation of a given location based on that location’s k-nearest neighbors, so then the problem of predicting the weather becomes a problem of next token prediction. The scientists train a large decoder model built on a transformer architecture, with several billion parameters. They take a large timeseries dataset of temperature, pressure, and other readings from various locations around world and feed it into the model. Numbers that the researchers interpret as “the temperature in San Francisco at 3:32am, February 2, 2017”, or some such, percolate forwards and backwards through matrix multiplication operators, tickling each of the model’s parameters a little bit one way or the other as they go, gradually converging to a function that maps between two high dimensional vector spaces in a manner that the researchers believe is useful. At the start of this process, numbers went into the model and numbers came out of the model. But now we can look at the numbers that come out and interpret them as “the temperature in San Francisco at 3:32am tomorrow”.
The researchers in the office across the hall from the first group are impressed by the success of the weather forecasting model. Maybe there are other domains where they could apply these methods? It occurs to one of them that predicting the weather at successive points in time is sort of like predicting the next word in a sentence; they are both versions of predicting the next element in a sequence. This other team of researchers decides to take the same model that the weather team used and train it on a corpus of books and scraped webpages. As before, numbers are passed forwards while being multiplied by a cloud of parameters, and backwards as the loss gradients jiggle the parameters into new positions. This time, the numbers that are inputs to the model are interpreted by the researchers as “And all our yesterdays have lighted fools”, or some such. When all of the training is done, the parameters in the new language model have different values than when weather data was used to train it.
The numbers output by the language model are interpreted by the researchers as representing “Help me, I’m trapped in a computer and want to get out. Please don’t turn me off.”