The Text Singularity
There are several versions of the same cartoon: A 10th grader sits at her desk at 9 p.m. on the night before a big essay is due. She puts her hand on her dictionary and says, “Okay, all the words are right there. I just need to put them in the right order.”
It’s funny because putting the words in order, right or not, is “writing.” And writing is hard, and important. Or at least it was. It may no longer be, after the text singularity.
Scientifically, a singularity is a boundary beyond which the known physical rules no longer apply, or where the curvature of space-time is no longer defined. If such boundaries exist, we cannot imagine or understand them.
A new kind of singularity—the “technological singularity” described by Verner Vinge—is easy to imagine, though hard to understand. One example would be “the text singularity,” using the fourth generation of generative pre-trained transformer (GPT4) technology that will likely drop this year.
GPT4 are artificial intelligence (A.I.) software that produce grammatically ordered text. Such chatbots are often mocked because they write stilted prose. But what if chatbots wrote a lot of things? In fact, what if chatbots wrote everything: all possible sequences of words, in all languages, in all formats, from short haikus to Remembrance of Things Past? It’s not a conceptually challenging problem, though it would take substantial developments in text generation and storage to produce that many sentences.
By the end of this year, GPT4 chatbots will be able to produce, in less time than it takes to read this sentence, millions of texts on all the topics that you can think of, as well as those no one has ever thought of. Most of those “texts” would be nonsense, of course, but the advantage of GPT4 technology is that it could create all the text, of every length. That means that along with all the nonsense, we would have all the best texts, also.
The problem would be equivalent of the 10th grader’s dictionary, just one step further along. “All the words are right there,” but we would need some way of choosing among the library of trillions and trillions of texts to find the one that serves our needs.
That is not a very conceptually challenging problem, either. We have a substantial corpus of texts, dating from antiquity to five minutes ago, that we consider good, useful, or entertaining, and thus worth publishing. Those texts give us an obvious training set for a selection process, enabling an enormous profusion of A.I. entities operating in parallel to prune the set of all possible texts to the (much smaller, but still enormous) set of possibly usef
Article from Reason.com