The Subtle Unfolding: GPT and the Fundamental Essence of Intelligence
How Large Language Models (LLMs) Hint at the Simplicity of Intelligent Life
Like many people, I have been fascinated with the recent surge of Large Language Models (such as OpenAI’s GPT) and their almost unbelievable capacity for knowledge retrieval and knowledge reasoning tasks.
Yes, LLMs hallucinate and often they are incapable of dealing with very simple contextual tasks such as character counting.
But despite these weaknesses, LLMs are a remarkable tool that expand the horizon of what computers can do against the cumulative knowledge we have captured as humans in books, press articles, research papers and the the web in general.
Even with their somewhat choppy reliability, LLMs are capable of piecing together information in truly intelligent-like ways and can display the attributes of an intelligent actor.
But what make LLMs incredibly interesting, is the simplicity behind what they do.
LLMs are just character/token prediction machines that rely on their hyper-scaled size, in terms of data and synapses, to exhibit the type of utility they have become popular for.
Of course the way these models have been trained depend on multiple layers of tactical complexity, so they are definitely a mind-blowing accomplishment that wouldn’t have been possible a few years ago.
But at the heart of it all, the transformer architecture in LLMs is just a rather simple mechanism to understand the order of words and how they connect to each other.
This relative simplicity is what makes them even more special.
If a LLM is capable of showing early signs of intelligence, then it’s highly likely that the fundamental rules that allow intelligence to emerge are more elemental and primitive than what our own conscious perception of intelligence would suggest.
This discovery opens the door to amazing possibilities in our attempts to understand the very fabric of reality.
Is it possible that the evasive fundamental theory of everything could be as simple as as the the straightforward scaled mathematics of language that make LLMs possible?
Perhaps the universe is just a system with similar or even simpler rules and the only reason why we have failed to decode its workings is because we have failed to to explore in a direction of simplicity instead of complexity.
The promise and potential of LLMs is a mesmerizing and inspiring example of the simplicity from which the world seems to stem.
I’m not sure LLMs will be capable of achieving AGI, but I wouldn’t be surprised if they did. After all, language is the most important interface for human intelligence and the only mechanism that allows us to relay our stream of consciousness.
It would make perfect sense that decoding the mechanics of natural language and a little bit of reinforcement learning it’s all it takes for artificial general intelligence to happen. Maybe then we could ask that AGI what’s the meaning of life, and we would be surprised how obvious and elegantly simple the answer is.
LLMs are interesting, but I’m not convinced that they are a simple model of intelligence. For one, language is a very human way to understand intelligence, but it isn’t the only way. Species, including rats, pigs, dolphins, apes, crows, and octopuses, all exhibit intelligence through complex social structures, memory, using tools, solving puzzles, non-verbal communication, and more, without the ability to use language. Additionally, there are humans who for a variety of reasons aren’t capable of using language, but who are no less intelligent for it. Written or verbal language is only one way that we can understand and convey information to other humans. Demonstration, tone, drawings and diagrams, gestures, and more are other ways that our understanding is encoded and transferred to others.
LLMs are great at recognizing patterns, but they have no conceptual model of the world, which means that they lack something that makes human intelligence very different from machines. Humans and other species are able to flexibly adapt their knowledge of one thing to another with little to no additional training. Part of that is that they can build analogies between similar things in their mind: this new thing is kind of like this other thing that I’ve experienced before…I’m going to try this approach and adapt based on what happens, adjusting my analogy as I go. And humans are often able to make analogies between concepts that might seem rather disparate, even to other humans. Someone may compare designing a digital product to the process of writing music or to the scientific method, depending on what they’ve had previous experience with. In that process, they often find skills that translate to a new context, making it easier and faster to pick up new skills.
At the moment, our poor facsimiles for intelligence as exhibited by LLMs actually give me more interest in how fascinating and wondrous humans are. If we think about how humans learn, they are able to absorb both the mechanics and meaning of words through only a small sample of language modeled with them independently typically in under 10 years. And even beyond that, their vocabulary and grammar continues to expand and evolve organically as new words and constructions are introduced to the lexicon. And they do that without having to have all the material ever authored by humans fed into them. There are efficiencies in this system of learning that we simply haven’t come to understand.
Instead, I’d argue that the LLMs that are being built to give the illusion of intelligence, like very advanced automatons who can seemingly replicate the output of intelligent life, but not the process. We’ve become, possibly dangerously, enamored of our own creations without considering their limitations or our own. I’d highly recommend reading this article about the nature of LLMs: https://nymag.com/intelligencer/article/ai-artificial-intelligence-chatbots-emily-m-bender.html