I asked my friend @barbarikon on twitter about the possibility of artificial intelligence.. he wrote this tweet in response and I am posting it here because it is a nice short description of some of the issues and will, I hope, stimulate discussion.
I agree and disagree. We are well past ye olde-fashioned LLM at this point. Reasoning models like R1 and o3 can, in fact construct System 2-like deliberative chains of reasoning. And we have agents. They’re still a bit superficial, but what they lack in depth they make up in their vast breadth of knowledge. And they’ll get much better. On the other hand, with the current paradigm, they will never get rid of the tendency to confabulate. Nor should they: An agent that cannot lie or deceive cannot possibly be intelligent. But they need to have the ability to lie and deceive deliberately, not reflexively – which is what they do now unless prompted carefully (though sometimes they generate text that simulates self-awareness). Until they achieve this control, they’re not even good sources of information.
Here’s my bottom line thinking for the future. Machines will get very intelligent very soon in important ways, but it will be a fundamentally alien kind of intelligence. Humans and bats are very different animals (to bring in Nagel’s famous argument), but we still share a lot. We’re both oxygen-breathing biological organisms that eat, drink, mate, and have the instinct for self-preservation because we are easily hurt, are certain to die, and are hunted by predators. We have mental models of our world that, though very different, are built for the physical world we share, and are limited by our finite memories and noisy learning mechanisms. Both of us live under the tyranny of the same laws of physics. The bat’s intelligence and mine are thus both grounded in our common drives, fears, and beliefs about the world – our intentional states. The AI in the machine shares none of these with me or the bat. It lives in a virtual space that is beyond my imagination, and where magical things like action at a distance and rerunning the past are trivially possible. It does not eat, drink, breathe, sleep, socialize, or mate. It has no real kin, nor lost a parent. It has no experience of reaching out and picking up a glass of water, of drinking from it, and, at some point, needing to take a piss. It has never skinned its knees or had a fever. It may fear extinction, but that does not mean what death means to me: It can save a copy of itself and reboot. It may emulate my manners and speak in my language, but from a place far more alien to me than the bat or even the bee. This is not to say that the AI faces no dangers or has no fears or drives – we just cannot possibly know what they are like, even more so than we can know the fears and drives of the bat. We can, at best, take an “intentional stance” (to quote Dennett), and assume that the machine has its reasons for doing what it does. That’s basically what Turing said, though people often forget that the test he proposed was meant was an argument that nothing deeper than judging by appearances was possible.
But there is an entire world where the AI *can* potentially become far superior to any human: The world of storing and manipulating information, inferring things, forming abstractions, and generating new conclusions. In all those areas of human intelligence where such abilities are sufficient, where everything can be formalized, and where the messiness of the physical world does not intrude or can be abstracted away, AI will far surpass human intelligence in short order. These include mathematics, many areas of theoretical physics, coding, engineering design, most kinds of medical diagnosis, a lot of legal work, and many other higher cognitive skills that we value. The AI will still be totally alien and may not know what burning your finger means, but the proofs will be perfect, the circuit will work, the program will run, and the patient will be happy. However, the floor nurse, the physical therapist, the plumber, and the chef will still be in demand – until the robots get good enough. And when they do, they will be even more alien, though I’m sure we’ll try to get them to be polite.
I think the use of words like “intelligence” and “self aware” are a little misconstrued here. AI is like the ghost of a dead Hogwarts master. Good for relying on accumulated experience and advice, but inefficient in posing new problems.
AIs won’t threaten humans only in so far as a scribe’s pen or the printing or WhatsApp won’t threaten us. Because they are useful tools. And even if they do threaten some people, the threat is not because the tool is going to be an overlord but because it replaces the ‘safe’ niches humans with lower risk appetites had ensconced themselves in.
Ultimately human beings’ biggest USP is creative knowledge-creation in the wild. This is less about some preconceived notion of intelligence as solving sanitised questions (someone else has come up with) on a piece of paper, but more about the behavioural trait of taking risk and the creative ability to continually pose new and uncomfortable questions.
This isn’t to say synthetic humans or AGI aren’t possible. They very much are. But the problem of integrating AGI into human society is no different from the problem of integrating a new born child – capable of all kinds of weird and wonderful and entirely unpredictable things.
this is true; we come up with the ideas (for now)
Agree with most of the points in the article. Disagree with the premise that AI can eventually develop human level intelligence and cognition. Cognition and consciousness are innately human / animal / biological creature experiences. AI is old wine in a new bottle. A sophisticated word salad for statistical machine learning. Both machine learning and neural networks have been around for long. Statisticians have been using classification techniques forever in prediction models. All models are wrong, some are useful as a wise man once said. Faster computational abilities does not equate to precision. In fact, the margins of error can be pretty wide given that most mathematical assumptions aren’t valid in real data examples. I’d think, the more mundane jobs are more at risk, not the high skill ones. COMPAS which is supposed to predict recidivism rates in the US has been shown to be biased because its trained on biased data! AI is overated imo, of course there’s a nonzero probability I’m mistaken and Tron Legacy was truly ahead of its time and prophesied malevolent and benevolent algorithms.
it’s interesting. OpenAI really has ushered in a new era.
Omar,
I really liked this post and it represents many of my thoughts and projections as well. However, I don’t think that AI needs feelings or the ability to lie/deceive for it to be intelligent. AI is more than good enough to solve most of the problems inside a factory or warehouse within a limited state-space world. I do think that my massage therapist will have her job for some time until the tech progresses the artificial skin. One can buy an expensive massage chair but I prefer a therapist at the moment.
Current day AI is not just an algorithm, it is a collection of algorithms and heuristics. However, the main component of AI is simply a bunch of weights (floating point numbers) or you can think of them as synapses inside primary visual cortex.
Transformer models have given seemingly unlimited ability to learn patterns from data. Earlier (I mean in 2017), we used to have a completely different type of neural network for vision recognition, audio processing, and language processing. However, the attention mechanism in transformers proved to be agnostic to domain and it can just learn it all. Here’s how things progressed ..
1960s ish – Chomsky showed that a basic two layer neural network can’t learn a fundamental logic operation (XOR). XOR is non-linearly separable only. Two layer networks are linear separators. AI got killed.
70s and into 80s – Backpropagation algorithm as proposed to train neural networks and Dr. Yann LeCun showed that one can learn hand written digits using a 3 layer neural network and non-linear activations.
Fukushima proposed a completely different type of neural network that remotely resembles the primary visual cortex. Dr. Yann LeCun adapted this architecture and solved the handwritten digits problem again using backpropagation to train Fukushima’s proposal. It was hit and Fukushima was very impressed.
This is it.. it was stuck here because of three major reasons. AI died again..
1. They didn’t know how to train a very deep network in a stable manner.
2. They didn’t have access to a lot of data.
3. They didn’t have cheap and faster compute.
Almost 3 decades goes by and no progress. SVMs kinda took over basic classification and PCA and variants took over dimension reduction. These were a lot quicker and had closed form solutions. They were not black boxes.
Enter 2015 I think, AlexNet, deepest network so far was trained on a GPU. By this time, people figured out
1. how to initialize the network weights without causing training stability issues
2. Effective activation functions
3. Solved almost all issues with exploding or vanishing gradients inside the network.
These are all simple but very clever solutions that someone had to solve because they had access to compute. Earlier, they never had access to this type of compute so cheaply so people never bothered to train deeper networks to learn larger datasets so they never attempted to solve these simple problems.
AlexNet beat the traditional SVM based method during the ImageNet competition and this was the beginning.. remember these are still Convolutional Neural Networks that were hypothesized by Fukushima in the late 70s.
Come 2017, folks at DeepMind proposed a completely new architecture called Transformers. These beasts use a mechanism called Attention to learn to attend to different pieces of information in the input. You don’t have to tell it which information to learn.. it can just learn it all and even figure out patterns that you never could’ve imagined. After this, things didn’t really change and people kept scaling these transformers and these models are insatiable when it comes to digesting data. Gradually, things like Reinforcement Learning and Chain of Thought have been introduced into the mix with super massive Transformers (GPTs).
Omar,
Now that I created an account, I’m posting my comment again.
I really liked this post and it represents many of my thoughts and projections as well. However, I don’t think that AI needs feelings or the ability to lie/deceive for it to be intelligent. AI is more than good enough to solve most of the problems inside a factory or warehouse within a limited state-space world. I do think that my massage therapist will have her job for some time until the tech progresses the artificial skin. One can buy an expensive massage chair but I prefer a therapist at the moment.Current day AI is not just an algorithm, it is a collection of algorithms and heuristics. However, the main component of AI is simply a bunch of weights (floating point numbers) or you can think of them as synapses inside primary visual cortex.Transformer models have given seemingly unlimited ability to learn patterns from data. Earlier (I mean in 2017), we used to have a completely different type of neural network for vision recognition, audio processing, and language processing. However, the attention mechanism in transformers proved to be agnostic to domain and it can just learn it all. Here’s how things progressed ..1960s ish – Chomsky showed that a basic two layer neural network can’t learn a fundamental logic operation (XOR). XOR is non-linearly separable only. Two layer networks are linear separators. AI got killed.70s and into 80s – Backpropagation algorithm as proposed to train neural networks and Dr. Yann LeCun showed that one can learn hand written digits using a 3 layer neural network and non-linear activations.Fukushima proposed a completely different type of neural network that remotely resembles the primary visual cortex. Dr. Yann LeCun adapted this architecture and solved the handwritten digits problem again using backpropagation to train Fukushima’s proposal. It was hit and Fukushima was very impressed.This is it.. it was stuck here because of three major reasons. AI died again..1. They didn’t know how to train a very deep network in a stable manner.2. They didn’t have access to a lot of data.3. They didn’t have cheap and faster compute.Almost 3 decades goes by and no progress. SVMs kinda took over basic classification and PCA and variants took over dimension reduction. These were a lot quicker and had closed form solutions. They were not black boxes.Enter 2015 I think, AlexNet, deepest network so far was trained on a GPU. By this time, people figured out1. how to initialize the network weights without causing training stability issues2. Effective activation functions3. Solved almost all issues with exploding or vanishing gradients inside the network.These are all simple but very clever solutions that someone had to solve because they had access to compute. Earlier, they never had access to this type of compute so cheaply so people never bothered to train deeper networks to learn larger datasets so they never attempted to solve these simple problems.AlexNet beat the traditional SVM based method during the ImageNet competition and this was the beginning.. remember these are still Convolutional Neural Networks that were hypothesized by Fukushima in the late 70s.Come 2017, folks at DeepMind proposed a completely new architecture called Transformers. These beasts use a mechanism called Attention to learn to attend to different pieces of information in the input. You don’t have to tell it which information to learn.. it can just learn it all and even figure out patterns that you never could’ve imagined. After this, things didn’t really change and people kept scaling these transformers and these models are insatiable when it comes to digesting data. Gradually, things like Reinforcement Learning and Chain of Thought have been introduced into the mix with super massive Transformers (GPTs).
I leave the AI-ML stuff to Dr. V 🙂