I’ve been thinking a lot about thinking lately.
I really love frameworks that sort and simplify a structure that might seem overwhelming, and this trick can work for a lot of different arenas of thought. I did this for the universe in The Substance and the Fabric, and I wanted to do something similar here for thinking.
Thinking is a great candidate for such a simplifying framework. When you get right down to it, intelligence seems to boil down to just two components.
There’s the thinking architecture. For wet things like humans, we might make the mistake of equating this with the brain, but that’s not quite right. That’s because our minds are bigger than our brains. We use our hands for cognitive offloading, and we use all sorts of tools designed to sort things for us.
Calculators and computers have this sort of thinking architecture, although maybe it sounds offensive to use the word thinking there. We think of thinking as something mystical, but when you get all the way down to how decisions are made, it’s just lots of little binary battles:
That neuronal tug-of-war that happens whenever you make a decision, where millions of parameters are considered in a split second, each having their own little battle play out across dendrites and axons.
That’s the same tug-of-war that’s happening any time you generate an image or ask a question of an LLM. This incredibly complex process happens beneath the surface of our conscious thought...
It’s ironic that we initially set out to mimic human intelligence by programming AI to think like us, but what seems to be happening more and more is something completely different: we are learning how our own intelligence works. It turns out we never really understood this very well in the first place.
In this way, any system that can reach equilibrium can be said to be a thinking system, albeit incredibly simplified (and only interested in one parameter, not billions). You’re probably not going to be friends with your thermostat, but it acts as your agent, in a manner of speaking, making the room just a little cooler whenever it gets too hot.
Norbert Wiener, today considered one of the founders of modern AI, famously used the thermostat as an example to point out this sliding scale view of thinking. There are degrees of how much architecture there is, and how many of those tiny decisions are made, even if all the decisions are based on feedback.
Besides this sliding scale of how good the architecture is, there’s also another important component to thinking that you may have heard quite a bit about lately, especially if you’ve paid attention to AI news. It’s that thinking costs energy.
When I first heard about these costs, they were relatively low and pretty easy to dismiss. However, today it’s no longer so easy to be sanguine: data centers today use as much energy as Finland or Chile, and if that doesn’t double within two years, I’ll be an anthropomorphic goat.
If you have an AI model and you want it to give you a much better answer, one trick is to ask it to think about it for longer. This is a really clear example of how you can make up for a crappy architecture with more energy, and you constantly hear examples of how much more efficient these models are getting, energy use wise, which has a direct bearing on their cost.
This is just like when you go on a road trip with friends. You can get to where you need to be, but you don’t exactly want to burn an extra gallon or two of gas getting there, so you think carefully about the most efficient route. That physical gas being burned is exactly what’s happening when those AI systems begin sorting.
Stunningly, James Clerk Maxwell came to a similar conclusion in 1867: that thinking has a cost in terms of energy, and there’s no way around this. He got here with a thought experiment we now refer to as Maxwell’s Demon, where a little fellow opens a tiny trap door to allow only fast-moving particles over to one side, thus creating something akin to a perpetual motion machine.
There’s really no free lunch here, since it costs just as much energy to steal organization from the universe. The universe is in a state of constantly becoming more and more disordered, and that means organization gradually erodes as a rule.
This is creeping into some very thorny territory: systems always tend to get more and more disordered over time, so keeping them organized costs energy. That’s because there are far more disordered states out there, so when particles wiggle around, they’re more likely to end up in a slightly more disordered state. Things decay over time.
This tendency toward disorder is baked right into the universe, and entropy may well explain the arrow of time. In other words, there is no getting around this expense.
Thinking is fighting against the arrow of time, in a manner of speaking. That’s why it can be so exhausting, and that’s why I need one more cup of coffee.
You've put me in mind of Sadbook's AI video! :)
https://www.youtube.com/watch?v=0EtxkkabRUw
It’s not often that one sees the Second law of thermodynamics referred to. Given that, one of our dachshund’s full, registered name is Entropy’s Arrow Daphne. Take a guess at her personality 😁