Stop Studying (for the Test)
You remember cramming, right?
Even if you were a good student, there was probably a test where you didn’t feel like you understood the material well enough to ace (or even pass) tomorrow’s exam. You made a calculated trade: you would swap a few hours of sleep in exchange for a few more hours of studying.
Did this actually work? The answer is a bit complicated, but at least on certain occasions and for certain tests, cramming can make a positive impact. It’s certainly far better to absorb the material steadily as you go along, reminding yourself rather than learning for the first time as test time nears.
Not everyone takes this safer approach, though—in fact, it’s fair to say that diligent students are the exception rather than the norm, and so you end up with a lot of cramming.
I think it’s conventional wisdom by now that cramming is generally a bad way to learn something for good. You might be able to retain the information for long enough to pass a test, but writing that info permanently into your brain is a completely different matter.
I want to take this a bit further today.
Cramming is bad, but studying in order to pass a test is the same kind of bad.
The ultimate goal of learning isn’t to cross an arbitrary finish line. It’s not to see if you can stay calm under pressure, in a room filled with other sweating students. It’s not to see what you can remember right now, during this nervous hour.
A test tells you these things about a student, but not whether they are learning the material. Will they know what you mean in three years if you ask a question based on the knowledge this test proves they have right now?
We are seeing the direct effects of studying for the test in the way we’re trying to test and understand today’s AI frontier models.
Frontier models are the ones at the bleeding edge of intelligence, reasoning, thinking—whatever you want to call it, they are undeniably smart—and useful in the right hands. Today, there are three names dominating the conversation, and it’s probably time to get familiar with these names if you’re not already.
OpenAI’s ChatGPT is probably the most famous of all the LLMs at the frontier, but by now you may have heard of Anthropic’s Claude and Google’s Gemini. All three try to produce a model that wins the title of Best at Taking This Particular Test.
The particular tests have clever names like Humanity’s Last Exam and SimpleBench—shout to Why Try AI for keeping Substack up to speed on all this! These tests all try to stump these models so that we poor humans can figure out just how smart they really are, but I think you already see the main issue here.
Indeed, when these models are being trained in the first place, they’re often built with passing these tests in mind and little else. The target isn’t becoming smarter or better—the target is to pass the test.
The problem is that you can get very good at passing a particular test, but still be dumb as a bag of bricks.





Bruh, only losers study for exams INSTEAD of sleeping.
Real pros just put their textbooks under the pillow and absorb knowledge through osmosis WHILE sleeping.
Look it up!
But funnily enough, I've had some discussions recently about this in a non-AI-related context. In Ukraine, primary school that I experienced focused a LOT on memorizing the "right" answer and then spitting it back at the teacher to get good grades, so that's what I've been raised with. But in Denmark, a lot more emphasis (at least in university) is on actually applying that knowledge in the real world via case studies (applying what you've learned to help a company with a real issue/process) and group work, which teaches you more of the skillset you'll actually need to navigate the world.
In some sense, Ukraine here is a bit like training AI to pass benchmarks with static, baked-in knowledge, while Denmark is more about giving AI a "world model" that helps it navigate novel scenarios, which nobody has fully cracked yet.
I never studied, I meditated. Drove fellow students crazy as they crammed every second before the exam. Being calm and collected can achieve mighty results. I also wrote a program called Jeopardy that gave answers and asked for questions. Only needed for history classes that required memorizing, which I consider harmful.