by Malcolm Murray
Oops. During most of 2024, all the talk was of deep learning hitting a wall. There were secret rumors coming out of OpenAI and Anthropic that their latest training runs were disappointing. People were confidently stating that AI progress had now hit a plateau. Importantly, this was not just a few pundits that have bet their career on the current AI path being the wrong one, such as the Gary Marcuses of the world. It was all of tech and even mass media jumping on AI-hitting-a-wall meme.
And then what happened just at the end of the year? OpenAI announced its o3 model. A new model, with an unknown architecture, that achieves benchmark scores hitherto unimaginable. o3 achieves 88% on the ARC-AGI challenge, a benchmark designed to measure the efficiency of AI on novel tasks. o3 gets past 2,700 on Codeforces, making it comparable to the best developers in the world. Maybe most impressively, o3 gets 25% on FrontierMath, a benchmark created by Epoch AI just months earlier with the most fiendishly difficult math questions, where other AI models have literally had 0% correct (except o1, which had 2%).
The point of this post is not to argue that these are mind-blowing scores and that AGI has been achieved, as some were quick to do. There are always question marks in how the scores were achieved. For example, some people question whether having the model trained on some of the public ARC-AGI training set unduly influenced the results. Or perhaps, it only solved the easier questions in FrontierMath. But that is beside the point. The bigger point here is that, when it comes to AI evolution, prediction seems dead. As noted by Zvi Mowshowitz in his post on o3, no one saw this coming. Outside of OpenAI, the whole tech industry seemed all bought in on the AI has hit a wall meme. So what this means is that we need to stop focusing on trying to predict AI capabilities and instead focus on building resilience to AI risk. Rather than try to determine exactly when we will get specific AI capabilities, we need to start preparing society for its effects, so that the impact can be mitigated. Read more »



My great-grandparents were among the 12 million immigrants who passed through Ellis Island and equally a part of the wave of 20 million immigrants who entered the United States between 1880 and 1920. America’s fast-growing economy needed more manpower than its existing population had available, and the poorer classes of Europe were the beneficiaries including four million Italians (largely southern) and two million Jews.
An empire, threatened on its flank, vents spleen



Some people use religion to get their life together. Good for them. I’m all for it. Although I myself am an atheist, I don’t think it much matters how someone gets their life together so long as they do.

On the one hand, nothing has changed since August 2020, when I wrote 
Anatomically, it’s the optic disc – the spot on each retina where neurons with news from all the light-sensitive rods and cones of the retina converge into the optic nerves. The optic disc itself,

Sughra Raza. Rorschach Landscape, Guilin, China, January 2020.


Of course there was no guarantee that Gerver’s couch was the biggest possible. Dr. Gerver’s approach made no promises that it gave the best possible, after all. A little more convincing is the fact that in 30 years we haven’t been able to do any better. But mathematics is a game of centuries and millennia — a few decades is small potatoes. In 2018, Yoav Kallus and Dan Romik proved that the couch could be no larger than 2.37 square meters. But the gap in size between Gerver’s couch and the Kallus-Romik upper bound is an order of magnitude larger than that between the couches of Gerver and Hammersley.