Alberto Romero at The Algorithmic Bridge:
Let me sum all this up because it’s too much information to process: What o3 just did is leap into uncharted territory. OpenAI trusted the trajectory and landed here. At 71.7% SWE-bench, 99.95th percentile Codeforces, 96.7 AIME, 87.7 GPQA Diamond, 25.2% FrontierMath and 87.5% ARC-AGI.
We don’t know what any of this means. We don’t know what lies further ahead. We don’t know what the next years hold. GPT-3 was four years ago for God’s sake.
Plenty of people are saying o3 is artificial general intelligence (AGI), or at least a soft form of AGI. Chollet denies the claim with an argument that reminds me of the idea that “no AGI is dumb at times.” He says beating ARC-AGI was a necessary but not sufficient condition to claim AGIness, and that there’s still research to do. I’m not sure what to think. The variance in intelligence across tasks is still high or o3 wouldn’t fail a single ARC-AGI task while striding through FrontierMath, but the last bastions resisting the unstoppable advance of AI seem to be falling one by one. Is it bitter? Is it even more bitter? I don’t know. Will new walls emerge to resist current techniques, as Chollet hopes to achieve with ARC-AGI-v2? I also don’t know.
more here.
Enjoying the content on 3QD? Help keep us going by donating now.