A transformative month rewrites the capabilities of AI

Ethan Mollick at One Useful Thing:

Llama 3.3, running on my home computer passes the “rhyming poem involving cheese puns” benchmark with only a couple of strained puns.

At the end of last year, there was only one publicly available GPT-4/Gen2 class model, and that was GPT-4. Now there are between six and ten such models, and some of them are open weights, which means they are free for anyone to use or modify. From the US we have OpenAI’s GPT-4o, Anthropic’s Claude Sonnet 3.5, Google’s Gemini 1.5, the open Llama 3.2 from Meta, Elon Musk’s Grok 2, and Amazon’s new Nova. Chinese companies have released three open multi-lingual models that appear to have GPT-4 class performance, notably Alibaba’s Qwen, R1’s DeepSeek, and 01.ai’s Yi. Europe has a lone entrant in the space, France’s Mistral. What this word salad of confusing names means is that building capable AIs did not involve some magical formula only OpenAI had, but was available to companies with computer science talent and the ability to get the chips and power needed to train a model.

In fact, GPT-4 level artificial intelligence, so startling when it was released that it led to considerable anxiety about the future, can now be run on my home computer. Meta’s newest small model, released this month, named Llama 3.3, offers similar performance and can operate entirely offline on my gaming PC. And the new, tiny Phi 4 from Microsoft is GPT-4 level and can almost run on your phone, while its slightly less capable predecessor, Phi 3.5, certainly can. Intelligence, of a sort, is available on demand.

More here.

Enjoying the content on 3QD? Help keep us going by donating now.