Dean W. Ball in The New Atlantis:
This notion — that LLMs are “just” next-word predictors based on statistical models of text — is so common now as to be almost a trope. It is used, both correctly and incorrectly, to explain the flaws, biases, and other limitations of LLMs. Most importantly, it is used by AI skeptics like [Gary] Marcus to argue that there will soon be diminishing returns from further LLM development: We will get better and better statistical approximations of existing human knowledge, but we are not likely to see another qualitative leap toward “general intelligence.”
There are two problems with this deflationary view of LLMs. The first is that next-word prediction, at sufficient scale, can lead models to capabilities that no human designed or even necessarily intended — what some call “emergent” capabilities. The second problem is that increasingly — and, ironically, starting with ChatGPT — language models employ techniques that combust the notion of pure next-word prediction of Internet text.
More here.
Enjoying the content on 3QD? Help keep us going by donating now.