Ben Brubaker at Quanta:
If language models really are learning language, researchers may need new theories to explain how they do it. But if the models are doing something more superficial, then perhaps machine learning has no insights to offer linguistics.
Noam Chomsky(opens a new tab), a titan of the field of linguistics, has publicly argued for the latter view. In a scathing 2023 New York Times opinion piece(opens a new tab), he and two co-authors laid out many arguments against language models, including one that at first sounds contradictory: Language models are irrelevant to linguistics because they learn too well. Specifically, the authors claimed that models can master “impossible” languages — ones governed by rules unlike those of any known human language — just as easily as possible ones.
Recently, five computational linguists put Chomsky’s claim to the test. They modified an English text database to generate a dozen impossible languages and found that language models had more difficulty learning these languages than ordinary English. Their paper, titled “Mission: Impossible Language Models(opens a new tab),” was awarded a best paper prize at the 2024 Association of Computational Linguistics conference.
More here.
Enjoying the content on 3QD? Help keep us going by donating now.
