Reify This

Leif Weatherby, Tyler Shoemaker, and Benjamin Recht in The Ideas Letter:

Last month, Pope Leo XIV himself anointed an esoteric subfield of AI research. Midway through his encyclical Magnifica Humanitas: On Safeguarding the Human Person in the Time of Artificial Intelligence, he calls for a “deepening of scientific research” into AI. As he explains, this research is necessary to moral discernment because knowing how AI works is a precondition for serious ethical inquiry. The pope concludes that further research is needed in “interpretability”: AI research that aims to understand how AI systems work and explain why they behave as they do.

For interpretability researchers, answers to these questions lie in discovering some causal mechanism under the hood. Neural networks are black boxes that must be pried open. Researchers often speak of “internal representations,” which encode how models structure information about the world. Pope Leo explicitly invokes these representations in the encyclical. We know very little about them, he says, because AI developers do not design every detail of their models. Instead, “current AI systems are more ‘cultivated’ than ‘built,’ for developers do not directly design every detail, but instead create a framework within which the intelligence ‘grows.’” While elsewhere the pontiff is critical about the prospect of such intelligence, here he seems to quote copy directly from the major AI labs. In fact, he is. Sitting to his left at the encyclical launch was Chris Olah, a co-founder of Anthropic.

Olah is a leading advocate of interpretability, and the cultivation metaphor is his. For years, he has drawn analogies between biology and AI, declaring about machine learning that its elegance “is the elegance of biology, not the elegance of math or physics.” At the encyclical launch, Olah went even further: “… what has grown is far more subtle, odd, and beautiful than science fiction prepared us for.”

More here.

Enjoying the content on 3QD? Help keep us going by donating now.