Why AI Needs Physics to Grow Up

by Ashutosh Jogalekar

There has long been a temptation in science to imagine one system that can explain everything. For a while, that dream belonged to physics, whose practitioners, armed with a handful of equations, could describe the orbits of planets and the spin of electrons. In recent years, the torch has been seized by artificial intelligence. With enough data, we are told, the machine will learn the world. If this sounds like a passing of the crown, it has also become, in a curious way, a rivalry. Like the cinematic conflict between vampires and werewolves in the Underworld franchise, AI and physics have been cast as two immortal powers fighting for dominion over knowledge. AI enthusiasts claim that the laws of nature will simply fall out of sufficiently large data sets. Physicists counter that data without principle is merely glorified curve-fitting.

A recent experiment brought this tension into sharp relief. Researchers trained an AI model on the motions of the planets and found that it could predict their positions with exquisite precision. Yet when they looked inside the model, it had discovered no sign of Newton’s law of gravitation — no trace of the famous inverse-square relation that binds the solar system together. The machine had mastered the music of the spheres but not the score. It had memorized the universe, not understood it.

This distinction between reproducing a pattern and understanding its cause may sound philosophical, but it has real consequences. Nowhere is that clearer than in the difficult art of discovering new drugs.

Every effective drug is, at heart, a tiny piece of molecular architecture. Most are small organic molecules that perform their work by binding to a protein in the body, often one that is overactive or misshapen in disease. The drug’s role is to fit into a cavity in that protein, like a key slipping into a lock, and alter its function.

Finding such a key, however, is far from easy. A drug must not only fit snugly in its target but must also reach it, survive long enough to act, and leave the body without causing harm. These competing demands make drug discovery one of the most intricate intellectual endeavors humans have attempted. For centuries, we relied on accident and observation. Willow bark yielded aspirin; cinchona bark gave us quinine. Then, as chemistry, molecular biology, and computing matured in the latter half of the twentieth century, the process became more deliberate. Once we could see the structure of a protein – thanks to x-ray crystallography – we could begin to design molecules that might bind to it.

This gave rise to the practice of molecular docking: using computers to predict how a molecule might nestle into a protein’s cavity, and how strongly the two might cling to each other. For decades, docking has been the workhorse of rational drug design, powerful, but computationally demanding and imperfect. Docking a list of billions of molecules – a tiny fraction of all of chemical space – can easily cost tens of thousands of dollars.

The promise of AI was irresistible: what if we could train models to perform docking thousands of times faster and at a fraction of the cost? If a machine could learn the patterns of molecular binding, perhaps it could screen millions of compounds in seconds, identifying the rights ones with the perfect blend of properties that could turn them into drugs.

In the last two years, models like AlphaFold 3 and Boltz-2 have made exactly that claim. Yet researchers at the University of Basel decided to ask a deceptively simple question: what happens when you change the rules of the game just slightly? The study took a set of AI docking models and gave them a challenge. After predicting how a small molecule binds to a protein, they subtly altered the protein’s binding pocket, deleting or mutating just a few atoms. A perturbation of this kind if hardly academic: in nature, this is exactly what happens when a virus mutates or a cancer cell evolves drug resistance. The shape of the pocket shifts or enlarges, and the drug now rattles around. The old key no longer fits.

A physics-based model, which calculates real forces between atoms, should immediately register the change. It should find that the molecule now collides with the pocket walls or loses its grip. But the AI models blithely ignored these mutations. They kept placing the molecule exactly as before, as if the world had not changed. The reason is instructive. Those altered proteins weren’t in the training data, and AI can only see what it has seen before. To a pattern-matching algorithm, an unfamiliar shape looks like a familiar one. The models had learned to imitate the outcomes of docking, not to understand the physical laws that make those outcomes possible.

They were, just like the planetary AI model did for astronomy, memorizing chemistry rather than thinking it.

And this problem is hardly unique to molecular modeling. It pervades every domain in which AI has made spectacular progress, from language models that string together words without grasping meaning, to image generators that reproduce style without substance. AI excels at interpolation – filling in gaps within the data it has already seen – but falters at extrapolation, at stepping into the unknown. It can play the next note perfectly, but it doesn’t know why the melody works.

Understanding, in contrast, is what allows generalization. It’s what lets a physicist infer a universal law of gravity from planetary orbits, or a biologist predict evolution from variation in finches on an island. Memorization can mimic knowledge, but it can never replace it.

A different group, led by researchers at Caltech and Berkeley, approached the problem from another direction. Instead of trying to replace physics with AI, they asked: what if we teach AI a little physics to begin with? Their model, named NucleusDiff, added a simple but powerful rule to the learning process: atoms cannot occupy the same space. In chemistry this limit is described by the so-called van der Waals radius – the “personal space” of atoms set by the repulsion of their electron clouds. As the previous paper showed, many machine-learning models happily violate this rule, producing physically absurd structures where atoms overlap or crowd too tightly.

The researchers designed a way to prevent that without drowning the model in calculations. Instead of checking every possible atom pair, they represented the molecular geometry on what mathematicians call a manifold – a smooth, multi-dimensional surface that encodes how atoms and electrons are distributed in space. By teaching the AI to stay within this manifold, they could enforce basic physical realism at high speed. The result was a hybrid system: an AI model guided by physical law. It produced far more accurate, physically consistent docking predictions. It even suggested a new small molecule capable of binding to a protein target. Equally telling was another finding: physics could not simply be bolted on after the fact. The constraints had to be part of the model’s internal reasoning from the start. The machine had to grow up with physics, not merely be corrected by it.

The Purdue and Caltech/Berkeley studies represent two ends of a spectrum: one shows what happens when AI forgets physics; the other shows what happens when it remembers. Together, they point toward a reconciliation rather than a rivalry.

Physics offers universality, a few principles that explain the many. AI offers scale, the ability to sift through more possibilities than the human mind ever could. Combine them, and you get something neither can achieve alone: speed combined with insight. The lesson here extends far beyond drug design. Whether in predicting weather, generating art, or writing code, AI’s greatest challenge is not that it lacks data, but that it lacks understanding. It can weave endless patterns, but only within the fabric it already knows. What it cannot yet do – though it is slowly learning – is to lift its gaze from the tapestry to the laws that govern its weave.

Perhaps the war between AI and physics is a mirage. In Underworld, a vampire and a werewolf eventually fall in love and produce a hybrid; faster, stronger, and more resilient than either parent. Science, too, may be heading for such a synthesis: a hybrid intelligence that unites the reach of data with the discipline of law.