by Jonathan Halvorson
There has been a minor resurgence of interest in whether the social sciences live up to their billing as sciences. Economics in particular is going through well-deserved scrutiny from its ongoing failures of prediction and its inability to build consensus.
Is this just the classic scientific process of new theories replacing old, or does science itself have a credibility problem? There has always been one sort of scientific credibility problem, but it was easy to write off intellectually, if not politically: ideologues and fanatics threatened by the results of science become motivated deniers of the theories that threaten their Weltanschauungen. Today, that means mostly evolution, Big Bang cosmology, and global warming. But this new credibility problem, should we choose to accept it, undermines any discipline in which the truth is slippery and seems to change. Whether it’s because the underlying ground shifts beneath our feet, or because we cannot get a reliable footing on even stable ground, the value of the scientific process and its results is diminished. Put simply: you can’t trust what the researchers say, or even the consensus of the scientific community.
I confess that, for me, many sciences have had a credibility problem for a long time. I can’t read about the latest breakthrough result in the field of anthopology, medicine, nutrition or educational theory without thinking: how long before this, too, is contradicted by new research and turned into yesterday’s fad? How long before the expensive new pharmaceutical is shown to have been no better than aspirin, or a sugar pill? How long before the newly heralded educational technique racks up a string of failures and is written off as just another modest tool in the toolbox, or thrown out entirely?
And yet something is rotten. Lehrer’s examples of studies that were once widely taught but failed to be replicable over the years all have something in common: they concern subjects that are not amendable to experiment under highly controlled conditions to determine constant relationships between a sharply limited set of variables. For now, let’s call them the sciences of complexity. For these, whether the subject is economics, medicine or ecology, statistical methods are used to support causal models that are not generalizable as causal laws. The complexity is never fully controlled in a way that allows wide generalization, the variables used in the model are incomplete and open to challenge, and “confirmation” of the result is done using a significance test rather than consensus around a numerical relationship that can be reliably replicated for prediction and control. It is these sciences of complexity that are struggling. Perhaps they are even dead, while their practitioners carry on, unable to acknowledge that they are animating a corpse. Are there sciences out there that live a zombie life, moving forward in a halting way, but emptied and barren?
What one can say with confidence is that despite the methodological problems that Lehrer cites and many others have documented, the defense that most initiates are tempted to give is this: the methods are not failing, it is people who are failing to follow the methods appropriately.
For example, there is a strong publication bias. Using a standard significance test (T-test) 1 out of 20 results will be judged significant (and thus worthy of publication) simply by chance. Journals are much more interested in publishing positive results than negative ones, so journals are biased to accept one study showing that A has an effect on B, and reject the many other studies that don’t identify any impact (until the A-B theory becomes well accepted–at which point journals have a bias in the opposite direction to publish a newsworthy study rejecting the relationship between A and B).
There is also confirmation bias. A research team may conduct a study 10 times without confirming the hypothesis they’re testing, and reject the experiments on one technical ground or another, but then when the 11th study confirms the hypothesis, submit it for publication. In the pharmaceutical industry and other areas where big dollars are at stake, the biases are even more pronounced and have become notorious.
There are many other known biases and limitations of statistical methods in identifying causal relationships. Despite it all, the typical response among social scientists and science pundits is a rousing: what’s the big deal? Science is fine; it’s the institutions that may get sick. The solution to issues like publication bias, confirmation bias or the identification problem is just to be more vigilant.
So, end of story–the honor of scientific methods across the sciences saved? Not quite. What no one in the several recent exchanges over the status of science has seriously addressed is the problem of variability. Even Lehrer seems to believe that the causal relationships are stable, but there are so many of them, and the subjects of study are heterogeneous in so many ways, that the stable universal generalizations are hard to discover even after the cognitive biases can be overcome. But dig down far enough and there is bedrock.
When it comes to natural sciences of complexity, including parts of ecology and medicine, this may be right and I don’t have anything to add. But the social sciences are another matter. There is excellent reason to believe that we haven’t found any fixed quantitative generalizations for social phenomena because they don’t exist, and they don’t exist because people change in response to interpersonal conditions, including the condition of acquiring new beliefs about causal relationships and statistical regularities, as well as the condition of having others exploit regular patterns to their advantage. There is no identified social model (fixed in terms of the numerical values and measurement procedures) that is impervous to human improvisation. Variability goes all the way down in the social sciences. There is no bedrock, only loose soil.
I discussed some of the reasoning behind this last time, and how it ties into Rational Expectations theory in economics. In short, to the extent that we are rational, we are not invariably set on the rails of any measured social regularity. But if variability is at work and not complexity (alone), then it isn’t just the measured correlations between variables in real world systems that shift, but the causal influence of the variables themselves. Strictly speaking, it is impossible to resolve the identification problem and fix the causal relationships in a model.
If you are skeptical or confused, and have some background in social science, conduct the following thought experiment: imagine any model of human behavior with a finite set of variables and coefficients that is supposed to determine an effect. Something of the form E = x1V1 + x2V2+ x3V3 . Now imagine that even though people become aware of it, they can’t change it even if they want to. Not just for the moment, but as a matter of law forever. Try to write down what the variables could be and what the coefficients could look like, and how, even in principle this could be known. I would submit that an example is only conceivable if it also implies that people are not free to act in their considered best interest. We would be like a more tormented version of Dennett’s Digger Wasps or ants caught in a death spiral.
There is, obviously, a philosophical correlary to this stance towards social science: if true, it means that we are free in a compatibalist sense when it comes to the measured generalizations of social science. But to say that we aren’t bound by any unchanging algorithm using the variables that an economist, psychologist or sociologist would deploy doesn’t mean that we can’t be predicted. One large-scale approach to replicable prediction, contrary to what seems to be the goal in much social science research, is to use continually modified and self-correcting algorithms in models that rely on
massive computational power, rather than seek the refined elegance of a small number of variables (as one might find in physics). You already come across these methods when you type a search term or buy a book online and the helpful computer tries to find what you are really looking for, or suggest other products you might like.
The future of social prediction and control may lie in the methods used by Google, Amazon and Netflix. They do not predict well because they have discovered some invariant causal relationship, but because the algorithms are continually learning and updating from past predictions. As social circumstances change, the algorithms change, and there is no final resting place. And as this type of prediction is used more and more for commercial, political and other purposes, people will become aware of their inevitable exploitation and, no doubt incompletely and in messy ways, adopt counter-strategies to avoid it. What the best strategies are for doing so (aside from turning the machines off), and how successful they can be as computational power increases, are questions that mark the beginning of a whole new field of research waiting to be explored.