Thinking About “Outliers” in Social Science, The WSJ’s Laughable Laffer-Curve Edition

This graphic has been making its rounds all over the left side of the blogosphere. It is from the WSJ, which seems to have reached a new low in right-wing mendaciousness and hackery in the cause of supply-side tax cut arguments.

Laffer

Now the most obvious response is that Norway is an outlier, i.e., an observation whose value is abnormally different from other observations. Mark Thoma over at Economist’s View offered this chart instead. The chart shows no Laffer curve relationship, but instead a positive relationship between tax rates and tax revenues. If Norway’s excluded the relationship would be stronger, in the sense that the error would be smaller.

20070713_thoma2_6

In the midst of this collective blogosphere howl at the WSJ, Kieran Healy takes the oppotunity to discuss how we should think about observations like Norway over at Crooked Timber.

In discussion threads about this kind of thing, you’ll find people saying stuff like, “I want to see a line showing x z or z”, or “I want to know what happens when you …”, and very often they’ll add “excluding outliers like Norway from the analysis.” Now, it’s true that in this plot Norway is very unlike the other countries. It’s also true that if you run regressions with data like this and don’t look at any plots while you do it then you will probably be misled by your coefficients, because some observations (like Norway) may have too much leverage or influence in the calculations. In this sense it’s important to take “outliers” into consideration.

But when your data set consists of just 18 or 25 advanced industrial democracies and your goal is to assess the empirical support for some alleged economic law, then you should be careful about tossing around the concept of “outlier.” In an important sense, Norway isn’t an outlier at all. It’s a real country, with a government and an economy and everything. Clearly they are doing something up there in the fjords to push the observed value up to the top of the graph. Maybe you don’t know what that is, but you shouldn’t just label it an outlying case and throw it away, at least not without re-specifying the scope of your question.