Sunday, May 25, 2014

What Is It With Economists and Spreadsheets?

The hot book in economics this year is Thomas Piketty's Capital in the Twenty-First Century.  The book addresses topics of inequality, asserting that inappropriate levels of such are a natural outgrowth of capitalism.  It makes sweeping policy proposals, such as a global tax on wealth and much higher income tax rates on the upper income brackets.  It has been favorably reviewed by liberal economists like Paul Krugman, and criticized by conservative economists.  Recently we learned one more thing about it, as reported by the Financial Times: calculations critical to the argument relied on error-plagued spreadsheets.

Last year we went through the episode of the Reinhart-Rogoff paper that asserted that national public debts in excess of 90% of GDP killed economic growth.  Conservative politicians jumped on the paper as a justification for proposing drastic changes in US public policy.  As it turned out, critical calculations were done by spreadsheet, the spreadsheet had errors, and when the errors were corrected the "cliff" in economic growth at 90% disappeared.  The damage had already been done, though.  The 90% debt-to-GDP cliff was quoted in a variety of government reports, and those secondary sources continue to be cited in policy debates today.

What is it with economists and spreadsheets?  Spreadsheet software is a programming system.  There is a large literature on the frequency and nature of spreadsheet errors (the European Spreadsheet Risks Interest Group's annual conference on the subject will be held in July this year).  Even using best practices, complex spreadsheets contain errors at a rate that would be completely unacceptable in any other programming environment.  Not that there seems to be any evidence being offered that the economists mentioned above were making use of those best practices.  For example, I have yet to read that Reinhart and Rogoff conducted formal code reviews.

One part of this is particularly puzzling to me.  Some years back I spent two semesters in a PhD economics program.  The econometrics classes used R and Gauss.  There was never even a hint that Excel was an acceptable method for doing research calculations.  Certainly none of the graduate students I met who were working on their dissertations were using a spreadsheet to do the analysis.  So to find highly-respected academic economists using spreadsheets is surprising.  They have to know that if the spreadsheet is even moderately complex, errors are creeping in.  Quite possibly embarrassing errors [1].

 For some decades, economists have been accused of suffering from "physics envy."  That is, they want their field to be considered a hard science like physics, not one of the so-called soft sciences.  I'll offer economists a piece of free advice on that: hard sciences don't do data analysis with spreadsheets.  Clean up your act.  Require authors to certify that they use real tools for numerical work, reject out-of-hand papers that don't, and punish people who lie about it harshly.

[1]  Using better tools is no guarantee that errors won't creep in.  But they offer a better chance of catching them.