On Friday, the Financial Times published allegations by its economics editor Chris Giles that Thomas Piketty's wealth inequality data in his heralded Capital in the Twenty-First Century gives a suspiciously skewed impression of trends and cross-national rankings. I will confess that I clicked on the link full of schadenfreude; I believe that Piketty's book is irresponsibly speculative, that his inequality estimates sometimes give the wrong impression, and that his policy preferences would prove harmful to the middle class and poor in the long run.
However, I also believe that few researchers that achieve Piketty's prominence fake their data, and I have deep respect for how readily Piketty and his colleagues have made vast quantities of data available online for anyone to see. For that matter, having read the book, I know that it cannot be reduced to the charts Giles criticized. And as someone who criticizes research and who is criticized, I have an interest in promoting the fair adjudication of research controversies.
My initial assessment from Friday is mostly unchanged. The Financial Times blew the data issues it identified out of proportion. Giles discovered a couple of clear errors and a number of adjustments that look questionable but have barely any impact on Piketty's charts. Much of his critique could have been consigned to a footnote to the effect that he uncovered other mistakes and questionable choices that do not actually change Piketty's results. Giles's post is written in a way that makes you think the alleged problems with Piketty's data are more legion than they are. And he's made some errors himself along the way.
Only a couple of issues Giles highlighted, for the United Kingdom in 2010 and the United States in 1970 and 1980, appear to matter, but in the worst case for Piketty, they would make the originally unimpressive trends look less ambiguously benign. I find it hard to believe that Piketty intentionally massaged his data to get the results he wanted, based on my familiarity with his previous work, on the relatively small impacts Giles's issues have on the results, and on the fact that Giles was able to discover the issues in question because Piketty put massive amounts of data online.
Above all, even if Giles is right, the issues would have minimal consequences for Capital's thesis. That thesis does not hinge at all on whether wealth inequality grew between 1980 and 2010 or whether Europe has higher wealth inequality than the U.S. And if it did, his original results should have been viewed as offering little support for those claims.
Let's take some of the basic questions one by one, over the course of a few “explainer” columns. (And, yes, I'm still doing a follow-up column about middle class income growth in the U.S.)
1. How much of Giles's critique is substantively important?
Giles claims that his analysis of Piketty's alleged data problems shows, “why these problems matter for each one of the four countries prof [sic] Piketty studies – France, Sweden, UK and the US.” That's not true. Let's take France and Sweden off the table immediately. Here are the charts Giles created (from his spreadsheet, here) showing Piketty's published results against what Giles says he should have shown (the red lines):
Giles just didn't have to say anything about either of these countries. Anything Piketty did wrong (and it doesn't seem to me that he did much wrong at all) had zero substantive impact on the trend for these two countries. In fact, many of Giles's specific issues are mostly unproblematic even for the U.S. and U.K.
Giles makes a big deal of the fact that, in computing wealth concentration estimates for Europe, Piketty doesn't weight Sweden, France, and the U.K. by their population sizes before averaging them. See, now I would have criticized Piketty for calling any average of three countries a valid estimate for “Europe,” but Giles doesn't go there. In fact, it doesn't make any difference whether one weights by population size or not. Here is Giles's money chart, except I've added lines using Giles's preferred estimates but taking simple averages of the three countries (and removed his “alternative” data point for 2010 which is based on incomparable U.K. data—more on that below).
The reason his and my lines differ from Piketty's in showing somewhat lower wealth inequality in recent decades is because Giles's preferred estimates differ from Piketty's. (And it's only the U.K. estimates that really matter, and only after 1980, and only for the top ten percent.) The averaging issue has nothing to do with it.
Giles cites an example using the Swedish data of an apparent transcription error by Piketty where a value for one year was mistakenly entered into a spreadsheet as the value for a different year. It's the one example of a transcription error he cites. You can see the inconsequential impact it has by looking at Piketty's 1920 data points in the Sweden chart above.
Giles spends a fair amount of space discussing “tweaks” Piketty made, which he views as uniformly suspect. As others have pointed out, when doing long-run trend analysis using multiple data sources, making adjustments to data is often unavoidable for purposes of estimating the trend validly. You might end up with a second-best estimate for this or that data point, but the trend itself might be more accurately depicted, which is the point. I and others have actually criticized Piketty for not making such adjustments in the specific case of U.S. income concentration trends before and after 1986, when major tax reforms altered the amount of ordinary income reported on individual tax returns.
The important thing is to document the adjustments and their rationales as clearly as possible (and, to be sure, if the assumptions required aren't sufficiently defensible, adjustments shouldn't be made). It's fair of Giles to criticize Piketty for insufficient documentation, but he seems overly sure that all of the “tweaks” are illegitimate, particularly given that he does not know these data sources nearly as well as Piketty (true of nearly everyone on the planet). At any rate, he leads this discussion with the French figures from 1810 to 1960. I would encourage you to look back up at that French chart to see whether it is worth worrying about Piketty's “tweaks” in this case or whether he's up to funny business. The third of his three examples relates to the British data from 1810 to 1870, but as you can see in the British chart further below, whether Piketty did something wrong or not makes no difference to the trend over those 60 years. I'll come back to his second example—the U.S. 1970 estimate—in my next post.
Another category of issues Giles considers is “constructed data,” where the provenance of Piketty's estimates for a number of years is impossible to determine. Here Giles is again right to ding Piketty for insufficient documentation, but with the exception of the U.S. top one percent share for 1970, it makes little difference whether Piketty has “constructed” good or bad estimates. I wish Piketty had clearly noted that his 1910-1950 trend for the top ten percent of Americans simply adds a constant to the top one percent estimates, but it doesn't change the conclusion about how 1870 compares to 1960 or about how 1960 compares to 2010. (And as I'll show, his estimates line with another series quite well.)
Giles also has a point when he says that Piketty labels some estimates as being from a year ending in “0” when they actually come from nearby years. These instances should have been clearly documented. But in general they don't matter substantively. They don't for any of the three specific examples Giles cites. Assume the worst about Piketty and the evidence here fails to support your mistrust (except perhaps for the U.S top one percent share estimate in “1980,” which I'll address in the next post).
Finally, Giles cites a few departures from Piketty's stated reliance on estate tax data and charges him with “cherry-picking data sources” when he does not rely on the tax data. Actually, he states these as two different sets of problems and suggests that they are more general than the examples he gives for the U.S. and U.K. They aren't, and Giles has his own problems here.
2. Which of Giles's specific claims are important for the U.K. estimates?
Here's a modified version of Giles's chart, where I've just changed the colors and markers to make the different data sources clearer:
The first thing to note is that if you use the estimates marked “Lindert” and “Atkinson,” Piketty's series through 1960 are just fine. Giles makes another mistake in his post when he says that rather than a “constructed” estimate for the top ten percent share in 1870 of 87.1 percent, Piketty could have used the published figure of 76.7 percent. Giles pulled that number from the wrong table, as is apparent in his own spreadsheet. The right table shows a published estimate of 83.8 percent—considerably closer to Piketty's. The most important data issue before 1970 goes unmentioned by Giles. The 1810 and 1870 estimates probably understate wealth inequality because, unlike the later years, they are for households rather than for individuals (see Table 4).
Giles makes a big deal out of the top one percent point for 1970 being higher if you use the “Atkinson” or “ATK” estimates. This is one of those times when Piketty has done himself no favors by failing to adequately document what he's done. But I think I know what he's done. Because Piketty wants to produce a consistent long-term series, and because the nineteenth century estimates are for England and Wales, he sticks with estimates for England and Wales rather than the full United Kingdom as long as he can (through 1980). Piketty uses the 1974 estimate for England and Wales as the “1970” estimate (see Table 1).
Why the 1974 estimate? It appears that the estimates through 1972 are not directly comparable to those from 1974 forward. There is a 9.1 point drop in the “Atkinson” estimates for the top one percent between 1972 and 1974. This discontinuity is suggested by the horizontal bar in Table 4.A2 (p. 151) here between 1973 and 1974. Unlike Giles, I gather, I am not a U.K. citizen, but it appears that inheritance tax policy changed in early 1972 so that for the first time, part of an estate could be passed on tax-free to a surviving spouse. In 1975, the inheritance tax was apparently replaced by a broader gift tax that exempted gifts to spouses.
I suspect Piketty had a similar rationale for using the 1981 England/Wales estimate for “1980”. Piketty hasn't done anything nefarious here, even though he should have included more detail about his choices. His decision to use 1974 for “1970” has the effect of overstating the 1960-70 decline in wealth inequality. This decision and that to use 1981 for “1980” both have the effect of understating the measured 1970 to 1980 decline. But that measured decline is partly an artifact due to tax law change. If you don't like the (meaningless) 0.1 point increase Piketty shows in the top one percent's share of wealth from “1970” to “1980,” think of it as a tiny (meaningless) increase from 1974 to 1981. By the way, using the 1981 instead of the 1980 value also understates the increase in wealth inequality Piketty would show for 1980-1990, assuming that the 1990 estimate is right.
At any rate, all of the 1980 estimates in the chart line up for the top one percent share. Meanwhile, for the top ten percent estimates, the “ATK” estimate lines up with Piketty's in 1970 while the “Atkinson” line is “too high,” but by 1980 the “ATK” estimate is “too low” and the “Atkinson” estimate lines up with Piketty. In part, what you are looking at here is data imprecision. In part, though, the issue is that the “ATK” estimates are better than the “Atkinson” estimates by 1980—but no longer comparable to the Piketty and “Atkinson” estimates (see the discussion here).
Things get murkier after 1980. Those red lines come from what appears to be the only data source on wealth inequality for years between 1982 and 2001, the Inland Revenue Statistics, a government agency. They indicate less wealth inequality than Piketty's lines, especially for the top ten percent. But they also indicate less wealth inequality than the “ATK” estimates that are themselves incomparable by 1980 to the “Atkinson” estimates. To build a consistent series over time, the red line needs to be adjusted upward, but by how much? I spent a fair amount of time trying to figure out why Piketty did what he did without success.
Note, though, that the 1980-2000 trends for both the top one percent and for the top ten percent point in the direction of small increases in wealth inequality regardless of which data source is used. Piketty's one percent trend goes from 23 to 27 percent while the IRS data shows 19 to 23 percent—both four-point rises. His ten percent trend goes from 62-63 percent to 68-69 percent while the IRS indicates it rose from 50 to 56 percent—both six-point increases. Wealth inequality was higher in 2000 than in 1980.
What happened between 2000 and 2010? Damned if I know, and I doubt Piketty does either. Giles definitely doesn't. The source of Piketty's top ten percent share estimate looks like constructed figures deriving from 2009 data (see the short purple line in the chart). In Giles's spreadsheet, he shows pretty clearly that this source is not comparable to the earlier red-line series from IRS. It finds the top ten percent owned 72 percent of wealth from 2001 to 2003, compared with 54 percent in the IRS series for 2002. But then it's completely unclear whether it's comparable to the “Atkinson” series either, as Piketty seems to assume it is.
What I do know is that Giles's exhortation about the Office of National Statistics Wealth and Assets Survey being more consistent with the earlier series is bunk. With the exception of this source, all of the other statistics rely on methods based on estate/inheritance tax data. The Wealth and Assets Survey collects information from household interviews. While it may produce more accurate estimates for any given year than the other sources, it almost surely produces an invalid downward trend if connected to the earlier sources. The way to think about this is to realize that if the survey had been conducted continuously back to 1980, the wealth inequality levels would most likely be much lower than the Piketty line in every year, but the trend need not be any different. Therefore no one should look at Giles's chart and take that decline seriously. This is Giles's most important error.
The 2010 data point is also Piketty's most important potential problem. I'm pretty sure it should have just been left off of the chart. At any rate, it looks to me like Piketty is on solid ground saying that wealth concentration in the U.K. rose by about five percentage points from 1980 to 2000.
So lots of smoke, but I don't see a fire here so far. The next explainer will look at Giles's criticisms of the U.S. data and the question of whether wealth inequality has risen in the United States.
Original Source: http://www.forbes.com/sites/scottwinship/2014/05/27/laffaire-piketty/