Politics and Tufte's Lie Factor
I admit it, I'm a political junkie. I'm also a math guy who loves design. Politics gets emotional fast and people are quick to stretch whatever data they have to fit their small, partisan aims.
Pundits and partisans misuse statistics all the time, but I happened upon a real gem that perfectly illustrates Edward Tufte's "Lie Factor."
The Lie Factor
In 1983 Edward Tufte wrote a book called The Visual Display of Quantitative Information. In it he states the following principle:
The representation of numbers, as physically measured on the surface of the graphic itself, should be directly proportional to the quantities represented.
The Lie Factor measures the extent to which a graph violates this principle. Mathematically it can be stated as follows:
The lie factor should be between 0.95 and 1.05. If it is outside that range then either the graph creator didn't know what the they were doing or they were intentionally trying to distort the facts.
Update: I realized after experimenting with Excel that the reason Jay's graph looks the way it does is because that's the Excel default. Stupid on Excel's part, but it's still careless not to notice.
The Culprit
On Friday Jay Cost over a Real Clear Politics made a post entitled "A Review of Obama's Voting Coalition." It contained no commentary, only six graphs. Here's the fifth graph:
For those who don't know, the Democrats nominate their candidate based on the number of delegates. Most states allocate their delegates proportionally based on the popular vote in each congressional district. One side-effect of this is that a vote in a sparsely populated congressional district can be worth more delegates that one in a densely populated congressional district.
But looking at this graph I was taken aback. Is it really true that each vote received by Obama was worth three times as many delegates as a vote received by Clinton? Take a closer look, though: the "zero point" on the graph is not zero but 10,200. The absolute difference is the same but the relative difference is skewed. Lie factor!
To see why this matters look at the corrected graph.
As you can see the difference is much less stark.
The Effect
It looks like Clinton gets around 11,750 votes per delegate and Obama gets around 10,800. This is around a 13.2% difference in the data.
The size of the effect on the graph, however, shows a 61.3% difference between the two numbers. That's a Lie Factor of around 4.64! Someone needs to review their Tufte.
The Echo Chamber
One reason I don't like the political blogosphere is that it's totally predictable. The same characters say and act the same way, all the time. They may as well be giving out advance copies of their script.
So, of course, Jerome Armstrong, the creator of MyDD and a vociferous Clinton supporter, placed this graph on the front page of his site without a hint of irony or self-reflection. He didn't even bother to analyze the graph and see if it really said what he thought it did.
This is a great example of another phenomenon called confirmation bias, where people search out or skew information so that it conforms to their currently held beliefs. In this case, Jerome just blindly posted a highly misleading graph because it supported his thesis that Clinton should be the Democratic nominee.
It's a comedy of errors, to be sure, but at least we can learn what not to do if we don't want to make ourselves look clueless.
P.S., this website has a list of the graph examples that Tufte himself used to illustrate the Lie Factor principle. Check it out.