jump to navigation

Common Fallacies That Cause People to Doubt Statistics October 23, 2012

Posted by Tim Rodgers in baseball, Process engineering, Quality.
Tags: , , , , ,
trackback

Lately I’ve been reviewing some old text books and work files as part of my preparation for the ASQ Six Sigma Black Belt certification exam in March. It’s interesting, and I think often amusing, to contrast the principles of inferential statistics and probability theory with they ways they’re used in the real-world. I think people tend to underestimate how easy it is to misuse statistical methods, or at least apply them incorrectly, and this can lead them to undervalue all statistical analysis, regardless of whether or not the methods were applied correctly.

I see this in baseball and political commentary all the time, particularly in the way people selectively or incorrectly use numbers to defend their point of view, while at the same time mocking those people who use numbers (correctly or not) to defend a different point of view.

Here are a few of the more-common mistakes that I’ve seen in the workplace:

1. Conclusions based on small sample sizes or selective sampling. Yes, we often have to make do with less data than we’d like, but that makes it especially important to put confidence intervals around our conclusions and stay open-minded about the possibility of a completely different version of reality. Also, a sample is supposed to represent the larger population, and we have to beware of sampling bias that excludes relevant members of the population and skews any findings based on that sample. Otherwise the findings are meaningful only for a subset of the population.

2. Unknown or uncontrolled measurement variability. We often assume that our measurement processes are completely trustworthy without considering the possible effects of variability due to equipment or people. If the variance of the measurement process exceeds the variance of the underlying processes that we’re trying to measure, we can’t possibly know what’s really going on.

3. Confusing independent vs. dependent events. There is no such thing as “the law of averages.” If you flip a coin 10 times and it comes up heads every time, the probability of a heads coming up on the 11th flip is still 50%. The results of those previous coin flips do not exert any influence whatsoever on future outcomes, assuming each coin flip is considered a single event. That being said, the event “eleven consecutive coin flips of heads” is an extremely unlikely event. If you take a large enough sample size, the sample statistics will approximate the population statistics (50% heads and 50% tails for an honest coin), sometimes simplistically referred to as “regression to the mean.”

4. Seeing a trend where none exists. This is usually the result of prediction bias, where we start with a conclusion and look for data to support it, and sometimes leading to selection bias, where we exclude data that doesn’t fit the expected behavior. Often we’re so eager for signs of improvement that we accept as proof a single data point that’s in the right direction. This is why it’s important to apply hypothesis tests to determine whether the before and after samples represent statistically significant differences. It’s also why we should never fiddle with a process that varies randomly but operates within control limits.

5. Correlation does not imply causation. You may be able to draw a regression line through a scatter plot, but that doesn’t necessarily mean there’s a cause-and-effect relationship between the two variables. This is where we have to use engineering judgment or even common sense. Earlier this year the Atlanta Braves baseball team lost 16 consecutive games that were played on a Monday. No one has been able to explain how winning or losing a baseball game could possibly be caused by the day of the week. A related logical fallacy is post hoc, ergo propter hoc (after it, therefore because of it). Chronological sequence does not imply causation, either.

Advertisements

Comments»

No comments yet — be the first.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: