The Phantom Pattern Problem: The Mirage of Big Data

ISBN : 9780198864165

Gary Smith; Jay Cordes
272 ページ
129 x 196 mm

Pattern-recognition prowess served our ancestors well, but today we are confronted by a deluge of data that is far more abstract, complicated, and difficult to interpret. The number of possible patterns that can be identified relative to the number that are genuinely useful has grown exponentially - which means that the chances that a discovered pattern is useful is rapidly approaching zero. Patterns in data are often used as evidence, but how can you tell if that evidence is worth believing? We are hard-wired to notice patterns and to think that the patterns we notice are meaningful. Streaks, clusters, and correlations are the norm, not the exception. Our challenge is to overcome our inherited inclination to think that all patterns are significant, as in this age of Big Data patterns are inevitable and usually coincidental. Through countless examples, The Phantom Pattern Problem is an engaging read that helps us avoid being duped by data, tricked into worthless investing strategies, or scared out of getting vaccinations.


1 Survival of the Sweaty Patter-Processors
2 Predicting What is Predictable
3 Duped and Deceived
4 Fooled Again and Again
5 The Paradox of Big Data
6 Fruitless Searches
7 The Reproducibility Crisis
8 Who Stepped In It?
9 Seeing Things for What They Are


Gary Smith is the Fletcher Jones Professor of Economics at Pomona College. Gary Smith is the Fletcher Jones Professor of Economics at Pomona College. He received his Ph.D. in Economics from Yale University and was an Assistant Professor there for seven years. He has won two teaching awards and written more than eighty academic papers and thirteen books. ; Jay Cordes is a data scientist who enjoys tackling challenging problems, including how to guide future data scientists away from the common pitfalls he saw in the corporate world. He earned a Math degree from Pomona College and more recently graduated from UC Berkeley's Master of Information and Data Science (MIDS) program.