ISBN : 9780198844396
Data science has never had more influence on the world. Large companies are now seeing the benefit of employing data scientists to interpret the vast amounts of data that now exists. However, the field is so new and is evolving so rapidly that the analysis produced can be haphazard at best.
The 9 Pitfalls of Data Science shows us real-world examples of what can go wrong. Written to be an entertaining read, this invaluable guide investigates the all too common mistakes of data scientists - who can be plagued by lazy thinking, whims, hunches, and prejudices - and indicates how they have been at the root of many disasters, including the Great Recession.
Gary Smith and Jay Cordes emphasise how scientific rigor and critical thinking skills are indispensable in this age of Big Data, as machines often find meaningless patterns that can lead to dangerous false conclusions. The 9 Pitfalls of Data Science is loaded with entertaining tales of both successful and misguided approaches to interpreting data, both grand successes and epic failures. These cautionary tales will not only help data scientists be more effective, but also help the public distinguish between good and bad data science.
1 Pitfall #1: Using Bad Data
2 Pitfall #2: Putting Data Before Theory
3 Pitfall #3: Worshiping Math
4 Pitfall #4: Worshiping Computers
5 Pitfall #5: Torturing Data
6 Pitfall #6: Fooling Yourself
7 Pitfall #7: Confusing Correlation with Causation
8 Pitfall #8: Being Surprised By Regression Toward the Mean
9 Pitfall #9: Doing Harm
10 Case Study: The Great Recession