April 8th 2020
Big data and data has become such a buzzword that we often forget data is useless without the statistical method to make meaning of it. It is truly fascinating how two parts of the same process have received such different attractions. Obviously statistical ideas and different methodologies are complex and don’t make a good headline but the extent to which it has been mainly put aside is perplexing.
More importantly, given how much of our lives are constantly influenced by data and the likely impact on our future, it would make sense for schools to include Statistics as a core subject. I cannot speak for other countries but unfortunately in Bhutan, there is no trace of statistics in middle and high school Math curriculum.
Although I am no expert or much knowledgeable in statistics, I was mildly surprised by how basic and introductory this book seemed. Reading this book made me realize how thankful I am to my college statistics professor, Michael Kahn. I took Accelerated Statistics and Methods of Data Analysis which I have not only enjoyed but has changed what I want to study in graduate school. This book is perfect for those having a basic idea of statistics or for those that took introductory statistics and desire to build a strong base for statistical knowledge.
One of the most recurring ideas in the book is that Statistics is just a tool that can be used both for helpful as well as nefarious purposes. Sometimes, people don’t knowingly use statistics to cause harm but rather because of the lack of proper understanding of the idea. In the book called 'Weapons of Math Destruction' the author writes about how New York city tried to use data and artificial intelligence to deter crime. Basically the idea was to statistically predict and locate places where there will be crimes. However, because of the way the system was designed (not fully being aware of the underlying assumptions), it targeted minority communities and with more time it exacerbated the effect causing more harm. On the other side statistical methods have improved lives. For instance, in the book 'The Undoing Project', researchers found that softwares designed using statistical methods was able to detect cancers from an x-ray picture more accurately than doctors.
With the advent of computers and numerous statistical softwares, it has become easy for anyone to perform a regression analysis. However, it is crucial that people know the underlying assumptions that need to be fulfilled before churning out a regression equation and implementing policies centered around it. And this is where I believe a lot of us fail.
The government of Bhutan recently started an initiative called Economic Road Map for the 21st Century which will plan out the economic road map of Bhutan for the decade. On their website, and social media accounts, they have a poll to determine the Bhutanese views on our economy. This is a good initiative at least in terms of collecting data since it is rare in Bhutan. However, I hope that the government doesn’t let these numbers in any way influence their policies because these numbers are inherently flawed.
The method of data collection.
The issue with this data collection is that in the end, people carrying out this poll will most likely conclude that the x percent of the Bhutanese population feels a certain way (doing well, not well or can’t say) about the economy. This is not true because the sample is not random enough to represent the whole Bhutanese population.According to the way the data is being collected, the correct conclusion would be that x percent of Bhutanese having internet access and educated to know what economy means, feel the Bhutanese economy is doing well or not or can’t say.
Michael Kahn: one of the best Professor any student can have.