Chirag Shah, an associate professor at the University of Washington Information School, has authored a new academic textbook, “A Hands-on Introduction to Data Science.”
Shah sees data science as an interdisciplinary field where many solutions to problems extend beyond statistical and computational modeling and require people from different fields and backgrounds to work together.
“It is not just doing the programming, algorithms and data analysis, but also trying to understand data problems in context,” he said.
With the rapid growth of technology over the past decade and its impact on our daily lives, Shah argues that data literacy is important for everyone to learn. When he could not find enough material to teach in his classes that made data science accessible to students, he decided to write this textbook.
“Whether you are a history, psychology or business major, you need to know how to read and write. Similarly, data literacy is one of those basic skills that everybody needs to have, whether or not they want to pursue a career in data science,” he explained.
Shah’s textbook introduces the field of data science in a practical and accessible manner. The first part of the book is for anyone who wants to learn about data literacy. The second part is where the reader can expect to learn about the basics of programming. The third section focuses on machine learning. And the final section offers many examples of real-life applications, with practice ranging from small to big data and applications involving social media, health and finance. Along the way, the book tackles multiple themes, including ethics, data privacy and fairness. The book is also accompanied by a suite of online material for both instructors and students, which includes solutions, code, datasets, slides and curriculum suggestions.
“Data is not pure. Discrimination and biases happen all the time. They are deeply rooted in the way we collect, store and use data, as well as the algorithms for processing that data,” Shah said.
The book warns readers about issues such as algorithmic and data bias.
“If you’re not careful, data analytics and machine learning techniques can create discrimination, with grave consequences,” he added.
Shah believes there needs to be more diversity in the data science field. He hopes that his textbook can help make the field more inclusive and remove the barriers that have kept underrepresented communities from being able to participate.
“We know there is huge inequality in terms of gender and racial and other kinds of characteristics and in terms of people who are not represented in data science as well as other science fields. The book is aimed to lower that barrier so we can have more inclusiveness,” he said.
“A Hands-on Introduction to Data Science” was published in April 2020 by Cambridge University Press.