Columbia University’s Data Science Institute is releasing some mooks, and I’m part of it. I’ll first give the official announcement and then share some of my thoughts.

The official announcement:

The Data Science Institute at Columbia University is excited to announce the launch of its first online-education series, Data Science and Analytics in Context, on Dec. 14. Available through the edX platform, the three-course series will run through April, featuring lectures, engaging exercises and community discussion.

The first course, Statistical Thinking for Data Science and Analytics, teaches the statistical foundations for analyzing large datasets. You will learn how data scientists design the data collection process, gain insights from visualizing data, find supporting evidence for data-based decisions and construct models for predicting future trends.

The second course, Machine Learning for Data Science and Analytics, is an introduction to machine learning and algorithms. In this course, you will develop a basic understanding of machine learning principles and how to find practical solutions using predictive analytics. We will also examine why algorithms play an essential role in Big Data analysis.

The third course, Enabling Technologies for Data Science and Analytics, explores the major components of the Internet of Things, including data gathering sensors. You will develop an understanding of how software is able to analyze events, recognize faces, interpret sentiment on social media, and how this information is fed into the decision-making process.

Learn from leading data scientists at Columbia University with guidance provided by Columbia graduate assistants during each course. Watch the video trailer for the series online at ColumbiaX and enroll today!

Link for video – https://www.youtube.com/watch?v=ahvuPvm-1YU

Link to enroll – https://www.edx.org/xseries/data-science-analytics-context

My perspective:

The mooks were organized by a group at our new Data Science Institute, including Prof. Tian Zheng, a friend and colleague of mine in the statistics department. I prepared two lectures, one on Bayesian data analysis and one on exploratory data analysis and visualization. The content was not super-organized; I just used some material I had around, including some of my favorite recent stories such as the Xbox polls and the age-adjusted death rates. I’m not sure how well they went because I hate looking at videos of myself. I did see clips from some of the other lectures and they looked pretty good.

Last year I prepared an intro stat course for the School of International and Public Affairs. I taped twelve 40-minute lectures, and along with each were R sessions with Ben Goodrich. These taped lectures were super-smooth; I actually ended up writing scripts for all of them because I sounded too awkward when I simply spoke as if I were giving a usual talk. In contrast, these new mooks are more like classroom lectures; it’s a different feel entirely.

Anyway, I hope this goes well. Organizing a remote course on data science seems like a real challenge, and it seems like a reasonable starting point to get different people to give different lectures on their areas of expertise. I suppose much will depend on the homework assignments and the student feedback. I was happy to contribute my parts, small as they were.