Author Archives: Rachel Schutt

Doing Data Science & Ada Lovelace Day

My book (with Cathy O’Neil), Doing Data Science, is now available on ebook and the print version will be available next week! The book is based on last year’s Introduction to Data Science class. In honor of Ada Lovelace Day, O’Reilly (our publisher) is offering 50% off books by women, and so because we’re women, […]

Announcing the Columbia Data Science Society

There is a new student group on campus called the Columbia Data Science Society. They’ve asked me to pass along the following information: Introducing Columbia Data Science Society! Columbia Data Science Society, CDSS, is an interdisciplinary society that promotes data science across Columbia University and the New York City community. Our goal is to understand […]

Introduction to Data Science Version 2.0

I’m teaching Introduction to Data Science for the second year. We just started last week,  and here are some of the significant differences between this year and last year: (1) Added another professor: I am team teaching this year with Dr. Kayur Patel who is a computer scientist at Google. Crudely speaking we can think […]

Philosophy of Data Science: Embrace the Practical and the Profound

This is my last blog post for Statistics 4242, Introduction to Data Science at Columbia University. All final projects have been turned in; grades have been given; the semester is over. I reserve the right to start blogging again at a later date. Dear Students, From the beginning, this course viewed Data Science simultaneously in […]

Kaggle Visualization Competition in Our Honor!

Dear Students, There is a new Kaggle Visualization Competition in our honor! I encourage you all to enter it! I received this email from Will Cukierski from Kaggle. This email was sent to me and Chris Mulligan. (See the p.s. for the Legend of Chris Mulligan.) Yours, Rachel Chris and Rachel, Thanks to your blog […]

Kaggle Competition Final Results!

Congratulations to Maura Fitzgerald for taking first place in our in-class Kaggle competition! First a couple comments, and then the final results are below. Were these Kaggle-competitive scores? The top scores were in the ballpark of the winning scores in the external version of this competition. The students in the class were given slightly different […]

My Strata Talk: Next-Gen Data Scientists

Dear Students, I’ll be giving a talk at Strata in February about this course and our experiences together: I’m bringing it up now, even though it’s more than two months off, because I plan to stop blogging about the class when the semester is finished. Here’s the abstract: Data Science is an emerging field […]