Category Data Science Domains

Week 10: Observational studies, Confounders, Epidemiology

Each week Cathy O’Neil blogs about the class. Cross-posted from This week our guest lecturer was David Madigan, Professor and Chair of Statistics at Columbia. He received a bachelors degree in Mathematical Sciences and a Ph.D. in Statistics, both from Trinity College Dublin. He has previously worked for AT&T Inc., Soliloquy Inc., the University […]

Dallas Art Museum and New York Philharmonic

Dear Students, We are not in a vacuum! As you know, part of the philosophy of this course is to draw inspiration from the real world– including guest speakers and messy data sets. We also try to create some academic distance– we are not fully immersed in the “real world”– and develop and form our […]

Week 8: Data Visualization, Square, Fraud Detection

Each week Cathy O’Neil blogs about the class. Cross-posted from This week in Rachel Schutt’s Columbia Data Science course we had two excellent guest speakers. The first speaker of the night was Mark Hansen, who recently came from UCLA via the New York Times to Columbia with a joint appointment in journalism and statistics. […]

The New York Philharmonic: Don’t be incompetent!

A couple weeks ago I was at the New York Philharmonic. The conductor, critically-acclaimed Alan Gilbert, and the piano soloist, Emanuel Ax, “broke the fourth wall” and explained Schoenberg’s Piano Concerto to the audience before playing it. They described Schoenberg’s 12-tone technique for composing music as: the composer selects a range of 12 notes and […]

Data Science & Urban Planning

Here I describe some inklings of ideas around Data Science & urban planning based on recent conversations* I’ve had, and casual reading I’ve been doing. I will touch on Las Vegas, Brooklyn, the Hubway visualization competition, and FourSquare. Metric: Return on Community This weekend’s NYT magazine has an article by Timothy Pratt: “‘If You Fix […]

Exploring the Data Science Universe

Dear Students, We’ve now had six weeks of blog posts, guest lectures, labs and homework assignments that have brought up a vast number of topics and issues across multiple dimensions that covers some subspace of Data Science. Finding your own way of understanding Data Science I hope you are finding your own ways of figuring […]

“Big Data on Campus”

For the final project, you’re working on developing a story or hypothesis around the theme of Data Science and Education. This article, “Big Data on Campus”, appeared in the New York Times (in cooperation with the Chronicle of Higher Education) over the summer and explores ways in which universities are starting to use technology that […]