Category Exploratory Data Analysis

Exploratory Data Analysis with Time-stamped Event Data

In the age of Big Data, one of the common data types is time-stamped events. This post focuses on (1) Explaining what time-stamped event data is and (2) Describing the Exploratory Data Analysis (EDA) you can do with it. It’s best to start your analysis with EDA so you can gain intuition for the data […]

The Data Science Process

Dear Students, Now that we’ve had our first guest lecture, I’d like to revisit the general framework I proposed for thinking about the data science process on the first day of class (when I generalized the example from Google Plus), and show how Jake’s lecture fits within this framework. Throughout the semester we’ll see that […]

Data Scientist Profiles

An example of a data scientist profile of one of the students in our class

What were you thinking when you made us do those data scientist profiles?

I had four primary reasons for going through that exercise:
Reason 1: Cultivating self-awareness

Reason 2: Illustrate importance of standardization in visualization
I wanted to demonstrate standardizing visualizations of individuals as a mix of characteristics. (You should think about how you might do it, and then also ask yourself whether you think a standardized visualization has any value.) In this particular case

(a) standardizing the x-axis: I used the main buckets that I thought were approximately some of the skills one needs as a data scientist. I’m not tied to these

Exploratory Data Analysis

Exploratory Data Analysis (EDA) is often relegated to Chapter 1 (by which I mean the “easiest”, and lowest level) of standard introductory statistics textbooks and then forgotten about for the rest of the book. Notable examples of textbooks used in statistics curriculum that embrace EDA are Andrew Gelman‘s books (which are by no means introductory). […]