Tonight we have two guest speakers, David Crawshaw and Josh Wills, both of whom I’ve had the pleasure of working with at Google. I hesitate to call them “data engineers” because that term is as problematic or potentially overloaded as “data scientist”, but suffice it to say that they’ve both worked as software engineers and dealt with massive amounts of data. Here’s more information about them:
David Crawshaw is a Software Engineer at Google who once accidentally deleted 10 petabytes of data with a bad shell script. Luckily he had a backup. After a stint in Social, he builds infrastructure for better understanding search. He recently moved from San Francisco to New York!
Josh Wills is Cloudera’s Director of Data Science, working with customers and engineers to develop Hadoop-based solutions across a wide-range of industries. Prior to joining Cloudera, Josh worked at Google, where he worked on the ad auction system and then led the development of the analytics infrastructure used in Google+. He earned his Bachelor’s degree in Mathematics from Duke University and his Master’s in Operations Research from The University of Texas at Austin.
Josh is also known for pithy data science quotes, such as: “I turn data into awesome” and “Data Scientist (noun): Person who is better at statistics than any software engineer and better at software engineering than any statistician.”
(If you remember, when Will Cukierski of Kaggle came to our class, he countered with “Data Scientist (noun): Person who is worse at statistics than any statistician and worse at software engineering than any software engineer.” Of course these two definitions are not incompatible.)