Week 3: Scrubbing and Analysis


Today we will be talking about a necessary evil when it comes to working with data, cleaning it. Unfortunately, if you are relying on someone else collecting the data for you, its usually not in the format that you need it in and there might be some data points that are incorrect or even missing. Once you have scrubbed your data then you can go on to analysis. We will be talking about the tools you can use to analyze your data, but ultimately you should use what you are most comfortable with. During this analysis process we will be doing quick and dirty visualizations to gain more insight. No need to worry about design at this stage.

Lecture Slides

In Class Assignment

Find a data set online, clean it up and put it in a format you can work with.

Once the data is clean, use either Google Sheets or Excel to start to play with the values using Pivot Tables and other tools and techniques to pull out interesting points. Create quick visualizations of the data to gain more insight.

Sample Datasets:
EPA Worst Emitters
Illegal Immigration Detainees
U.S. JCC Bomb Threats
U.S. Mass Shootings
Missing Migrants
Officer Involved Deaths 2016
Global Terrorist Attacks 2016


Reading: Pages 111-146

Decide on a topic, cause or non-profit to research for your midterm. Find at least two datasets based on that topic, analyze it and visualize for insight. Be prepared to discuss your findings next week.

Example: If you are interested in climate change you can look at data around its cause and effect, i.e. temperature data, extreme weather, carbon emissions, methane emissions, extinction rates, government environmental policies and see if there is any correlation.

Leave a Reply