Tidy Data

There are three interrelated rules which make a dataset tidy:

That interrelationship leads to an even simpler set of practical instructions:

Lesson

Download the script that generates the tables for the lesson here

  1. Tidy Data

Exercise

  1. Are the bike counts data tidy data?
  2. If not, why not? And how can we tidy it?
  3. Convert the total bike counts data to a wide format, with each row representing a day, and a column representing the total bike counts for each of the three bridges;
  4. Convert the above data frame in wide format back to long format.
  5. [Challenge] After tidying the bike counts, using functions in the tidyr package, create tables summarizing the average bike counts by bridge and day of week in two different formats:
Bike Counts by Day of Week and Bridge (1st Format)
Bridge Sun Mon Tue Wed Thur Fri Sat
Hawthorne
Tilikum
Bike Counts by Day of Week and Bridge (2nd Format)
Day of Week Hawthorne Tilikum
Fri
Mon
Sat
Sun
Thur
Tue
Wed

Sample code: tidy_counts.R

Resources

  1. Dataframe Manipulation with tidyr
  2. Data Wrangling Cheat sheet
  3. Introduction to tidyr