Split-Apply-Combine

Split-Apply-Combine

A common analytical pattern is to:

Generally avoid using loops when you need to do Split-Apply-Combine, consider these alternatives instead:

  1. Entry level: dplyr::group_by()
  2. General approach: nesting
  3. *aplly functions and plyr package (non-tidyverse solution)

Lesson

Exercise

  1. Fit linear regression models of the daily bike counts on percipitation and max temperature, first for both bridges together and then for each bridge separately using the split-apply-combine pattern;
  2. When using ggfortify to plot weekly variation, trend and noise separately, you need to plot each bridge separately (sample code here). Use the split-apply-combine to avoid having to repeat for each bridge.

Resources:

  1. purrr package
  2. purrr tutorial
  3. Software Carpentry lesson on Split-Apply-Combine