Working with time-series data can be a challenge for new and experienced R users. You will often have to format the date, time, and timezone when working with raw data. R does not automatically recognize date-time formats and there are many formats for representing date-time (e.g. yyyy-mm-dd, mm-dd-yy, mm/dd/yyyy hh:mm:ss).

lubridate is a handy package that is installed as part of the tidyverse installation but does not automatically load when you call for the tidyverse package (library(tidyverse)). You have to explicitly call the package when you need it with library(tidyverse).

Lesson

Download the sample data set from here (left-click & Save link as…), and move it to the data folder of your working drive.

Load the library and sample dataset.

library(lubridate)
library(dplyr)

leesferry <- read.csv("data/leesferry.csv", stringsAsFactors = F)

Check the structure of your dataframe and you’ll notice the date.time attribute is a character string. You’ll also notice that the timezone is MST.

str(leesferry)
## 'data.frame':    43391 obs. of  7 variables:
##  $ date.time  : chr  "1926-01-11 00:00:00" "1926-02-25 00:00:00" "1926-03-26 00:00:00" "1926-05-24 00:00:00" ...
##  $ timezone   : chr  "MST" "MST" "MST" "MST" ...
##  $ site.name  : chr  "Lees" "Lees" "Lees" "Lees" ...
##  $ parameter  : chr  "p71851" "p71851" "p71851" "p71851" ...
##  $ description: chr  "Nitrate, water, filtered, milligrams per liter as nitrate" "Nitrate, water, filtered, milligrams per liter as nitrate" "Nitrate, water, filtered, milligrams per liter as nitrate" "Nitrate, water, filtered, milligrams per liter as nitrate" ...
##  $ measurement: chr  "15" "8.4" "0.8" "0" ...
##  $ units      : chr  "mg/l asNO3" "mg/l asNO3" "mg/l asNO3" "mg/l asNO3" ...

When assigning timezones to your date-time attribute use the Olson name which can be called by OlsonNames() of which there are over 600. You can also look at a list here.

To format the date.time attribute with the appropriate timezone use the following lubridate function:

leesferry$date.time <- ymd_hms(leesferry$date.time, tz = "US/Mountain")

To extract the year, month, or day of a date:

# Extract and create new column for year
leesferry$year <- year(leesferry$date.time)

# Extract and create new columns for month and day of week
leesferry_dow <- leesferry %>%
  mutate(month = month(date.time)) %>%
  mutate(dow = wday(date.time, label = T, abbr = T))

There are a number of functions with lubridate that can make working with time-series data including but not limited to:

  • rounding date-times (helpful with aggregations)
  • calculating intervals, periods, and durations

Check out the vignette for these additional functions.