Set up your computer

Installing R and R Studio

R and R Studio are two different software packages. You must install R first. - Download and install the latest version of F from https://ftp.osuosl.org/pub/cran/ . Or, go to www.r-project.org and and click on the download R link and choose your preferred CRAN mirror. You will have the option to download and install R for Linux, Mac, or Windows. (NOTE: Requires administrator/root privellege.) - Install the latest RStudio Desktop free version from https://www.rstudio.com/products/rstudio/download/ .

If you already have R and RStudio installed on your laptop, it’d be a good idea to check their version and upgrade them to the latest (if they are not).

Installing Git

Directions are a courtesy of Software Carpentry.

Windows

This will provide you with Git and Git Bash.

Video Tutorial

  1. Download the Git for Windows installer.
  2. Run the installer and follow the steps bellow:
    1. Click on “Next” four times (two times if you’ve previously installed Git). You don’t need to change anything in the Information, location, components, and start menu screens.
    2. Select “Use the nano editor by default” and click on “Next”.
    3. Keep “Use Git from the Windows Command Prompt” selected and click on “Next”. If you forgot to do this programs that you need for the workshop will not work properly. If this happens rerun the installer and select the appropriate option.
    4. Click on “Next”.
    5. Keep “Checkout Windows-style, commit Unix-style line endings” selected and click on “Next”.
    6. Select “Use Windows’ default console window” and click on “Next”.
    7. Click on “Install”.
    8. Click on “Finish”.
  3. If your “HOME” environment variable is not set (or you don’t know what this is):
    1. Open command prompt (Open Start Menu then type ‘cmd’ (without quotes) and press [Enter])
    2. Type the following line into the command prompt window exactly as shown: setx HOME "%USERPROFILE%"
    3. Press [Enter], you should see SUCCESS: Specified value was saved.
    4. Quit command prompt by typing exit then pressing [Enter]

Mac

Video Tutorial

  1. For OS X 10.9 and higher, install Git for Mac by downloading and running the most recent “mavericks” installer from this list. Because this installer is not signed by the developer, you may have to right click (control click) on the .pkg file, click Open, and click Open on the pop up window. After installing Git, there will not be anything in your /Applications folder, as Git is a command line program.
  2. For older versions of OS X (10.5-10.8) use the most recent available installer labelled “snow-leopard” available here.

For Linux

If Git is not already available on your machine you can try to install it via your distro’s package manager. For Debian/Ubuntu run sudo apt-get install git and for Fedora run sudo dnf install git.

Installation Verification

Launch RStudio and you should see a program window like this:

What is Data Science?

According to Wikipedia:

Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from data in various forms, both structured and unstructured, similar to data mining.

Data science is a “concept to unify statistics, data analysis, machine learning and their related methods” in order to “understand and analyze actual phenomena” with data. It employs techniques and theories drawn from many fields within the context of mathematics, statistics, information science, and computer science.

Data Science Venn Diagram

Data Science Venn Diagram by Drew Conway

Why R