An introduction to using the R statistics package and the RStudio interface for multivariate statistics.

Learning objectives

  1. Prepare data in spreadsheet program (e.g. Excel, LibreOffice Calc) for export to R
  2. Read data from files into R
  3. Run Principal Components Analysis (PCA) and graphically display results
  4. Perform Discriminant Function Analysis (DFA) and interpret the results

Setup

Workspace organization

First we need to setup our development environment. We need to create two folders: ‘data’ will store the data we will be analyzing, and ‘output’ will store the results of our analyses.

dir.create(path = "data")
dir.create(path = "output")

Preparing data in a format R can read

  • Download data file from https://jcoliver.github.io/learn-r/data/otter-mandible-data.xlsx or http://tinyurl.com/otter-data (the latter just re-directs to the former). These data are a subset of those used in a study on skull morphology and diet specialization in otters doi: 10.1371/journal.pone.0143236.
  • Open this file, otter-mandible-data.xlsx, in spreadsheet program like Microsoft Excel® or LibreOffice Calc.
  • Save a copy of the file as a CSV (comma-separated values) file named ‘otter-mandible-data.csv’ in the data folder you created above:
    • In MS Excel®, select File > Save As… and in the dialog that appears, select CSV from the type dropdown menu.
    • In LibreOffice Calc, select File > Save As… and in the dialog that appears, select Text CSV (.csv) in the Format dropdown in the lower-right portion of the dialog.