An introduction to using the R statistics package and the RStudio interface for multivariate statistics.
- Prepare data in spreadsheet program (e.g. Excel, LibreOffice Calc) for export to R
- Read data from files into R
- Run Principal Components Analysis (PCA) and graphically display results
- Perform Discriminant Function Analysis (DFA) and interpret the results
First we need to setup our development environment. We need to create two folders: ‘data’ will store the data we will be analyzing, and ‘output’ will store the results of our analyses.
dir.create(path = "data")
dir.create(path = "output")
Preparing data in a format R can read
- Download data file from https://jcoliver.github.io/learn-r/data/otter-mandible-data.xlsx or http://tinyurl.com/otter-data (the latter just re-directs to the former). These data are a subset of those used in a study on skull morphology and diet specialization in otters doi: 10.1371/journal.pone.0143236.
- Open this file, otter-mandible-data.xlsx, in spreadsheet program like Microsoft Excel® or LibreOffice Calc.
- Save a copy of the file as a CSV (comma-separated values) file named ‘otter-mandible-data.csv’ in the data folder you created above:
- In MS Excel®, select File > Save As… and in the dialog that appears, select CSV from the type dropdown menu.
- In LibreOffice Calc, select File > Save As… and in the dialog that appears, select Text CSV (.csv) in the Format dropdown in the lower-right portion of the dialog.