Introduction to Statistics using R and RStudio

A Workshop

Author

Mark Andrews

This two-day workshop provides a comprehensive introduction to R and RStudio for data science and statistics in academic and professional settings. R is free, open-source, and widely used across research, government, and industry. The course begins with setting up a working R environment and proceeds through the fundamentals of using R, basic statistical analysis, data visualization with ggplot2, and data wrangling with the tidyverse. The topics are designed to be worked through in class; each guide serves as a reference and review resource alongside the live demonstrations.

Getting Started

R and RStudio

An introduction to R and RStudio: what they are, how to install them, and how to configure the working environment. Covers the four-pane layout, installing and loading packages, essential global settings, and RStudio Projects.

Using R

First steps in R: the console, functions, assignment, and the pipe operator. Covers R scripts — creating, running, commenting, and organising them — and reading data from CSV files into R.

Data Analysis

Statistical Analysis

An introduction to statistical data analysis in R using a dataset of height, weight, age, and gender. Covers descriptive statistics with skimr, independent-samples t-tests, correlation, linear regression (simple, multiple, and with interactions), and one-way ANOVA with post-hoc comparisons.

Data Visualization

An introduction to data visualization using ggplot2. Covers scatterplots, regression lines, colour coding by group, palette customization, histograms, and faceted plots.

Data Wrangling

An introduction to data wrangling with dplyr. Covers the core verbs — select, rename, relocate, slice, filter, mutate, and summarise — and how to chain them into readable pipelines using the pipe operator.