Introduction to R and RStudio

Adapted from Software Carpentry

Overview

Today we will learn to:

  • Describe the purpose and use of each pane in RStudio
  • Locate buttons and options in RStudio
  • Define a variable and assign data to a variable
  • Manage a workspace in an interactive R session
  • Use mathematical and comparison operators
  • Call functions
  • Manage packages

Questions

  • How to find your way around RStudio?
  • How to interact with R?
  • How to manage your environment?
  • How to install packages?

Getting Started

RStudio is a free, open-source IDE that makes R easier to use. It provides an editor, integrated console, and project management tools.

RStudio Layout

When RStudio opens, you’ll see:

  • Console (left): Interactive R session where code runs
  • Environment/History (upper right): Variables and command history
  • Files/Plots/Packages/Help (lower right): File browser, plots, installed packages

R scripts (.R files) can be opened in the editor panel (top left).

Workflow

You can work in two ways:

  1. Interactive: Type code in the console, save useful lines to a .R file
  2. Script-based: Write code in a .R file, use Ctrl+Return to run lines
TipRunning Code in RStudio

To run the current line: click Run or press Ctrl+Return (Windows/Linux) or ⌘+Return (Mac).

To run a code block: select it and click Run.

Using R as a Calculator

R can perform arithmetic:

1 + 100
[1] 101
3 + 5 * 2
[1] 13
(3 + 5) * 2
[1] 16

The [1] shown in output indicates the first element of a result.

Scientific Notation

Large/small numbers use scientific notation (e.g., 2e-4 means \(2 \times 10^{-4}\)):

2 / 10000
[1] 2e-04
5e3
[1] 5000

Mathematical Functions

Call functions with parentheses. Functions take arguments as inputs:

sin(1)
[1] 0.841471
log(1)
[1] 0
log10(10)
[1] 1
exp(0.5)
[1] 1.648721

Use ?function_name to view help for any function.

Comparisons

Comparison operators:

1 == 1  # equal
[1] TRUE
1 != 2  # not equal
[1] TRUE
1 < 2   # less than
[1] TRUE
1 <= 1  # less than or equal
[1] TRUE
WarningComparing Numbers

Never use == to compare decimal numbers. Use all.equal() instead due to floating-point precision.

Variables and Assignment

Assign values to variables using <-:

x <- 1 / 40
x
[1] 0.025
log(x)
[1] -3.688879

Variables appear in the Environment tab. Reassign at any time:

x <- 100
x <- x + 1

Variable Naming Rules

Valid names contain letters, numbers, underscores, and periods (must start with a letter or period). Conventions include periods.between.words, underscores_between_words, or camelCase. Be consistent.

NoteChallenge 1: Valid Variable Names

Which of the following are valid R variable names?

min_height
max.height
_age
.mass
MaxLength
min-length
2widths
celsius2kelvin

Valid: min_height, max.height, .mass, MaxLength, celsius2kelvin

Invalid: _age (starts with underscore), min-length (hyphen not allowed), 2widths (starts with number)

Vectorization

R is vectorized - operations apply to entire vectors:

1:5
[1] 1 2 3 4 5
2^(1:5)
[1]  2  4  8 16 32
x <- 1:5
2^x
[1]  2  4  8 16 32

Managing Your Environment

List all variables with ls():

x <- 1:5
y <- 10
ls()
[1] "install.packages" "x"                "y"               

Delete variables with rm():

rm(x)

To delete all variables: rm(list = ls())

Tip

When assigning function arguments by name, use = (not <-): rm(list = ls())

NoteChallenge 2: Variable Values

What are the values of mass and age after each statement?

mass <- 47.5
age <- 122
mass <- mass * 2.3
age <- age - 20
mass
[1] 109.25
age
[1] 102

mass is 109.25 (47.5 × 2.3) and age is 102 (122 − 20).

NoteChallenge 3: Comparisons

Is mass larger than age?

mass <- 47.5
age <- 122
mass <- mass * 2.3
age <- age - 20

mass > age
[1] TRUE

Yes, 109.25 > 102.

NoteChallenge 4: Clean Up

Delete the mass and age variables.

rm(mass)
rm(age)

Packages

Extend R with packages. Common package commands:

installed.packages()     # list installed packages
install.packages("name") # install a package
update.packages()        # update packages
library(name)            # load a package
detach(package:name)     # unload a package
NoteChallenge 5: Install Packages

Install these packages: ggplot2, plyr, gapminder

install.packages(c("ggplot2", "plyr", "gapminder"))

Use the command above or install individually with install.packages("name").

Key Points

  • Use RStudio to write and run R programs.
  • R has the usual arithmetic operators and mathematical functions.
  • Use <- to assign values to variables.
  • Use ls() to list the variables in a program.
  • Use rm() to delete objects in a program.
  • Use install.packages() to install packages (libraries).