| Trt1 | Trt2 | Rep | response |
|---|---|---|---|
| 1 | 1 | 1 | 9.37 |
| 2 | 1 | 1 | 9.04 |
| 1 | 2 | 1 | 10.84 |
| 2 | 2 | 1 | 10.94 |
| 1 | 1 | 2 | 10.00 |
| 2 | 1 | 2 | 9.63 |
| 1 | 2 | 2 | 9.62 |
| 2 | 2 | 2 | 10.00 |
A SC3L Workshop
University of Nebraska-Lincoln
Data visualization is an important step in understanding the relationships of variables in your dataset.
R has multiple methods of creating visualizations, but our focus will be with the ggplot2 package. This package uses the Grammar of Graphics approach, layering different building blocks to produce a graph.
Data needs to be formatted so that it is tidy, which is defined as one observation per row and each measurement as a column.
| Trt1 | Trt2 | Rep | response |
|---|---|---|---|
| 1 | 1 | 1 | 9.37 |
| 2 | 1 | 1 | 9.04 |
| 1 | 2 | 1 | 10.84 |
| 2 | 2 | 1 | 10.94 |
| 1 | 1 | 2 | 10.00 |
| 2 | 1 | 2 | 9.63 |
| 1 | 2 | 2 | 9.62 |
| 2 | 2 | 2 | 10.00 |
| color | Fair | Good | Very Good | Premium | Ideal |
|---|---|---|---|---|---|
| E | 11156 | 4535 | 1703 | 4739 | 1799 |
| G | 5924 | NA | 2684 | 5720 | 3266 |
| H | 3862 | NA | 4527 | 5438 | 3448 |
| D | NA | 2406 | 3424 | 1809 | 5855 |
| F | NA | 4549 | 8948 | 2880 | 1780 |
| I | NA | 1952 | 4532 | 4480 | 3734 |
| J | NA | NA | NA | 6348 | 1720 |
| cut | color | price |
|---|---|---|
| Fair | E | 11156 |
| Fair | G | 5924 |
| Fair | H | 3862 |
| Good | D | 2406 |
| Good | E | 4535 |
| Good | F | 4549 |
| species | island | bill_length_mm | bill_depth_mm | flipper_length_mm | body_mass_g | sex | year |
|---|---|---|---|---|---|---|---|
| Adelie | Torgersen | 39.1 | 18.7 | 181 | 3750 | male | 2007 |
| Adelie | Torgersen | 39.5 | 17.4 | 186 | 3800 | female | 2007 |
| Adelie | Torgersen | 40.3 | 18.0 | 195 | 3250 | female | 2007 |
| Adelie | Torgersen | NA | NA | NA | NA | NA | 2007 |
| Adelie | Torgersen | 36.7 | 19.3 | 193 | 3450 | female | 2007 |
| Adelie | Torgersen | 39.3 | 20.6 | 190 | 3650 | male | 2007 |
ggplot(data = penguins, mapping = aes(x = bill_length_mm, y = bill_depth_mm, color = species)) +
geom_point(aes(shape = species), alpha = 2/3, size = 1) +
theme_bw() +
labs(x = 'Bill length (mm)', y = 'Bill depth (mm)',
title = 'Bill length vs. Bill depth', subtitle = 'By species',
color = 'Species', shape = 'Species', caption = 'Source: palmerpenguins') +
facet_grid(.~island) +
theme(aspect.ratio = 1/2,
legend.position = 'bottom')Does mass differ between penguins species?
Using the ggplot2::economics dataset, plot the median duration of unemployment over time.
ggplot(data = economics, mapping = aes(x = date, y = uempmed)) +
# geom_line(color = 'black') +
geom_area(fill = 'skyblue', color = 'black') +
theme_bw() +
labs(x = '',
y = 'Median durration of unemployment\n(in weeks)',
title = 'Longer unemployment during Great Recession',
subtitle = 'in the United States',
caption = 'Source: ggplot2::economics') +
scale_x_date(date_breaks = '5 years', date_labels = '%Y') +
scale_y_continuous(limits = c(0,27), expand = c(0,0)) +
theme(aspect.ratio = 1/2)myplot <- economics2000s %>%
ggplot(mapping = aes(x = year, y = mean_unemploy)) +
geom_bar(stat = 'identity', width = 1,
color = 'black', fill = 'skyblue') +
labs(x = '', y = 'Unemployment rate (%)', title = 'Unemployment rate in the United States') +
scale_y_continuous(limits = c(0, 5), expand = c(0,0)) +
scale_x_continuous(breaks = 2000:2015) +
theme_bw() + theme(aspect.ratio = 1/2)
ggsave('unemploy.png', width = 6, dpi = 600)Additional resources
Visit our website to schedule an appointment! https://statistics.unl.edu/sc3lhelp-desk/