<- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2024/2024-08-06/olympics.csv') olympics
ggplot2
a system for creating graphics, based on the Grammar of Graphics
readr
a fast and friendly way to read rectangular data (csv, txt…)
tibble
a tibble is a re-imagining version of the data frame, keeping what time has proven to be effective and throwing out what has not
stringr
provides a cohesive set of functions designed to make working with strings as easy as possible
forcats
provides a suite of useful tools that solve common problems with factors
dplyr
provides a grammar of data manipulation, providing a consistent set of verbs that solve the most common data manipulation challenges
tidyr
provides a set of functions that help you get to tidy data
purrr
enhances R’s functional programming (FP) toolkit by providing a complete and consistent set of tools for working with functions and vectors
Let’s work with the same data as with the carpentry lesson.
library(ggplot2)
Installing:
Based on Leland Wilkinson’s “The Grammar of Graphs”:
In summary, the grammar of graphs says that an statistical plot consists on mapping the data to aesthetics attributes (position, colour, shape, size…) of geometric objects (points, lines, bars).
This mapping may also include statistical transformations (logs, smooths…) and specific coordinate systems (cartesian, polar…).
We can fix the value for different aesthetics on the geometry, outside the aes
.
We can fix the value for different aesthetics on the geometry, outside the aes
.
Or we can map the aesthetics to some data variable with aes
.
Or we can map the aesthetics to some data variable with aes
.
Common aesthetics:
x
, y
)color
, fill
, alpha
)shape
, linetype
)size
, linewidth
)olympics |>
select(sport, age, sex, medal) |>
filter(sport %in% c("Basketball", "Luge"), !is.na(medal)) |>
ggplot() +
geom_point(
aes(x = sport, y = age, shape = sex, color = medal),
alpha = 0.3, position = 'jitter', size = 3
) +
geom_boxplot(
aes(x = sport, y = age),
fill = 'transparent',
outliers = FALSE
)
olympics |>
select(sport, age, sex, medal) |>
filter(sport %in% c("Basketball", "Luge"), !is.na(medal)) |>
ggplot() +
geom_point(
aes(x = sport, y = age, shape = sex, color = medal),
alpha = 0.3, position = 'jitter', size = 3
) +
geom_boxplot(
aes(x = sport, y = age, linetype = sport),
fill = 'transparent', linewidth = 2,
outliers = FALSE
)
olympics |>
select(sport, age, sex, medal) |>
filter(sport %in% c("Basketball", "Luge"), !is.na(medal)) |>
ggplot() +
geom_point(
aes(x = sport, y = age, shape = sex, color = medal),
alpha = 0.3, position = 'jitter', size = 3
) +
geom_boxplot(
aes(x = sport, y = age),
fill = 'transparent',
outliers = FALSE
)
olympics |>
select(sport, age, sex, medal) |>
filter(sport %in% c("Basketball", "Luge"), !is.na(medal)) |>
ggplot() +
geom_point(
aes(x = sport, y = age, shape = sex, color = medal),
alpha = 0.3, position = 'jitter', size = 3
) +
geom_boxplot(
aes(x = sport, y = age),
fill = 'transparent',
outliers = FALSE
) +
facet_grid(cols = vars(sex))
olympics |>
select(sport, age, sex, medal) |>
filter(sport %in% c("Basketball", "Luge"), !is.na(medal)) |>
ggplot() +
geom_point(
aes(x = sport, y = age, shape = sex, color = medal),
alpha = 0.3, position = 'jitter', size = 3
) +
geom_boxplot(
aes(x = sport, y = age),
fill = 'transparent',
outliers = FALSE
) +
facet_grid(cols = vars(sex), rows = vars(medal))
Statistical transformation can happen automatically when mapping data to some geometries, in fact we have seen already a couple of examples of this:
But sometimes we need/want to add a transformation layer ourselves.
But sometimes we need/want to add a transformation layer ourselves.
Scales are for any aesthetic mapped, not only position (x
, y
…), but also colors, shapes, sizes…
Scales are for any aesthetic mapped, not only position (x
, y
…), but also colors, shapes, sizes…
Scales are for any aesthetic mapped, not only position (x
, y
…), but also colors, shapes, sizes…
Scales are for any aesthetic mapped, not only position (x
, y
…), but also colors, shapes, sizes…
olympics |>
select(sport, age, sex, medal) |>
filter(sport %in% c("Basketball", "Luge"), !is.na(medal)) |>
ggplot() +
geom_point(
aes(x = sport, y = age, shape = sex, color = medal),
alpha = 0.3, position = 'jitter', size = 8
) +
geom_boxplot(
aes(x = sport, y = age),
fill = 'transparent',
outliers = FALSE
) +
scale_colour_manual(values = c("darkorange2", "gold", "gray50")) +
scale_shape_manual(values = c("♀", "♂"))
There are gplot2
predefined themes:
olympics |>
select(sport, age, sex, medal) |>
filter(sport %in% c("Basketball", "Luge"), !is.na(medal)) |>
ggplot() +
geom_point(
aes(x = sport, y = age, shape = sex, color = medal),
alpha = 0.3, position = 'jitter', size = 8
) +
geom_boxplot(
aes(x = sport, y = age),
fill = 'transparent',
outliers = FALSE
) +
scale_colour_manual(values = c("darkorange2", "gold", "gray50")) +
scale_shape_manual(values = c("♀", "♂")) +
theme_minimal()
But all the elements on the themes can be modified with theme
25-29 Nov 2024