Application: Elementome Trajectory Analysis

Miquel De Cáceres & Javier de la Casa

EMF-CREAF

Outline

  1. Introduction
  2. The dataset for today
  3. Defining the trajectories
  4. Calculation of trajectory metrics
  5. Comparing trajectories
  6. Hearing the trajectories
  7. Use your own data

Lake Caldeirao, Corvo, Azores

1. Introduction

Elementome Trajectory Analysis

What is Elementome Trajectory Analysis?

Elementome Trajectory analysis is the use of Ecological Trajectory Analysis in a multivariate spece defined by the elemental composition of an item through time

What is an elementome?

  • In 2019, Peñuelas et al., defined elementome to describe the elemental composition that characterize an organism, linking it to their physiological requirements to carry out its functions

  • The concept has been extended to characterize the change in the elemental composition of ecosystem through time: ecosystem elementome shifts

Studying ecosystem elementomes

  • Today you will use elemental measurements from paleoecological records to reconstruct changes in the ecosystem elementome

  • This data is published in de la Casa et al. 2025. Unveiling two millennia of ecosystem changes in the Azores through elementome trajectory analysis

2. The dataset for today

The dataset

For this exercise we are going to move to the Azores, where we will study sediment cores from two shallow lakes

Loading the data

So lets start with R - Load packages and the database

library(readr)
library(dplyr)
library(ecotraj)
library(ggplot2)
library(ggrepel)

ETA_azores_example <- read_delim("../exercises/StudentCSVdata/ETA_azores_example.csv",
                                 delim = ";", escape_double = FALSE, trim_ws = TRUE)

And we inspect inspect its structure:

ETA_azores_example
# A tibble: 181 × 21
   record_ID island archipelago   lat   log elevation first_arrival_BP
   <chr>     <chr>  <chr>       <dbl> <dbl>     <dbl>            <dbl>
 1 caldeirao Corvo  Azores       39.7 -31.1       400             1100
 2 caldeirao Corvo  Azores       39.7 -31.1       400             1100
 3 caldeirao Corvo  Azores       39.7 -31.1       400             1100
 4 caldeirao Corvo  Azores       39.7 -31.1       400             1100
 5 caldeirao Corvo  Azores       39.7 -31.1       400             1100
 6 caldeirao Corvo  Azores       39.7 -31.1       400             1100
 7 caldeirao Corvo  Azores       39.7 -31.1       400             1100
 8 caldeirao Corvo  Azores       39.7 -31.1       400             1100
 9 caldeirao Corvo  Azores       39.7 -31.1       400             1100
10 caldeirao Corvo  Azores       39.7 -31.1       400             1100
# ℹ 171 more rows
# ℹ 14 more variables: XV_arrival <dbl>, DEPTH_numeric <dbl>, TIME_BP <dbl>,
#   K <dbl>, Si <dbl>, Ca <dbl>, S <dbl>, Ti <dbl>, Mn <dbl>, Fe <dbl>,
#   Sr <dbl>, Zr <dbl>, C <dbl>, N <dbl>

What are the variables in the dataset?

The elemental composition (C, N, Ca, Fe, Fe, K, Mn, Si, Ti, Sr, Zr) from chronologically dated sediment samples of the two lakes

  • C and N are measured with spectrometer, the rest with X-Ray fluorescence.
  • We will transform the data so it is comparable (first root square and then min-max transformation)

Data transformation

  • record_ID identifies eack lake
  • TIME_BP is the calibrated age in years before present (cal. years BP)
element_list <- c("C","N", "Ca",  "Fe", "K",
                  "Mn",  "S", "Si", "Ti",  "Sr",  "Zr")


azores <- ETA_azores_example %>%
  group_by(record_ID) %>%
  # Apply square root transformation
  mutate(across(all_of(element_list), ~ sqrt(.), .names = "{.col}")) %>%
  # Min-max transformation (between 0 and 1)
  mutate(across(all_of(element_list), ~ (. - min(.)) / (max(.) - min(.)), .names = "{.col}")) %>%
  # Remove duplicates
  distinct(TIME_BP, .keep_all = TRUE) %>%
  ungroup()

Note

Now all elements are between 0 and 1. Thus, we work with relative quantifications.

3. Defining the trajectories

Common vs. lake-individual spaces

We have to choose now

  • We can define a common multivariate space for the two sites. This is better to make comparisons between sites
  • Generate independent multivariate spaces for each site. This could be useful if the elements available in both sites are very different
  • Here, we will use the first approach :)

What do we need?

To define trajectories you need:

  • Distance matrix: defined using the elemental composition of each lake sedimentary record

  • Sites: This will define how many independent trajectories are

  • Surveys: This is the steps within the trajectories. Here surveys = time measurements

Building the distance matrix

we are going to use use euclidean distances (thus our ordination analysis will be similar to a PCA). This differ depending on the data you are working with, Give it a thought!

Euclidean distance assumes:

  • No constant-sum constraint (elemental data is often considered compositional)

  • No relative dependence

  • Proper variable scaling

azores <- azores[, c("record_ID","TIME_BP",all_of(element_list))]

#this is the distance matrix
distance_azores<-dist(azores[, -c(1:2)]) #eliminate non-element columns
distance_azores <- as.matrix(distance_azores)

Defining the site vector

Each of the lakes is considered a site here.

site_azores <- azores[, 1]
site_azores <- site_azores[[1]] #Make sure it is a vector

Defining the survey vector

Time here is years before present. ecotraj will understand older ages (bigger numbers) as the most recent. This could be changed adding a minus but it can drives problems in the calculation of the metrics

survey_azores <- azores[, 2]

survey_azores <- survey_azores[[1]]  #should be a vector

Let’s create the trajectories:

We call function defineTrajectories() with the three elements:

traj_azores <- defineTrajectories(distance_azores, site_azores, times = survey_azores)

Let’s inspect the trajectory metadata:

head(traj_azores$metadata, 20)
       sites surveys  times
1  caldeirao       1 -35.56
2  caldeirao       2  -5.24
3  caldeirao       3  24.54
4  caldeirao       4  54.86
5  caldeirao       5  84.64
6  caldeirao       6 114.96
7  caldeirao       7 144.74
8  caldeirao       8 175.06
9  caldeirao       9 204.84
10 caldeirao      10 235.16
11 caldeirao      11 265.49
12 caldeirao      12 295.26
13 caldeirao      13 325.59
14 caldeirao      14 355.36
15 caldeirao      15 385.69
16 caldeirao      16 422.16
17 caldeirao      17 466.65
18 caldeirao      18 510.34
19 caldeirao      19 554.83
20 caldeirao      20 599.32

Let’s plot the trajectories

traj_azores <- defineTrajectories(distance_azores, site_azores, times= survey_azores)


trajectoryPCoA(traj_azores , traj.colors = c("yellow", "green"), lwd = 2,
               survey.labels = F)
legend("topright", col=c("yellow", "green"),
       legend=c("Caldeirao", "Empadadas"), bty="n", lty=1, lwd = 2)

To better interpret the trajectories, we can plot the position of the elements (weight/loadings) in the multivariate space.

This in not yet possible in ecotraj, we will use the prcomp() function from base

Plotting the elements in the plot

pca_result <- prcomp(azores[, -c(1:2)], center = TRUE, scale. = FALSE)
pca_scores <- as.data.frame(pca_result$x)
pca_scores$age <- azores$TIME_BP
pca_scores$group <- azores$record_ID
loadings <- as.data.frame(pca_result$rotation)
loadings$element <- rownames(loadings)

arrow_scale <- 3
loadings <- loadings %>%
  mutate(PC1 = PC1 * arrow_scale,
         PC2 = PC2 * arrow_scale)

Plotting the elements in the plot

ggplot() +
  # Trajectory path
  geom_path(data = pca_scores, aes(x = PC1, y = PC2, group = 1, color = group), linewidth = 1, alpha=0.2) +
  # Loadings arrows
  geom_segment(data = loadings,
               aes(x = 0, y = 0, xend = PC1, yend = PC2),
               arrow = arrow(length = unit(0.3, "cm")),
               color = "black", linewidth = 0.3) +
  # Labels for elements
  geom_text_repel(data = loadings,
                  aes(x = PC1, y = PC2, label = element),
                  color = "black", size = 3) +
  # Points
  geom_point(data = pca_scores, aes(x = PC1, y = PC2, color = group), size = 3, alpha=0.2) +

  theme_bw() +
  labs(title = "PCA Trajectory with Element Loadings",
       x = "PC1",
       y = "PC2",
       color = "Age Segment")

Plotting the elements in the plot

Plotting sites individually

We can plot each trajectory individually as well using subsetTrajectories

traj_caldeirao <- subsetTrajectories(traj_azores, site_selection = "caldeirao")
trajectoryPCoA(traj_caldeirao)

traj_empadadas <- subsetTrajectories(traj_azores, site_selection = "empadadas")
trajectoryPCoA(traj_empadadas)

Plotting specific moments of time

traj_azores_ModernEra<- subsetTrajectories(traj_azores,  window_selection = c(0,500)) # remember the minus

trajectoryPCoA(traj_azores_ModernEra , traj.colors = c("yellow", "green"), lwd = 2,
               survey.labels = F)
legend("topright", col=c("yellow", "green"),
       legend=c("Calderiao", "Empadadas"), bty="n", lty=1, lwd = 2)

Plotting smoothed trajectories

traj_azores_ModernEra_smooth<- smoothTrajectories(traj_azores_ModernEra) 

trajectoryPCoA(traj_azores_ModernEra_smooth , traj.colors = c("yellow", "green"), lwd = 2,
               survey.labels = F)
legend("topright", col=c("yellow", "green"),
       legend=c("Calderiao", "Empadadas"), bty="n", lty=1, lwd = 2)

4. Calculation of trajectory metrics

Overall trajectory metrics

The function trajectoryMetrics() calculates different metrics for all the trajectory

metrics_azores_allperiod <- trajectoryMetrics(traj_azores)

metrics_azores_allperiod
  trajectory   n   t_start     t_end  duration   length mean_speed mean_angle
1  caldeirao  65 -35.56000 2000.9700 2036.5300 31.23727 0.01533848  109.55773
2  empadadas 116 -60.82425  592.7947  653.6189 33.56347 0.05135022   96.00328
  directionality internal_ss internal_variance
1      0.3416622    30.07345         0.4698976
2      0.3840902    75.97777         0.6606763

Calculating metrics across the trajectory

  • We may be interested in how this metric changes through time.

  • We need to use moving windows, lets do a 500 years one (for each survey, it generates a moving window and calculate the metrics within)

metrics_azores<- trajectoryWindowMetrics(traj_azores, 500, type = "times", add = TRUE)

metrics_azores <- metrics_azores%>% mutate(
  meantime = ((t_start + t_end)/2))

Choosing the right moving windows

Caldeirao record spans 2035 years, while Empadadas only 660. - we should use windows spanning proportional spans, lets say 15%

metrics_caldeirao<- trajectoryWindowMetrics(traj_caldeirao, 305, type = "times", add = TRUE)
metrics_caldeirao <- metrics_caldeirao%>% mutate(
  meantime = ((t_start + t_end)/2))

metrics_empadadas<- trajectoryWindowMetrics(traj_empadadas, 99, type = "times", add = TRUE)
metrics_empadadas <- metrics_empadadas%>% mutate(
  meantime = ((t_start + t_end)/2))

Adding more layers:

The Azores had been subject to two waves of human arrival (Raposeiro et al., 2021). The arrival of Portuguese around 1450 CE (500 years BP), and potential earlier settlements 600 years before (1100 years BP)

  • GREEN for prehuman conditions

  • BLUE for first settlers

  • RED for Portuguese colonization

How to interpret the metrics

Internal variance (elemental turnover)

ggplot(metrics_caldeirao, aes(x = meantime, y = internal_variance)) +
  geom_line(linewidth = 1) +
  # Green: no humans
  annotate("rect", xmin = 2000, xmax = 1100,
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "green") +
  # Blue: first settlers (TIME_BP ~1100)
  annotate("rect", xmin = 1100, xmax = 498,
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "blue") +
  # Red: European (TIME_BP ~450)
  annotate("rect", xmin = 498, xmax = -50,
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "red") +
  labs(x = "Time (years BP)",
       y = "Internal Variance",
       color = "Trajectory") +
  theme_minimal() +
  scale_x_reverse()

ggplot(metrics_empadadas, aes(x = meantime, y = internal_variance)) +
  geom_line(linewidth = 1) +
  # Green: no humans
  annotate("rect", xmin = 600, xmax = 518,
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "blue") +
  # Red: European (TIME_BP ~450)
  annotate("rect", xmin = 518, xmax = -50,
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "red") +
  labs(x = "Time (years BP)",
       y = "Internal Variance",
       color = "Trajectory") +
  theme_minimal() +
  scale_x_reverse()

Directionality

ggplot(metrics_caldeirao, aes(x = meantime, y = directionality)) +
  geom_line(linewidth = 1) +
  # Green: no humans
  annotate("rect", xmin = 2000, xmax = 1100,
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "green") +
  # Blue: first settlers (TIME_BP ~1100)
  annotate("rect", xmin = 1100, xmax = 498,
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "blue") +
  # Red: European (TIME_BP ~450)
  annotate("rect", xmin = 498, xmax = -50,
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "red") +
  labs(x = "Time (years BP)",
       y = "Directionality",
       color = "Trajectory") +
  theme_minimal() +
  scale_x_reverse()

ggplot(metrics_empadadas, aes(x = meantime, y = directionality)) +
  geom_line(linewidth = 1) +
  # Green: no humans
  annotate("rect", xmin = 600, xmax = 518,
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "blue") +
  # Red: European (TIME_BP ~450)
  annotate("rect", xmin = 518, xmax = -50,
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "red") +
  labs(x = "Time (years BP)",
       y = "Directionality",
       color = "Trajectory") +
  theme_minimal() +
  scale_x_reverse()

Speed

ggplot(metrics_caldeirao, aes(x = meantime, y = mean_speed)) +
  geom_line(linewidth = 1) +
  # Green: no humans
  annotate("rect", xmin = 2000, xmax = 1100,
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "green") +
  # Blue: first settlers (TIME_BP ~1100)
  annotate("rect", xmin = 1100, xmax = 498,
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "blue") +
  # Red: European (TIME_BP ~450)
  annotate("rect", xmin = 498, xmax = -50,
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "red") +
  labs(x = "Time (years BP)",
       y = "Speed",
       color = "Trajectory") +
  theme_minimal() +
  scale_x_reverse()

ggplot(metrics_empadadas, aes(x = meantime, y = mean_speed)) +
  geom_line(linewidth = 1) +
  # Green: no humans
  annotate("rect", xmin = 600, xmax = 518,
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "blue") +
  # Red: European (TIME_BP ~450)
  annotate("rect", xmin = 518, xmax = -50,
           ymin = -Inf, ymax = Inf, alpha = 0.2, fill = "red") +
  labs(x = "Time (years BP)",
       y = "Speed",
       color = "Trajectory") +
  theme_minimal() +
  scale_x_reverse()

5. Comparing trajectories

Estimating convergence/divergence of trajectories

Now we are going to calculate metrics of the divergence/convergence of the trajectories. For that we are going to subset our trajectories for the period that we have data in both, the modern era that we subsetted before

We are going to use the Relative Trajectory Movement Assessment (RTMA) (Djeghri et al., under review) using the ecotraj trajectoryRMA . . .

trajectoryRMA(traj_azores_ModernEra)

It gives error because the number of surveys does not match

Transforming trajectories

We are going to interpolate the trajectories so they have the same number of surveys

traj_azores_mod_inter <- interpolateTrajectories(traj_azores_ModernEra, times = c(50, 100, 150, 200, 250, 300, 350, 400, 450))

trajectoryPCoA(traj_azores_mod_inter , traj.colors = c("yellow", "green"), lwd = 2, survey.labels = F)
legend("topright", col=c("yellow", "green"), legend=c("Caldeirao", "Empadadas"), bty="n", lty=1, lwd = 2)

RMA analysis

trajRMA<- trajectoryRMA(traj_azores_mod_inter)

trajRMA
$dynamic_relationships
          caldeirao                      empadadas                     
caldeirao NA                             "weak convergence (symmetric)"
empadadas "weak convergence (symmetric)" NA                            

$symmetric_convergence
$symmetric_convergence$tau
           caldeirao  empadadas
caldeirao         NA -0.6666667
empadadas -0.6666667         NA

$symmetric_convergence$p.value
           caldeirao  empadadas
caldeirao         NA 0.01266534
empadadas 0.01266534         NA


$asymmetric_convergence
$asymmetric_convergence$tau
           caldeirao  empadadas
caldeirao         NA -0.4444444
empadadas -0.5555556         NA

$asymmetric_convergence$p.value
          caldeirao empadadas
caldeirao        NA 0.1194389
empadadas 0.0446153        NA


$correspondence
          caldeirao empadadas
caldeirao        NA  1.913445
empadadas     0.077        NA

$parameters
          alpha corrected_alpha           nperm 
     0.05000000      0.01274146    999.00000000 

$dynamic_relationships_taxonomy
                                       dynamic_relationship    conv_div_group
neutral (symmetric)                     neutral (symmetric)              <NA>
parallel (symmetric)                   parallel (symmetric)              <NA>
antiparallel (symmetric)           antiparallel (symmetric)              <NA>
convergence (symmetric)             convergence (symmetric) convergence group
weak convergence (symmetric)   weak convergence (symmetric) convergence group
divergence (symmetric)               divergence (symmetric)  divergence group
weak divergence (symmetric)     weak divergence (symmetric)  divergence group
weak approaching (approacher) weak approaching (approacher) convergence group
weak approaching (target)         weak approaching (target) convergence group
weak departing (departer)         weak departing (departer)  divergence group
weak departing (origin)             weak departing (origin)  divergence group
approaching (approacher)           approaching (approacher) convergence group
approaching (target)                   approaching (target) convergence group
departing (departer)                   departing (departer)  divergence group
departing (origin)                       departing (origin)  divergence group
catch-up (leader)                         catch-up (leader) convergence group
catch-up (follower)                     catch-up (follower) convergence group
pursuit (leader)                           pursuit (leader)              <NA>
pursuit (follower)                       pursuit (follower)              <NA>
escape (leader)                             escape (leader)  divergence group
escape (follower)                         escape (follower)  divergence group
                                      oriented_group
neutral (symmetric)                             <NA>
parallel (symmetric)                            <NA>
antiparallel (symmetric)                        <NA>
convergence (symmetric)                         <NA>
weak convergence (symmetric)                    <NA>
divergence (symmetric)                          <NA>
weak divergence (symmetric)                     <NA>
weak approaching (approacher)  oriented group (back)
weak approaching (target)     oriented group (front)
weak departing (departer)     oriented group (front)
weak departing (origin)        oriented group (back)
approaching (approacher)       oriented group (back)
approaching (target)          oriented group (front)
departing (departer)          oriented group (front)
departing (origin)             oriented group (back)
catch-up (leader)             oriented group (front)
catch-up (follower)            oriented group (back)
pursuit (leader)              oriented group (front)
pursuit (follower)             oriented group (back)
escape (leader)               oriented group (front)
escape (follower)              oriented group (back)

attr(,"class")
[1] "RTMA" "list"

Plot it

trajectoryRMAPlot(trajRMA)

6. Hearing the trajectories

soundtraj

soundtraj is a framework developed to produce sound out of multivariate trajectories

We hope that you have now a better understanding of Ecological Trajectory Analysis and motivation to use it as an analytically tool.