On this page we are going to get more familiar with the ERA dataset and zoom in specifically on the agroforestry data within ERA. We are going to use a mix of general R packages specifically designed for explorortive data analysis (EDA) as well as inherent functions in the ERAg package, such as ERAComboSplit, ERAHexPlot, PrepareERA and of cause ERAAnalyze. A combined outlier removal approach will be applied the the output data of ERAAnalyze to get a better undestanding of outliers. Ultimately, a visual assessment of the results, from ERAAnalyze, will be made by looking at the reponse ratios of all agroforestry practices across a number of selected outcomes. Press the “Show code” to view the R codes used to perform a given analysis or visualisation output.
This part of the document is where we actually get to the nitty-gritty of the ERA agroforestry data and therefore it requires us to load a number of R packages for both general Explorortive Data Analysis.
Loading general R packages
# Using the pacman functions to load required packages
if(!require("pacman", character.only = TRUE)){
install.packages("pacman",dependencies = T)
}
required.packages <- c("knitr","data.table", "kableExtra", #---------------------------------------------
"DT", "tidyverse", "ggExtra", # Packages needed in general
"lme4", "treemap", "pillar", "threadr",
"readr", "broom.mixed", "dotwhisker",
"skimr", "hablar", "see", "ggpubr",
"RColorBrewer", "ruler", "rstatix",
"corrr", "GGally", "cowplot", "ggridges",
"report", "rcompanion", "ggbeeswarm",
"ggfortify", "DataExplorer", "gstat",
"superheat", "patchwork", "Boruta",
"stringr", "dplyr", "see", "ggplot2",
"ggfittext", "d3treeR", "grid", "recipes")
p_load(char=required.packages, install = T,character.only = T)
Loading the ERAg package
The ERAg package includes all the ERA data we need for the EDA.
#devtools::install_github("peetmate/ERAg",
# auth_token = "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",
# build_vignettes = TRUE,
# dependencies = TRUE)
library(ERAg)
We will first need to “split” the ERA data using the inherent ERAg function called “ERAComboSplit.” Using the ERAComboSplit function we can split practice and product combinations into duplicate individual rows each contain a unique combination of any practice x product combination present in the original observation.
ERA.Compiled.Split <- ERAComboSplit(Data = ERAg::ERA.Compiled)
# We count the number of studies per level of the practice hierarchy (see PracticeCodes
# object for more information on the practice hierarchy)
ERA.Compiled.Split.Pracs <- ERA.Compiled.Split[,list(N.Studies=length(unique(Code))),
by=list(SubPrName.Combo,PrName.Combo,Theme.Combo)]
# Visualize with the treemap function proportions of ERA Theme
# png(filename="tree.png",width=800, height=800)
ERA.tree.theme.and.practices <- (treemap::treemap(ERA.Compiled.Split.Pracs,
index=c("Theme.Combo","PrName.Combo"),
vSize="N.Studies",
type="index",
palette = "Set3",
border.col=c("black","white"),
border.lwds=c(5,1),
fontsize.title=12,
title="Proportion of ERA theme and practices
based on the number of studies for each ERA theme and practice"))
# Interactive treemap using the d3treeR package function d3tree()
ERA.tree.theme.and.practices.d3viz <- d3tree(ERA.tree.theme.and.practices,
id = "name",
celltext = "name",
valueField = "size",
width = 800, height = 600,
rootname = "ERA Themes and Practices")
Lets explore the proportions of data in ERA for each Practice under each Theme, based on the number of studies. Explorer the Treemap interactively!
Visualising each ERA theme and practice using a tree map gives a good understanding of the proportions of data under each theme and for each practice in the ERA data - based on the number of studies. We see agroforestry, our focus only accounts for a limited amount of the total ERA data.
Let us now focus on the agroforestry data within ERA by selecting data from ERA.Compiled that are only found under the ERA Theme “Agroforestry.”
af <- ERAg::ERA.Compiled[grepl("Agroforestry" , Theme)]
dim(af)
[1] 9605 144
Now we can view the subsetted agroforestry data.
rmarkdown::paged_table(af)