library(haven) # For working with SPSS datafiles
library(sjlabelled) # To remove labels from SPSS datafiles
library(tidyverse) # For so much
Case Study: Linking ISSP survey data
The ISSP data
The International Social Survey Programme (ISSP) is a cross-national programme conducting annual surveys on diverse topics like:
- the role of government 🏛,
- inequality 💰,
- work orientations 👷,
- the environment 🏞,
- or national identity 🇫🇷 🇿🇦 🇮🇳.
The ISSP was established in 1984 by the founding members Australia, Germany, Great Britain, and the US. Currently, the ISSP has 44 member states. Since its foundation, over one million respondents have participated in the surveys of the ISSP. All datasets are publicly available and free of charge.
For this case study, we are working with the cumulation of the environment module of the ISSP, which integrates the four existing survey rounds on the environment (1993, 2000, 2010, and 2020). Let’s assume we are interested in the role of long-term climate change patterns on how they affect environmental attitudes on the country-level. We first show a “manual” approach of retrieving and processing the EO indicator before introducing an easy alternative with the gxc
-package.
Setup
For loading and wrangling with the data, we need some packages. Keep in mind to install (install.packages()
) those packages first before loading with the library-function.
After downloading the data file from the website, place it in your project folder. We are loading the SPSS-file and directly remove all labels.
<- haven::read_spss("./data/issp_env/ZA8793_v1-0-0.sav") |>
issp ::remove_all_labels() sjlabelled
Let’s clean the dataset. We want to investigate the relationship between the experience of extreme heat and climate change concern. Let’s subset the datafile to variables which are necessary for us and rename directly. We will keep the variables measuring the survey year
, the country
of residence of the respondent, and the climate change concern item (v42
).
<- issp |>
issp select(
year,
country, concern = v42
)
You can check out the codebook of the dataset to find out the country names for the numeric values in the dataset. Let’s label it. We will also combine responses from Northern Ireland and Great Britain into “United Kingdom” and store it in a new variable.
<- issp |>
issp mutate(
country = factor(country, levels=c(36, 40, 100, 124, 152, 158, 191, 203,
208, 246, 250, 276, 348, 352, 372, 376,
380, 392, 410, 428, 440, 484, 528, 554,
578, 608, 620, 643, 703, 705, 710, 724,
752, 756, 826, 840, 82602),
labels=c("Australia", "Austria", "Bulgaria", "Canada",
"Chile", "Taiwan", "Croatia", "Czechia",
"Denmark", "Finland", "France", "Germany",
"Hungary", "Iceland", "Ireland", "Israel",
"Italy", "Japan", "South Korea", "Latvia",
"Lithuania", "Mexico", "Netherlands",
"New Zealand", "Norway", "Philippines",
"Portugal", "Russia", "Slovakia",
"Slovenia", "South Africa", "Spain",
"Sweden", "Switzerland", "Great Britain",
"USA",
"Northern Ireland"))
)
<- issp |>
issp mutate(
country_new = case_when(
== "Great Britain" | country == "Northern Ireland" ~ "United Kingdom",
country TRUE ~ country
),.after = country
)
Let’s also reverse the concern-scale so that higher values indicate higher concern.
<- issp |>
issp mutate(
concern = case_match(concern,
1 ~ 5,
2 ~ 4,
3 ~ 3,
4 ~ 2,
5 ~ 1)
)
The dataset is ready for linking with our temperature data.
head(issp)
year country country_new concern
1 1993 Australia Australia 4
2 1993 Australia Australia 4
3 1993 Australia Australia 5
4 1993 Australia Australia 3
5 1993 Australia Australia 3
6 1993 Australia Australia 3
Before we do that, let’s explore the distribution of climate concern across countries for the 2020 wave.
# Define Likert-theme
<- theme_gray() +
likert_theme theme(text = element_text(size = 12, face = "bold"),
plot.title = element_text(size = 13, face = "bold",
margin = margin(10, 0, 10, 0)),
plot.margin = unit(c(.4,0,.4,.4), "cm"),
plot.subtitle = element_text(face = "italic"),
legend.title = element_blank(),
legend.key.size = unit(1, "line"),
legend.background = element_rect(fill = "grey90"),
panel.grid = element_blank(),
axis.text.x = element_blank(),
axis.ticks = element_blank(),
axis.title = element_blank(),
panel.background = element_blank(),
strip.text = element_text(size = 12, face = "bold"))
# Let's look at 2020 only
<- issp |>
issp_2020 filter(year == "2020")
# Plot
|>
issp_2020 filter(!is.na(concern)) |>
mutate(country_new = forcats::fct_reorder(country_new, concern,
.fun=mean, .desc=FALSE)) |>
arrange(country_new) |>
group_by(country_new, concern) |>
summarize(count = n()) |>
group_by(country_new) |>
mutate(prop_value = count / sum(count)) |>
ggplot() +
geom_bar(mapping = aes(x = country_new,
y = prop_value,
fill = forcats::fct_rev(factor(concern))),
position = "fill",
stat = "identity")+
geom_text(aes(x = country_new, y = prop_value, label = round(100*prop_value)),
position = position_stack(vjust = 0.5),
fontface = "bold") +
scale_fill_brewer(type = "div", palette = "PRGn", direction = -1,
labels = c("5 - High concern", "4", "3", "2", "1 - No concern")) +
coord_flip() +
+
likert_theme theme(legend.position = "bottom") +
guides(fill = guide_legend(reverse = TRUE, nrow =1))
EO indicators
We want to investigate whether temperature anomalies in the year of the survey, in comparison to a long running average, are associated with climate change concern. For this example, we will only focus on the 2020 survey wave but the process can be replicated for each survey round. The visualization below helps us to conceptualize our climate indicator:
- Indicator: Temperature - annual average
- Intensity: Anomaly (mean deviation)
- Focal time period: 2020
- Baseline period: 1961-1990
- Spatial buffer: Country
The ERA5-Land Reanalyis from the Copernicus Climate Change Service is a suitable data product for this temperature indicator. It records observations on air temperature at 2 meters above the surface from 1950 onwards, has a spatial resolution of 0.1x0.1degrees and a global spatial coverage.
In order to access the data, we need an ECMWF-account. Utilizing the ecmwfr-package, we can access the data directly in R.
Data access and preparation
Given that we want to aggregate the data on country-level, we first load country shapefiles, and download the data according to the spatial extent of the countries included in the survey. The ISSP has a diverse membership from North and South America, Europe, Africa, and Asia. Thus, we can work with a global spatial extent when downloading the EO indicator.
We need some packages to load and prepare the world map and process the raster files (rnaturalearth
, sf
, terra
, and tidyverse
). We also need the keyring
-package to safely store our ECMWF-API key and the devtools
-package to load the gxc
-package.
# Install and load required packages
<- c("keyring", "rnaturalearth", "sf", "tidyverse", "terra", "devtools")
required_packages <- required_packages[!(required_packages %in% installed.packages()[,"Package"])]
new_packages if(length(new_packages)) install.packages(new_packages)
lapply(required_packages, library, character.only = TRUE)
We load the shapefile containing country-level polygons and subset it to the most relevant variables.
# Download world map data
<- ne_countries(scale = "medium", returnclass = "sf")
world st_geometry(world)
Geometry set for 242 features
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -180 ymin: -89.99893 xmax: 180 ymax: 83.59961
Geodetic CRS: WGS 84
First 5 geometries:
MULTIPOLYGON (((31.28789 -22.40205, 31.19727 -2...
MULTIPOLYGON (((30.39609 -15.64307, 30.25068 -1...
MULTIPOLYGON (((53.08564 16.64839, 52.58145 16....
MULTIPOLYGON (((104.064 10.39082, 104.083 10.34...
MULTIPOLYGON (((-60.82119 9.138379, -60.94141 9...
# Subset to relevant variables
<- world |>
world select(admin, iso_a3, geometry)
# Plot world map
plot(world[1])
A final step before we can access the data from the Copernicus API is to store our API key. By setting it to “wf_api_key”, the function automatically retrieves the key.
# Store as environment variable
# Sys.setenv(WF_API_KEY = "YOUR-API-KEY")
<- Sys.getenv("WF_API_KEY")
api_key
::key_set_with_value(service = "wf_api_key", password = api_key) keyring
Now we can access the data. We loop the download over the four years of the survey programme (1993, 2000, 2010, 2020) in order to create four separate files.
# Year vector
<- c("1993", "2000", "2010", "2020")
years
# # API acess looped over four years
# for (yr in years) {
#
# # Create file names which include year
# file_name <- paste0("era5_temperature", yr, ".grib")
#
# # Specify API request
# request <- list(
# data_format = "grib",
# variable = "2m_temperature",
# product_type = "monthly_averaged_reanalysis",
# time = "00:00",
# year = yr,
# month = c("01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11", "12"),
# area = c(90, -180, -90, 180),
# dataset_short_name = "reanalysis-era5-land-monthly-means",
# target = file_name
# )
#
# # Download data from C3S
# file_path <- ecmwfr::wf_request(
# request = request,
# transfer = TRUE,
# path = "./data/EO_data/C3S_data",
# verbose = FALSE
# )
#
# }
<- terra::rast("./data/EO_data/C3S_data/era5_temperature1993.grib")
temp_1993 <- terra::rast("./data/EO_data/C3S_data/era5_temperature2000.grib")
temp_2000 <- terra::rast("./data/EO_data/C3S_data/era5_temperature2010.grib")
temp_2010 <- terra::rast("./data/EO_data/C3S_data/era5_temperature2020.grib") temp_2020
Let’s inspect the datacube for 2020 and plot the first layer of the 2020 datacube (January 2020). The attributes of the file tell us information on the dimensions (number of rows, columns, and layers), the resolution, spatial extent, the coordinate reference system, units, and time points.
temp_2020
class : SpatRaster
dimensions : 1801, 3600, 12 (nrow, ncol, nlyr)
resolution : 0.1, 0.1 (x, y)
extent : -180.05, 179.95, -90.05, 90.05 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat Coordinate System imported from GRIB file
source : era5_temperature2020.grib
names : SFC (~e [C], SFC (~e [C], SFC (~e [C], SFC (~e [C], SFC (~e [C], SFC (~e [C], ...
unit : C, C, C, C, C, C, ...
time : 2020-01-01 01:00:00 to 2020-12-01 01:00:00 UTC
plot(temp_2020[[1]])
Now we can aggregate the monthly values by year and country. We will check that our country polygons and the raster files have the same CRS and align, if necessary.
for (yr in years) {
<- get(paste0("temp_", yr))
temp_data
# Check CRS of both datasets and adjust if necessary
if(!identical(crs(world), terra::crs(temp_data))) {
<- world |>
world st_transform(crs=st_crs(temp_data))
}
# Collapse the month layers into one layer by averaging across months
<- terra::app(temp_data, fun = mean, na.rm = TRUE)
annual_values
# Aggregate by country
<- terra::extract(
country_values
annual_values,
world,fun = mean,
na.rm = TRUE
)
# Add values to shapefile
paste0("temp_", yr)] <- country_values[, 2]
world[
}
print(head(world))
Simple feature collection with 6 features and 6 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -73.36621 ymin: -22.40205 xmax: 109.4449 ymax: 41.9062
Geodetic CRS: Coordinate System imported from GRIB file
admin iso_a3 geometry temp_1993 temp_2000 temp_2010
1 Zimbabwe ZWE MULTIPOLYGON (((31.28789 -2... 294.3789 293.3119 294.4570
2 Zambia ZMB MULTIPOLYGON (((30.39609 -1... 294.8202 294.7319 295.1083
3 Yemen YEM MULTIPOLYGON (((53.08564 16... 297.5421 297.9850 298.3186
4 Vietnam VNM MULTIPOLYGON (((104.064 10.... 295.9980 295.8958 296.7712
5 Venezuela VEN MULTIPOLYGON (((-60.82119 9... 297.7739 297.5062 298.3977
6 Vatican VAT MULTIPOLYGON (((12.43916 41... 288.3732 289.0528 288.5110
temp_2020
1 294.6193
2 295.2352
3 298.2873
4 296.7613
5 298.7989
6 289.6033
Now that we have the focal values for all four survey years, we redo the process for the baseline period (1961-1990).
# Year vector
<- as.character(1961:1970)
baseline_years
# # Specify API request
# request <- list(
# data_format = "grib",
# variable = "2m_temperature",
# product_type = "monthly_averaged_reanalysis",
# time = "00:00",
# year = baseline_years,
# month = c("01", "02", "03", "04", "05", "06", "07", "08", "09", "10", "11", "12"),
# area = c(90, -180, -90, 180),
# dataset_short_name = "reanalysis-era5-land-monthly-means",
# target = "era5_temperature1961-1990.grib"
# )
#
# # Download data from C3S
# file_path <- ecmwfr::wf_request(
# request = request,
# transfer = TRUE,
# path = "./data/EO_data/C3S_data",
# verbose = FALSE
# )
<- terra::rast("./data/EO_data/C3S_data/era5_temperature1961-1990.grib") temp_base
# Check CRS of both datasets and adjust if necessary
if(!identical(crs(world), terra::crs(temp_base))) {
<- world |>
world st_transform(crs=st_crs(temp_base))
}
# Collapse all into one layer by averaging across months and years
<- terra::app(temp_base, fun = mean, na.rm = TRUE) annual_values
|---------|---------|---------|---------|
=========================================
# Aggregate by country
<- terra::extract(
country_values
annual_values,
world,fun = mean,
na.rm = TRUE
)
# Add values to shapefile
$temp_base <- country_values[, 2]
world
print(head(world))
Simple feature collection with 6 features and 7 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -73.36621 ymin: -22.40205 xmax: 109.4449 ymax: 41.9062
Geodetic CRS: Coordinate System imported from GRIB file
admin iso_a3 geometry temp_1993 temp_2000 temp_2010
1 Zimbabwe ZWE MULTIPOLYGON (((31.28789 -2... 294.3789 293.3119 294.4570
2 Zambia ZMB MULTIPOLYGON (((30.39609 -1... 294.8202 294.7319 295.1083
3 Yemen YEM MULTIPOLYGON (((53.08564 16... 297.5421 297.9850 298.3186
4 Vietnam VNM MULTIPOLYGON (((104.064 10.... 295.9980 295.8958 296.7712
5 Venezuela VEN MULTIPOLYGON (((-60.82119 9... 297.7739 297.5062 298.3977
6 Vatican VAT MULTIPOLYGON (((12.43916 41... 288.3732 289.0528 288.5110
temp_2020 temp_base
1 294.6193 294.1319
2 295.2352 294.8179
3 298.2873 297.3228
4 296.7613 295.6958
5 298.7989 297.8074
6 289.6033 288.1982
Now that we have the focal and baseline values, we calculate single deviations.
<- world |>
world ::mutate(
dplyrdiff_1993 = temp_1993 - temp_base,
diff_2000 = temp_2000 - temp_base,
diff_2010 = temp_2010 - temp_base,
diff_2020 = temp_2020 - temp_base
)
# Plot 2020 deviation from baseline
ggplot(data = world) +
geom_sf(aes(fill = diff_2020)) +
scale_fill_viridis_c() +
theme_minimal() +
labs(
title = "Absolute deviation between 2020 and baseline temperature",
subtitle = "Averaged across countries",
fill = "Temperature (K)"
)
Integrate survey and EO data
Turning to the survey data, we aggregate climate change concern across country-waves and link it with the shapefile.
<- issp |>
mean_concern group_by(country_new, year) |>
summarize(mean_concern = mean(concern, na.rm=TRUE),
se_concern = sd(concern, na.rm=TRUE) / sqrt(n()))
`summarise()` has grouped output by 'country_new'. You can override using the
`.groups` argument.
ggplot(mean_concern, aes(x = year, y = mean_concern, color = country_new, group = country_new)) +
geom_point(size = 3) +
geom_errorbar(aes(ymin = mean_concern - se_concern, ymax = mean_concern + se_concern), width = .5) +
geom_line()+
labs(title = "Mean climate change concern across countries",
x = "Survey Wave", y = "Mean climate change concern", color = "Country") +
facet_wrap(~country_new, ncol=5)+
theme_minimal() +
theme(legend.position = "none")
<- mean_concern |>
mean_concern_wide select(!se_concern) |>
pivot_wider(
names_from = year,
values_from = mean_concern,
names_glue = "{.value}_{year}",
names_sort = TRUE)
<- left_join(world, mean_concern_wide, by = c("admin" = "country_new"))
world
print(head(world |>
filter(!is.na(mean_concern_2020)) |>
arrange(admin)))
Simple feature collection with 6 features and 15 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: -61.79409 ymin: -54.74922 xmax: 158.9589 ymax: 70.06484
Geodetic CRS: Coordinate System imported from GRIB file
admin iso_a3 temp_1993 temp_2000 temp_2010 temp_2020 temp_base diff_1993
1 Australia AUS 294.8993 294.2218 294.7285 295.7058 295.0203 -0.1210429
2 Austria AUT 278.7910 279.8814 278.3763 280.4286 278.1573 0.6337671
3 Croatia HRV 283.9717 285.6562 283.9234 285.5485 283.5202 0.4515663
4 Denmark DNK 280.9850 282.4755 280.1209 283.3281 280.6341 0.3508730
5 Finland FIN 274.8894 276.5970 274.3150 277.6573 274.4078 0.4816045
6 France -99 285.0149 285.6563 284.6554 286.9104 284.6791 0.3357822
diff_2000 diff_2010 diff_2020 mean_concern_1993 mean_concern_2000
1 -0.7985866 -0.29184766 0.6854605 3.73951 NA
2 1.7241737 0.21901826 2.2712879 NA 3.996714
3 2.1360691 0.40327552 2.0283470 NA NA
4 1.8414213 -0.51319651 2.6940181 NA 3.402542
5 2.1891151 -0.09287108 3.2494134 NA 3.302792
6 0.9772204 -0.02364492 2.2312917 NA NA
mean_concern_2010 mean_concern_2020 geometry
1 3.306898 3.833178 MULTIPOLYGON (((143.1789 -1...
2 3.804325 4.182692 MULTIPOLYGON (((9.527539 47...
3 4.010915 3.990964 MULTIPOLYGON (((13.57793 45...
4 3.498276 3.642857 MULTIPOLYGON (((12.56875 55...
5 3.511832 3.924270 MULTIPOLYGON (((24.15547 65...
6 3.474752 3.832879 MULTIPOLYGON (((9.480371 42...
The data is ready to use for any further analysis.
Subnational data
The ISSP data stores information on subnational data. We are zooming into Canada and its 10 provinces to further analyse climate change concern and changes in temperature.
library(haven)
library(sjlabelled)
library(tidyverse)
library(sf)
library(rnaturalearth)
library(rnaturalearthdata)
library(terra)
<- haven::read_spss("./data/issp_env/ZA8793_v1-0-0.sav") |>
issp ::remove_all_labels()
sjlabelled
<- ne_states(country = "Canada", returnclass = "sf") |>
provinces select(
name,
postal,
geometry )
In a first step, we aggregate the temperature data for the three survey waves in 1993, 2000, and 2010 across the Canadian provinces.
<- c("1993", "2000", "2010")
years
<- terra::rast("./data/EO_data/C3S_data/era5_temperature1993.grib")
temp_1993 <- terra::rast("./data/EO_data/C3S_data/era5_temperature2000.grib")
temp_2000 <- terra::rast("./data/EO_data/C3S_data/era5_temperature2010.grib")
temp_2010 <- terra::rast("./data/EO_data/C3S_data/era5_temperature1961-1990.grib")
temp_base
for (yr in years) {
<- get(paste0("temp_", yr))
temp_data
# Check CRS of both datasets and adjust if necessary
if(!identical(crs(provinces), terra::crs(temp_data))) {
<- provinces |>
provinces st_transform(crs=st_crs(temp_data))
}
# Collapse the month layers into one layer by averaging across months
<- terra::app(temp_data, fun = mean, na.rm = TRUE)
annual_values
# Aggregate by province
<- terra::extract(
province_values
annual_values,
provinces,fun = mean,
na.rm = TRUE
)
# Add values to shapefile
paste0("temp_", yr)] <- province_values[, 2]
provinces[
}
# Check CRS of both datasets and adjust if necessary
if(!identical(crs(provinces), terra::crs(temp_base))) {
<- provinces |>
provinces st_transform(crs=st_crs(temp_base))
}
# Collapse all into one layer by averaging across months and years
<- terra::app(temp_base, fun = mean, na.rm = TRUE) annual_values
|---------|---------|---------|---------|
=========================================
# Aggregate by country
<- terra::extract(
province_values
annual_values,
provinces,fun = mean,
na.rm = TRUE
)
# Add values to shapefile
$temp_base <- province_values[, 2]
provinces
<- provinces |>
provinces ::mutate(
dplyrdiff_1993 = temp_1993 - temp_base,
diff_2000 = temp_2000 - temp_base,
diff_2010 = temp_2010 - temp_base
)
<- provinces |>
provinces st_transform(3347)
Let’s also prepare the ISSP data on Canada.
<- issp |>
canada filter(country == 124) |>
filter(!is.na(CA_REG)) |>
select(
year,
CA_REG,concern = v42
)
<- canada |>
canada mutate(
CA_REG = factor(CA_REG, levels=c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10),
labels=c("Newfoundland and Labrador", "Nova Scotia",
"Prince Edward Island", "New Brunswick",
"Québec", "Ontario", "Manitoba", "Saskatchewan",
"Alberta", "British Columbia"))
)
<- canada |>
canada mutate(
concern = case_match(concern,
1 ~ 5,
2 ~ 4,
3 ~ 3,
4 ~ 2,
5 ~ 1)
)
<- canada |>
canada group_by(CA_REG, year) |>
summarize(mean_concern = mean(concern, na.rm=TRUE),
se_concern = sd(concern, na.rm=TRUE) / sqrt(n()))
ggplot(canada, aes(x = year, y = mean_concern, color = CA_REG, group = CA_REG)) +
geom_point(size = 3) +
geom_errorbar(aes(ymin = mean_concern - se_concern, ymax = mean_concern + se_concern), width = .5) +
geom_line()+
labs(title = "Mean climate change concern across Canadian provinces",
x = "Survey Wave", y = "Mean climate change concern", color = "Province") +
facet_wrap(~CA_REG, ncol=5)+
theme_minimal() +
theme(legend.position = "none")
We can now join both datasets to integrate the survey data with the temperature data.
<- canada |>
canada_wide select(!se_concern) |>
pivot_wider(
names_from = year,
values_from = mean_concern,
names_glue = "{.value}_{year}",
names_sort = TRUE)
<- left_join(provinces, canada_wide, by = c("name" = "CA_REG"))
canada_sf
<- canada_sf |>
canada_sf select(-temp_base) |>
pivot_longer(
cols = c(temp_1993:mean_concern_2010),
names_to = c("variable", "year"),
names_pattern = "^(.*)_(\\d{4})$",
values_to = "value"
|>
) pivot_wider(
names_from = variable,
values_from = value
|>
) mutate(year = as.integer(year))
Let’s visualize both variables in one plot.
<- ggplot(canada_sf) +
CA_plot_1 geom_sf(aes(fill = mean_concern)) +
facet_wrap(~ year) +
scale_fill_viridis_c() +
geom_sf_text(aes(label = postal), size = 3, color = "black") +
labs(fill = "Mean concern") +
theme_minimal()
<- ggplot(canada_sf) +
CA_plot_2 geom_sf(aes(fill = diff)) +
facet_wrap(~ year) +
scale_fill_viridis_c() +
geom_sf_text(aes(label = postal), size = 3, color = "black") +
labs(fill = "Mean\ntemperature\ndeviation (K)") +
theme_minimal()
library(patchwork)
/ CA_plot_2 CA_plot_1
We further investigate the relationship between both variables with a scatterplot.
ggplot(canada_sf, aes(x = diff, y = mean_concern, color = as.factor(year))) +
geom_point() +
geom_smooth(method = "lm", se = FALSE) +
labs(
x = "Mean temperature deviation",
y = "Mean concern",
color = "Year"
+
) theme_minimal()
Spatial linking made easy
The manual approach described above is time- and code-intensive. Our gxc
-package helps to automatize these steps. You can select from various functions for polygons, points or grids from our gxc
-package. For this case study, we use the poly_link_monthly
-function to directly link to every single observation in a dataset with the discussed EO indicator.
We need devtools
to load the gxc
-package.
# # Install and load required packages
# required_packages <- c("devtools", "keyring", "rnaturalearth", "sf", "tidyverse")
# new_packages <- required_packages[!(required_packages %in% installed.packages()[,"Package"])]
# if(length(new_packages)) install.packages(new_packages)
# lapply(required_packages, library, character.only = TRUE)
#
# # Load gxc package
# if (!requireNamespace("gxc", quietly = TRUE)) {
# remotes::install_github("denabel/gxc")
# }
#
# # Now load the package
# library(gxc)
We exemplify the process for the 2020 wave. The function requires the last month of the focal time period as variable in the dataset. We add a date-variable to the dataset which records the last month of the focal period (December 2020).
# # Download world map data
# world <- ne_countries(scale = "medium", returnclass = "sf")
# st_geometry(world)
#
# # Subset to relevant variables
# world <- world |>
# select(admin, iso_a3, geometry)
#
# # Create fixed date-variable
# world$date_raw <- "12-2020"
Check out vignette for poly_link_monthly
for detailed documentation.
# ?gxc::poly_link_monthly
Specification of poly_link_monthly
and data access.
# dataset_out <- gxc::poly_link_monthly(
# indicator = "2m_temperature",
# data = world,
# date_var = "date_raw",
# time_span = 11,
# time_lag = 0,
# baseline = c("1961", "1970"),
# order = "my",
# path = "./data/EO_data/C3S_data",
# catalogue = "reanalysis-era5-land-monthly-means",
# by_hour = FALSE,
# keep_raw = FALSE,
# parallel = FALSE,
# chunk_size = 50
# )
# print(head(world |>
# filter(!is.na(mean_concern_2020)) |>
# arrange(admin)))
This was a relatively easy example where we link data on the country-level. Data with a more fine-grained georeferencing or more complex temporal resolution requires even more flexible approaches. The gxc
-package allows these custom-made linking approaches. The next example with the GLES Panel shows how to do it for observations with varying linking dates and small spatial buffer.