Eurostat map using Python

I’ve started using Python a while ago and I really like using geopandas (for one, the documentation is pretty good).

Using Eurostat data, this is a map comparing tertiary education in European countries in 2022. Tertiary education refers to anything university-level.

First, I’ve used a Natural Earth shapefile to get the base map of the world. Then, after I’ve read in the Eurostat file and made sure the country colunms in both files are called ‘Country’, I merge them like so:

df  = wb.merge(land10, on ='Country', how ='left')
gdf = gpd.GeoDataFrame(df)

A few countries were missing from Eurostat so I coerced them into missing, ‘NaN’ values. The final map was created by the following code:

from mpl_toolkits.axes_grid1 import make_axes_locatable

fig, ax = plt.subplots(1, 1) 

divider = make_axes_locatable(ax)

#axes on which to draw the legend in case of color map
cax = divider.append_axes("bottom", size="5%", pad=0.1)

ax.set_axis_off();
gdf.plot(ax=ax, 
        column="Percentage", 
        cmap= 'YlGn',
        legend=True,
        cax=cax,
        legend_kwds={"label": "Percentage Tertiary Educated", "orientation": "horizontal"},

    )

xmin, ymin, xmax, ymax = gdf[gdf.Country == 'Germany'].total_bounds
pad = 20 # add a padding around the geometry
ax.set_xlim(xmin-pad, xmax+ pad)
ax.set_ylim(ymin-pad, ymax+ pad)

# Figure footer title
fig.text(0.4, 0.9, "Source: Eurostat 2022",
         horizontalalignment="center", fontsize=8)

# Save plot to file 
plt.savefig("eurostat.png", dpi=600)
Tertiary education in Europe map

I would like to know how to make the map flat and not so squished (if you know what I mean). I’m going to explore other base maps in the future.

The script is also available in my repo.

  1. February 18, 2024

First time using LiDAR data for a 3D map

I had heard about LiDAR data in GIS projects - it’s a remote-sensing technique that uses laser light to densely sample the surface of the earth to produce 3D measurements (x,y,z). Pretty cool.

I came across this tutorial which uses LiDAR raster data from the Netherlands to make a 3D city map on R. I’ve copied the code and adjusted it slightly to make an image of the town I grew up in; Kuopio, Finland. The National Land Survey of Finland provides various raster and vector data for free (shout out to Maanmittauslaitos!). See my code in my repo.

The code uses the terra package to overlay 3D measurement data onto an aerial image (orthophoto) and then renders it on rayshader. Now, this might still be a work in progress as I’m not sure why my map came out so smudgy and unclear. I increased the lightsize option, which helped with the grayness, but I didn’t have my own HDR file to use under environment_light. It was my first time using rayshader and to be honest, I am not so clear on all the options as I don’t have a GIS (never mind STEM) background!

Anyway, I really enjoyed making this - thank you Milos for the excellent tutorial and inspiration!

3D map of Kuopio
  1. November 7, 2023

Pollution mortality in Latin America

I was thinking about how pollution levels, and therefore where we live, affect our health so much. (This is discussed in urban studies and related fields through environmental inequality.)

The OECD has a lot of data on health as well as environment indicators at a global scale. The rates of mortality pollution in Latin America recorded for 2019 range from 148 in Paraguay to the highest two: 430 in Venezuela and 516 per 1 million population in Cuba (quite a leap).

I am thinking of making an R Shiny app for a group of indicators for Latin America, and it could be around health and/or the environment. Any ideas, let me know.

Pollution mortality in Latin America made with mapsf

Data and more information about the indicator: OECD (2023), Air pollution effects indicator. doi: 10.1787/573e3faf-en (Accessed on 22 June 2023).

For this one, I used mapsf, the successor of the cartogaphy package. I really liked using this one, as the main bit of the code ie. map_choro() incroporates options (palette, number of breaks…) that you would commonly use anyway but maybe have to add in or build up before plotting, which is helpful. I’ll add the code below, as I have some comments on it:

#Load packages
library(tidyverse) 
library(dplyr)
library("ggplot2")
library("sf")
library("rnaturalearth")
library("rnaturalearthdata")
library(mapsf)

#Read in data
pollution <- read_csv("pollutioneffectmortality.csv") 

#Check the location column and define a list that we can match countries to and filter the data by
unique(pollution$LOCATION)
latam <- c("MEX","ARG","BOL","BRA","COL","CHL" ,"CUB","DOM" ,"ECU","HTI" ,"GTM", "NIC","PER",
           "PRY", "URY"  ,"VEN","CRI","JAM", "SLV")

df <- pollution %>% 
  filter(TIME >= "2019") %>% 
  filter(LOCATION %in% latam)

#Check the mortality values' range
range(df$Value) 

#Get basemap from naturalearth package

world <- ne_countries(scale = "medium", returnclass = "sf")

latamap <- world %>% 
  filter(iso_a3 %in% latam) %>% 
  select("iso_a3", "name", "admin")

#Merge these together

pollutionmap <- latamap %>% 
  left_join(df, by = c("iso_a3" = "LOCATION"))

## Use the mapsf package ###

#Initialise
mf_init(pollutionmap, expandBB = rep(0, 4), theme = 'brutal') 

#Plot the base map
mf_map(x = pollutionmap)

# Plot the choropleth map
mf_choro(
  x = pollutionmap, var = "Value",
  pal = "Heat",
  nbreaks = 6,  # here, I decided on the number of breaks after testing a few different ones
  leg_title = "Mortality per\n1000,000 population",
  leg_val_rnd = -1,
  add = TRUE
)

# Plot a title
mf_title("Pollution mortality in Latin America, 2019")

# Plot credits
mf_credits("Johanna Jokio, 2023\nSource: OECD\n") # the Source would sit very close to the horizontal line, hence the linebreak

# Plot a scale bar - I found this strange in the 'brutal' theme, as it just draws a horizontal line rather than showing the scale, and I didn't particularly want the line to stretch over the map
mf_scale(size = 2)