In this project I’m going to show you how to build an interactive visual to display election results using R, RStudio and the leaflet package. To follow along you’ll need a basic understanding of the R programming language.
library(rgeos)
library(maptools)
library(leaflet)
The map data is in the form of a shapefile I downloaded mine from the Australian Electoral Commission to minimise inconsistencies with the election data (…still found some though) https://aec.gov.au/Electorates/gis/gis_datadownload.htm
Election Data https://results.aec.gov.au/24310/Website/HouseDivisionalResults-24310.htm
Using readShapeSpatial from the maptools package, we pass the shape file in as it’s ownly required parameter and assign it to a variable. use the plot() function to see how it looks.
vicmapdata <- readShapeSpatial(".\\vic-july-2018-esri\\E_AUGFN3_region.shp")
plot(vicmapdata)
Read the election data and pass it into a variable
electiondata <- read.csv(".\\vic_elect_results.csv")
First we take a look at the structure of the data
summary(electiondata)
## Division State Successful.Party TCP.Margin
## Aston : 1 VIC:38 Australian Labor Party:21 1,090 : 1
## Ballarat: 1 Independent : 1 10,934 : 1
## Bendigo : 1 Liberal :12 11,289 : 1
## Bruce : 1 The Greens : 1 11,326 : 1
## Calwell : 1 The Nationals : 3 12,134 : 1
## Casey : 1 12,453 : 1
## (Other) :32 (Other):32
First of all, I don’t like the names of the variables
colnames(electiondata) <- c("Divisions", "State", "Party", "Marginal Votes")
summary(electiondata)
## Divisions State Party Marginal Votes
## Aston : 1 VIC:38 Australian Labor Party:21 1,090 : 1
## Ballarat: 1 Independent : 1 10,934 : 1
## Bendigo : 1 Liberal :12 11,289 : 1
## Bruce : 1 The Greens : 1 11,326 : 1
## Calwell : 1 The Nationals : 3 12,134 : 1
## Casey : 1 12,453 : 1
## (Other) :32 (Other):32
This is looking better Next we’ll take a look at the Division names in the shapefile data. This is important because the electiondata Divisions have to map exactly to the shapefile Divisions
summary(vicmapdata)
## Object of class SpatialPolygonsDataFrame
## Coordinates:
## min max
## x 140.96168 149.97668
## y -39.15919 -33.98043
## Is projected: NA
## proj4string : [NA]
## Data attributes:
## E_div_numb Elect_div Numccds Actual
## Min. : 1.00 Aston : 1 Min. :267.0 Min. :100151
## 1st Qu.:10.25 Ballarat: 1 1st Qu.:337.5 1st Qu.:105494
## Median :19.50 Bendigo : 1 Median :354.0 Median :107416
## Mean :19.50 Bruce : 1 Mean :359.1 Mean :106954
## 3rd Qu.:28.75 Calwell : 1 3rd Qu.:387.0 3rd Qu.:109115
## Max. :38.00 Casey : 1 Max. :467.0 Max. :112265
## (Other) :32
## Projected Total_Popu Australian Area_SqKm Sortname
## Min. :107381 Min. :0 Min. :0 Min. : 40.46 Aston : 1
## 1st Qu.:109070 1st Qu.:0 1st Qu.:0 1st Qu.: 80.59 Ballarat: 1
## Median :109986 Median :0 Median :0 Median : 170.28 Bendigo : 1
## Mean :110372 Mean :0 Mean :0 Mean : 5987.93 Bruce : 1
## 3rd Qu.:111484 3rd Qu.:0 3rd Qu.:0 3rd Qu.: 2619.61 Calwell : 1
## Max. :113924 Max. :0 Max. :0 Max. :81962.21 Casey : 1
## (Other) :32
It looks like the Divisions in the shapefile are under the variable Elect_div
vicmapdata$Elect_div
## [1] Aston Ballarat Bendigo Bruce Calwell Casey
## [7] Chisholm Cooper Corangamite Corio Deakin Dunkley
## [13] Flinders Fraser Gellibrand Gippsland Goldstein Gorton
## [19] Higgins Holt Hotham Indi Isaacs Jagajaga
## [25] Kooyong La Trobe Lalor Macnamara Mallee Maribyrnong
## [31] Mcewen Melbourne Menzies Monash Nicholls Scullin
## [37] Wannon Wills
## 38 Levels: Aston Ballarat Bendigo Bruce Calwell Casey Chisholm ... Wills
So far so good. There are 38 levels in both electiondata$Divisions and vicmapdata$Elect_div which is a good start. We can inspect them manually at this size but what if we were doing the whole country, or a much larger one? We need to do this intelligently and have R do the heavy lifting for us.
setdiff(electiondata$Division, vicmapdata$Elect_div)
## [1] "McEwen"
We do have an inconsistency. It’s telling us that “McEwen” is in the first vector, but not in the second. Let’s find the position of “McEwen” in the first vector, and see what is in the corresponding position in the second vector.
which(electiondata$Division=="McEwen")
## [1] 31
So it’s at position 31. What’s at position 31 in the other vector?
vicmapdata$Elect_div[31]
## [1] Mcewen
## 38 Levels: Aston Ballarat Bendigo Bruce Calwell Casey Chisholm ... Wills
Ok so it’s small but the first vector has an ‘E’ where the second has an ‘e’. To stay inside the code, and not have to fix this manually (remember this needs to be scalable), we’ll map the data from one vector into the other (now that we know everything else is the same)
vicmapdata$Elect_div <- electiondata$Divisions
Now let’s run a check like we did before to make sure the values match up
setdiff(electiondata$Division, vicmapdata$Elect_div)
## character(0)
That’s better!
Here comes the fun bit! We’re going to use leaflet instead of the plot function we used above. We’ll pass it our ‘vicmapdata’. addPolygons will draw the boundaries for us when we set stroke=True We’ll use white lines with a thickness of 1.5
leaflet(vicmapdata) %>%
addPolygons(
stroke = TRUE,
color = 'White',
weight = 1.5,
)
At this point, we have 2 data variables. 1 for the mapping data, which we have just demonstrated. Now we need to map our election data to the plot.
First we’ll create the labels. This means that when we hover our mouse over an electorate, it will tell us the relevant statistics. We’re using some HTML tools for formatting once the plot is rendered.
mylabels <- paste(
"Electorate: ", vicmapdata$Elect_div, "<br/>",
"Party: ", electiondata$Party, "<br/>",
"Margin(votes): ", electiondata$`Marginal Votes`
) %>%
lapply(htmltools::HTML)
Now we’ll plug our mylabels variable into our previous leaflet code chunk
leaflet(vicmapdata) %>%
addPolygons(
stroke = TRUE,
color = 'White',
weight = 1.5,
label = mylabels,
labelOptions = labelOptions(
style = list("font-weight" = "normal", padding = "3px 8px"),
textsize = "13px",
direction = "auto"
)
)