Introduction

In this project I’m going to show you how to build an interactive visual to display election results using R, RStudio and the leaflet package. To follow along you’ll need a basic understanding of the R programming language.

Load Libraries for the Project

library(rgeos)
library(maptools)
library(leaflet)

The Data

The map data is in the form of a shapefile I downloaded mine from the Australian Electoral Commission to minimise inconsistencies with the election data (…still found some though) https://aec.gov.au/Electorates/gis/gis_datadownload.htm

Election Data https://results.aec.gov.au/24310/Website/HouseDivisionalResults-24310.htm

Plotting a shape file

Using readShapeSpatial from the maptools package, we pass the shape file in as it’s ownly required parameter and assign it to a variable. use the plot() function to see how it looks.

vicmapdata <- readShapeSpatial(".\\vic-july-2018-esri\\E_AUGFN3_region.shp")
plot(vicmapdata)

Read the election data and pass it into a variable

electiondata <- read.csv(".\\vic_elect_results.csv")

Data Wrangling

First we take a look at the structure of the data

summary(electiondata)
##      Division  State                  Successful.Party   TCP.Margin
##  Aston   : 1   VIC:38   Australian Labor Party:21      1,090  : 1  
##  Ballarat: 1            Independent           : 1      10,934 : 1  
##  Bendigo : 1            Liberal               :12      11,289 : 1  
##  Bruce   : 1            The Greens            : 1      11,326 : 1  
##  Calwell : 1            The Nationals         : 3      12,134 : 1  
##  Casey   : 1                                           12,453 : 1  
##  (Other) :32                                           (Other):32

First of all, I don’t like the names of the variables

colnames(electiondata) <- c("Divisions", "State", "Party", "Marginal Votes")
summary(electiondata)
##     Divisions  State                       Party    Marginal Votes
##  Aston   : 1   VIC:38   Australian Labor Party:21   1,090  : 1    
##  Ballarat: 1            Independent           : 1   10,934 : 1    
##  Bendigo : 1            Liberal               :12   11,289 : 1    
##  Bruce   : 1            The Greens            : 1   11,326 : 1    
##  Calwell : 1            The Nationals         : 3   12,134 : 1    
##  Casey   : 1                                        12,453 : 1    
##  (Other) :32                                        (Other):32

This is looking better Next we’ll take a look at the Division names in the shapefile data. This is important because the electiondata Divisions have to map exactly to the shapefile Divisions

summary(vicmapdata)
## Object of class SpatialPolygonsDataFrame
## Coordinates:
##         min       max
## x 140.96168 149.97668
## y -39.15919 -33.98043
## Is projected: NA 
## proj4string : [NA]
## Data attributes:
##    E_div_numb       Elect_div     Numccds          Actual      
##  Min.   : 1.00   Aston   : 1   Min.   :267.0   Min.   :100151  
##  1st Qu.:10.25   Ballarat: 1   1st Qu.:337.5   1st Qu.:105494  
##  Median :19.50   Bendigo : 1   Median :354.0   Median :107416  
##  Mean   :19.50   Bruce   : 1   Mean   :359.1   Mean   :106954  
##  3rd Qu.:28.75   Calwell : 1   3rd Qu.:387.0   3rd Qu.:109115  
##  Max.   :38.00   Casey   : 1   Max.   :467.0   Max.   :112265  
##                  (Other) :32                                   
##    Projected        Total_Popu   Australian   Area_SqKm            Sortname 
##  Min.   :107381   Min.   :0    Min.   :0    Min.   :   40.46   Aston   : 1  
##  1st Qu.:109070   1st Qu.:0    1st Qu.:0    1st Qu.:   80.59   Ballarat: 1  
##  Median :109986   Median :0    Median :0    Median :  170.28   Bendigo : 1  
##  Mean   :110372   Mean   :0    Mean   :0    Mean   : 5987.93   Bruce   : 1  
##  3rd Qu.:111484   3rd Qu.:0    3rd Qu.:0    3rd Qu.: 2619.61   Calwell : 1  
##  Max.   :113924   Max.   :0    Max.   :0    Max.   :81962.21   Casey   : 1  
##                                                                (Other) :32

It looks like the Divisions in the shapefile are under the variable Elect_div

vicmapdata$Elect_div
##  [1] Aston       Ballarat    Bendigo     Bruce       Calwell     Casey      
##  [7] Chisholm    Cooper      Corangamite Corio       Deakin      Dunkley    
## [13] Flinders    Fraser      Gellibrand  Gippsland   Goldstein   Gorton     
## [19] Higgins     Holt        Hotham      Indi        Isaacs      Jagajaga   
## [25] Kooyong     La Trobe    Lalor       Macnamara   Mallee      Maribyrnong
## [31] Mcewen      Melbourne   Menzies     Monash      Nicholls    Scullin    
## [37] Wannon      Wills      
## 38 Levels: Aston Ballarat Bendigo Bruce Calwell Casey Chisholm ... Wills

So far so good. There are 38 levels in both electiondata$Divisions and vicmapdata$Elect_div which is a good start. We can inspect them manually at this size but what if we were doing the whole country, or a much larger one? We need to do this intelligently and have R do the heavy lifting for us.

setdiff(electiondata$Division, vicmapdata$Elect_div)
## [1] "McEwen"

We do have an inconsistency. It’s telling us that “McEwen” is in the first vector, but not in the second. Let’s find the position of “McEwen” in the first vector, and see what is in the corresponding position in the second vector.

which(electiondata$Division=="McEwen")
## [1] 31

So it’s at position 31. What’s at position 31 in the other vector?

vicmapdata$Elect_div[31]
## [1] Mcewen
## 38 Levels: Aston Ballarat Bendigo Bruce Calwell Casey Chisholm ... Wills

Ok so it’s small but the first vector has an ‘E’ where the second has an ‘e’. To stay inside the code, and not have to fix this manually (remember this needs to be scalable), we’ll map the data from one vector into the other (now that we know everything else is the same)

vicmapdata$Elect_div <- electiondata$Divisions

Now let’s run a check like we did before to make sure the values match up

setdiff(electiondata$Division, vicmapdata$Elect_div)
## character(0)

That’s better!

Creating the Visual

Here comes the fun bit! We’re going to use leaflet instead of the plot function we used above. We’ll pass it our ‘vicmapdata’. addPolygons will draw the boundaries for us when we set stroke=True We’ll use white lines with a thickness of 1.5

leaflet(vicmapdata) %>%
  addPolygons(
    stroke = TRUE, 
    color = 'White', 
    weight = 1.5, 
    )

At this point, we have 2 data variables. 1 for the mapping data, which we have just demonstrated. Now we need to map our election data to the plot.

First we’ll create the labels. This means that when we hover our mouse over an electorate, it will tell us the relevant statistics. We’re using some HTML tools for formatting once the plot is rendered.

mylabels <- paste(
  "Electorate: ", vicmapdata$Elect_div, "<br/>",
  "Party: ", electiondata$Party, "<br/>",
  "Margin(votes): ", electiondata$`Marginal Votes`
) %>%
  lapply(htmltools::HTML)

Now we’ll plug our mylabels variable into our previous leaflet code chunk

leaflet(vicmapdata) %>%
  addPolygons(
    stroke = TRUE, 
    color = 'White', 
    weight = 1.5, 
    label = mylabels,
    labelOptions = labelOptions( 
      style = list("font-weight" = "normal", padding = "3px 8px"), 
      textsize = "13px", 
      direction = "auto"
    )
    )

So far so good. Now we’ll add some colours. I created a variable called factpal (factor palette). We’re using the colorFactor function from the leaflet library. topo.colors(5) tells the function that we need 5 colours within the topo.colors palette and unique() allows us to assign a colour to a specific level. In this case we want every electorate with the same winning party to be assigned the same colour.

factpal <- colorFactor(topo.colors(5), unique(electiondata$Party))

And now we insert the factpal function into our leaflet code chunk

leaflet(vicmapdata) %>%
  addPolygons(
    fillColor = ~factpal(electiondata$Party), 
    stroke = TRUE, 
    color = 'White', 
    weight = 1.5, 
    label = mylabels,
    labelOptions = labelOptions( 
      style = list("font-weight" = "normal", padding = "3px 8px"), 
      textsize = "13px", 
      direction = "auto"
    )
    )

Looking better yet

From here we just need to add some text and a legend.

Text: We need a title and some reference links to the data

Legend: Which colours equate to which party

Here is the code for the text labels. We’re essentially just feeding in HTML

htmltitle <- "<h5> How Victoria voted in the 2019 Federal Election | House of Representatives</h5>"

references <- "<h5>References</h5><a target='_blank' href='https://results.aec.gov.au/24310/Website/HouseDivisionalResults-24310.htm'><h5>Election Data</h5></a><a target='_blank' href='https://aec.gov.au/Electorates/gis/gis_datadownload.htm'><h5>Geometric Data</h5></a>"

Now we just plug in our title and reference data from above and add the legend which you can see below under the addLegend function.

leaflet(vicmapdata) %>%
  addPolygons(
    fillColor = ~factpal(electiondata$Party), 
    stroke = TRUE, 
    color = 'White', 
    weight = 1.5, 
    label = mylabels,
    labelOptions = labelOptions( 
      style = list("font-weight" = "normal", padding = "3px 8px"), 
      textsize = "13px", 
      direction = "auto"
    )
    ) %>%
  addLegend( pal=factpal, 
             values=~electiondata$Party, 
             opacity=0.3, 
             title = "Political Party", 
             position = "bottomleft" 
             ) %>%
  addControl(html=htmltitle, position = "topright") %>%
  addControl(html=references, position = "bottomright")

Possible Continuations

  1. Adding a timescale to show how election results have changed over time
  2. Create another visual to see which electorates changed hands
  3. Scale up for the whole of Australia