Juan Carlos Borrás

Visualizing political divergences: 2012 local elections in Helsinki

The newspaper Helsingin Sanomat has been running for more than a decade Vaalikone which is a voter advice web application. In short: candidates answer a given poll and their results are stored, then voters visiting the web site can take the same questionnaire and the candidates with the closest answers are shown to the voter.

It is fairly easily to construct arguments against it (the most obvious is the presence or absence of important political issues in the questionnaire) but I personally believe that such tool helps to navigate among the more than a thousand candidates here in Helsinki running in the local elections of next Sunday.

Least an opportunity to put on display what are the candidates' positions on some issues should you need to remind them later on.

Politics aside, I was wondering if from the questionnaire results I could represent candidates and/or party differences in some way. That is, I would like to map every single candidate to a 2-D map where their political differences would be somehow displayed. Kohonen's SOM comes to my mind first but I think I am going to go spectral and see how a quick PCA solution would look like.

Some quick notes on the questionnaire. All possible answers come from discrete sets (i.e. {Yes, No} or {Totally agree, Mostly Agree, Indifferent, Somehow disagree, Totally disagree}). Luckily a relationship of order can be establised among all the elements of each set. Hence I have mapped them to numerical equidistant values between -1 and 1 (for the example above the mapping returns \( \{-1, 1\} \) and \( \{-1,-0.5, 0, 0.5, 1\} \) respectively) which comes handy.

Also no political interpretation is done on the scale beyond the relationship of order, that is for any issue full agreement will be mapped to 1 and full disagreement to -1 independently on whether the issue is considered progressive or conservative and all other possibilities for the answer are consistent with the relationship of order in each set.

Once the questionnaire answers are in place, and I have chosen only the candidates for Helsinki for this analysis, a little bit of R code (which you will find at the end of this post) can go a long way.

Let's start by visualizing parties. Selection is arbitrary so as not to clutter the graph with parties represented with a similar color.

comparison.plot(subset(X, Party %in% c("Keskusta", "Kokoomus", "Perussuomalaiset", "SDP")))
Comparison KESK-KOK-PS-SDP

SDP, KESK and KOK have been the largest parties representing approximately the left, center and right views respectively of the political spectrum. Notice that their cluster centers would roughly define a straigt line. PS is a late arrival to the local political arena with distinctly different views on many political aspects.

   Party %in% c("Vasemmistoliitto", "Vihreät", "RKP", "Kristillisdemokraatit")))
Comparison KD-RKP-VAS-VIHR

VAS, VIHR, RKP and KD are parties with lesser national support though Helsinki is a stronghold of VIHR. Notice that while VIHR and VAS have clearly separated cluster centers both RKP and KD show a significant amount of overlap on local issues.

comparison.plot(subset(X, Party %in% c("Piraattipuolue", "SKP")))
Comparison PIR-SKP

The graph above shows the disparity between views of members of the Pirate Part and the Communist Party of Finland. Notice for instance that the variation of views of the later is much narrower specially when it comes to political views.

Ok so now that we have seen how the things works let's clear a little bit the clutter in some of the graphs above.

comparison.plot(subset(X, Party %in% c("Kokoomus", "Keskusta")))
Comparison KESK-KOK
comparison.plot(subset(X, Party %in% c("Kokoomus", "SDP")))
Comparison KOK-SDP
comparison.plot(subset(X, Party %in% c("Kokoomus", "Perussuomalaiset")))
Comparison KOK-PS

Which in general show a larger agreement on local issues than their overal political views (larger overlap in the graphs on the left) even though in the comparison SDP/KOK the overlap is smaller.

Let's have a closer look to the RKP/KD case which heavily overlap on local issues.

comparison.plot(subset(X, Party %in% c("RKP", "Kristillisdemokraatit")))
Comparison KD-RKP

Then we could compare the parties who got the most seats in the city council, we will do a side by side comparison so to avoid confussion with SDP and VAS:

comparison.plot(subset(X, Party %in% c("Kokoomus", "Vihreät")))
Comparison KOK-VIHR
comparison.plot(subset(X, Party %in% c("SDP", "Vihreät")))
Comparison SDP-VIHR
comparison.plot(subset(X, Party %in% c("Vasemmistoliitto", "Vihreät")))
Comparison VAS-VIHR

And I will stop here.

Voters do have plenty of choices, and may in some circumpstances find candidates with very similar views to them coming from unsuspected parties.

From the graphs above however I find particularly suggestive the idea of voting not for the candidate the agrees most with one's views but for some candidate that pull away from the party cluster center but in a particular opposing direction so that it will hopefully bias a little bit more the decission making process. But that is just food for thought.

The R code for manipulating the questionnaire results and the function for generating the graphs above is as follows:


x <- read.csv("helsinki.csv", stringsAsFactors = FALSE)
questions <- sapply(seq(15), function(x) paste("X", x, sep = ""))
values <- sapply(seq(16, 25), function(x) paste("X", x, sep = ""))

xq <- x[, c("number", "party", questions)]  ## answers on local politics
xv <- x[, c("number", "party", values)]  ## answers on the candidate values

pca.encoding <- function(x.in, k = 2) {
    z <- scale(x.in, scale = FALSE)  # keep the scale (input in [-1,1] anyway)
    w <- t(z) %*% z  # w is the covariance matrix
    W <- svd(w)
    as.data.frame(z %*% W$u[, 1:k])

princomp.encoding <- function(x.in, k = 2) {
    pc <- princomp(x.in, center = TRUE, scores = TRUE)
    r <- predict(pc, x.in)[, seq(k)]  ## Could have used pc$scores too (ok,)
    colnames(r) <- sapply(seq(k), function(s) paste("V", s, sep = ""))

## Choose your choice, but results will be the same 
f <- pca.encoding
f <- princomp.encoding

Z <- lapply(list(local.issues = xq, political.views = xv), function(x.in) {
    xx <- as.matrix(x.in[, grepl("^X\\d{1,2}$", colnames(x.in))])
    data.frame(Number = x.in$number, Party = x.in$party, f(xx), stringsAsFactors = FALSE)

## Sorting out questions by type
Z$local.issues$Qtype <- "Local Issues"  ## questions on local politics
Z$political.views$Qtype <- "Political Views"  ## Personal political views by the candidate
X <- Reduce(rbind, Z)

parties.colormap <- c(Keskusta = "#1b9345ff", Kokoomus = "#00577dff", 
  Kristillisdemokraatit = "#f7931dff", Perussuomalaiset = "#edd866ff", 
  Piraattipuolue = "#000000ff", RKP = "#007ac9ff", SDP = "#ed1b24ff", 
  SKP = "#ff0000", Vasemmistoliitto = "#cd0009ff", Vihreät = "#61bf1aff")

# Fillable shapes only
parties.shapes <- c(Keskusta = 15, Kokoomus = 16, Kristillisdemokraatit = 17, 
    Perussuomalaiset = 15, Piraattipuolue = 16, RKP = 17, SDP = 15, SKP = 16, 
    Vasemmistoliitto = 17, Vihreät = 15)

comparison.plot <- function(x.in) {
    p <- ggplot(x.in, aes(V1, V2, color = Party)) + geom_density2d() +
      facet_grid(. ~ Qtype)
    p <- p + geom_point(alpha = 0.5, size = 5) + 
      scale_color_manual(values = parties.colormap[x.in$Party])
    p <- p + scale_x_continuous(name = "First component")
    p <- p + scale_y_continuous(name = "Second component")
    p + labs(title = "Two dimensional projection of candidates' profile")