pacman::p_load(GGally, parallelPlot, tidyverse)Hands-on Exercise 5.4: Visual Multivariate Analysis with Parallel Coordinates Plot
1 Getting Started
1.1 Install and loading R packages.
1.2 Importing the data
wh <- read_csv("../../data/WHData-2018.csv")2 Plotting Static Parallel Coordinates Plot
2.1 Plotting a simple parallel coordinates
ggparcoord(data = wh,
columns = c(7:12))
Notice that only two argument namely data and columns is used. Data argument is used to map the data object (i.e. wh) and columns is used to select the columns for preparing the parallel coordinates plot.
2.2 Plotting a parallel coordinates with boxplot
Click to view the code.
ggparcoord(data = wh,
columns = c(7:12),
groupColumn = 2,
scale = "uniminmax",
alphaLines = 0.2,
boxplot = TRUE,
title = "Parallel Coordinates Plot of World Happines Variables")
Things to learn from the code chunk above.
groupColumnargument is used to group the observations (i.e. parallel lines) by using a single variable (i.e. Region) and colour the parallel coordinates lines by region name.scaleargument is used to scale the variables in the parallel coordinate plot by usinguniminmaxmethod. The method univariately scale each variable so the minimum of the variable is zero and the maximum is one.alphaLinesargument is used to reduce the intensity of the line colour to 0.2. The permissible value range is between 0 to 1.boxplotargument is used to turn on the boxplot by using logicalTRUE. The default isFALSE.titleargument is used to provide the parallel coordinates plot a title.
2.3 Parallel coordinates with facet
Since ggparcoord() is developed by extending ggplot2 package, we can combination use some of the ggplot2 function when plotting a parallel coordinates plot.
In the code chunk below, facet_wrap() of ggplot2 is used to plot 10 small multiple parallel coordinates plots. Each plot represent one geographical region such as East Asia.
Click to view the code.
ggparcoord(data = wh,
columns = c(7:12),
groupColumn = 2,
scale = "uniminmax",
alphaLines = 0.2,
boxplot = TRUE,
title = "Multiple Parallel Coordinates Plots of World Happines Variables by Region") +
facet_wrap(~ Region)
One of the aesthetic defect of the current design is that some of the variable names overlap on x-axis.
2.4 Rotating x-axis text label
To make the x-axis text label easy to read, let us rotate the labels by 30 degrees. We can rotate axis text labels using theme() function in ggplot2 as shown in the code chunk below
Click to view the code.
ggparcoord(data = wh,
columns = c(7:12),
groupColumn = 2,
scale = "uniminmax",
alphaLines = 0.2,
boxplot = TRUE,
title = "Multiple Parallel Coordinates Plots of World Happines Variables by Region") +
facet_wrap(~ Region) +
theme(axis.text.x = element_text(angle = 30))
Thing to learn from the code chunk above:
- To rotate x-axis text labels, we use
axis.text.xas argument totheme()function. And we specifyelement_text(angle = 30)to rotate the x-axis text by an angle 30 degree.
2.5 Adjusting the rotated x-axis text label
Rotating x-axis text labels to 30 degrees makes the label overlap with the plot and we can avoid this by adjusting the text location using hjust argument to theme’s text element with element_text(). We use axis.text.x as we want to change the look of x-axis text.
Click to view the code.
ggparcoord(data = wh,
columns = c(7:12),
groupColumn = 2,
scale = "uniminmax",
alphaLines = 0.2,
boxplot = TRUE,
title = "Multiple Parallel Coordinates Plots of World Happines Variables by Region") +
facet_wrap(~ Region) +
theme(axis.text.x = element_text(angle = 30, hjust=1))
3 Plotting Interactive Parallel Coordinates Plot: parallelPlot methods
parallelPlot is an R package specially designed to plot a parallel coordinates plot by using ‘htmlwidgets’ package and d3.js. In this section, you will learn how to use functions provided in parallelPlot package to build interactive parallel coordinates plot.
3.1 The basic plot
The code chunk below plot an interactive parallel coordinates plot by using parallelPlot().
Click to view the code.
wh <- wh %>%
select("Happiness score", c(7:12))
parallelPlot(wh,
width = 320,
height = 250)Notice that some of the axis labels are too long. You will learn how to overcome this problem in the next step.
3.2 Rotate axis label
In the code chunk below, rotateTitle argument is used to avoid overlapping axis labels.
One of the useful interactive feature of parallelPlot is we can click on a variable of interest, for example Happiness score, the monotonous blue colour (default) will change a blues with different intensity colour scheme will be used.
Click to view the code.
parallelPlot(wh,
rotateTitle = TRUE)3.3 Changing the colour scheme
We can change the default blue colour scheme by using continousCS argument as shown in the code chunk below.
Click to view the code.
parallelPlot(wh,
continuousCS = "YlOrRd",
rotateTitle = TRUE)3.4 Parallel coordinates plot with histogram
In the code chunk below, histoVisibility argument is used to plot histogram along the axis of each variables.