An introduction to multivariate analysis

Load the dataset for today

library(vegan)
data(dune)
data(dune.env)
table(dune.env$Management)


BF HF NM SF 
 3  5  6  6

Cluster analysis of the dune vegetation

We calculate two dissimilarity indices between sites: Bray-Curtis distance and Chord distance

bray_distance <- vegdist(dune)
# Chord distance, euclidean distance normalized to 1.
chord_distance <- dist(decostand(dune, "norm"))

We perform the cluster analysis. Which is the default clustering method?

Let’s use "average", who will link clusters

library(cluster)
b_cluster <- hclust(bray_distance, method = "average")
c_cluster <- hclust(chord_distance, method = "average")

Let’s plot them side to side

par(mfrow = c(1, 2))
plot(b_cluster)
plot(c_cluster)

par(mfrow = c(1, 1))

Ok, that’s a little bit ugly

par(mfrow = c(1, 2))
plot(b_cluster, hang = -1, main = "", axes = F)
axis(2, at = seq(0, 1, 0.1), labels = seq(0, 1, 0.1), las = 2)
plot(c_cluster, hang = -1, main = "", axes = F)
axis(2, at = seq(0, 1, 0.1), labels = seq(0, 1, 0.1), las = 2)

par(mfrow = c(1, 1))