R Clustering Tree Plot


hclust() function performs hierarchical cluster analysis. cutree() function cuts a tree, e.g., as resulting from hclust, into several groups either by specifying the desired number(s) of groups or the cut height(s).


hclust(d, method = "complete", members = NULL)
cutree(tree, k = NULL, h = NULL)

method: the agglomeration method to be used. This should be (an unambiguous abbreviation of) one of "ward", "single", "complete", "average", "mcquitty", "median" or "centroid".
members: NULL or a vector with length size of d. See the ‘Details’ section.
tree: a tree as produced by hclust. cutree() only expects a list with components merge, height, and labels, of appropriate content each.
k: an integer scalar or vector with the desired number of groups
h: numeric scalar or vector with heights where the tree should be cut.

Let's first have a look of our data file named clustering.csv:

elements S1 S2 S3 S4 S5 S6 S7 S8
R1 -0.0027 0.1057 0.1976 0.0209 0 0.0089 0.0082 0.0209
R2 0 -0.1204 0.2627 0 0 0.283 0.2076 -0.0158
R3 0 -0.1204 0.2627 0 0 0.283 0.2076 -0.0158
R4 0.0142 0 -0.454 0.0101 -0.0213 -0.0084 -0.0121 0.0083
R5 0 0 -0.2334 0.007 0.4151 0 0.0987 0.021
R6 0.0381 0.0644 0.2302 0 0 -0.0476 0.2432 -0.0069
R7 0.0381 0.0644 0.2302 0 0 -0.0476 0.2432 -0.0069
R8 0.0381 0.0644 0.2302 0 0 -0.0476 0.2432 -0.0069
R9 0.0891 -0.1022 -0.4466 -0.4877 -0.0175 -0.0523 -0.4792 -0.0547
R10 0.0046 -0.1539 -0.4645 0 -0.0282 0 -0.0217 0.017
R11 0.0706 0.028 0.3626 0 0.0196 -0.0094 0.3086 0
R12 0.0311 0.0759 0.2119 0 -0.0022 0 0 0.0117
R13 0.0013 0.0702 -0.3176 0.0152 0.0095 -0.0224 0.2069 0.005
R14 0.0491 0.0525 -0.4329 0.0237 -0.0038 -0.0224 0.2065 0.005
R15 0.0256 0.0579 0.1846 0.0024 0.0029 -0.0165 0.4781 -0.0123
R16 -0.0061 -0.1554 -0.0635 0.0121 -0.0282 0 -0.016 0.017
R17 -0.0061 -0.1554 -0.0635 0.0121 -0.0282 0 -0.016 0.017


A simple unsupervised hierarchical clustering:

>x <- read.csv("clustering.csv", header=T, dec=".",sep=",")
>data.hclust <- hclust(dist(t(x[,2:ncol(x)])),method="complete")
>plot(data.hclust)



Let's add some annotations and use cutree to divide the cluster into several groups:

>label <- data.hclust$labels
>for (i in 1:length(label)){
> if (i %% 2 == 1) {label[i]<- paste("control_",label[i],sep="");}
>}
>data.hclust$labels <- label
>plot(data.hclust,pointsize=15,units="px",
+ main="Hierarchical Clustering",xlab="Samples")
>rect.hclust(data.hclust,k=4,border="blue")
>groups<-cutree(data.hclust,k=4)