R Significance Analysis of Microarrays (samr)


Significance Analysis of Microarray (SAM) can be done by the 'samr' package. To install the package:

    > source("http://bioconductor.org/biocLite.R")
    > biocLite("samr")

Suppose we have a log2 transformed microarray file named "samr.csv"(can be downloaded at the end of the artical), which have 48 samples. The first 24 samples have phenotype A, and the other 24 samples have phenotype B. We are going to find out the Differentially Expressed Genes (DEGs) between these two groups.

Let first read in the data from the file:

    > x1 = read.csv("samr.csv",header=T,sep=",",dec=".")

Process the data for 'samr' use:

    > xcol <- ncol(x1)
    > xrow <- nrow(x1)-1
    > x2 <- x1[2:xrow,2:xcol]
    > y1 <- c(rep(1,24), rep(2,24))
    > data=list(x=as.matrix(x2),y=y1,logged2=TRUE)

Calculate all significant genes:

    > samr.obj<-samr(data,resp.type="Two class unpaired", nperms=100)
    > delta.table <- samr.compute.delta.table(samr.obj, min.foldchange=0.1,nvals=200)
    > siggenes.table <- samr.compute.siggenes.table(samr.obj, del=0, data, delta.table,all.genes=TRUE)

Note: the "min.foldchange=0.1" means that the fold change for the two groups should be >0.1.

Let's write all FDR < 10% DEGs into a file (the 8th column of siggenes.table is FDR) :

    > a <- siggenes.table$genes.up; # all up regulated genes
    > b <- siggenes.table$genes.lo; # all down regulated genes
    > c <- rbind(a,b)
    > lo <- c[as.numeric(c[,8])<10,]
    > for (i in 1:nrow(lo))
    > {
    >     tp <- as.numeric(as.vector(as.matrix(lo[i,1])))-1;
    >     lo[i,3] <- as.character(as.vector(as.matrix(x1[tp,1])));
    > }
    > write.csv(lo,"DEGs_samr.csv")