R split Function


split() function divides the data. unsplit() funtion does the reverse.

split(x, f, drop = FALSE, ...)
split(x, f, drop = FALSE, ...) <- value
unsplit(value, f, drop = FALSE)


x: vector, data frame
f: indices
drop: discard non existing levels or not

Following file has been used for ANOVA analysis:

(Download the data file)

Let first read in the data from the file:

>x <- read.csv("anova.csv",header=T,sep=",")


Split the "Expression" values into two groups based on "Gender" variable, "f" for female group, and "m" for male group:

>g <- split(x$Expression, x$Gender)
>g

$f
[1] -0.66 -1.15 -0.30 -0.40 -0.24 -0.92 0.48 -1.68 -0.80 -0.55 -0.11 -1.26
[13] -0.11 0.13 0.81 0.45 0.74 -0.31 -0.18 -0.08 0.54 -0.35 0.38 -0.39
[25] -1.49 -0.77 -0.92 -0.35 0.26 -0.78 1.20 0.06 -0.68 -0.44 0.93 -0.35
[37] 0.11 -0.12 -0.22 0.29 -0.67 -0.03 -0.57 0.19 -1.80 -0.81 1.80 -0.99
[49] -2.22 -1.06 0.06 -1.68 -0.64 0.29 -0.13 -0.84 0.44 -1.32 -0.54 -0.05
[61] 0.23 0.38 0.35 0.30 -0.33 0.79 -0.06 -0.88 0.32 -0.45 0.21 -2.03
[73] 0.59 -0.92 -0.07 -0.39 -0.98 -0.11 -0.73 -1.01 -0.50 -0.16 -0.59 1.13
[85] 1.01 0.21 -0.21 -1.05 0.10 -1.81 -1.18 0.49 -1.74 -1.57 0.46 1.31
[97] 0.44 -2.08 -1.62 -1.53 0.03 -0.42 -1.86 -1.99 -0.25 -2.11 -0.93 0.42
[109] -1.13 -0.92 0.38 -2.01 1.42 0.10 -2.17 0.13 -1.75 -1.18 0.85 0.64
[121] 0.97 -0.72 -0.04 0.38 -1.87 -2.09 -1.54 0.09 -0.25 0.51 0.33 -1.29
[133] -0.51 -0.50 -0.52
$m
[1] -0.54 -0.80 -1.03 -0.41 -1.31 -0.43 1.01 0.14 1.42 -0.16 0.15 -0.62
[13] -0.42 -0.35 -0.42 0.32 -0.57 -0.07 -0.06 0.02 -0.39 -0.74 -0.09 -0.03
[25] 0.18 0.25 -0.39 -0.24 -0.30 0.25 -0.42 0.54 0.03 -0.66 0.30 -0.38
[37] -0.03 -0.62 0.14 -0.77 -0.09 -0.80 -0.41 -0.88 -0.27 -0.07 -1.60 -0.79
[49] -0.33 1.31 -0.33 -0.43 -0.92 -0.29 -1.02 0.41 -0.81 0.61 -0.63 -0.49
[61] 0.18 0.17 0.24 -0.12 -0.24 -0.26 1.48 0.04 -0.56 -1.12 -0.19 0.27
[73] -1.28 -0.38 -0.83 0.25 -0.14 0.29 0.18 0.44 -0.28 0.08 -0.29 -0.62
[85] -0.87 0.19 0.34 0.54 0.02 -0.39 1.25 -0.51 0.05 -0.36 -0.19 -0.10
[97] 0.08 -1.16 1.58 0.59 -0.19 0.56 -0.22 -0.77 -0.12 -0.76 0.35 -0.69
[109] -0.20 -0.44 -1.98 0.00 -0.54 -0.61 -1.39 0.44 0.20 -0.78 -0.96 -0.10
[121] 0.39 -1.11 -1.78 -1.46 1.00 -1.34 -0.72 -0.47 0.15 1.67 0.81 0.16
[133] -0.39 -0.40 1.18 -0.30 -1.91 -1.14 0.13 -0.34 -0.44 0.52 1.11 -0.89
[145] -0.17 -1.62


Calculate the length, mean value of each group:

>sapply(g,length)

f m
135 146


>sapply(g,mean)

f m
-0.3946667 -0.2227397


You may use lapply, return is a list:

>lapply(g,mean)

$f
[1] -0.3946667
$m
[1] -0.2227397


unsplit() function combines the groups:

>unsplit(g,x$Gender)