Commonly used R commands (statistics)

When I say Ease of Use Improved, I mean you can simply copy, paste and run the codes in this post, without referring to other places, without downloading a data file and read it from R. This is how I like a blog article to be. You don’t need to read the whole article. You just need to Ctrl+F what your need and copy the codes there and run it.

I use R in Windows and sometimes Linux. The version is 2.13.0. The following scripts should be applicable to other versions.

Read a File to a Table

congold<-read.table("C:/Users/Jun/Dropbox/Research/LQCD/exp/96.24.24.24/congrad.old.txt", header=T)

Hmm.. You can’t copy and run this in your system, since you don’t have that file. congold is a table,? the first argument of read.table() is the path of the file. In Windows, you should use “/” in the path instead of “”.

Boxplot

 d = rnorm(10)
 t = rep(c(1,2),c(5,5))
 boxplot(d~t)

Get subset

 df = data.frame(col1=c(1,2,3,4),col2=c(1,1,2,2))
 subset(df,col2==2)

Find out how many unique items in a? list

 a = c(5,5,6,6,6)
 length(unique(a))

Viewing Several Graphs

In Windows

 windows()

In Linux

 X11()

In Mac

 quartz()

?Delete Columns by Names

 df <- data.frame(x=rep(1,3), y=rep(2,3), z=rep(3,3), t=rep(4,3))
 df <- df[,-which(names(df) %in% c("z","t"))]

An easier way:

 df <- data.frame(x=rep(1,3), y=rep(2,3), z=rep(3,3), t=rep(4,3))
 df <- subset(df, select=-c(z,t))

Actually, it is done by selecting the columns you want. So we have the following:

Select Columns by Names

 df <- data.frame(x=rep(1,3), y=rep(2,3), z=rep(3,3), t=rep(4,3))
 df[, c("x","y")]
 df <- data.frame(x=rep(1,3), y=rep(2,3), z=rep(3,3), t=rep(4,3))
 subset(df, select=c(x,y))

Print out Column Names

 df <- data.frame(x=rep(1,3), y=rep(2,3), z=rep(3,3), t=rep(4,3))
 names(df)

Change Column Names

 df <- data.frame(x=rep(1,3), y=rep(2,3), z=rep(3,3), t=rep(4,3))
 names(df)[[1]]="newNameForColumn1"
 df <- data.frame(x=rep(1,3), y=rep(2,3), z=rep(3,3), t=rep(4,3))
 names(df)=c("newNameForColumn1", "newNameForColumn2", "newNameForColumn3","newNameForColumn4")
 df <- data.frame(x=rep(1,3), y=rep(2,3), z=rep(3,3), t=rep(4,3))
 names(df)[which(names(df)=="y")]= "NewNameOf_y"

Reduction Plot

 library(lattice)
 x = 1:100
 y = rnorm(100)
 xyplot(x~y, type=c("r","p"))

Finding out 95%th, 99%th of Each Category

 library(doBy)
 x = rep(c(1,2),50)
 y = rnorm(100)
 summaryBy(y~x, data=df, FUN=function(x){quantile(x,c(0.95,0.99))})
 x = rep(c(1,2),50)
 y = rnorm(100)
 aggregate(y~x, data = df, function(x){quantile(x,0.95)})
 aggregate(y~x, data = df, function(x){quantile(x,0.99)})

Get Median of Each Factor in a data frame (each type has many rows)

 x = rep(c(1,2),50)
 y = rnorm(100)
 aggregate(y~x, data = df, median)

To count rows or columns

 df <- data.frame(x=rep(1,3), y=rep(2,3), z=rep(3,3), t=rep(4,3))
 nrow(df)
 df <- data.frame(x=rep(1,3), y=rep(2,3), z=rep(3,3), t=rep(4,3))
 ncol(df)

Create empty matrix or vector

 mymatrix <- mat.or.vec(2,3)

Replace data in data frame

 tmp = data.frame("a"=c(1,2,3,4))
 selected = tmp == 2
 selected
 tmp[selected] = 22
 tmp

Convert Factor to Number

 size <- factor(c(55,44,33,22,11))
 size
 as.numeric(size)
 levels(size)[size]
 as.numeric(levels(size)[size])

Change the order of colums

 df = data.frame("a"=c(1,1), "b"=c(2,2), "c"=c(3,3))
 df
 df = subset(df, select=c(c,b,a))
 df

Order Data Frame

 df = data.frame(a=c(4,5,6),b=c(9,8,7))
 df = df[order(df$b),]
 df = data.frame(a=c(4,5,6),b=c(9,8,7),c=c(11,12,12))
 df
 df[order(df$c,df$b),]

Too much to organize from my note…

Maybe I’ll pick it up later, nor not….