- Scribbled Notes
- Info
- Input/Output
- Operators
- Data Structures
- Vector
- List
- Matrix
- Data Frame
- Vector
- Plotting
- Syntax
- Functions
- Libraries
- Environment
- Emacs
Note: in Blogger's dynamic template, unfortunately name anchors do not work, so you cannot use the above list to jump to the section of interest.
Scribbled Notes
This section contains raw scribbled notes that have to be revised.return(x) - write as a function matrix^-1 with solve(matrix) x'A^-1x as x*solve(A,x)
Info
search() lists all objects in the current environment, without parameter that are all objetcs in the global environment. Those objects are usually packages.The contents of packages in the environments listed by search may then be listed by ls(index) or ls('name'). Just ls() is like ls(1), which refers to ".GlobalEnv". For listing the contents of a package, use ls('package:libname').
dir() instead lists objects in directories on the file system, by default the current directory.
library lists all available packages, or loads one when called with a package name.
help(name) and apropos(name) search through the documentation, for exact matches or any item that somewhere contains the word. A shortcut for help(name) is ?name.
args(name) shows the arguments and default values of a function.
Typing the name of any function without parentheses lists the sorce code for this function. This is great to find out in detail what it does, and to learn programming in R.
Input/Output
Save data with save(obj, file="filename") and load it back with load("filename"). The data file is binary, and should end in .rda.Using data() to load a dataset R searches for data files in data subdirs of the working directory or directories of loaded packages.
- .R and .r files are source()ed as R source code
- .RData and .rda are loaded as binary files
- .tab .txt .csv are read with read.table().
Operators
<-,= assignment ==,<= comparison %o% outer product %*% matrix multiplication : sequence generation *,/,+,- elementwise multiplication, divison, addition and substraction |, & list or, and ||,&& expression short-circuiting atomic or, and
Datastructures
The most irritating thing for me as a beginner with R is the datastructures that vary quite a bit from other programming languages, seem redundand and sometimes not very, well, structured.For starters, INDEXES START FROM 1. Not from zero, like any well-behaved index should.
There are vectors, arrays, matrices, factors, lists, and data frames. R knows no scalars. Most of the basic indexing and naming stuff that applies to all these datastructures is covered under Vector.
linear | rectangular | |
all same type | vector | matrix |
mixed type | list | data frame |
Literals and Names
TRUE, FALSE, NANames are case sensitive, must start with a letter and may contain digits, letters and the dot, NO underscore!
Vector
Vectors are the simplest kind of list object. All elements must be of the same type (logical, integer, real complex or character). Even they can be indexed via name. Note that literal vectors are created by the c() function, not just by parentheses. Missing values are represented by NA.Creation | c(2,3,4) 1:10 seq(-5,5,by=.2) rep(x,times=5) a>2 |
||||||||||||||
Names | names(x) = c("Frodo", "Bilbo", "Sam") c("Frodo"="Ringbearer", "Bilbo"="Old One", "Sam"="Sidekick") |
||||||||||||||
Indexing |
|
||||||||||||||
Useful funcs | sum mean var length sort |
Factors
Factors are vectors that fall into discrete classes. Levels are the different unique values of a factor.Creation | factor(c("Man", "Orc", "Orc", "Elf", "Man")) |
Levels | levels(x) |
Useful funcs | tapply(vector, factor, function) |
List
Lists are like vectors, but can contain mixed elements of any kind of object, especially other lists. So you can build up complex data structures from them (hello, Lisp!).Creation | list(elements) as.list(vector) | ||||||||
Indexing |
|
Array
Arrays are lists with more than one dimension?Matrix
A matrix is a two dimensional vector.Creation | matrix(data,nrow,ncol) as.matrix(object) rbind(vec1, vec2) row-wise |
||||||||||||||||
Useful funcs | dim | ||||||||||||||||
Indexing | For indexing matrices there are two ways: one, treating the matrix as
one large vecor. This method is used if an index of only one dimension
is given. Elements are counted running through cols top to bottom,
then left to right, compare as.vector() and the indexing
under vector. Two, treating the matrix as
two-dimensional. This is used if a two dimensional index is given
(using a comma):
|
Notes on indexing: Other than in data frames, indexing only a single dimension returns a single element, not a whole column.
Data Frame
A data frame looks like a matrix but may have differend types in different columns. Each column is a vector.Creation | |||||||||||||||||||||||
Useful funcs | |||||||||||||||||||||||
Indexing |
|
Plotting
plot() for general plotting. pch='.' to use dots as characters.abline(intercept, slope) draws a line into the existing plot.
Syntax
# comments Lexical (static) scoping All vars that are params or assigned to in a function are local, all others are expected as free (try to look up in enclosing environments, up to global) Objects Access (indices count from 1 not from 0) A[M==2] # all elems that are == 2 Function definition a { } block is also an expression, it evaluates to the last statement within funcname <- function(param,..,defparam=expr) expr the expression ... may be used for pass-through argument lists if (expr1) expr2 else expr3 for (var in vector) expr break,next switch ( var, key1 = statement, kex2 = statement) while (cond) expr repeat expr # must be broken by break from within is.null(item) # Method calls
Useful Functions
Packages update.packages package.contents library/require searchObject creation c vector array matrix data.frame list environment rep seq
Lists/Vectors unlist
Hashes/Environments environment ls get exists
Vectors c vector names
Arrays (Vectors with dim) array aperm dim outer
Matrices (2D-arrays) matrix t crossprod diag cbind rbind solve det eigen svd lsfit dist nrow ncol row col scale cor var cov
Lists list attach detach
Data Frames data.frame names row.names methods as.matrix
Interactive getwd edit
Coding dir mode any all lapply substitute eval table iter length unique as.function as.numeric
Debugging/Optimizing system.time
Regexen grep grep sub match
Info help apropos/find example search ls/objects methods data library
I/O data source load cat write.table read.table library/require
Math sqrt prod sum cumprod/cumprod density
Vizualisation heatmap image plot rug boxplot pairs coplot qqplot hist dotchart persp Lowlevel: points lines text axis title legend General Params: par
Stat sd var mean median median stem hist qqnorm qqline qqplot ecdf norm (dnorm=density, pnorm=cumul. density, qnorm=quantile fkt, rnorm=simulation)
Libraries
Rcmd INSTALL pkgs # where pkgs is a tar.gz file or dir locationlibraries are installed under .Library in the following structure:
mylib lib name | CONTENTS | DESCRIPTION | INDEX created by Rdindex man > INDEX | TITLE deprecated, put it in Title: under DESCRIPTION | README optional | +---chtml | ? +---help | AnIndex | 00Titles R help files, may be in zip file | caha | clin2mim ... etc | +---html | 00Index.html html help files, may be in zip file | caha.html | clin2mim.html ... etc +---latex | caha.tex latex help files, may be in zip file | clin2mim.tex ... etc | +---Man | caha.rd R help files in R documentation format, may be in zip file | clin2mim.rd ... etc | +---R | mylib the actual library file with R code | \---R-ex fetchAvgDiff.R code examples, may be in zip file firstpass.R ...
Environment
Initialisation sequence: Rprofile.site, .Rprofile, .RData, .First()- $R_PROFILE || $R_HOME/etc/Rprofile.site is the site init file
-
.Rprofile is sourced if
- R is invoked from the same dir or
- it's in your home dir
- .First() in any of the files executed
R and Emacs
To add R to your emacs, first install R to your machine. On windows there is a program called Rterm, which provides a command line interface to R.Then, Install the Emacs ESS package (if it was not in the default packages), byte compile it like this: (byte-compile-file "d:/Programme/emacs-21.2/lisp/progmodes/perl-mode.el") and tell emacs to load it at startup in your .emacs file, like this: (load "d:/Programme/emacs-21.2/ess-5.1.24/lisp/ess-site" t)
Now you only have to let Emacs know where to look for the Rterm executable. This is done by adding the path to the executable to your Windows path variable, on Win2000 you can do this via Properties on the My Machine Icon.
You start an R-process with M-x R.
You send a buffer region to R with C-c C-r, a function with C-c C-f and the whole buffer with C-c C-b. (memo copy region/function/buffer)