PDA

View Full Version : Bookclub: Data Mining with R



bryangoodrich
03-27-2012, 04:11 PM
Here I want to begin a book club for talking about Data Mining with R (http://www.liaad.up.pt/~ltorgo/DataMiningWithR/). I know a few of us here at TS have in our possession or have looked at this book. This format for the club is on-going. You participate at-will. If you don't have the book, check it out this summer and participate! The point is to have a repository of information, questions, and discussion on the contents. These may be theoretical, to which anyone can answer, or they may be specific, to which only those with the book may be supportive. In any case, I hope this minimal format will produce more participation than we've seen in the past (you slackers!).

You can get the data from the website linked above, but better yet, just use their package


install.packages("DMwR")

This gives you the data sets



algae Training data for predicting algae blooms
test.algae (testAlgae) Testing data for predicting algae blooms
algae.sols (algaeSols) The solutions for the test data set for
predicting algae blooms
GSPC A set of daily quotes for SP500
sales A data set with sale transaction reports


This covers the main 3 cases, but not the last (microarray samples). Instead, you have to run this once



source("http://bioconductor.org/biocLite.R")
biocLite()
biocLite("ALL")


Then you can access the data



library(Biobase)
library(ALL)
data(ALL)

vinux
03-28-2012, 08:38 AM
I am in. I could do something productive in the Finance case study.

laurits
06-05-2012, 07:25 AM
Hi

Have any of you worked through chapter 3 - Predicting Stock Markets?

It's a really good introduction to many useful R functions for predicting and testing. However, the final section leaves you a bit lost. Have any of you worked out how to obtain the predicted signal for today's (or most recent) data point? Would like to hear from you.

Regards,

Laurits