And now that we have a book club we could use it to hide any plot twists!
It shouldn't be hard to implement. They do it over at the Math Help Forum, and it comes in really handy, especially when you have a big block of something you don't want to clog the visual space; it lets you just toggle it hidden or visible.
And now that we have a book club we could use it to hide any plot twists!
I hope I wasn't a spoiler--I certainly did not intend to be. If I did spoil, please accept my apologies.
Dason I'm proficient (just a smidge away from what a real statistician would describe as dangerous) with [R]. I tried to figure it out for some time; to no avail. I failed and peeked Don't judge.
"If you torture the data long enough it will eventually confess."
-Ronald Harry Coase -
Haha. I wouldn't actually expect anybody to get it. Even the original version I wrote that doesn't do silly things like rename functions and use generic variable names and use convoluted ways to get the letter "R" saved isn't necessarily easy to read.
I decided to try to improve upon my search function to make it more versatile for searching through large data sets (similar to the search button in Microsoft Excel). I changed grep to agrep and added an ignore.case argument so the function is no longer case sensitive and takes approximate matches. I added a variation argument (agrep's max.distance) set at .02. Adjust this to 0 to narrow the results or higher to broaden.
Currently the function works on specific columns of a data frame. I wanted to make it work on a data frame and return all rows that contain any columns with the search term. I thought about using the apply function and then unique to eliminate duplicates. Unfortunately, I can't seem to get this to work. Any ideas?
Search Function Code
EXAMPLECode:Search<-function(term,dataframe,column.name,variation=.02){ te<-substitute(term) te<-as.character(te) cn<-substitute(column.name) cn<-as.character(cn) HUNT<-agrep(te,dataframe[,cn],ignore.case =TRUE,max.distance=variation) dataframe[c(HUNT),] }
Code:#CREATING A FAKE DATA SET SampDF<-data.frame("islands"=names(islands),mtcars[1:48,]) #EXAMPLES Search(cuba,SampDF,islands) Search(New,SampDF,islands) Search(ho,SampDF,islands)#Too much variation Search(ho,SampDF,islands,var=0) Search("Axel Hbeierg",SampDF,islands)#not enough variation Search("Axel Hbeierg",SampDF,islands,var=2) Search(19,SampDF,mpg,0)
Last edited by trinker; 07-02-2011 at 11:02 PM.
"If you torture the data long enough it will eventually confess."
-Ronald Harry Coase -
Link (06-29-2011)
Thanks a lot trinker for posting this! I had been wanting to do something like this for a while...but was too busy (read lazy) to do it. Since it's started though...here's one of my favorites. I should probably edit it though to work for words with spaces as well (it currently doesn't work if there are spaces in the word).
Code:#This function allows you to type in names without the need for " or '. Great when you have lots of text to type. words <- function(...) paste(substitute(list(...)))[-1] #CREATING A FAKE DATA SET SampDF<-data.frame("islands"=names(islands),mtcars[1:48,]) #EXAMPLES words(test, test2, test3) subset(SampDF, select=words(am, gear, carb)) SampDF$islands %in% words(Cuba, Newfoundland, Africa)
trinker (06-29-2011)
I don't think you could keep it the way you have it and modify it so that you could have spaces in your "words". But if you do have words with spaces in them you can just put quotes around that and it will work.
Code:nullfun <- function(...){} words <- function(...) paste(substitute(list(...)))[-1] # works words(hey, you, guys) # Doesn't work words(hey, you, guys, specifically link) # Doesn't work either so I don't think you could #Just modify your function to make it work. nullfun(hey, you, guys, specifically link) #But this works. So you just have to do a little more work # for words with spaces. words(hey, you, guys, "specifically link")
lol. Yeah...I guess that's one solution. That's actually what I currently do when I have words with spaces.
I think that's about the best you'll be able to do too without wrapping the entire input with quotes like so:
Note that I was lazy and if you put spaces after the commas then the spaces are in the words. You could modify it to trim the beginning and ending spaces easily though.Code:words <- function(input){ strsplit(input, ",")[[1]] } #example words("Hey,you,guys")
Search Function (Danger)
In my previous post for the search() function I named an object within the function "t". This is dangerous because [R] has already assigned "t" to something else. This is what lead to my issues with the snow library not loading. Inside of the function I think you're ok, but if you run individual lines outside of the function it's risky. I updated the code in my previous post to represent this change. my apologies.
Now any ideas on how to apply search to all columns of a data frame. My idea was to try apply and then unique but I can't get the apply to work properly.
"If you torture the data long enough it will eventually confess."
-Ronald Harry Coase -
Imagine if you would have overwrote T.
One of the meanest things you could do to somebody who leaves an R script open is put this at the top of the file and run it so they don't notice
which is partially why I try to always use TRUE or FALSE instead of T or F. Imagine trying to debug that.Code:T <- FALSE F <- TRUE
Since I want to contribute something here, I thought I would include an example I coded myself as part of my ALSM project. This comes from chapter 16 (not included on website, yet). The example considers taking all the permutations of the given (base) 9-sequence as assigned to three treatments: (1.1, 0.5, -2.1), (4.2, 3.7, 0.8), and (3.2, 2.8, 6.3). The example concluded by drawing a histogram of the Randomization (Permutation) distribution and overlaying the appropriate F-distribution that approximates it.
I originally coded this using the 'lm' summary function and appropriate indexing to grab the F-values. It took nearly a minute. I decided later to recode this algorithm by doing the calculations manually and the whole thing reduced to mere seconds. If anyone has any suggestions on better organization or more efficient coding, let me know. I like my code to be concise, especially in examples like these. I don't want to burden the reader with having to decipher a whole bunch of code to appreciate the example.Code:################################################################################ ## TABLE 16.5 (p 715) # ## Randomization Samples and Test Statistics--Quality Control Example # ## FIGURE 16.8 # ## Randomization Distribution of F* and Corresponding # ## F Distribution--Quality Control Example # ## # ## Since there is no algorithm to compute this example we had to devise one. # ## It should come as rather straight-forward. The Xi's are as in the above # ## examples. The 'y' will hold the 1,680 cases of 9-sequences consisting of # ## the response variables. The 'ti' implies the treatment group. In this case # ## t1 is the first group (3-sequence) and t12 is the composite of t1 and t2. # ## The 'remainder' function is a wrapper for grabing a subset of 'set' based # ## on those values not in 'x'. The 'seq6' is the 6-sequence remainder after t1 # ## is defined. The whole process took less than 10 seconds on a 2.4 GHz # ## processor. As for the output, the columns are arbitrarily labeled 1-9. # ## Clearly they represent the three treatment groups based on groups of three. # ## The function 'f' uses the matrix algebra discussed in Ch. 5. It is possible # ## to get away with merely fitting an 'lm' object, and then extract the # ## f-statistic in a single call. However, this requires a lot of additional # ## work for each of the 1680 rows. It took somewhere between 30-60 seconds to # ## produce the same result. # ################################################################################ remainder <- function(x, set) set[!set %in% x] f <- function(Y, X) { Y <- matrix(Y) ## Turn row-vector into column p <- ncol(X); n <- nrow(X) J <- matrix(1, n, n) ## (5.18) H <- X %*% solve(t(X) %*% X) %*% t(X) ## (5.73a) SSE <- t(Y) %*% (diag(n) - H) %*% Y ## (5.89b) SSR <- t(Y) %*% (H - (1/n)*J) %*% Y ## (5.89c) fstar <- (SSR / (p - 1)) / (SSE / (n - p)) ## (6.39b) } base <- c(1.1, 0.5, -2.1, 4.2, 3.7, 0.8, 3.2, 2.8, 6.3) t2 <- t12 <- t123 <- list() y <- NULL X <- cbind( X1 = c(1, 1, 1, 0, 0, 0, 0, 0, 0), X2 = c(0, 0, 0, 1, 1, 1, 0, 0, 0), X3 = c(0, 0, 0, 0, 0, 0, 1, 1, 1) ); t1 <- t(combn(base, 3)) seq6 <- t(combn(base, 3, remainder, set = base)) for (i in 1:84) t2[[i]] <- t(combn(seq6[i, ], 3)) for (i in 1:84) t12[[i]] <- cbind(t1[i, 1], t1[i, 2], t1[i, 3], t2[[i]]) for (i in 1:84) t123[[i]] <- cbind(t12[[i]], t(apply(t12[[i]], 1, remainder, set = base))) for (i in 1:84) y <- rbind(y, t123[[i]]) fstar <- apply(y, 1, function(Y) f(Y, X)) cbind(y, data.frame(f = fstar)) hist(fstar, freq = FALSE, ylim = c(0, 1), col = "gray90", main = "") curve(df(x, 2, 6), add = TRUE, lwd = 2) rm(base, fstar, i, remainder, seq6, t1, t2, t12, t123, f, X, y)
And if anyone is wondering how we figure out there are 1680 possible permutations, consider the fact that we first have to choose the first treatment (3-sequence). There are 84 (= 9 choose 3) ways to do this. For each of those 84, there remains a 6-sequence for which we need to choose another 3-sequence for the second treatment. Upon that selection the last (remainder) treatment (3-sequence) is automatically determined. Thus, the number of permutations is
Last edited by bryangoodrich; 07-01-2011 at 04:04 AM.
Random card, die and coin outcome generator
I feel it is my duty to revive this thread every so often (mostly for selfish reasons so I can see everyone else's good ideas)
Game Probabilities Function
I tutor middle school students and am often in the need of a card, die, or coin generator (the web is rank with these but I didn't always get internet so [R] to the rescue). So…
Description
This sfunction is outcome generators for: die, coin & cards. It's very basic but I've used it many times tutoring (one of the first functions I wrote).
Function note: the game argument does not need to be in quotesFunction CodeCode:game.prob(game=c(coin,die,cards),n,replace=T)
ExamplesCode:game.prob<-function(game,n=10,replace=T){ g<-substitute(game) g<-as.character(g) switch(g, coin=paste(sample(c("heads","tails"), n, replace),sep=" "), die=paste(sample(c("one","two","three","four","five","six"), n, replace),sep=" "), cards=paste(sample(c("Ace of Hearts","Ace of Diamonds","Ace of Spades","Ace of Clubs", "2 of Hearts","2 of Diamonds","2 of Spades","2 of Clubs", "3 of Hearts","3 of Diamonds","3 of Spades","3 of Clubs","4 of Hearts","4 of Diamonds","4 of Spades","4 of Clubs", "5 of Hearts","5 of Diamonds","5 of Spades","5 of Clubs","6 of Hearts","6 of Diamonds","6 of Spades","6 of Clubs", "7 of Hearts","7 of Diamonds","7 of Spades","7 of Clubs","8 of Hearts","8 of Diamonds","8 of Spades","8 of Clubs", "9 of Hearts","9 of Diamonds","9 of Spades","9 of Clubs","10 of Hearts","10 of Diamonds","10 of Spades","10 of Clubs", "Jack of Hearts","Jack of Diamonds","Jack of Spades","Jack of Clubs","Queen of Hearts","Queen of Diamonds","Queen of Spades","Queen of Clubs", "King of Hearts","King of Diamonds","King of Spades","King of Clubs"), n, replace),sep=" ")) }
Code:game.prob(die,6) game.prob(die,40) game.prob(cards,40) game.prob(cards,40,replace=F) game.prob(coin,100) paste(game.prob(die,40), game.prob(die,40),sep=" & ") # 2 die paste(game.prob(coin,10), game.prob(die,10),sep=" & ") # a coin and a die #TABLING THE OCCURANCES table(game.prob(die,4000)) table(game.prob(coin,100)) table(game.prob(cards,40)) table(paste(game.prob(coin,1000), game.prob(die,1000),sep=" & "))# a coin and a die
Last edited by trinker; 07-15-2011 at 08:42 PM. Reason: Code Simplification
Tweet |