Today I Learned: ____

Dason

Ambassador to the humans
It's really nice when writing reports. I don't use it much when writing code though. My code files tend to be fairly short so I can't easily scroll through the entire thing and keep it all in mind.
 

TheEcologist

Global Moderator
TIL RStudio not only lets you skip around tabs from a drop down list (the double-arrows on the tabs when you've more loaded than can show). It also has within a document a drop down list to move to a function! It's next to the line:column number display. I was sitting here staring at my package script wondering "wtf is that?" clicked on it and it dawned on me "whoa, here's all the functions in this file!" More than that, however, I have a few "# ====== Section Title ======" and it has those in the list, too! Considering I'm throwing a bunch of stuff into these files sometimes, this is mightily useful for quick navigation. Maybe. We'll see. At least I know it is there now. It may have always been there. My ignorance has been obliterated, though!
Sounds like it will someday become a bloated version of emacs ;)
 

TheEcologist

Global Moderator
"bloated version of emacs"?

We call the bloated version of emacs: "emacs"
What age are you from? That used to be a valid argument back in the day when 512k mem was considered large.
These days, emacs is tiny.

Edit: I just checked, with all my extentions, emacs is < 5 mb.

Once ago, In age forgotten, emacs once stood for “Eight Megabytes And Constantly Swapping”. Right now, your webbrowser needs about as much RAM per tab as Emacs does for 100 open files. The Emacs bloat, once true, is now just myth.

The way it works with Emacs is that If you don't use it, you don't have to know it's there. How do all the extra features work with Rstudio? Does it get heavier on RAM and CPU?
 
Last edited:

bryangoodrich

Probably A Mammal
TIL how to take my palette of hex color numbers and easily slap some transparency on there. Needed it in like 5 minutes and I sure as hell wasn't going to figure out how to handle all of that. Luckily, somebody already did!

http://stackoverflow.com/questions/...t-points-in-scatterplot-more-transparent-in-r

Code:
addTrans <- function(color,trans)
{
  # This function adds transparancy to a color.
  # Define transparancy with an integer between 0 and 255
  # 0 being fully transparant and 255 being fully visable
  # Works with either color and trans a vector of equal length,
  # or one of the two of length 1.

  if (length(color)!=length(trans)&!any(c(length(color),length(trans))==1)) stop("Vector lengths not correct")
  if (length(color)==1 & length(trans)>1) color <- rep(color,length(trans))
  if (length(trans)==1 & length(color)>1) trans <- rep(trans,length(color))

  num2hex <- function(x)
  {
    hex <- unlist(strsplit("0123456789ABCDEF",split=""))
    return(paste(hex[(x-x%%16)/16+1],hex[x%%16+1],sep=""))
  }
  rgb <- rbind(col2rgb(color),trans)
  res <- paste("#",apply(apply(rgb,2,num2hex),2,paste,collapse=""),sep="")
  return(res)
}
Now I just took my (base) plotting function and slapped on "col = addTrans(pal, trans)" and made a parameter for trans. Easily beautiful solution worked like a charm. Is there no base functionality that already does this??
 

trinker

ggplot2orBust
@BG I think the alpha function from the scales package does this as well.

Code:
library(scales)
cols <- c("pink", "red", "yellow", "blue", "green", "purple")
alpha(cols, 1)
alpha(cols, 0.5)
 

trinker

ggplot2orBust
TIRL

I often have a list and want to assign individual elements to an environment or the global environment. I keep making my own function to do this and keep forgetting about the list2env function.

Here's what I'm talking about:

Code:
dat <- list(A=1, B=2, C=3, D=4)

lapply(1:length(dat), function(i) {

	assign(names(dat)[i], dat[[i]],envir=.GlobalEnv)
	
})


A
rm(A)
A

list2env(dat, .GlobalEnv)
A
 

Lazar

Phineas Packard
TIL how to link R and Fortran (not as easy as f2py in python with limited online documentation). Fortran code and compiling instructions:
Code:
! ./fcn1.f90

! A test function for trying out .Fortran in R:
!   $gfortran -shared -o fcn1.dll fcn1.f90  
!   $ R
!   Should give approx 2.7182818284590451
! Not the most exciting example but gives an idea


subroutine fcn(x,f1)
    double precision x
    double precision f1
    f1 = exp(x)
end
then in R:
Code:
dyn.load("./fcn1.dll")
is.loaded("fcn")
.Fortran("fcn", x=as.double(1.0), f=as.double(1.0))
 

TheEcologist

Global Moderator
TIL how to link R and Fortran (not as easy as f2py in python with limited online documentation). Fortran code and compiling instructions:
This is great, I have never called Fortran from R, but is exactly like I call C. Good to know.

It's great for simple speed-ups in your program (rcpp for the win if you do anything more complex).

Note that this is even easier if you are on Linux.

;)
 

bryangoodrich

Probably A Mammal
TIL that Vectorize is awesome! Functional programming FTW!

I have a list of column vectors (data frames) from a query I ran on my database and I wanted to lapply a random sample of them to see some of the variety of these data that met a condition I'm looking for. So of course this errors

Code:
lapply(res, sample, 5)
So instead, I did

Code:
lapply(res, Vectorize(sample), 5)
All is right in the world :D

I'm sure I'll find other uses now that I can perceive this pattern in my code.
 

Dason

Ambassador to the humans
TIL that Vectorize is awesome! Functional programming FTW!

I have a list of column vectors (data frames) from a query I ran on my database and I wanted to lapply a random sample of them to see some of the variety of these data that met a condition I'm looking for. So of course this errors

Code:
lapply(res, sample, 5)
So instead, I did

Code:
lapply(res, Vectorize(sample), 5)
All is right in the world :D

I'm sure I'll find other uses now that I can perceive this pattern in my code.
Just so we're clear - that won't actually sample 5 rows from each dataframe. What that does is samples five items from each column in each dataframe. So in the first row in the result you'll get values that could have come from different rows in your dataframe.

Code:
> mydf <- data.frame(letters = letters, LETTERS = LETTERS)
> res <- list(mydf, mydf)
> lapply(res, Vectorize(sample), 5)
[[1]]
     letters LETTERS
[1,] "e"     "I"    
[2,] "g"     "C"    
[3,] "y"     "R"    
[4,] "s"     "S"    
[5,] "t"     "M"    

[[2]]
     letters LETTERS
[1,] "h"     "O"    
[2,] "e"     "T"    
[3,] "o"     "Y"    
[4,] "x"     "Q"    
[5,] "f"     "U"
Don't get me wrong - Vectorize can be **** useful but if your goal was to sample rows then this isn't the way to do it.
 

bryangoodrich

Probably A Mammal
I have a single column in each data frame, so it's equivalent to what I after. Of course if I was returning a multi-field result set, I'd have to do something different. I just never used Vectorize before, and I liked how it fit into the workflow nicely.
 

Dason

Ambassador to the humans
Vectorize isn't the fastest way to do things in most cases - but it can be one of the most convenient ways to just make things work sometimes. In this case writing an anonymous function in your call to lapply like function(x){sample(x[,1], 5)} will be much faster than using Vectorize.
 

bryangoodrich

Probably A Mammal
Yeah, I wouldn't expect Vectorize to be efficient, but this is where R's lack of good lambda expressions for succinct anonymous functions is troublesome. Much easier in this case to literally wrap sample than make an anonymous wrapper for it! Of course, all of this begs the question why I'm not just using Python to interface my database, though :p
 

Jake

Cookie Scientist
Efficiency issues aside, I don't think the use of Vectorize() is really appropriate here in the first place. The first big clue that something is not quite right is that you create a vectorized version of sample(), but then you don't actually pass in a vectorized argument! So right off the bat the use of Vectorize() seems a bit confusing and apparently pointless here.

Upon closer scrutiny we can see that although Vectorize() happens to produce the desired result for your task, this is entirely a lucky accident arising from the way that the vectorized function handles the data types of its arguments. Specifically, it internally calls mapply(), which makes sure to call sample() on the vectors comprising the data frames rather than on the data frames themselves. Thus, your original command is essentially equivalent to this:
Code:
lapply(res, function(x) mapply(sample, x, 5))
I think a far more clear solution is one that straightforwardly and clearly addresses the root of the problem, namely that sample() takes vectors and not data frames. Thus I would go for something like this:
Code:
lapply(lapply(res,unlist), sample, 5)
IMO this is far more transparent and principled than the Vectorize() solution.
 

Dason

Ambassador to the humans
Efficiency issues aside, I don't think the use of Vectorize() is really appropriate here in the first place. The first big clue that something is not quite right is that you create a vectorized version of sample(), but then you don't actually pass in a vectorized argument!
A list is a vector. A data.frame is a list. He is sending in a one-element vector in that regard.
 

Jake

Cookie Scientist
I am aware of that technicality but didn't want to split hairs. It's not a vectorized argument as the term is commonly/colloquially understood.