R - "\\" and "$" signs

#1
[SOLVED] R - "\\" and "$" signs

I was looking into some packages on GitHub and trying to make sense of functions in these packages.

I came across this function to read csv-files (from ProjectTemplate package). I was however wondering what the "\\" and "$" signs stand for.

I suppose "\\" is something like "any characters" or has to do with the "." preceding zip. I also don't understand why the "$"-sign is used at the end of zip. If you just want to check if ".zip" is part of a filename why not just use "\\.zip"?

To conclude (although this isn't strictly speaking an R related question), what is the best way to search for the meaning of symbols (or e.g. logical operators) in google or what is a good R-resource to look up such things?

Code:
csv.reader <- function(data.file, filename, variable.name)
{
  if (grepl('\\.zip$', filename))
  {
    tmp.dir <- tempdir()
    tmp.path <- file.path(tmp.dir, data.file)
    file.copy(filename, tmp.path)
    unzip(filename, exdir = tmp.dir)
    filename <- file.path(tmp.dir, sub('\\.zip$', '', data.file))
  }
  
  assign(variable.name,
         read.csv(filename,
                  header = TRUE,
                  sep = ','),
         envir = .GlobalEnv)
}

Thanks!
 
Last edited:

bryangoodrich

Probably A Mammal
#2
The "\\" is used in the regular expressions for grep and sub. Read the help files on them. The "\" is an escape character saying "within this string, the following character should be taken as is." Thus, if you want to actually look for "\" you need to escape it. However, when doing regular expressions in R, you have to "double escape" which is why you see "\\". In R regular expressions like these, if you wanted to actually include "\" you would have to use "\\\". Again, see the help file.

The "$" has a number of meanings, such as identifying a list element.

Code:
x <- list()
x$foo <- rnorm(10)
x$bar <- runif(10)
x
However, in the function above it is being used in a regular expression, again. See the help files or Google about this. The "$" means "look at the end of the string." Thus, the two instances are looking for files that end in ".zip" and the "\\" escapes the "." which has a special meaning in regex.
 

Dason

Ambassador to the humans
#3
To answer your question about the R code: grepl is a form of grep that returns a boolean value instead of the matching response. If you want to understand this better you should probably try to understand regular expressions. The period is a special symbol in regular expression. When left by itself it just means "any character". That's not what the programmer wanted - they wanted the literal period character. To do that we need to escape the period so that the regular expression knows we want a period and not the special "any character" symbol it thinks it should mean. In most languages we only need to use one slash to escape the period ("\.") but in R we need two for some symbols ("\\."). The reason they use the dollar sign ($) at the end of the regular expression is to denote that they only want to match the expression if it occurs at the end of the string of interest. For instance they want to match "My_Wallaby_Videos.zip" or "CATSLOL.zip" but not "My.zipper.txt" (notice that .zip occurs in that last one but not at the very end).


To answer the question about searching for symbols: I've hear decent things about http://www.symbolhound.com/. It's a search site that doesn't strip symbols away.

To answer your question about searching for R stuff: R Seek is probably the way to go there!

Edit: **** ninjas...
 

Dason

Ambassador to the humans
#4
In R regular expressions like these, if you wanted to actually include "\" you would have to use "\\\". Again, see the help file.
Not quite. The reason we need to double escape in R initially is so that R recognizes that we want to include a "\" but if we don't double escape it then it will think we're including a special character and it'll mess up.

Code:
> x <- c("\\", "C:\\Program Files", "CATSLOL.zip")
> grep("[\\]", x)
[1] 1 2
> grep("\\\", x)
+ Oh noes! At the moment it still thinks I'm continuing the string
+ Because I started the string then had \\ which gives a slash and
+ then I followed that with \" which is an escaped quote.
+ So it's going to think this whole thing is the regular expression.
+ Oh my.
+ ", x)
integer(0)
>
> grep("\\", x)
Error in grep("\\", x) : 
  invalid regular expression '\', reason 'Trailing backslash'
>
> grep("\\", x, fixed = T)
[1] 1 2