Split string

Dason

Ambassador to the humans
#2
Can you explain why you want to split it that way? What is the logic? And can you give several examples?

Will it always be 4 characters, 2 characters, 2 characters, 4 characters?
 
#3
Basically I have a load of citations keys in the format AnonEtAl2013 (therefore no spaces). I need to search for the bibtex references on Google Scholar, and searching for AnonEtAl2013 doesn't tend to work.

Won't always be 4,2,2,4 characters but if I know to to split AnonEtAl2013 then I could figure out how to do the rest.
 

Dason

Ambassador to the humans
#4
You haven't actually described the pattern though. What is it that you want to split on? Capital letters and/or numbers? I can't read your mind - neither can the computer. The first step is to break the problem down into small little pieces. In this case that means identify the actual pattern you want to split on. If you can adequately describe that then we're a lot closer to achieving your goal.

This is partially why I wanted several examples. A single example doesn't do anything in establishing a pattern.
 
#5
In every instance pattern is NameEtAlyear, like this:

Code:
AnonEtAl2010

AnotherEtAl2011

OnemoreEtAl2012

LastoneEtAl2013
Each of these should be split into:

Code:
Anon Et Al 2010

Another Et Al 2011

Onemore Et Al 2012

Lastone Et Al 2013
 

Dason

Ambassador to the humans
#7
Code:
x <- c("AnonEtAl2010", "AnotherEtAl2011", "OnemoreEtAl2012", "LastoneEtAl2013")

gsub("([A-Z]|[[:digit:]]+)", " \\1", x)
That will add a space at the beginning but you can probably deal with that. What this does is add a space before every capital letter and every group of numbers.
 
#8
There are two other possible patterns

(1) NameYear eg Anon2013

(2) NameNameYear eg SomeoneAnother2013

I thought that if I knew how to split AnonEtAl2013, I could apply similar code to these other two patterns
 

Dason

Ambassador to the humans
#9
We can get rid of adding the space to the beginning if we use perl compatible regexs.

Code:
x <- c("AnonEtAl2010", "AnotherEtAl2011", "OnemoreEtAl2012", "LastoneEtAl2013")
gsub("(?!^)([A-Z]|[[:digit:]]+)", " \\1", x, perl=T)

#which gives
> gsub("(?!^)([A-Z]|[[:digit:]]+)", " \\1", x, perl=T)
[1] "Anon Et Al 2010"    "Another Et Al 2011" "Onemore Et Al 2012"
[4] "Lastone Et Al 2013"
Edit: Adding your new stuff in...

Code:
> x <- c("AnonEtAl2010", "AnotherEtAl2011", "OnemoreEtAl2012", "LastoneEtAl2013", "Anon2013", "SomeoneAnother2013")
> 
> gsub("(?!^)([A-Z]|[[:digit:]]+)", " \\1", x, perl=T)
[1] "Anon Et Al 2010"      "Another Et Al 2011"   "Onemore Et Al 2012"  
[4] "Lastone Et Al 2013"   "Anon 2013"            "Someone Another 2013"
 

Dason

Ambassador to the humans
#10
Here is a slightly modified version of the code with a very small amount of explanation.
Code:
x <- c("AnonEtAl2010", "AnotherEtAl2011", "OnemoreEtAl2012", "LastoneEtAl2013", "Anon2013", "SomeoneAnother2013")

# (?!^)                       : Don't match the beginning of the string (this requires using perl=TRUE)
# ([[:upper:]]|[[:digit:]]+)  : Match either an upper case letter or a group of numbers and create a 'group' out of it
# " \\1"                      : replace the match with a space followed by the 'group' (ie the match)
gsub("(?!^)([[:upper:]]|[[:digit:]]+)", " \\1", x, perl = TRUE)
For more info read ?regex and/or the wikipedia page on regular expressions.