LoganB
01-16-2010, 06:52 PM
I have a large data set (~40k rows) and there are a lot of spelling version for different taxa names that are actually the same species. (EX: Ephemerella invaria, E. invaria, Ephem invaria or Ephemerellainvaria) I need to be able to resolve these spelling differences but I know there is a better way than going through and doing it by hand. Any Ideas?
lumhearts
01-17-2010, 10:11 AM
I'm not sure of the R procedure, but with other software you would use functions to find and replace. For example, the function would search for when a taxa name begins with E and changes that value to Ephemerella. You would probably run frequency tables then your function then repeat until all is good. Maybe someone else here knows how this is done in R. Else, you could import your data into Excel and use the 'Ctrl H' key function.
bugman
01-17-2010, 03:44 PM
The best idea is to tidy it up before importing, as Lumhearts said using find and replace. Should only take a couple of minutes (be sure that the case is the same also, and that there are no spaces before and after each entry and no double spaces between the genis and species names).
Phil