PDA

View Full Version : cor() command in R for non-numeric data



DCohen
05-24-2008, 07:40 PM
Is there a way I can create a correlation matrix in R with non-numeric data in it? Similar in the way when using the regression command "lm" R will separate each instance of a character string value to new coefficient.

When creating a correlation matrix for the time being I am taking all the possible values for that column and creating multiple columns with 0's or 1's depending on it's original value. For instance, if one column is "Department" and the options are; HR, Engineering, or Finance I will have 3 new columns labeled HR - Engineering - Finance each with a 0 or a 1 in it to show which department they truly belong to. This is ridiculously time consuming and I'd like a way to make it quicker on me. Thanks.

TheEcologist
05-25-2008, 04:41 AM
Is there a way I can create a correlation matrix in R with non-numeric data in it? Similar in the way when using the regression command "lm" R will separate each instance of a character string value to new coefficient.

When creating a correlation matrix for the time being I am taking all the possible values for that column and creating multiple columns with 0's or 1's depending on it's original value. For instance, if one column is "Department" and the options are; HR, Engineering, or Finance I will have 3 new columns labeled HR - Engineering - Finance each with a 0 or a 1 in it to show which department they truly belong to. This is ridiculously time consuming and I'd like a way to make it quicker on me. Thanks.

See if the discussion on the R-help list can help you (there's a list of replies);

http://tolstoy.newcastle.edu.au/R/e2/help/06/10/2642.html

Otherwise you can search the R-Help archives:

http://finzi.psych.upenn.edu/search.html

There is much on the subject:

http://finzi.psych.upenn.edu/cgi-bin/namazu.cgi?query=categorical+correlation&max=100&result=normal&sort=score&idxname=functions&idxname=Rhelp02a&idxname=Rhelp01&idxname=R-devel

Lastly I would reflect if another method, say GLM's wont be more informative for your cause.

cheers,