# How to create a Pearson Correlation and others with this data?

#### Joker

##### New Member
Hello everyone..
I am struggling to create a Pearson correlation with this data below:

My goal here is very simple, identify correlations between each module and the severity number. (Its related to Defects)
I tried Pearson correlation but did not work out well. Can someone give me a hint of how can I make this into an outstanding analysis? Its for a work meeting...
Thank you very much!

#### Omerikooo

##### New Member
Hello Joker,

Correlation refers to a relationship between two continuous variables. For example, it is possible to look for a correlation between age and height of a group of people.

In your case you have a module variable (which I understand is a nominal variable), and severity which is an ordinal variable. In order to understand the relationship between module and severity you can use chi-square( better option) or t-test (acceptable option).

Chi-square simply tells if your module variable affects the frequency of the severity variable.

T-test would be a better option if your severity variable was not recorded as categories.

#### Joker

##### New Member
Hello Joker,

Correlation refers to a relationship between two continuous variables. For example, it is possible to look for a correlation between age and height of a group of people.

In your case you have a module variable (which I understand is a nominal variable), and severity which is an ordinal variable. In order to understand the relationship between module and severity you can use chi-square( better option) or t-test (acceptable option).

Chi-square simply tells if your module variable affects the frequency of the severity variable.

T-test would be a better option if your severity variable was not recorded as categories.
Hello Joker,

Correlation refers to a relationship between two continuous variables. For example, it is possible to look for a correlation between age and height of a group of people.

In your case you have a module variable (which I understand is a nominal variable), and severity which is an ordinal variable. In order to understand the relationship between module and severity you can use chi-square( better option) or t-test (acceptable option).

Chi-square simply tells if your module variable affects the frequency of the severity variable.

T-test would be a better option if your severity variable was not recorded as categories.

Do the variables need to be "Double" to make the correlation work? Because the Module column is a string..

#### Omerikooo

##### New Member

Do the variables need to be "Double" to make the correlation work? Because the Module column is a string..
String is probably fine for your purpose but it would be good to know which software you are using.

It is not statistically sound to call this a correlation. You can refer to wikipedia: "Pearson's chi-squared test is used to determine whether there is a statistically significant difference between the expected frequencies and the observed frequencies in one or more categories of a contingency table."

#### Joker

##### New Member
String is probably fine for your purpose but it would be good to know which software you are using.

It is not statistically sound to call this a correlation. You can refer to wikipedia: "Pearson's chi-squared test is used to determine whether there is a statistically significant difference between the expected frequencies and the observed frequencies in one or more categories of a contingency table."
String is probably fine for your purpose but it would be good to know which software you are using.

It is not statistically sound to call this a correlation. You can refer to wikipedia: "Pearson's chi-squared test is used to determine whether there is a statistically significant difference between the expected frequencies and the observed frequencies in one or more categories of a contingency table."
Thank you so much!