# Homework Help

##### New Member
Hello!

We're currently studying regression in our basic statistics class. My professor gave each of us in the class a set of data and we were tasked with describing the relationship between the two variables within.

The two variables are Patient Satisfaction (KPasTot) and Evaluation of Nurse's Therapeutic Communication Skills (KTerTot).

To be honest, I'm still not entirely sure what I'm doing. I looked at the data and neither looks to be normal.

I then performed Pearson and Spearman's rho correlation tests (I think Spearman's is for non-parameteric data?) and both returned significant positive correlation between the two variables. So then I plotted the data and got this ugly mess.

Those points are all over the place but they look to GENERALLY be following that linear fit line? Would I just describe the relationship as significantly positive, report the correlation coefficient and significance, and include the equation for the fit line and its R^2? I feel like describing the relationship as "significantly positively correlated" is doesn't properly explain the relationship between the two but (based on the super simple analysis I've done) I don't see a clear relationship between the two.

I'm not looking for answers but how would you approach this question? And what would you do if faced with a scatterplot like the one above? I want to understand the thinking process behind the analysis and reporting more than to know the "correct" answer to this assignment.

Thanks in advance for anyone nice enough to help with suggestions or advice.

#### Karabiner

##### TS Contributor
I then performed Pearson and Spearman's rho correlation tests (I think Spearman's is for non-parameteric data?)
There is no such thing like nonparametric data. There are non-parametric tests
(tests which do not assume certain distributional properties of the data or of
the prediction errors). Spearman is used for rank data (or for interval data transformed
into ranks). Personally, I would use both Spearman (a coefficient for the degree
of monotony of associations) and Pearson (degree of linear association) to describe
the relationship in the sample.

It is interesting to note that both variables show high frequencies for the most extreme
postive value, althiugh I do not know what to made out of this.
So then I plotted the data and got this ugly mess.
I do not think it is ugly. The R² indicates that there's a large correaltion between the
variables, which is more or less reflected by the scatterplot.

Would I just describe the relationship as significantly positive, report the correlation coefficient and significance, and include the equation for the fit line and its R^2?
Sounds ok.

Just my 2pence

Karabiner

##### New Member
Thanks a lot for your input.

I obviously have a lot more learning to do on this topic and I really appreciate your help steering me in the right direction

I decided to graph how the average of the Communication Therapeutic Skills related to the Patient Satisfaction and got a graphic that shows a much clearer relationship between the two variables.

Is this a valid way of going about looking at the data? I understand the linear fit line no longer describes the raw results but rather the average value of the samples. How valuable is this graphic in describing the general relationship between the two variables?

#### Karabiner

##### TS Contributor
Unfortunately, I have no idea what you actually did there. "the average of the Communication Therapeutic Skills related to the Patient Satisfaction" What average? There is only 1 average in the sample, but you display a graph.

With kind regards

Karabiner

Last edited:

#### Dason

Looks like they averaged over any replicated values. You would still want to do the regression on the original data though.

#### shaolu

##### New Member
Looks like they averaged over any replicated values. You would still want to do the regression on the original data though.
hi Dason, i have a math question thats not related to this (sorry) but could you help me with it please?

#### shaolu

##### New Member
Looks like they averaged over any replicated values. You would still want to do the regression on the original data though.
question: find lamdba given that P(X<=5) = 0.9896