# Thread: [Solved] Significance in R

1. ## [Solved] Significance in R

Ok, this is quite hard to explain, but I'm at a complete loss what to do. I'm a relative newcomer to R and although I can completely admire how powerful it is, I'm not too good at actually using it....

Basically, I have some very contrived data that I need to analyse (it wasn't me who chose this, I can assure you!). I have the right and left hand lengths of lots of people, as well as some numeric data that shows their sociability.

Now I would like to know if people who have significantly different lengths of hand are more or less sociable than those who have the same (leading into the research that 'symmetrical' people are more sociable and intelligent, etc.

I have got as far as loading the data into R, then I have no idea where to go from there. How on Earth do I start to separate those who are close to symmetrical to those who aren't to then start to do the analysis?

2. ## Re: Significance in R

Originally Posted by Gemsie
I have got as far as loading the data into R, then I have no idea where to go from there. How on Earth do I start to separate those who are close to symmetrical to those who aren't to then start to do the analysis?
Hi Gemsie,

Start with determining which test you want to use, here's a guide.

Then give us an example of how your data file is structured. For example what does the output of this command look like:

Its also useful to read this.

3. ## Re: Significance in R

I definitely need to apologise for the ridiculous way I've asked the question...I'm definitely under the stupid category on your guide so thanks for your patience.

Thank you for sending me the other link too, I'll certainly be using it in future.

I've tried to simplify my question. I would like to do two things:
1. Test whether the lengths of the left and right hands are significantly different from each other.
2. Test whether the sociability is significantly affected by hand length (i.e. is there a difference between those people who have similar left and right hand lengths, to those who have different lengths?)

I have 150 people, and thought that at some point (when I eventually figure out the stuff before), I'd have to do something along the lines of:
glm(sociable~ ???, family="binomial" )

??? : I've got no idea what to put here...

4. ## Re: Significance in R

Originally Posted by Gemsie
I definitely need to apologise for the ridiculous way I've asked the question...I'm definitely under the stupid category on your guide so thanks for your patience.

Originally Posted by Gemsie
???
Why dont you post an example of how you structured your data (with the command head as I suggested.. we will only see the first few lines). This will make it easier for me (or someone else) to help you with the code. That way we will have an idea of what to put there

With the data example I can also help you easier with 1.

5. ## Re: Significance in R

What I've done so far is this:

hist(social)
This shows me that the data isn't normally distributed (i.e. it's a reverse J distribution).

I then:
stand=scale(measurements\$l.hand-measurements\$r.hand)
m<-lm(measurements\$social~stand)
m
summary(m)
anova(m)
par(mfrow=c(2,2))
plot(m)

This obviously gives me the model plot checks, and again shows the data isn't normally distributed.

I then tried to plot the results on a graph, so:
plot(l.hand-r.hand,social,
ylab="Sociability (number of people spoken to)",
xlab="Difference in Hand Length (cm)")

But I'm unable to plot a line of best fit on it....I tried abline, but that just runs a horizontal line straight through 0. So then I looked at scatter.smooth but that looks wrong too.....

Finally, I'm trying to do some kind of analysis on the data, but I'm totally lost here. I've muddled my way through some ideas:

1.
stand=scale(measurements\$l.hand-r.hand,center=FALSE)
fake=abs(stand)<1.96
t.test(measurements\$social[fake],measurements\$social[!fake])

But I don't think I have enough observations to do this (150)...

2.
cor(abs(measurements\$l.hand-measurements\$r.hand),measurements\$social)

But again, I have no idea if this is right and don't know how to intepret this.

3.
set.seed(1)
DF <- data.frame(l.hand = rnorm(100, 15, sd = 2), r.hand = rnorm(100, 15, sd = 2), social = runif(100))
DF <- within(DF, hands <- l.hand - r.hand)
mod <- lm(social ~ hands, data = DF)
summary(mod)
plot(social ~ hands, data = DF)

So I've messed about with all of these and they all work, but I just feel like I'm blindly trying anything, feeling optimistic when it doesn't error, but in essence, have absolutely no idea what I'm doing

6. ## Re: Significance in R

Originally Posted by Gemsie
1. Test whether the lengths of the left and right hands are significantly different from each other.
2. Test whether the sociability is significantly affected by hand length (i.e. is there a difference between those people who have similar left and right hand lengths, to those who have different lengths?)
Oke now I have a slightly better idea of how your data looks (all I wanted to know from the data example was how you had structure your data).

No offense meant, but It always amazes me how much (often sophisticated) statistics people conduct on there data without first looking at them (now if you did the below first then this comment is off course not for you). From experience I know that doing some simple data explorations sometimes save you hours of fussing with analysis. It (1) gives you a 'feel' for your data, (2) helps you understand the trends you find and (3) make your analysis much more directed.

For your objective 1. Try this: plot a boxplot of both statistics.

Code:
``````#create a data frame of handstats
handstats=data.frame(length=c(measurements\$l.hand,
measurements\$r.hand), hand=c(rep(150,'left'),rep(150,'right'))
# boxplots
boxplot(handstats\$length~handstats\$hand)``````
Now that should already give you a pretty good idea of the existence of any differences, the distribution of the data and what test would be best.

Report on how this looks, but best of all would be if you posted the graph here.

For 2. Plot these scatter plots.
Code:
``````par(mfrow=c(2,2))
# 1
plot(social~l.hand)
# 2
plot(social~r.hand)
# 3
plot(social~c(r.hand-l.hand))
#4 this tell you whether right and left hand lengths are related,
# and thus if for 1 a different test would be more appropriate (e.g. ANCOVA).
plot(r.hand~l.hand)``````
Again when you report back, it would be best to post the graph. We should then be able to see which course of action make sense.

Hope this helps,

7. ## Re: Significance in R

Thank you so much for helping me....you have no idea how grateful I am.

I tried to do the first thing you said, but it just errors:

> handstats=data.frame(length=c(measurements\$l.hand,
+ measurements\$r.hand), hand=c(rep(150,'left'),rep(150,'right'))
+ boxplot(handstats\$length~handstats\$hand)

Error: unexpected symbol in:
"measurements\$r.hand), hand=c(rep(150,'left'),rep(150,'right'))
boxplot
"
I have retyped it word for word, just in case it's a formatting error, etc. but to no avail.

Part two graphs:
This worked! I have no idea how to post a graph in the thread though (how terrible do I appear with computers?!) but have attached it.

8. ## Re: Significance in R

Originally Posted by Gemsie
Thank you so much for helping me....you have no idea how grateful I am.

I tried to do the first thing you said, but it just errors:

> handstats=data.frame(length=c(measurements\$l.hand,
measurements\$r.hand), hand=c(rep(150,'left'),rep(150,'right'))
boxplot(handstats\$length~handstats\$hand)

Error: unexpected symbol in:
"measurements\$r.hand), hand=c(rep(150,'left'),rep(150,'right'))
boxplot
"
I have retyped it word for word, just in case it's a formatting error, etc. but to no avail.
Aah I missed one parenthesis, here it is correct:

Code:
``````handstats=data.frame(length=c(measurements\$l.hand,
measurements\$r.hand), hand=c(rep(150,'left'),rep(150,'right')))
boxplot(handstats\$length~handstats\$hand)``````
That should work.
Also don't forget this plot (scatter plot left right hand lengths):
Code:
``plot(measurements\$l.hand~measurements\$r.hand)``
Originally Posted by Gemsie
Part two graphs:
This worked! I have no idea how to post a graph in the thread though (how terrible do I appear with computers?!) but have attached it.
Looks like there is some positive relationship (the graphs also leads me to believe there is a relationship between l.hand and r.hand as well). But lets see the scatter plot of left right hand lengths first.

There seems to be no relationship between socialibilty & the difference in handsizes (r.hand-l.hand) so you can forget about that.

9. ## Re: Significance in R

Ok, I tried it again, but got this:

> handstats=data.frame(length=c(measurements\$l.hand,
+ + measurements\$r.hand), hand=c(rep(150,'left'),rep(150,'right')))

Error in rep(150, "left") : invalid 'times' argument
In data.frame(length = c(measurements\$l.hand, +measurements\$r.hand), hand = c(rep(150, :
NAs introduced by coercion
>
boxplot(handstats\$length~handstats\$hand)

What does this mean? I've noticed that in the output code, there is an extra +, which I haven't written in the coding window....strange.

The new graph was interesting with a definite relationship (except for one very strange person who has particularly odd hands)!

10. ## Re: Significance in R

Originally Posted by Gemsie
Ok, I tried it again, but got this:

What does this mean? I've noticed that in the output code, there is an extra +, which I haven't written in the coding window....strange.
Hi Gemsie,

That error message means that there are not exactly 150 samples (I thought there were 150 from your previous remarks). Lets see if this finally works:

Code:
``````# I'm adding code to find the exact length of the measurements
N=dim(measurements)[1]
handstats=data.frame(length=c(measurements\$l.hand,
measurements\$r.hand), hand=c(rep(N,'left'),rep(N,'right')))
boxplot(handstats\$length~handstats\$hand)``````

Originally Posted by Gemsie
The new graph was interesting with a definite relationship (except for one very strange person who has particularly odd hands)!
That's what I expected, if you have a large right hand odds are you have a large left hand! This also tells us that both variables give us basically the same information, you thus don't need to and should not use both [in statistics this is called collinearity. You need to choose either left or right hand lengths in the rest of your analysis (you can let model fits guide your decision later, but I suspect right hand lengths will be best - as this is the dominant hand for most people).

Next try this code (we start with a very simple linear regression model).
Code:
``````m1=lm(social~r.hand) # or whatever hand you choose
see what these command tell you:
#summary of the fitted model
summary(m1)
# evaluate if the residuals are normal
hist(m1\$residuals)
shapiro.test(m1\$residuals)``````

11. ## Re: Significance in R

> N=dim(measurements)[1]
> handstats=data.frame(length=c(measurements\$l.hand,
+ measurements\$r.hand), hand=c(rep(N,'left'),rep(N,'right')))

Error in rep(N, "left") : invalid 'times' argument
In data.frame(length = c(measurements\$l.hand, measure\$r.hand), hand = c(rep(N, :
NAs introduced by coercion

> boxplot(handstats\$length~handstats\$hand)

BTW- I actually have 152 samples (I jut rounded down for simplicity, sorry), so I redid the original using 152...still didn't work!

12. ## Re: Significance in R

You can't use a character as the times parameter. You probably want it switched around
Code:
``rep("left", 152)``
You should also learn how to debug these things yourself (it's a very good skill). R has a good built in help system. To get help on using rep you would do
Code:
``````?rep
#or
help(rep)``````
But for your purposes you would probably want to explore and use something like
Code:
``````N <- 152
rep(c("left", "right"), each = N)``````

13. ## Re: Significance in R

Yay! I got it to work! Thank you, thank you, thank you to TheEcologist and Dason!

I use the help function...a lot, but I always think it's super complex. I could feasibly have sat there for hours trying to figure out which is the wrong way round, etc. I have used Crawley's text too, but didn't find it particularly helpful

Does anybody have any recommendations for textbooks, etc?

> m1=lm(social~r.hand)
> summary(m1)

Call:
lm(formula = social ~ r.hand)

Residuals:
Min 1Q Median 3Q Max
-16.922 -8.468 -4.238 4.568 34.874

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -14.5674 7.5104 -1.940 0.054300 .
r.hand 3.0612 0.8054 3.801 0.000209 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 12.19 on 150 degrees of freedom
Multiple R-squared: 0.08785, Adjusted R-squared: 0.08177
F-statistic: 14.45 on 1 and 150 DF, p-value: 0.0002091

> hist(m1\$residuals)
> shapiro.test(m1\$residuals)

Shapiro-Wilk normality test

data: m1\$residuals
W = 0.8571, p-value = 7.608e-11

14. ## Re: Significance in R

Originally Posted by Dason
You can't use a character as the times parameter. You probably want it switched around
Code:
``rep("left", 152)``

Aaah yes ofcourse, first the times argument. Thanks for paying attention Dason! Guess I should have checked it in R (I was coding from memory).

I was a little lazy. Shame on me.

15. ## Re: Significance in R

Coding from memory?

Oh.my.gosh.

Page 1 of 2 1 2 Last

 Tweet