+ Reply to Thread
Page 1 of 2 1 2 LastLast
Results 1 to 15 of 28

Thread: [Solved] Significance in R

  1. #1
    Points: 1,320, Level: 20
    Level completed: 20%, Points required for next Level: 80

    Posts
    21
    Thanks
    2
    Thanked 0 Times in 0 Posts

    [Solved] Significance in R




    Ok, this is quite hard to explain, but I'm at a complete loss what to do. I'm a relative newcomer to R and although I can completely admire how powerful it is, I'm not too good at actually using it....

    Basically, I have some very contrived data that I need to analyse (it wasn't me who chose this, I can assure you!). I have the right and left hand lengths of lots of people, as well as some numeric data that shows their sociability.

    Now I would like to know if people who have significantly different lengths of hand are more or less sociable than those who have the same (leading into the research that 'symmetrical' people are more sociable and intelligent, etc.

    I have got as far as loading the data into R, then I have no idea where to go from there. How on Earth do I start to separate those who are close to symmetrical to those who aren't to then start to do the analysis?

  2. #2
    R purist
    Points: 35,103, Level: 100
    Level completed: 0%, Points required for next Level: 0
    TheEcologist's Avatar
    Location
    United States
    Posts
    1,921
    Thanks
    303
    Thanked 607 Times in 341 Posts

    Re: Significance in R

    Quote Originally Posted by Gemsie View Post
    I have got as far as loading the data into R, then I have no idea where to go from there. How on Earth do I start to separate those who are close to symmetrical to those who aren't to then start to do the analysis?
    Hi Gemsie,

    I'm going to ask for more information first.

    Start with determining which test you want to use, here's a guide.

    Then give us an example of how your data file is structured. For example what does the output of this command look like:

    head(mydataset)

    Its also useful to read this.
    Last edited by TheEcologist; 01-07-2011 at 10:17 AM. Reason: typo
    The true ideals of great philosophies always seem to get lost somewhere along the road..

  3. #3
    Points: 1,320, Level: 20
    Level completed: 20%, Points required for next Level: 80

    Posts
    21
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: Significance in R

    I definitely need to apologise for the ridiculous way I've asked the question...I'm definitely under the stupid category on your guide so thanks for your patience.

    Thank you for sending me the other link too, I'll certainly be using it in future.

    I've tried to simplify my question. I would like to do two things:
    1. Test whether the lengths of the left and right hands are significantly different from each other.
    2. Test whether the sociability is significantly affected by hand length (i.e. is there a difference between those people who have similar left and right hand lengths, to those who have different lengths?)

    I have 150 people, and thought that at some point (when I eventually figure out the stuff before), I'd have to do something along the lines of:
    glm(sociable~ ???, family="binomial" )

    ??? : I've got no idea what to put here...

  4. #4
    R purist
    Points: 35,103, Level: 100
    Level completed: 0%, Points required for next Level: 0
    TheEcologist's Avatar
    Location
    United States
    Posts
    1,921
    Thanks
    303
    Thanked 607 Times in 341 Posts

    Re: Significance in R

    Quote Originally Posted by Gemsie View Post
    I definitely need to apologise for the ridiculous way I've asked the question...I'm definitely under the stupid category on your guide so thanks for your patience.
    Don't worry about it.

    Quote Originally Posted by Gemsie View Post
    ???
    Why dont you post an example of how you structured your data (with the command head as I suggested.. we will only see the first few lines). This will make it easier for me (or someone else) to help you with the code. That way we will have an idea of what to put there

    With the data example I can also help you easier with 1.
    The true ideals of great philosophies always seem to get lost somewhere along the road..

  5. #5
    Points: 1,320, Level: 20
    Level completed: 20%, Points required for next Level: 80

    Posts
    21
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: Significance in R

    What I've done so far is this:

    hist(social)
    This shows me that the data isn't normally distributed (i.e. it's a reverse J distribution).

    I then:
    stand=scale(measurements$l.hand-measurements$r.hand)
    m<-lm(measurements$social~stand)
    m
    summary(m)
    anova(m)
    par(mfrow=c(2,2))
    plot(m)


    This obviously gives me the model plot checks, and again shows the data isn't normally distributed.

    I then tried to plot the results on a graph, so:
    plot(l.hand-r.hand,social,
    ylab="Sociability (number of people spoken to)",
    xlab="Difference in Hand Length (cm)")


    But I'm unable to plot a line of best fit on it....I tried abline, but that just runs a horizontal line straight through 0. So then I looked at scatter.smooth but that looks wrong too.....

    Finally, I'm trying to do some kind of analysis on the data, but I'm totally lost here. I've muddled my way through some ideas:

    1.
    stand=scale(measurements$l.hand-r.hand,center=FALSE)
    fake=abs(stand)<1.96
    t.test(measurements$social[fake],measurements$social[!fake])


    But I don't think I have enough observations to do this (150)...

    2.
    cor(abs(measurements$l.hand-measurements$r.hand),measurements$social)

    But again, I have no idea if this is right and don't know how to intepret this.

    3.
    set.seed(1)
    DF <- data.frame(l.hand = rnorm(100, 15, sd = 2), r.hand = rnorm(100, 15, sd = 2), social = runif(100))
    DF <- within(DF, hands <- l.hand - r.hand)
    mod <- lm(social ~ hands, data = DF)
    summary(mod)
    plot(social ~ hands, data = DF)


    So I've messed about with all of these and they all work, but I just feel like I'm blindly trying anything, feeling optimistic when it doesn't error, but in essence, have absolutely no idea what I'm doing

  6. #6
    R purist
    Points: 35,103, Level: 100
    Level completed: 0%, Points required for next Level: 0
    TheEcologist's Avatar
    Location
    United States
    Posts
    1,921
    Thanks
    303
    Thanked 607 Times in 341 Posts

    Re: Significance in R

    Quote Originally Posted by Gemsie View Post
    1. Test whether the lengths of the left and right hands are significantly different from each other.
    2. Test whether the sociability is significantly affected by hand length (i.e. is there a difference between those people who have similar left and right hand lengths, to those who have different lengths?)
    Oke now I have a slightly better idea of how your data looks (all I wanted to know from the data example was how you had structure your data).

    No offense meant, but It always amazes me how much (often sophisticated) statistics people conduct on there data without first looking at them (now if you did the below first then this comment is off course not for you). From experience I know that doing some simple data explorations sometimes save you hours of fussing with analysis. It (1) gives you a 'feel' for your data, (2) helps you understand the trends you find and (3) make your analysis much more directed.

    For your objective 1. Try this: plot a boxplot of both statistics.

    Code: 
    #create a data frame of handstats
    handstats=data.frame(length=c(measurements$l.hand,
    measurements$r.hand), hand=c(rep(150,'left'),rep(150,'right'))
    # boxplots
    boxplot(handstats$length~handstats$hand)
    Now that should already give you a pretty good idea of the existence of any differences, the distribution of the data and what test would be best.

    Report on how this looks, but best of all would be if you posted the graph here.

    For 2. Plot these scatter plots.
    Code: 
    par(mfrow=c(2,2))
    # 1
    plot(social~l.hand)
    # 2
    plot(social~r.hand)
    # 3
    plot(social~c(r.hand-l.hand))
    #4 this tell you whether right and left hand lengths are related, 
    # and thus if for 1 a different test would be more appropriate (e.g. ANCOVA).  
    plot(r.hand~l.hand)
    Again when you report back, it would be best to post the graph. We should then be able to see which course of action make sense.

    Hope this helps,
    The true ideals of great philosophies always seem to get lost somewhere along the road..

  7. #7
    Points: 1,320, Level: 20
    Level completed: 20%, Points required for next Level: 80

    Posts
    21
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: Significance in R

    Thank you so much for helping me....you have no idea how grateful I am.

    I tried to do the first thing you said, but it just errors:

    > handstats=data.frame(length=c(measurements$l.hand,
    + measurements$r.hand), hand=c(rep(150,'left'),rep(150,'right'))
    + boxplot(handstats$length~handstats$hand)

    Error: unexpected symbol in:
    "measurements$r.hand), hand=c(rep(150,'left'),rep(150,'right'))
    boxplot
    "
    I have retyped it word for word, just in case it's a formatting error, etc. but to no avail.

    Part two graphs:
    This worked! I have no idea how to post a graph in the thread though (how terrible do I appear with computers?!) but have attached it.
    Attached Files
    Last edited by Gemsie; 01-08-2011 at 09:54 AM.

  8. #8
    R purist
    Points: 35,103, Level: 100
    Level completed: 0%, Points required for next Level: 0
    TheEcologist's Avatar
    Location
    United States
    Posts
    1,921
    Thanks
    303
    Thanked 607 Times in 341 Posts

    Re: Significance in R

    Quote Originally Posted by Gemsie View Post
    Thank you so much for helping me....you have no idea how grateful I am.

    I tried to do the first thing you said, but it just errors:

    > handstats=data.frame(length=c(measurements$l.hand,
    measurements$r.hand), hand=c(rep(150,'left'),rep(150,'right'))
    boxplot(handstats$length~handstats$hand)

    Error: unexpected symbol in:
    "measurements$r.hand), hand=c(rep(150,'left'),rep(150,'right'))
    boxplot
    "
    I have retyped it word for word, just in case it's a formatting error, etc. but to no avail.
    Aah I missed one parenthesis, here it is correct:

    Code: 
    handstats=data.frame(length=c(measurements$l.hand,
    measurements$r.hand), hand=c(rep(150,'left'),rep(150,'right')))
    boxplot(handstats$length~handstats$hand)
    That should work.
    Also don't forget this plot (scatter plot left right hand lengths):
    Code: 
    plot(measurements$l.hand~measurements$r.hand)
    Quote Originally Posted by Gemsie View Post
    Part two graphs:
    This worked! I have no idea how to post a graph in the thread though (how terrible do I appear with computers?!) but have attached it.
    Looks like there is some positive relationship (the graphs also leads me to believe there is a relationship between l.hand and r.hand as well). But lets see the scatter plot of left right hand lengths first.

    There seems to be no relationship between socialibilty & the difference in handsizes (r.hand-l.hand) so you can forget about that.
    The true ideals of great philosophies always seem to get lost somewhere along the road..

  9. #9
    Points: 1,320, Level: 20
    Level completed: 20%, Points required for next Level: 80

    Posts
    21
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: Significance in R

    Ok, I tried it again, but got this:

    > handstats=data.frame(length=c(measurements$l.hand,
    + + measurements$r.hand), hand=c(rep(150,'left'),rep(150,'right')))

    Error in rep(150, "left") : invalid 'times' argument
    In addition: Warning message:
    In data.frame(length = c(measurements$l.hand, +measurements$r.hand), hand = c(rep(150, :
    NAs introduced by coercion
    >
    boxplot(handstats$length~handstats$hand)
    Error in eval(expr, envir, enclos) : object 'handstats' not found

    What does this mean? I've noticed that in the output code, there is an extra +, which I haven't written in the coding window....strange.

    The new graph was interesting with a definite relationship (except for one very strange person who has particularly odd hands)!
    Attached Files

  10. #10
    R purist
    Points: 35,103, Level: 100
    Level completed: 0%, Points required for next Level: 0
    TheEcologist's Avatar
    Location
    United States
    Posts
    1,921
    Thanks
    303
    Thanked 607 Times in 341 Posts

    Re: Significance in R

    Quote Originally Posted by Gemsie View Post
    Ok, I tried it again, but got this:

    What does this mean? I've noticed that in the output code, there is an extra +, which I haven't written in the coding window....strange.
    Hi Gemsie,

    That error message means that there are not exactly 150 samples (I thought there were 150 from your previous remarks). Lets see if this finally works:

    Code: 
    # I'm adding code to find the exact length of the measurements
    N=dim(measurements)[1]
    handstats=data.frame(length=c(measurements$l.hand,
    measurements$r.hand), hand=c(rep(N,'left'),rep(N,'right')))
    boxplot(handstats$length~handstats$hand)

    Quote Originally Posted by Gemsie View Post
    The new graph was interesting with a definite relationship (except for one very strange person who has particularly odd hands)!
    That's what I expected, if you have a large right hand odds are you have a large left hand! This also tells us that both variables give us basically the same information, you thus don't need to and should not use both [in statistics this is called collinearity. You need to choose either left or right hand lengths in the rest of your analysis (you can let model fits guide your decision later, but I suspect right hand lengths will be best - as this is the dominant hand for most people).

    Next try this code (we start with a very simple linear regression model).
    Code: 
    m1=lm(social~r.hand) # or whatever hand you choose
    see what these command tell you:
    #summary of the fitted model
    summary(m1)
    # evaluate if the residuals are normal
    hist(m1$residuals)
    shapiro.test(m1$residuals)
    The true ideals of great philosophies always seem to get lost somewhere along the road..

  11. #11
    Points: 1,320, Level: 20
    Level completed: 20%, Points required for next Level: 80

    Posts
    21
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: Significance in R

    > N=dim(measurements)[1]
    > handstats=data.frame(length=c(measurements$l.hand,
    + measurements$r.hand), hand=c(rep(N,'left'),rep(N,'right')))

    Error in rep(N, "left") : invalid 'times' argument
    In addition: Warning message:
    In data.frame(length = c(measurements$l.hand, measure$r.hand), hand = c(rep(N, :
    NAs introduced by coercion

    > boxplot(handstats$length~handstats$hand)
    Error in eval(expr, envir, enclos) : object 'handstats' not found



    BTW- I actually have 152 samples (I jut rounded down for simplicity, sorry), so I redid the original using 152...still didn't work!

  12. #12
    Devorador de queso
    Points: 95,889, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,937
    Thanks
    307
    Thanked 2,630 Times in 2,246 Posts

    Re: Significance in R

    You can't use a character as the times parameter. You probably want it switched around
    Code: 
    rep("left", 152)
    You should also learn how to debug these things yourself (it's a very good skill). R has a good built in help system. To get help on using rep you would do
    Code: 
    ?rep
    #or
    help(rep)
    But for your purposes you would probably want to explore and use something like
    Code: 
    N <- 152
    rep(c("left", "right"), each = N)

  13. #13
    Points: 1,320, Level: 20
    Level completed: 20%, Points required for next Level: 80

    Posts
    21
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: Significance in R

    Yay! I got it to work! Thank you, thank you, thank you to TheEcologist and Dason!

    I use the help function...a lot, but I always think it's super complex. I could feasibly have sat there for hours trying to figure out which is the wrong way round, etc. I have used Crawley's text too, but didn't find it particularly helpful

    Does anybody have any recommendations for textbooks, etc?

    > m1=lm(social~r.hand)
    > summary(m1)

    Call:
    lm(formula = social ~ r.hand)

    Residuals:
    Min 1Q Median 3Q Max
    -16.922 -8.468 -4.238 4.568 34.874

    Coefficients:
    Estimate Std. Error t value Pr(>|t|)
    (Intercept) -14.5674 7.5104 -1.940 0.054300 .
    r.hand 3.0612 0.8054 3.801 0.000209 ***
    ---
    Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

    Residual standard error: 12.19 on 150 degrees of freedom
    Multiple R-squared: 0.08785, Adjusted R-squared: 0.08177
    F-statistic: 14.45 on 1 and 150 DF, p-value: 0.0002091

    > hist(m1$residuals)
    > shapiro.test(m1$residuals)




    Shapiro-Wilk normality test

    data: m1$residuals
    W = 0.8571, p-value = 7.608e-11
    Attached Files
    Last edited by Gemsie; 01-08-2011 at 04:19 PM.

  14. #14
    R purist
    Points: 35,103, Level: 100
    Level completed: 0%, Points required for next Level: 0
    TheEcologist's Avatar
    Location
    United States
    Posts
    1,921
    Thanks
    303
    Thanked 607 Times in 341 Posts

    Re: Significance in R

    Quote Originally Posted by Dason View Post
    You can't use a character as the times parameter. You probably want it switched around
    Code: 
    rep("left", 152)


    Aaah yes ofcourse, first the times argument. Thanks for paying attention Dason! Guess I should have checked it in R (I was coding from memory).

    I was a little lazy. Shame on me.
    The true ideals of great philosophies always seem to get lost somewhere along the road..

  15. #15
    Points: 1,320, Level: 20
    Level completed: 20%, Points required for next Level: 80

    Posts
    21
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: Significance in R


    Coding from memory?

    Oh.my.gosh.

+ Reply to Thread
Page 1 of 2 1 2 LastLast

           




Similar Threads

  1. Solved and Workaround tags
    By TheEcologist in forum Forum Feedback
    Replies: 14
    Last Post: 01-03-2012, 06:19 PM
  2. [Solved] Dichotomizing variables
    By alexburke in forum SPSS
    Replies: 6
    Last Post: 01-09-2011, 11:15 AM
  3. [Solved] Use mean or not?
    By smir in forum Statistics
    Replies: 2
    Last Post: 01-06-2011, 03:38 AM
  4. [Solved] Script writing for PCA in R
    By Emma2415 in forum R
    Replies: 4
    Last Post: 01-05-2011, 10:49 AM
  5. Solved problems
    By bugman in forum Forum Feedback
    Replies: 11
    Last Post: 08-09-2010, 04:56 PM

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats