+ Reply to Thread
Results 1 to 6 of 6

Thread: Association/ correlation binary and continuous data non-normal distribution

  1. #1
    Points: 20, Level: 1
    Level completed: 39%, Points required for next Level: 30

    Posts
    3
    Thanks
    3
    Thanked 0 Times in 0 Posts

    Association/ correlation binary and continuous data non-normal distribution




    Hi,
    I want to perform an association analysis between a binary (dichotomous?) variable and continuous variable. The binary variable is whether something is present or not, "Yes" or "No"/ "1" or "0" .... etc. The continuous variable are numbers between ~150 and 170~. The continuous variable is not normally distributed. There are more low than high values. My question of interest is whether there is a correlation between either high or low values of the continuous variable and the 1 or 0 of the binary variable. So, do low values correlate with "1"? My sample size is ~150

    I have tried a point biserial correlation test and a sperman's rho test so far. I'm not sure if any of them is the right one. Can someone give me an advice on this?

    Many Thanks!

  2. #2
    TS Contributor
    Points: 17,981, Level: 85
    Level completed: 27%, Points required for next Level: 369
    Karabiner's Avatar
    Location
    FC Schalke 04, Germany
    Posts
    2,563
    Thanks
    56
    Thanked 644 Times in 606 Posts

    Re: Association/ correlation binary and continuous data non-normal distribution

    Compare the means of your continuous variable between the "yes" and "no" group (t-Test, or rather Welch test).

    With kind regards

    Karabiner
    »Jetzt kann mich der Führer mal am Arsch lecken.« (Ernst Kuzorra, 1941)

  3. The Following User Says Thank You to Karabiner For This Useful Post:

    Hannah212 (04-27-2017)

  4. #3
    Omega Contributor
    Points: 39,242, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    7,094
    Thanks
    405
    Thanked 1,202 Times in 1,163 Posts

    Re: Association/ correlation binary and continuous data non-normal distribution

    Yeah, I was going to propose the Wilcoxon rank sum. Is your continuous variable bound between ~150 and 170~ or was that just where most landed?
    Stop cowardice, ban guns!

  5. The Following User Says Thank You to hlsmith For This Useful Post:

    Hannah212 (04-27-2017)

  6. #4
    Points: 20, Level: 1
    Level completed: 39%, Points required for next Level: 30

    Posts
    3
    Thanks
    3
    Thanked 0 Times in 0 Posts

    Re: Association/ correlation binary and continuous data non-normal distribution

    Thank you!

    The continuous variable is bound between ~150 and 170. If this is is problem I can change the numbers to ~0-20 but I don't think it is.

    A t-test and a point biserial correlation test are basically the same thing, is that right? By applying a t-test I get really low p-value for every case I'm testing.

  7. #5
    Omega Contributor
    Points: 39,242, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    7,094
    Thanks
    405
    Thanked 1,202 Times in 1,163 Posts

    Re: Association/ correlation binary and continuous data non-normal distribution

    I was checking with the bounding, because if a continuous variable is bounded, then many times you can get confidence intervals that span a greater range than is allowable, e.g., say 99% bound by 100% and 95% CIs are 94% to 109%, which may be non-sensical (sp?).

    What are you trying to say with the results. Also, can the continuous variable be non-intergers, e.g., 156.89?


    Given your data, I would think an exact (monte carlo) Wilcoxon rank sum test would be appropriate. There is another person on this forum that would likely also recommend perhaps a permutation test, based on say the t-test framework.
    Stop cowardice, ban guns!

  8. The Following User Says Thank You to hlsmith For This Useful Post:

    Hannah212 (04-27-2017)

  9. #6
    Points: 20, Level: 1
    Level completed: 39%, Points required for next Level: 30

    Posts
    3
    Thanks
    3
    Thanked 0 Times in 0 Posts

    Re: Association/ correlation binary and continuous data non-normal distribution


    the continuous variable are days of the year. The event whose occurence I am tesing based on condition "0" or "1" can occur approximately between 150 and 170 days after January 1st. Does this mean I have ties to my data? It can also be floats as I am also working with the mean value over several years.
    With the result I am trying to tell if condition "1" leads to lower values of the continuous variable. So if condition "1" occurs, wether the event I am testing occurs earlier.

+ Reply to Thread

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats