+ Reply to Thread
Results 1 to 8 of 8

Thread: Point-biserial correlation when some of the test assumptions is violated

  1. #1
    Points: 9, Level: 1
    Level completed: 17%, Points required for next Level: 41

    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Point-biserial correlation when some of the test assumptions is violated




    Hi!

    I originally posted my question on StacExchange-CrossValidated, but no one there seems to be able to answer my question.

    This is the situation: I've got 15 raters who assess if there is deep invasion of a cancer type or not (dichotomous), and they can be either accurate or inaccurate. I've also asked raters how certain they are in their assessment, using a Visual Analogue Scale from 0-100.

    I would like to check for an association between these variables, answering the question "Do raters know when they are accurate/inaccurate?" so that I later can see if more experienced raters are more knowing of their limitations than inexperienced raters, who can be expected to be more hit-and-miss without knowing it.

    I've learned from other forums with researchers asking the same question that point-biserial correlation seems to be the way to go. Although, some assumptions are violated; I get a significant Shapiro-Wilk for some raters and a significant Lavenes test for some others and so forth.

    http://stats.stackexchange.com/quest...-continuous-in

    Any help is greatly appreciated, I'm totally stuck!

    //Rasmus Green, MD., PhD student, Karolinska Institute, Stockholm, Sweden

  2. #2
    Omega Contributor
    Points: 38,396, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    7,001
    Thanks
    398
    Thanked 1,186 Times in 1,147 Posts

    Re: Point-biserial correlation when some of the test assumptions is violated

    Yo, Dr. Green,


    Is this three variables or two


    invasive (y/n)
    accurate (y/n)
    certainty (0-100)


    Which variables are in play here?


    Why not use logistic regression regardless of 2 or 3 variables in the model. Also, do you have any 0s or 100s in the certainty variable. post its distribution.
    Stop cowardice, ban guns!

  3. #3
    Points: 9, Level: 1
    Level completed: 17%, Points required for next Level: 41

    Posts
    2
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Re: Point-biserial correlation when some of the test assumptions is violated

    Quote Originally Posted by hlsmith View Post
    Yo, Dr. Green,


    Is this three variables or two


    invasive (y/n)
    accurate (y/n)
    certainty (0-100)


    Which variables are in play here?


    Why not use logistic regression regardless of 2 or 3 variables in the model. Also, do you have any 0s or 100s in the certainty variable. post its distribution.
    The pathologist provides the 'gold standard', true invasiveness (Y/N). The examiner provides his/her assessment, measures invasiveness (Y/N), thus var_accurate (Y/N) can easily be obtained.

    And yes, I have 0s and 100s, some raters are very confident, some are more humble.

    I don't know why most post I've read on this recommends point-biserial correlation over logistic regression. I've asked but haven't got an explanation. Even Mann-Whitney or ANOVA could be used, I guess, but surely there must be some pros and cons which aren't clear to me.

  4. #4
    Omega Contributor
    Points: 38,396, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    7,001
    Thanks
    398
    Thanked 1,186 Times in 1,147 Posts

    Re: Point-biserial correlation when some of the test assumptions is violated

    Well we have to get your details before making a suggestion that we can then explain.


    Your outcome is binary (invasive: y/n), so you are predicting a dichotomized variable. That narrows down your options. pretty sure point-biserial can't handle a 3 variables, and why would you use it. in that its interpretation is not as clear as logistic regression. So you can run a regression model predicting invasive using a oncologist/radiologist/lab techs interpretation plus their confidence.


    Model: invasive (y/n) = interpretation (y/n) + confidence. You may also want to examine an interaction between interpretation and confidence as another predictor.




    Question: Are more than one rater going to interpret the same sample or observation. If so, that is going to make things more complex. In that you have to control for this in a multilevel logistic regression.




    Side note, the replies you receive are based on the amount and quality of information provided. In addition, there is a subjective nature to which statistical approach you select. I think you want a simple solution, but I am going to bet the best approach will likely be more complex than you imagined. Meaning you may need to recruit a statistician to help you. Especially if you are going to use result to guide decisions or if you plan to disseminate ressults to others with your name associated with it.
    Stop cowardice, ban guns!

  5. #5
    Omega Contributor
    Points: 38,396, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    7,001
    Thanks
    398
    Thanked 1,186 Times in 1,147 Posts

    Re: Point-biserial correlation when some of the test assumptions is violated

    Alright, I just read your Stacked post, if that is all you want to do, than I would still recommend logistic regression.


    accurate = confidence.


    However, such a model won't tell you if they were inaccurate due to false positive or false negative. Plus as mentioned you need to control for multiple raters of the same observation. So a mixed model polynomial logistic regression could work. Though, you may have a small data sample to power the model.
    Stop cowardice, ban guns!

  6. #6
    Human
    Points: 12,676, Level: 73
    Level completed: 57%, Points required for next Level: 174
    Awards:
    Master Tagger
    GretaGarbo's Avatar
    Posts
    1,362
    Thanks
    455
    Thanked 462 Times in 402 Posts

    Re: Point-biserial correlation when some of the test assumptions is violated

    I am thinking more of the Visual Analogue Scale as a prior in a Bayesian model for the binomial model correct/incorrect where the prior is beta-distributed.

    But there is a problem with the scaling.

  7. #7
    Omega Contributor
    Points: 38,396, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    7,001
    Thanks
    398
    Thanked 1,186 Times in 1,147 Posts

    Re: Point-biserial correlation when some of the test assumptions is violated

    Unique perspective Greta.
    Stop cowardice, ban guns!

  8. #8
    Human
    Points: 12,676, Level: 73
    Level completed: 57%, Points required for next Level: 174
    Awards:
    Master Tagger
    GretaGarbo's Avatar
    Posts
    1,362
    Thanks
    455
    Thanked 462 Times in 402 Posts

    Re: Point-biserial correlation when some of the test assumptions is violated


    Quote Originally Posted by hlsmith View Post
    Unique perspective Greta.
    Hmmm, and maybe I am completely wrong.

    I would like to modify my statement by saying that the VAS-scale (Visual Analogue Scale) should be a sort inverse of a hyper parameter in a beta distribution that expressed the variance in the prior. So if the scale value is high the the variance in the prior beta distribution is low.

    Example: suppose ten skillful doctors had evaluated a patient and 8 of 10 said he has got the disease and the gave high scores on the VAS. Then we would believe that to a large extent, that is our prior for their statement would have a small variance so that the prior density would have a narrow range. (Hmm a beta distribution needs two parameters, in that also the location needs to be established, Lets say this is how correct they have been previously.)

    Let's also assume that ten complete amateur made an assessment. Then we would still let the previous result give the location and and their VAS result give the variance. But also if we know that they are amateurs we would add something to increase their variance in comparison to the experts.

    We would estimate the the probability of being correct by observed correct/incorrect. That is the likelihood. Then we would adjust the likelihood by the prior value from the VAS scale assessment, to an posterior distribution.

    Does this make sense?

+ Reply to Thread

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats