+ Reply to Thread
Page 1 of 2 1 2 LastLast
Results 1 to 15 of 29

Thread: parameter estimation based on observed survival probabilities

  1. #1
    schleprock
    Guest

    parameter estimation based on observed survival probabilities




    Say 100 samples have been pulled from a Normal distribution with unknown mean and variance. All I know about the results are the following:

    23 out of 100 are greater than 200
    9 out of 100 are greater than 300
    2 out of 100 are greater than 400

    How would I find the best estimate of the mean and SD of the distribution based on this information?

    Thanks in advance!

  2. #2
    Devorador de queso
    Points: 95,540, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,930
    Thanks
    307
    Thanked 2,629 Times in 2,245 Posts

    Re: parameter estimation based on observed survival probabilities

    Do you know anything about maximum likelihood estimation?
    I don't have emotions and sometimes that makes me very sad.

  3. #3
    Points: 30, Level: 1
    Level completed: 60%, Points required for next Level: 20

    Posts
    12
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: parameter estimation based on observed survival probabilities

    Yes, but I'm not clear how that would apply in this situation. Individual observations aren't available, so it doesn't seem like MLE applied to the PDF is helpful. The CDF has no closed form solution. I did think about using MLE and a binomial distribution (combined w/ an approximation to the Normal CDF) to solve for the mu and sigma, but it's unwieldy, and honestly I feel like I'm over-complicating things. Seems like a straightforward problem, but for some reason the answer isn't obvious to me. Thanks for your help!

  4. #4
    Devorador de queso
    Points: 95,540, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,930
    Thanks
    307
    Thanked 2,629 Times in 2,245 Posts

    Re: parameter estimation based on observed survival probabilities

    If I tell you mu and sigma can you tell me the probability of an observation being less than 200? Greater than 200 but less than 300?
    I don't have emotions and sometimes that makes me very sad.

  5. #5
    Points: 30, Level: 1
    Level completed: 60%, Points required for next Level: 20

    Posts
    12
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: parameter estimation based on observed survival probabilities

    Quote Originally Posted by Dason View Post
    If I tell you mu and sigma can you tell me the probability of an observation being less than 200? Greater than 200 but less than 300?
    Of course. But if I tell you Pr(X<200), can you tell me mu and sigma? No, because there is not a unique solution.

    If I also tell you Pr(X>=200 AND X<300), then -- with some effort -- you could give me mu and sigma. But if I also give you Pr(X>=300 AND X<400), and those probabilities are based on observed results (not a Normal distribution w/ known parameters), then I think there probably is no mu and sigma which would describe that. I would be interested to know how you would solve for the most likely mu and sigma in that case.

  6. #6
    Devorador de queso
    Points: 95,540, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,930
    Thanks
    307
    Thanked 2,629 Times in 2,245 Posts

    Re: parameter estimation based on observed survival probabilities

    I think you're forgetting how maximum likelihood works though. You basically say "if mu = 50 and sigma = 30 what is the probability of observing the data" (the likelihood) and your goal is to find the values of mu and sigma that maximize that. So pretend that you know mu and sigma for a second - can you write out the joint distribution of the observed data? Then it becomes a task of finding which values maximize that.
    I don't have emotions and sometimes that makes me very sad.

  7. #7
    Points: 30, Level: 1
    Level completed: 60%, Points required for next Level: 20

    Posts
    12
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: parameter estimation based on observed survival probabilities

    It's funny you say that, because I thought the problem might be that I'm too stuck on EXACTLY how MLE works in normal circumstances! Normally, we'd be dealing with individual observations, not some result like "23 of 100 obs are >200". Given individual observations, the path is to use the probability density function to establish a likelihood function, and then maximize the likelihood with respect to the individual parameters.

    But what is the likelihood function here?? We're dealing with a cdf, not a pdf, and the Normal cdf is not closed form.

  8. #8
    Devorador de queso
    Points: 95,540, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,930
    Thanks
    307
    Thanked 2,629 Times in 2,245 Posts

    Re: parameter estimation based on observed survival probabilities

    Well it might have been normally distributed originally but that's not what you see now. Now all you have is binned data but given the parameters you can find the probabilities of the bins.

    Forget the original problem exists and pretend that you're trying to solve this problem:

    The probability of an observation being Red is \alpha, the probability of an observation being Blue is \beta and the probability of it being neither is 1-\alpha-\beta. If you observe 32 reds, 23 blues, and 17 neithers then what are the MLEs for \alpha and \beta.
    I don't have emotions and sometimes that makes me very sad.

  9. #9
    Points: 30, Level: 1
    Level completed: 60%, Points required for next Level: 20

    Posts
    12
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: parameter estimation based on observed survival probabilities

    Quote Originally Posted by Dason View Post
    Well it might have been normally distributed originally but that's not what you see now. Now all you have is binned data but given the parameters you can find the probabilities of the bins.

    Forget the original problem exists and pretend that you're trying to solve this problem:

    The probability of an observation being Red is \alpha, the probability of an observation being Blue is \beta and the probability of it being neither is 1-\alpha-\beta. If you observe 32 reds, 23 blues, and 17 neithers then what are the MLEs for \alpha and \beta.
    I think you're hitting on one of the complications, which is that the information given speaks more directly to a binomial distribution than a Normal distribution. In your example, I don't think any MLE is needed; the best estimate for \alpha is 44.4% (32/72) and for \beta it's 31.9% (23/72).

    But that doesn't really get me anywhere, does it? I want to know something about the Normal distribution that underlies the percent of observations falling into the various "bins".

  10. #10
    Devorador de queso
    Points: 95,540, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,930
    Thanks
    307
    Thanked 2,629 Times in 2,245 Posts

    Re: parameter estimation based on observed survival probabilities

    Quote Originally Posted by schleprock2 View Post
    I don't think any MLE is needed
    That's not a good way to think about this. Take a step back and ask yourself how would you find the MLE in this case.
    I don't have emotions and sometimes that makes me very sad.

  11. #11
    Points: 30, Level: 1
    Level completed: 60%, Points required for next Level: 20

    Posts
    12
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: parameter estimation based on observed survival probabilities

    In the example you gave -- what are the MLEs of alpha (Prob of an observation being red) and beta (Prof of an observation being blue) -- you're talking about a binomial distribution. k successes in n trials. This page (https://onlinecourses.science.psu.edu/stat504/node/28) walks through the calculation better than I could in this quick reply, but the conclusion is that the MLE of alpha is k/n, or number of successes (32) divided by number of trials (72), or 44.4%.

    So that's solved. But it doesn't appear to get me any closer to using that information to understand the underlying Normal distribution.

  12. #12
    Devorador de queso
    Points: 95,540, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,930
    Thanks
    307
    Thanked 2,629 Times in 2,245 Posts

    Re: parameter estimation based on observed survival probabilities

    No it's technically not a binomial distribution. It would be a multinomial distribution (with a binomial there are only two possibilities).

    Plus it's not solved. The whole point of it was to get you think about what the likelihood function actually is. There is a direct connection between the problem I gave you and the the problem you're trying to do.
    I don't have emotions and sometimes that makes me very sad.

  13. #13
    Human
    Points: 12,672, Level: 73
    Level completed: 56%, Points required for next Level: 178
    Awards:
    Master Tagger
    GretaGarbo's Avatar
    Posts
    1,361
    Thanks
    455
    Thanked 462 Times in 402 Posts

    Re: parameter estimation based on observed survival probabilities

    Is the conditions here that:
    Say 100 samples have been pulled from a Normal distribution
    23 out of 100 are greater than 200
    9 out of 100 are greater than 300
    2 out of 100 are greater than 400?

    OR is it that:

    66 out of 100 are less than 200
    23 out of 100 are greater than 200 and less than 300
    9 out of 100 are greater than 300 and less than 400
    2 out of 100 are greater than 400?
    And that all are statistically independent?

  14. #14
    Points: 30, Level: 1
    Level completed: 60%, Points required for next Level: 20

    Posts
    12
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: parameter estimation based on observed survival probabilities

    Quote Originally Posted by Dason View Post
    No it's technically not a binomial distribution. It would be a multinomial distribution (with a binomial there are only two possibilities).

    Plus it's not solved. The whole point of it was to get you think about what the likelihood function actually is. There is a direct connection between the problem I gave you and the the problem you're trying to do.
    Yes, it's technically multinomial. However, it's just as easily stated as a binomial ("blue" or "not blue").

    Thanks for the feedback, but I don't think this is really getting me anywhere. I was really looking for suggested solutions, not additional problems related to the original question.

  15. #15
    Points: 30, Level: 1
    Level completed: 60%, Points required for next Level: 20

    Posts
    12
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: parameter estimation based on observed survival probabilities


    Quote Originally Posted by GretaGarbo View Post
    Is the conditions here that:
    Say 100 samples have been pulled from a Normal distribution
    23 out of 100 are greater than 200
    9 out of 100 are greater than 300
    2 out of 100 are greater than 400?

    OR is it that:

    66 out of 100 are less than 200
    23 out of 100 are greater than 200 and less than 300
    9 out of 100 are greater than 300 and less than 400
    2 out of 100 are greater than 400?
    And that all are statistically independent?
    Hi Greta-

    I think the problem could be equivalently stated that way, but I don't think those outcomes are statistically independent, since the number of samples is fixed. More outcomes less than 200 means that (statistically) fewer will be greater than 400.

+ Reply to Thread
Page 1 of 2 1 2 LastLast

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats