+ Reply to Thread
Results 1 to 5 of 5

Thread: Struggling with finding a distribution...

  1. #1
    Points: 1,514, Level: 22
    Level completed: 14%, Points required for next Level: 86

    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Struggling with finding a distribution...



    Hi All,

    When I use a histogram to view my data, I get the following... (see age.png)

    I'm trying to figure out the distribution...

    From visual inspection, I assumed it would be lognormal. The data is left balanced and there's no "peaks" otherwise.

    However, in R I try the following...

    > s <- sum( (log(age+.1) - u )^2 ) / length(age)
    > u <- sum(log(age+.1)) / length(age)
    > s <- sum( (log(age+.1) - u )^2 ) / length(age)
    > my_lnorm<-rlnorm(length(age), u, s)
    > qqplot( my_lnorm, age )
    (I add by .1 because some of the ages == 0. log(0) returns -Inf)

    See qqplot.png for the result.

    So according to the qq-plot, the data clearly does not match...

    Is there a better way to determine distributions?
    Attached Images

  2. #2
    Points: 2,198, Level: 28
    Level completed: 32%, Points required for next Level: 102

    Posts
    275
    Thanks
    0
    Thanked 1 Time in 1 Post
    It looks lognormal to me. To test that, take the logarithm, then do a test of normality for the transformed data.

  3. #3
    Points: 1,514, Level: 22
    Level completed: 14%, Points required for next Level: 86

    Posts
    3
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by squareandrare View Post
    It looks lognormal to me. To test that, take the logarithm, then do a test of normality for the transformed data.
    Thanks for the response.

    I have to admit, I don't understand the rationale behind transforming into normal...

    but even in doing so, the qq-plot doesn't appear to fit the data...

    I try the following in R:
    > u <- sum(log(age+.1)) / length(age)
    > s <- sum( (log(age+.1) - u )^2 ) / length(age)
    > age_trans = ( log(age+.1) - u ) / s
    > qqnorm(age_trans)
    > qqline(age_trans)

    And I get the results in "age_transformed.png".

    It still doesn't seem to fit the distribution all that well... What should I interpret from this?
    Attached Images  

  4. #4
    Points: 2,198, Level: 28
    Level completed: 32%, Points required for next Level: 102

    Posts
    275
    Thanks
    0
    Thanked 1 Time in 1 Post
    Quote Originally Posted by nami1234 View Post
    What should I interpret from this?
    That it isn't lognormal.

    First, I would try the exponential distribution. Then if that doesn't look good, maybe try the Weibull. You could also try a Box-Cox transformation. There should be some documentation online about how to calculate the maximum likelihood estimates for the parameters Exponential, Weibull, and Box-Cox.

    The reality is that your data might not fit any well-known distributions. The distribution is what it is.

  5. #5
    Super Moderator
    Points: 8,665, Level: 62
    Level completed: 72%, Points required for next Level: 85
    Dragan's Avatar
    Location
    Illinois, US
    Posts
    1,734
    Thanks
    0
    Thanked 129 Times in 115 Posts

    Quote Originally Posted by nami1234 View Post

    ...It still doesn't seem to fit the distribution all that well...

    You might want to consider fitting a generalized lambda distribution (GLD) to your data. See, for example (there are a number of links),

    http://www.jstatsoft.org/v21/i09/paper

    http://www.algorithmics.com/EN/media...3-3_lambda.pdf

+ Reply to Thread

Similar Threads

  1. finding p-value from F distribution table
    By hanelliot in forum Statistics
    Replies: 3
    Last Post: 07-26-2012, 10:34 PM
  2. Finding E[X] and Var[X] of exponential distribution
    By StatStudent3 in forum Probability
    Replies: 7
    Last Post: 10-19-2010, 12:23 PM
  3. finding distribution
    By madara123 in forum Statistics
    Replies: 10
    Last Post: 09-14-2010, 12:03 PM
  4. Finding the right Gamma distribution
    By yayo1984 in forum Probability
    Replies: 3
    Last Post: 08-27-2010, 11:54 PM
  5. Finding distribution from MGF
    By rose13 in forum Probability
    Replies: 1
    Last Post: 07-28-2009, 12:04 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts








Advertise on Talk Stats