+ Reply to Thread
Results 1 to 12 of 12

Thread: help with Normality test?

  1. #1
    Points: 42, Level: 1
    Level completed: 84%, Points required for next Level: 8

    Posts
    6
    Thanks
    2
    Thanked 0 Times in 0 Posts

    help with Normality test?




    Hi all!

    this is probably a simple question; however, my statistics skills got a bit rusty and I cannot find an appropriate solution on the internet...

    This is the problem: Let X_1, ..., X_n be a series of values drawn from a normal distribution. All I know about them is their mean u=sum X_i/n and their standard deviation s (n is unknown and can be assumed to be large). I want to compute the likelihood (i.e. a p-value) that X_1, ..., X_n come from the normal distribution N(U,S^2), with U and S known.

    I need something like Student's or Welch's t-tests; however, those tests (1) require n to be known and (2) test the hypothesis that two populations have equal means (instead I want to test for both equal mean and equal standard deviation). Problem (1) could probably be solved with the assumption that n is large, so the t distribution tends to N(0,1)...

    Can someone help me with this? thank you very much!

  2. #2
    Devorador de queso
    Points: 95,922, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,937
    Thanks
    307
    Thanked 2,630 Times in 2,246 Posts

    Re: help with Normality test?

    Without the sample size I think you're out of luck for what you want to do. At least if I'm understanding you correctly.
    I don't have emotions and sometimes that makes me very sad.

  3. #3
    Points: 42, Level: 1
    Level completed: 84%, Points required for next Level: 8

    Posts
    6
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: help with Normality test?

    I think there should be a solution (it's just that I cannot find it): consider the simpler version of this problem where I just want to take into account the mean u of my data (and just ignore its standard deviation). Then, this can be solved with a Student's t-test. Since I can assume large n, I simply can approximate the t-distribution of the test with N(0,1)...

    The problem is: is there someting like Student's t-test that takes into account both sample mean and standard deviation?

  4. #4
    Points: 1,340, Level: 20
    Level completed: 40%, Points required for next Level: 60

    Posts
    19
    Thanks
    9
    Thanked 1 Time in 1 Post

    Re: help with Normality test?

    Look into normality tests....i.e. Shapiro-Wilk test

  5. The Following User Says Thank You to gene2420 For This Useful Post:

    nicola (06-17-2015)

  6. #5
    Devorador de queso
    Points: 95,922, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,937
    Thanks
    307
    Thanked 2,630 Times in 2,246 Posts

    Re: help with Normality test?

    Quote Originally Posted by nicola View Post
    Since I can assume large n, I simply can approximate the t-distribution of the test with N(0,1)...
    Sure you get rid of the issue of needing to care about the degrees of freedom but... how are you getting your t-statistic?

    T = \frac{\bar{X} - \mu_o}{s/\sqrt{n}}

    You need n to get the t-statistic.
    I don't have emotions and sometimes that makes me very sad.

  7. The Following User Says Thank You to Dason For This Useful Post:

    nicola (06-17-2015)

  8. #6
    Points: 42, Level: 1
    Level completed: 84%, Points required for next Level: 8

    Posts
    6
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: help with Normality test?

    yes, you are totally right woa ok, I'll see if in some way I can derive n from the data I have.. thank you!

  9. #7
    Devorador de queso
    Points: 95,922, Level: 100
    Level completed: 0%, Points required for next Level: 0
    Awards:
    Posting AwardCommunity AwardDiscussion EnderFrequent Poster
    Dason's Avatar
    Location
    Tampa, FL
    Posts
    12,937
    Thanks
    307
    Thanked 2,630 Times in 2,246 Posts

    Re: help with Normality test?

    What data do you have?
    I don't have emotions and sometimes that makes me very sad.

  10. #8
    Points: 42, Level: 1
    Level completed: 84%, Points required for next Level: 8

    Posts
    6
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: help with Normality test?

    Hi! in the end, I managed to obtain the sample size n! this makes everything much easier.

    To summarize, this is an example instance of the problem I wish to solve: I know that mean(X_1,...,X_n)=21, stdev(X_1,...,X_n)=3, and n=250. How to compute the likelihood that X_1, ..., X_n have been generated from the distribution N(21.5,4)?

    I could perform a Student's or Welch's t-test, but those tests only give me the likelihood that the means are equal, right? Is there a way to compute the likelihood that both mean and standard deviation are the same?

    thanks!

  11. #9
    Omega Contributor
    Points: 38,406, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    7,002
    Thanks
    398
    Thanked 1,186 Times in 1,147 Posts

    Re: help with Normality test?

    what is the purpose of this endeavor? Does it have to be compared to (21.5, 4) or can you just test whether your data is normally distributed?


    In the prior posts, you may have been able to insert a range of n-values and say that your parameters would be normally distributed given n-value = ? - ?.


    Currently you can also plot your data, if you actually have them, and overlay a normal distribution with mean 21.5 and SD 4, and visually examine the distributions.
    Stop cowardice, ban guns!

  12. #10
    Points: 42, Level: 1
    Level completed: 84%, Points required for next Level: 8

    Posts
    6
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: help with Normality test?

    It must be compared to N(21.5,4). I already know the data is normally distributed, so this is not of concern. I could use the test T = \frac{\bar{X} - \mu_o}{s/\sqrt{n}} , but this would only include \bar{X} in the computation, and not the standard deviation of the sample (instead I want to use also the standard deviation to make the estimate more accurate)

    Unfortunately, I need an automatic method to perform this task (I cannot use graphical methods) because I am implementing this as a C++ routine to be called hundreds of times per second... this problem comes from the analysis of DNA sequencing data.

  13. #11
    Omega Contributor
    Points: 38,406, Level: 100
    Level completed: 0%, Points required for next Level: 0
    hlsmith's Avatar
    Location
    Not Ames, IA
    Posts
    7,002
    Thanks
    398
    Thanked 1,186 Times in 1,147 Posts

    Re: help with Normality test?

    If you gave N(21.4,4) the same sample size as the comparison sample, and you confirmed normality assumptions for the ttest, then you can put the other pieces together and do a ttest. Also, if you just gave N(21.4,4) the same sample size you could run a K-S test to compare distributions.
    Stop cowardice, ban guns!

  14. #12
    Points: 42, Level: 1
    Level completed: 84%, Points required for next Level: 8

    Posts
    6
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Re: help with Normality test?


    I solved the problem. I post the solution here in the hope it will be useful to others.

    Again, the problem formulation is:

    compute the likelihod of observing sample mean \bar\mu and sample standard deviation \bar\sigma in n samples drawn from the distribution N(\mu,\sigma^2)

    The quantity we are interested in is \log P(\bar \mu, \bar\sigma | \mu, \sigma) (I use log-likelihood since using log simplifies notation). Since sample mean \bar \mu and sample variance \bar\sigma^2 of a normally distributed population are two independent random variables, we have that

    \log P(\bar \mu, \bar\sigma | \mu, \sigma) = \log P(\bar \mu | \mu, \sigma) + \log P(\bar\sigma | \mu, \sigma)

    The random variable M=\frac{\bar\mu-\mu}{\sigma/\sqrt{n}} is t-distributed with n-1 degrees of freedom. For large n, Student's t-distribution tends to N(0,1); we assume big n so we can approximate the distribution of M with N(0,1). Then (applying the definition of the standard normal distribution's density function),

    \log P(\bar \mu | \mu, \sigma) \approx \log\left(\frac{1}{\sqrt{2\pi}}exp(-M^2/2)\right) = - \frac{1}{2}\log{2\pi} - \frac{(\bar\mu-\mu)^2n}{2\sigma^2}

    The random variable S=\frac{(n-1)\bar\sigma^2}{\sigma^2} is chi-distributed with n-1 degrees of freedom. Again, we assume n to be large. Then, the distribution of the random variable Q=\frac{S-n}{\sqrt{2n}} tends to N(0,1) and we have:

    \log P(\bar\sigma | \mu, \sigma) \approx \log\left( \frac{1}{\sqrt{2\pi}}exp(-Q^2/2) \right) =  - \frac{1}{2}\log{2\pi} - \frac{\left((n-1)\bar\sigma^2-n\sigma^2\right)^2}{4n\sigma^4}

    note that n\approx n-1 (n is large), so the above quantity simplifies to
    - \frac{1}{2}\log{2\pi} - \frac{n^2\left(\bar\sigma^2-\sigma^2\right)^2}{4n\sigma^4} = - \frac{1}{2}\log{2\pi} - \frac{n\left(\bar\sigma^2-\sigma^2\right)^2}{4\sigma^4}

    Putting it all together, we finally obtain

    \log P(\bar \mu, \bar\sigma | \mu, \sigma) = - \log{2\pi} - \frac{n}{2\sigma^2}\left( (\bar\mu-\mu)^2 + \frac{(\bar\sigma^2-\sigma^2)^2}{2\sigma^2} \right)

+ Reply to Thread

           




Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts






Advertise on Talk Stats