+ Reply to Thread
Page 1 of 2 1 2 LastLast
Results 1 to 15 of 16

Thread: introduction to stats

  1. #1
    Points: 2,897, Level: 32
    Level completed: 98%, Points required for next Level: 3

    Posts
    1
    Thanks
    0
    Thanked 0 Times in 0 Posts

    introduction to stats



    i am a new student to statistics and was looking for some help.
    my question is.
    i have made a bar graph and then it asks what the main feature of the graph is. i'm not quite sure what it means by the main feature.
    and then the second question i have is,
    what are some ways to help me describe the shape, center and spread of the distribution and stiking deviations.
    thank you.

    - i have read the chapter 3 times but cannot find any help and also my teacher did not explain anything.

  2. #2
    Points: 3,212, Level: 35
    Level completed: 8%, Points required for next Level: 138

    Location
    Australia
    Posts
    20
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Read the chapter of what book?

    Try looking up descriptive statistics. Basically, it's about describing your data.

    You should be looking at finding mean, median, mode, possibly standard deviation and skewness, depending on what level of study you are undertaking.

    You can find basics on descriptive stats on wikipedia.

  3. #3
    Points: 2,933, Level: 33
    Level completed: 22%, Points required for next Level: 117

    Posts
    9
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Red face New to Stats too...

    Sorry to post under your question...but I am also new to stats and have a question about standard deviation with sample size. My book has two different formulas to find the standard deviation for a sample size...a defintion formula and then a computational..what is the differnce?
    Im trying to find the standard deviation with a set of 30 numbers which are distances. The numbers are huge and Im looost! Any help would be great. Thanks!

  4. #4
    Points: 2,857, Level: 32
    Level completed: 72%, Points required for next Level: 43

    Posts
    8
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by acarnle1 View Post
    Sorry to post under your question...but I am also new to stats and have a question about standard deviation with sample size. My book has two different formulas to find the standard deviation for a sample size...a defintion formula and then a computational..what is the differnce?
    Im trying to find the standard deviation with a set of 30 numbers which are distances. The numbers are huge and Im looost! Any help would be great. Thanks!
    One formula is for calculating st devs for populations (divide by n) the other formula is for when you want to find the st. dev of a sample (divide by n-1).
    Most of the time in statitistics you deal with samples not the entire population, therefore the (n-1) formula is used.

    If the the 30 numbers you mention are from a sample of numbers use the (n-1) formula. If the 30 numbers are ALL the numbers you were given use the "divide by n" formula.

    Good Luck.
    P

  5. #5
    Points: 3,421, Level: 36
    Level completed: 48%, Points required for next Level: 79

    Posts
    12
    Thanks
    0
    Thanked 0 Times in 0 Posts
    > it asks what the main feature of the graph is

    A key reason for graphing data is to gain insight into its distribution. I suggest you look to see whether the data seems to cluster about some single value or multiple values, and how much the data seems to vary around such central values.

  6. #6
    Points: 2,888, Level: 32
    Level completed: 92%, Points required for next Level: 12

    Posts
    5
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Really trying to learn and frustrated!!!

    I am trying so hard to learn stats, however this does not come naturally. I am working on a problem.

    BETA CORP makes 80% of their parts in the US and the remainder at non-US facilities. It was found that 30% of the parts having been manufactured at non-US sites were good, and 20% of parts having been manufactured at domestic (US) sites were bad.


    a. What is the proportion of recently tested parts were made outside the US?

    b. Suppose a part was selected at random. What is the probability that it was a defective part manufactured at a foreign site?

    c. A good part was randomly chosen. What is the probability it was manufactured in a domestic plant?

    d. A part manufactured at a foreign site was randomly chosen. What is the probability it is a good part?

    I have tried a decision tree, I am pretty sure b is a Bayes Therom problem. I just don't know where to begin....

    Any guidance would be much appreciated.

  7. #7
    TS Contributor
    Points: 13,042, Level: 74
    Level completed: 48%, Points required for next Level: 208
    Awards:
    User with most referrers
    JohnM's Avatar
    Posts
    1,948
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Don't get frustrated - sometimes taking a deep breath and reading what the question is asking for will help lead you to the answer. These questions are pretty basic and don't require fancy analytical tools such as decision trees or Bayes Theorem (they can lead to the correct answer, but it's not necessary in this case).

    I won't just hand out the answers, but hopefully I can show you what the questions are really asking....

    (a) Since 80% of the parts are made in the US, then 20% are made outside the US.

    (b) Here, 20% are foreign parts, and we're given that 20% of them are bad,
    so, P(foreign AND bad) = 0.20 x 0.20 = 0.04 = 4%

    (c) Here, we can use conditional probability

    P(foreign, given it was good) = P(foreign and good) / P(good)

    (d) Again, conditional probability

    P(good, given it was foreign) = P(foreign and good) / P(foreign)

  8. #8
    Points: 2,888, Level: 32
    Level completed: 92%, Points required for next Level: 12

    Posts
    5
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Thank you so much JohnM. Just what I was looking for. Need help in the right direction but i really want to get this figured out on my own. Millacam

  9. #9
    Points: 2,933, Level: 33
    Level completed: 22%, Points required for next Level: 117

    Posts
    9
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Ok thank you so much....that helped, i have a population...one more thing if possible. If I have a standard deviation of say 100 and a mean of 135, do i count 100 to the left and 100 to the right on the nnumber line to get the two values above and below? ANd then do i plot the rest of the numbers as they are, or the value after i subtract them from the mean???
    thanks again!!!

  10. #10
    Points: 2,857, Level: 32
    Level completed: 72%, Points required for next Level: 43

    Posts
    8
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by acarnle1 View Post
    Ok thank you so much....that helped, i have a population...one more thing if possible. If I have a standard deviation of say 100 and a mean of 135, do i count 100 to the left and 100 to the right on the nnumber line to get the two values above and below? ANd then do i plot the rest of the numbers as they are, or the value after i subtract them from the mean???
    thanks again!!!
    Not sure I fully understand your question...but to get the upper and lower bounds of one st. deviation away from the mean you would add 100 to 135 to get the upper bound and subtract 100 from 135 to get the lower bound.

    All other data points I would plot "as is." In this way you can see what data points (of the 30) fall within one st. deviation of the mean.

    Make sense?
    Good luck.
    Paul

  11. #11
    Points: 3,212, Level: 35
    Level completed: 8%, Points required for next Level: 138

    Location
    Australia
    Posts
    20
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by edwar377 View Post
    One formula is for calculating st devs for populations (divide by n) the other formula is for when you want to find the st. dev of a sample (divide by n-1).
    Most of the time in statitistics you deal with samples not the entire population, therefore the (n-1) formula is used.

    If the the 30 numbers you mention are from a sample of numbers use the (n-1) formula. If the 30 numbers are ALL the numbers you were given use the "divide by n" formula.

    Good Luck.
    P
    Yes, there is a difference for computing a population and a sample standard deviation. However, the difference between a definitional and computational formula is that a definitional formula is the mathematical 'definition', the theory I guess you could say, whereas the computational is the formula is the one you would use, it's easier.

    Take for example z-scores of skewness:

    Z skewness = S – 0/SE skewness

    Obviously the 0 is obsolete. This is the definitional formula. In the computational formula, you would leave out the zero, it plays no role in this context. However, theoretically, it’s there.

  12. #12
    Points: 2,857, Level: 32
    Level completed: 72%, Points required for next Level: 43

    Posts
    8
    Thanks
    0
    Thanked 0 Times in 0 Posts
    Quote Originally Posted by Cohen's monkey View Post
    ...whereas the computational is the formula is the one you would use, it's easier.:

    I respectfully disagree with you.
    The context of the problem determines what formula to use. In addition, I am not sure what you mean by "easier". Both formulas require the same types of calculations. I would argue that the sample st. dev formula is more difficult for students to comprehend, since it involves bias and/or degrees of freedom.

  13. #13
    TS Contributor
    Points: 13,042, Level: 74
    Level completed: 48%, Points required for next Level: 208
    Awards:
    User with most referrers
    JohnM's Avatar
    Posts
    1,948
    Thanks
    0
    Thanked 4 Times in 4 Posts
    "Easier" means just that. The computational formula is easier to use when you need to compute the standard deviation by hand, and it's also easier to use when you need to write computer code to calculate it.

  14. #14
    Points: 3,212, Level: 35
    Level completed: 8%, Points required for next Level: 138

    Location
    Australia
    Posts
    20
    Thanks
    0
    Thanked 0 Times in 0 Posts
    I agree with you about the context, ie: n for populations, n-1 for samples. I think it is the definitions that you and I are using that is the source of dispute.

    I was taught that the formulae are population formulae and sample formulae, and differ in whether or not you use n or n-1, and by notation; whereas the difference between computational and definitional formulae is as I stated above.

    Does someone else what to clarify?

  15. #15
    Super Moderator
    Points: 8,567, Level: 62
    Level completed: 39%, Points required for next Level: 183
    Dragan's Avatar
    Location
    Illinois, US
    Posts
    1,724
    Thanks
    0
    Thanked 127 Times in 113 Posts

    Quote Originally Posted by Cohen's monkey View Post

    Does someone else what to clarify?
    Sure, I would be more than happy to clarify.

    The primary purpose of using computational formulae in lieu of definitional formulae is to reduce numerical error. It has nothing to do with dividing by a constant such as N or N -1 (as mentioned above).

    For example, consider computing the Sums of Squares associated with a random variable X.

    Using the definitional formula we have:

    SSX = Sum(X - XBar)^2
    where XBar must be estimated and thus introduces error for each observation of X.

    On the other hand, using a computational formula the numerical error is reduced by using only the actual data points (there are no estimates involved) as follows:

    SSX = SumX^2 - [(SumX)^2 / N].

    Note: Simplifing formula (e.g. variance, skew, kurtosis) by using standard scores i.e. Z-Scores does not reduce error because it requires estimates of both the mean and standard deviation.

    Mkay.

+ Reply to Thread
Page 1 of 2 1 2 LastLast

Similar Threads

  1. New Member Introduction
    By Knot_Known in forum New Member Introduction
    Replies: 4
    Last Post: 06-10-2011, 10:45 AM
  2. Introduction to Data Generation
    By Lazar in forum R
    Replies: 12
    Last Post: 06-06-2011, 11:18 AM
  3. Introduction for Observational Study on Red Cars
    By Thanasi in forum Statistical Research
    Replies: 0
    Last Post: 06-13-2010, 09:01 AM
  4. Introduction to Statistics question
    By bbowler in forum Probability
    Replies: 1
    Last Post: 04-30-2007, 08:17 AM
  5. HI introduction and a question.
    By bundho in forum General Discussion
    Replies: 1
    Last Post: 12-20-2005, 08:53 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts








Advertise on Talk Stats