1. ## introduction to stats

i am a new student to statistics and was looking for some help.
my question is.
i have made a bar graph and then it asks what the main feature of the graph is. i'm not quite sure what it means by the main feature.
and then the second question i have is,
what are some ways to help me describe the shape, center and spread of the distribution and stiking deviations.
thank you.

- i have read the chapter 3 times but cannot find any help and also my teacher did not explain anything.

2. Read the chapter of what book?

You should be looking at finding mean, median, mode, possibly standard deviation and skewness, depending on what level of study you are undertaking.

You can find basics on descriptive stats on wikipedia.

3. ## New to Stats too...

Sorry to post under your question...but I am also new to stats and have a question about standard deviation with sample size. My book has two different formulas to find the standard deviation for a sample size...a defintion formula and then a computational..what is the differnce?
Im trying to find the standard deviation with a set of 30 numbers which are distances. The numbers are huge and Im looost! Any help would be great. Thanks!

4. Originally Posted by acarnle1
Sorry to post under your question...but I am also new to stats and have a question about standard deviation with sample size. My book has two different formulas to find the standard deviation for a sample size...a defintion formula and then a computational..what is the differnce?
Im trying to find the standard deviation with a set of 30 numbers which are distances. The numbers are huge and Im looost! Any help would be great. Thanks!
One formula is for calculating st devs for populations (divide by n) the other formula is for when you want to find the st. dev of a sample (divide by n-1).
Most of the time in statitistics you deal with samples not the entire population, therefore the (n-1) formula is used.

If the the 30 numbers you mention are from a sample of numbers use the (n-1) formula. If the 30 numbers are ALL the numbers you were given use the "divide by n" formula.

Good Luck.
P

5. > it asks what the main feature of the graph is

A key reason for graphing data is to gain insight into its distribution. I suggest you look to see whether the data seems to cluster about some single value or multiple values, and how much the data seems to vary around such central values.

6. ## Really trying to learn and frustrated!!!

I am trying so hard to learn stats, however this does not come naturally. I am working on a problem.

BETA CORP makes 80% of their parts in the US and the remainder at non-US facilities. It was found that 30% of the parts having been manufactured at non-US sites were good, and 20% of parts having been manufactured at domestic (US) sites were bad.

a. What is the proportion of recently tested parts were made outside the US?

b. Suppose a part was selected at random. What is the probability that it was a defective part manufactured at a foreign site?

c. A good part was randomly chosen. What is the probability it was manufactured in a domestic plant?

d. A part manufactured at a foreign site was randomly chosen. What is the probability it is a good part?

I have tried a decision tree, I am pretty sure b is a Bayes Therom problem. I just don't know where to begin....

Any guidance would be much appreciated.

7. Don't get frustrated - sometimes taking a deep breath and reading what the question is asking for will help lead you to the answer. These questions are pretty basic and don't require fancy analytical tools such as decision trees or Bayes Theorem (they can lead to the correct answer, but it's not necessary in this case).

I won't just hand out the answers, but hopefully I can show you what the questions are really asking....

(a) Since 80&#37; of the parts are made in the US, then 20% are made outside the US.

(b) Here, 20% are foreign parts, and we're given that 20% of them are bad,
so, P(foreign AND bad) = 0.20 x 0.20 = 0.04 = 4%

(c) Here, we can use conditional probability

P(foreign, given it was good) = P(foreign and good) / P(good)

(d) Again, conditional probability

P(good, given it was foreign) = P(foreign and good) / P(foreign)

8. Thank you so much JohnM. Just what I was looking for. Need help in the right direction but i really want to get this figured out on my own. Millacam

9. Ok thank you so much....that helped, i have a population...one more thing if possible. If I have a standard deviation of say 100 and a mean of 135, do i count 100 to the left and 100 to the right on the nnumber line to get the two values above and below? ANd then do i plot the rest of the numbers as they are, or the value after i subtract them from the mean???
thanks again!!!

10. Originally Posted by acarnle1
Ok thank you so much....that helped, i have a population...one more thing if possible. If I have a standard deviation of say 100 and a mean of 135, do i count 100 to the left and 100 to the right on the nnumber line to get the two values above and below? ANd then do i plot the rest of the numbers as they are, or the value after i subtract them from the mean???
thanks again!!!
Not sure I fully understand your question...but to get the upper and lower bounds of one st. deviation away from the mean you would add 100 to 135 to get the upper bound and subtract 100 from 135 to get the lower bound.

All other data points I would plot "as is." In this way you can see what data points (of the 30) fall within one st. deviation of the mean.

Make sense?
Good luck.
Paul

11. Originally Posted by edwar377
One formula is for calculating st devs for populations (divide by n) the other formula is for when you want to find the st. dev of a sample (divide by n-1).
Most of the time in statitistics you deal with samples not the entire population, therefore the (n-1) formula is used.

If the the 30 numbers you mention are from a sample of numbers use the (n-1) formula. If the 30 numbers are ALL the numbers you were given use the "divide by n" formula.

Good Luck.
P
Yes, there is a difference for computing a population and a sample standard deviation. However, the difference between a definitional and computational formula is that a definitional formula is the mathematical 'definition', the theory I guess you could say, whereas the computational is the formula is the one you would use, it's easier.

Take for example z-scores of skewness:

Z skewness = S  0/SE skewness

Obviously the 0 is obsolete. This is the definitional formula. In the computational formula, you would leave out the zero, it plays no role in this context. However, theoretically, its there.

12. Originally Posted by Cohen's monkey
...whereas the computational is the formula is the one you would use, it's easier.:

I respectfully disagree with you.
The context of the problem determines what formula to use. In addition, I am not sure what you mean by "easier". Both formulas require the same types of calculations. I would argue that the sample st. dev formula is more difficult for students to comprehend, since it involves bias and/or degrees of freedom.

13. "Easier" means just that. The computational formula is easier to use when you need to compute the standard deviation by hand, and it's also easier to use when you need to write computer code to calculate it.

14. I agree with you about the context, ie: n for populations, n-1 for samples. I think it is the definitions that you and I are using that is the source of dispute.

I was taught that the formulae are population formulae and sample formulae, and differ in whether or not you use n or n-1, and by notation; whereas the difference between computational and definitional formulae is as I stated above.

Does someone else what to clarify?

15. Originally Posted by Cohen's monkey

Does someone else what to clarify?
Sure, I would be more than happy to clarify.

The primary purpose of using computational formulae in lieu of definitional formulae is to reduce numerical error. It has nothing to do with dividing by a constant such as N or N -1 (as mentioned above).

For example, consider computing the Sums of Squares associated with a random variable X.

Using the definitional formula we have:

SSX = Sum(X - XBar)^2
where XBar must be estimated and thus introduces error for each observation of X.

On the other hand, using a computational formula the numerical error is reduced by using only the actual data points (there are no estimates involved) as follows:

SSX = SumX^2 - [(SumX)^2 / N].

Note: Simplifing formula (e.g. variance, skew, kurtosis) by using standard scores i.e. Z-Scores does not reduce error because it requires estimates of both the mean and standard deviation.

Mkay.