This issue
comes up every year, so we may as well deal with it. In our lectures on calculating
the Standard Deviation, we report that the denominator is: n – 1. That is, after
adding up the deviations of observations from the mean of those observations, you
must divide by the denominator of the number of observations minus 1. Many
students enrolled in first-year math and/or stats classes like to point out
this denominator as an error because it conflicts with information they receive
in those courses. It seems that the equation for Standard Deviation that they
often encounter in those courses reports the denominator as N, instead of n –
1. In actuality, both equations are correct and the contradiction can be easily
resolved by pointing out that the Standard Deviation has different equations,
depending on whether it is the Standard Deviation of a POPULATION of observations
that is being computed versus a SAMPLE of observations drawn from a larger
POPULATION. Imagine that we had all of the high school GPAs of every student
beginning their studies at the U of M. We could calculate the Standard Deviation
from that Population of scores by using N (the # of GPA scores we have) as the
denominator in the equation. However, if we didn’t have the entire population
of scores, we could obtain a smaller number of scores drawn from that
population (say, 100 scores out of the much larger number of first-year students).
In that case, we must compute the Standard Deviation using n – 1 as the
denominator. The reason is that our goal is to estimate how much scores vary in
the population, based on the information we have from a much smaller sample.
Because we are using a sample, rather than the entire population of first-year student
GPAs, we can only get an accurate estimate by computing the Standard Deviation
with n – 1 as the denominator. You’ll just need to trust us in this one: using
n – 1 in the denominator means that the Standard Deviation we compute will come
as close as possible to the Standard Deviation of the Population of scores,
even though the denominator for computing Standard Deviation for a population
of scores is N, rather than n – 1. We report the equation for computing the
Standard Deviation of a Sample of scores, instead of a Population of scores,
because psychological studies most typically rely on observations obtained from
a SAMPLE, and it is really quite rare for such studies to involve scores obtained
from an entire POPULATION. It is simply too difficult to get observations from
every single member of a POPULATION and it is usually much more trouble than it’s
worth. We hope that clarifies things.
yes DR
ReplyDelete