"Average households" and sampling


The top graph shows the sizes of different households in a village, based on representative data for the UK.

The distribution of household sizes has a definite skewness to its shape. There are a large number of one- or two-person households but relatively few 5- or 6-person households.

The distribution of household sizes has a definite skewness to its shape. There are a large number of one- or two-person households but relatively few 5- or 6-person households.

If we take repeated samples from the village, record the "average" size of the household, and do this again and again, what will the result look like?

Click on the button opposite to take one sample.

You can now take a succession of samples. A distribution will gradually build up on the bottom graph: not a distribution of household sizes but a distribution of average household sizes.

What shape do you expect this distribution to take?

You should find that the distribution of sample averages is symmetrical, not skewed like the population.

Except for really small sample sizes, the distribution of sample averages is, surprisingly, not affected by the shape of the distribution that you are sampling from.