Thursday, March 1, 2012

What an average doesn't tell you

Recently, I had a hunch that the average age of males was lower than the average age of females. My logic was that women tend to live longer than men and that slightly more male babies are born than female babies. I wondered if my reasoning was correct so I downloaded the demographic statistics for NSW from the Australian Bureau of Statistics and I found the following:
  • There were 4.3% more males than females under 31 years of age
  • There were equal numbers of males and females aged 31 years of age
  • There were 6.3% more females than males over 31 years of age
  • 41.2% of the population were less than 31 years old while 57.4% were more than 31 years old
When I crunched the figures I found that the average male is 1 year 8 months younger than the average female. So my hunch was correct. But the question is: what does this tell us about any given person?

The answer is: nothing!

Averages are about populations not about individuals. So if you were to see a statistic such as "Males are on average 1 year 8 months younger than females", you aren't being given an answer so much as a provocation to ask "Why would this be the case? What does it mean?"

When you are surprised at an average then it may help you to surface your assumptions about whatever it was that was being measured.

For instance, if I were to ask you what the average height for a human being was, the chances are that you would say something like 5'2" or thereabouts. So it would surprise you if I were to tell you that the true figure is under 5'. I will explain this claim in a moment, but first would it surprise you to learn that short people often have lower literacy levels than taller people? Yes?

Well interestingly the reason for both of these statistics is the same: children are human beings too!

Most children are under 5' tall and most childen have lower literacy levels than adults and these two facts result in the given statistics. So if you were surprised at the claims I made, it would be because you assumed that we weren't counting children in the averages. Even if we weren't aware of it, we implicitly assumed that we were talking about adult humans.

This can be true of almost any average. The question we always need to ask is: what is the true population that is being averaged? What assumptions are we making about this population? Are there groups we are excluding which we should be including? Does it make sense to use a single average when you are dealing with a 'mixed' population (e.g. adults vs children, males vs females). Does comparing separately calculated averages lead us to ask new questions about the reasons for any differences? Are the reasons obvious or do we need to dig deeper?

The other question is whether we are comparing the right averages.

In Australia, a serious issue is that average Aboriginal life expectancy is significantly lower than that of other Australians. However is this the right comparison? If we were to segment the two populations by socio-economic level would we find that the issue isn't race but poverty i.e. that people in lower socio-economic groups have much lower life expectancies than those in higher socio-economic groups? In other words, the underlying reason may not be racial but economic. Whatever the answer may be, it determines what sort of strategy you use to approach the issue: do we directly target health services to Aboriginal people or do we adopt a broader strategy of reducing poverty across the board?

I don't know what the answer is, but I raise this issue as just one example where the populations you choose for comparing averages can make a significant difference to what strategy you use to investigate the causes of any differences and the consequential strategies you adopt to deal with such causes.

Averages are one of the simplest statistics to calculate and this is one of the reasons that they are so frequently used. However they may conceal differences and assumptions that need to be surfaced if you truly want to understand what is actually going on.

No comments:

Post a Comment