Rethinking the population pyramid

If you’ve ever browsed the U.S. Census population statistics pages, you’ve no doubt come across the famous population pyramid that they so frequently use to display the distribution of the U.S. population by age and gender.


I was reading up about population pyramids last weekend and ran across an interesting quote that caught my eye:

the use of a population pyramid is considered the best way to graphically illustrate the age and sex distribution of a given population.

Now, I’m no expert at displaying population statistics, but I was shocked at this claim. Could it really be true that population pyramids are considered the best method for displaying population distributions?

That line of thought ultimately led to the article below, where I raise three critiques of the population pyramid and present simpler and — in my view — more effective visualization methods.

For this article, I used the 2010 U.S. Census population statistics, which you can find here in a machine-readable format.

You can also find all of the code for these charts in my GitHub repository.

Problems with the population pyramid

1) Violates the standard expectation of having the causal variable on the x-axis

One of the most noticeable mistakes that the population pyramid makes is flipping the chart on its side to form a “pyramid” shape. I can only view this as an aesthetic flourish, since it violates one of the standard expectations of plotting: The causal variable should always be on the x-axis.

When it comes to plotting, the x-axis is typically reserved for the independent variable, i.e., a fixed setting that has some sort of effect on another variable. In contrast, the y-axis is reserved for the dependent variable, i.e., the variable that shows some effect from varying the independent variable.

The implication is that values on the x-axis cause some measurable effect on the values in the y-axis. This is why we always put the passage of time on the x-axis: it doesn’t make sense to think of some other factor causing changes in the passage of time. (Until we discover time travel, anyway.)

Since it doesn’t make sense to think about a population’s gender distributions having an effect on age — and it makes far more sense to think about age having an effect on a population’s gender distributions — let’s flip the axis of the pyramid so it’s more in line with standard visualization practices.


Now we don’t have to reorient ourselves every time we look at the population pyramid, since the data is displayed more naturally.

Ideally, the x-axis labels would be in between the “women” and “men” bars, but that was a bit tedious to pull off in my plotting software. Moving on…

2) Doesn’t allow direct comparisons between the two categories

The second flaw with population pyramids is that they make it difficult to compare the age distributions of men and women.

For example, can you tell me at a glance if there’s more men or women in the 25-29 age group? You’d have to look up the number of men and women in the 25-29 age group separately and make the comparison that way, when there’s really no reason that the chart shouldn’t be performing those comparisons for you.

Let’s rework the population pyramid to group the people by age, with separate bars for men and women.


Now we have the exact same benefits of the population pyramid, with the additional benefit of being able to immediately discern whether there are more women or men in each age group. Arguably, we can now perform the same comparison between age groups as well — for example, are there more 50-54-year-old women than 30-34-year-old men? — but those comparisons become difficult the further the age groups are from each other.

What’s immediately apparent from this version of the population pyramid is:

  1. There are more young men than young women in the U.S.,
  2. we reach gender parity around age 30,
  3. then men start dying out younger and leaving droves of widows behind starting at age 45.

There’s some really interesting implications in that data for the evolution of human sex ratios, but I’ll leave that for another time.

3) Relative trends between the categories are masked by displaying absolute values

There’s clearly an interesting trend going on in the age 45+ groups where there are more women than men. But what’s going on with the M:F ratio, especially in the 90+ categories? It’s incredibly difficult to tell because these trends are masked when we display absolute values.

If we’re more interested in the relative trends between the two categories, we can drop the absolute values and instead show the percentage breakdown of the groups as I’ve done below.


Now those trends I discussed above become abundantly clear, and we see that roughly 75% of U.S. adults aged 90+ are women. Sorry, straight men: your wife is probably going to outlive you.

Of course, whether we would use this third chart solely depends on whether we care more about relative differences between the gender categories or the age distribution of the population. As with all charts, what data you should display depends on what story you want to tell with the data. In either case, I hope I’ve convinced you that the population pyramid — as it’s currently used — is not quite ideal for telling either story.

Lessons learned

As with all of my long-winded articles critiquing a data visualization, I’ll end with a brief summary of the main lessons we’ve learned.

  1. The causal variable (e.g., time or a parameter you control in an experiment) should always go on the x-axis.
  2. Group related data when within-group comparisons can be useful.
  3. What chart you use and what data you display depends on the story you want to tell. Don’t try to force a story out of the wrong chart.

Are there some other ways the population pyramid could be improved? Leave your suggestions in the comments.

If you liked what you saw in this post and want to learn more, check out my Python data visualization video course that I made in collaboration with O’Reilly. In just one hour, I will cover these topics and much more, which will provide you with a strong starting point for your career in data visualization.

Dr. Randy Olson is a Senior Data Scientist at the University of Pennsylvania, where he develops state-of-the-art machine learning algorithms with a focus on biomedical applications.

Posted in data visualization, tutorial Tagged with: , , ,
  • KeithWM

    I agree it is more conventional to have the causal/independent variable on the abscissa, but I would definitely advise against the plot as you depict in 1). The question which group goes on top has a large influence on the perception of the graph.

    Also, the second option rather destroys the _shape_ of either side’s age group. I believe the idea of the pyramid is to show this shape, and for comparing numbers only to a lesser extent.

    A fourth option would be to plot a stacked bar chart. Plotting for each age group that part of the population where for every male there is a female and vice versa, stacked with the surplus of the gender of which there are more.

    • Lavi Shpigelman

      I agree with KeithWM’s comments. The main info that the pyramid shows is the absolute size by age. The bottom-to top axis serves to heighten the notion that these age groups are getting older and moving up the pyramid.

      The thing that is missing from these charts which is often useful to people that look at them is a visualization of predicted population size in 10 / 20 / 30 years. To make such predictions you would need to use life tables (current ones available from the WHO) which tell you the (current) expected fraction of deaths per age group. As for filling in new bottom rows (which may not necessarily go into a visualization), you would need age-specific fertility rates to guess the number of newborns.

    • +1 to the stacked bar chart idea. especially if the use case is to compare age/gender distributions between countries. Although with a stacked bar chart, it would again be difficult to compare the categories (men vs. women) in the age groups.

  • Eelco Doornbos

    Good points. Although I think there are exceptions to the rule that the causal variable should always go on the x-axis. For example when plotting temperature measurements or model values as a function of altitude, it is more natural to put altitude on the vertical axis.

    • I agree with that – it’s very difficult to give a hard rule in dataviz. As you say, there are some (IMO rare) situations where it makes more sense to plot the independent variable on the y-axis owing to its nature.

  • >This is why we always put time on the x-axis:

    Not so. in fact, the page linked to to illustrate the “standard expectations” gives an example in which the dependent variable has the dimension of “time”.

    > it doesn’t make sense to think of some other factor causing changes in time.

    You may be muddling up the question of time as the dimension of a measurement and the actual passage of time.

    • Hmmm, I see where that didn’t come through correctly in my writing. I’ll address that soon. Thanks!

  • While I generally prefer your last graph for age/population, I might use a tornado diagram for focusing only on the incremental differences over time, rather than the population, to a business audience that might expect such a diagram. Although a bar chart would still be preferred, using other chart types to highlight differences and break up the bar chart monotony is not a bad thing.

    • >Although a bar chart would still be preferred, using other chart types to highlight differences and break up the bar chart monotony is not a bad thing.

      This is something I regularly struggle with. Bar and line charts are often the most effective tools for communicating the data, but they’re often considered plain or boring. I often err on the side of the more effective (yet plain) visualization technique in the hope that the data and story communicated by the visualization is the real interesting part. No need to distract from the data with pretty makeup. 🙂

      • Great analysis and discussion thread, lots of really interesting points.Strongly agree with your point about keeping the charts as simple as possible, to most effectively communicate the data. I think the key point is that gender breakdown is useful for some analyses, but a combined “horizontal” stacked bar is probably the best way to visualize the distribution by age. Personally, I would prefer using two separate graphs to show the best possible interpretation of the data for each perspective – over one graph that is potentially misleading for both.

  • The only time I’ve ever seen the traditional pyramid used is to compare what the pyramid of country x looks like compared to the pyramid of country y, or to compare the pyramid of country x now with the pyramid of country x in 1900, and I’m not sure you could do those comparisons as easily with second re-worked figure.

    • In that use case, I think it would make more sense to use a stacked bar chart, similar to my second chart but with the men and women bars stacked on top of each other. Then we can still look at the age distribution of the population without potentially being fooled by imbalances in the gender distribution.

  • Nice – thank you for sharing this version! This definitely makes for a less-muddled version of my second chart.

    • The problem with the stacked version (and to a lesser extent the grouped bar version) is that because the length of the visual elements is dominant it’s harder to get a clear comparison of total population sizes at an age, i.e for some of the length we need to double count for some of it not — eg. age group 0-5 reads as around 20m where actually it’s double (and a bit) that. I might consider two lines that way crossing points are highlighted and it’s clearer that the count for each is separate … this also gets rid of the visual noise problem that column charts (particularly the grouped column variety) can suffer from.

      Overall I think it’s great to break down these conventional chart types and look at alternatives when a particular aspect of the data is to be highlighted ( eg. relative M-F proportions etc.) but I think it’s important not to overlook the value inherent in a convention i.e. when you see a population pyramid you immediately understand its content without recourse to reading the title, axis labels etc. further, if you’ve seen a few of these things you immediately look for patterns in the right places (where’s the notch for the second world war? how high up the plot is the baby boom?) and can compare broad shapes with a mental catalogue of other population pyramids. The more generic column chart form doesn’t have this affordance.

      I think the criticism that really sticks is the difficulty in comparing the left and right halves of the plot, this can be mitigated to some extent by clear vertical sections but it’s still not ideal. Personally I’m happy with the trade offs of the conventional approach most of the time.

  • Henk Doorlag

    Hi Randy,

    the population pyramid is not often used to compare sex distribution within age groups, but rather show at a glance the current and future make up of an entire population, often in comparison with many other populations.

    Is there a bulb – how far away from retirement is it.

    see also


    the pyramid better helps me do this than an sideways arrow would ,but that might just be familiarity.

    • Thank you for your insight, Henk! That sounds like a fine use case to me, but in that case, wouldn’t a stacked bar chart (as discussed with KeithWM in the comments here) work better for showing the shape (distribution) of the population by age? It seems we could be easily misled to think a population is stagnant or declining, for example, if there is a large imbalance between the genders in some age groups.

  • T.T
  • Adam Perry

    I strongly disagree with this. Unless I’m missing something, I think that looking at gender distribution must be the least useful way you could use a population pyramid. Population pyramids provide many interesting insights into the current and future state of a country. Gender distribution is not one of them.

    • Christopher Steffen

      Rotating the graph 90º doesn’t change the graph’s effectiveness. And gender distribution is an important demographic to measure, which is why population graphs depict each separately. Sure, there are plenty of other uses for population graphs, but none of those are hampered by turning it 90º.

  • TheGrue

    I definitely prefer age on the x axis, much easier for me to understand, because pretty much every other analysis is presented the same way.
    Maybe the pyramids are a hoax? 🙂

  • Anonymous

    You have misunderstood the use of time. Time is not the x-axis because it is not passing. Without the passage of time the expectation that time is causal is merely the fallacy of vividness.

    Age is bucketed into <5, 6-10 and so on. These are not continuous variables but discrete. Taking your expectation that time ought to be the x-axis and the x-axis being synonymous with the causal axis suggests that the age group is a determinant of the y-axis. This is far more problematic than it first appears.

    The variations you propose make sense if time is both continuous and causal. They are undermined by the fact that the population pyramid uses time in several senses: first, to bucket blocks of data with a basing year; second as a block of ages. The distinction is subtle. The first use case would be better represented as an XY graph with the origin at the basing year of the Pyramid; the second use case would be better served as a variation on stacked bar charts. The basing year is a hugely important concept in a population pyramid.

    Switching the time and population axes is a problematic piece of representation as it promotes the idea of a "uniform causality". Which is demonstrably untrue – see, for example, Hume. It further promotes the idea that the data can be used to infer a continuous function; which may or may not be true.

    The use of time in the pyramid is obviously not causal and the fact that it is not causal is demonstrated by time not being on the x-axis. This is not intuitive for physicists, in particular, but it is clear representation. Time does determine the age but it is population that determines future population, not time.

    The question of "how can population be causal?" is tricky, but it is one population biologists can respond to with concepts such as minimal viable populations and the notion of extinction. No amount of time will "cause" an extinct population to rise but a population below a minimum viable limit will, in the future, fall to zero. Which is, in fact, shown on the population pyramid at the very top. This can be intuited by examining some population pyramids of baby booms with different basing years. The examination of time as a causal variable is between pyramids not within pyramids. Which is a lot harder with the stacked bar charts.

    The comparison of male and female is useful as sudden changes in male population in a bucket, for example, can indicate wars; sudden changes in female population within a series of buckets can illustrate the Chinese one Child per Family policy. None of the alternatives are as rich in interpretive hints as the population pyramid.

  • Gary Stroble

    IMO the first example is much easier to visually process.

  • Anon

    Given that this is census data taken from a certain year (2010) the x-axis also represents birth year. It would be interesting to label it as such and then indicate if events such as the Vietnam War, Korean War, and World War II perhaps had a significant influence in the smaller male populations, since these events would disproportionally affect males in their 20’s.

  • C. T.

    Oof, as a demographer, this makes my head hurt.

    As far as graph 1 goes, I didn’t find it particularly more intuitive than the Census pop pyramid. For one thing, women are almost always presented on the right side of pop pyramids, so putting them on top was a little confusing at first. Second, when I wanted to make comparisons, my eyes still had to travel back and forth to the left side of the graph twice (to compare men/women size) and then to the bottom to look at the age cohorts.

    The second graph does have the advantage of direct comparison, however, it’s more difficult for me to draw conclusions about the age of the population. Perhaps this is because of my training and the amount of time I spent looking at pop pyramids. In the Census’s example, I can immediately see the baby boom and then the echo of the baby boom. This shape isn’t as apparent to me in your second visualization, although I will concede that my personal training is probably the reason for this.

    As far as sex ratios… these numbers are typically presented at birth and for a whole population as one number. When they are presented for all age groups, rather than have a bar chart, it’s more common to see a line graph. Finally, sex ratios are always presented in relation of men/women (i.e. a number above 100 (or 1) means more men than women). Again, I’d like to see the women on the bottom side of the chart there.

    I do agree that demographers might revisit pop pyramids as the primary visualization, but I still think there is some value in the original. As an example, here are two overlaid pop pyramids in an early visualization from a publication of mine.

    Here, it’s easy to see several things at once. First, you can clearly see the differences in population size for age groups across time. Second, the contrasting shapes of the pyramids immediately tell me something about the composition of the population. The anvil shape of the later pyramid shows that the population is top-heavy compared to the first one. You also see the decreasing size of future cohorts, something I think is missing when you flip the x-axis.

    Interesting topic. I’m enjoying the comments from others.

    • C. T.

      Oh, the other thing that I’d add, pop pyramids show cohorts rather than time. That may be one of the reasons for the flipped axis.

  • Emil Kirkegaard

    “Now those trends I discussed above become abundantly clear, and we see that roughly 75% of U.S. adults aged 90+ are women. Sorry, straight men: your wife is probably going to outlive you.”

    When older people (re-)marry, the gender difference in age is larger. You will want to use actual death data within married couples to see who dies off first. Remember also that for new weds, the man is about 3 years older too. These facts increase the chance that the male will die off first.

  • sparkyb

    Your improvements make a lot of sense. One area where I see this could still be improved upon is how the chart changes over time. For instance, how much of the downward slope after the 50s is due to death and how much is because those people were born at a time where the birthrate was lower? Similarly, what is responsible for the dip in the 30s? You could have a third axis (maybe not a 3D plot, but a slider or something) to see data from different years, but that might still make those questions hard to answer. Instead of having the two independent axes being year of survey and age group, it could be birth year and age group so we could see just the effect of mortality on a specific generation.

  • johnlocke

    >Sorry, straight men: your wife is probably going to outlive you.

    Why only straight men? Gay men married to women will have their wives outlive them too.

  • ABeagleKnots

    In the side-by-side bar chart, the arbitrary choice to list the woman bar to the left of the men pair hides the difference by creating a smooth-looking decline. Reversing this would enhance this, creating a more vivid sawtooth effect.

  • J.D.

    This is an interesting discussion I just stumbled upon. With Community Accounts ( we have used animated population pyramids to show shifting age cohorts. For the average person, Keith’s chart will not be as intuitively seen by users of demographic data, as the concept of male/female harmony in population figures and the slow decrease of population in each cohort in a sustainable population. To us, each of these charts would solve a different problem.

  • Kevin W

    This is an ideal situation to use Tukey’s hanging rootogram, except with so many bars I would prefer to use lines instead of bars to reduce visual clutter. The rootogram gives you the global changes in population size over time and the deviations for each subgroup. Critically, putting the deviations at the bottom of the graph means that the deviations are on the same position of the axis over time, facilitating the interpretation of trends in deviations over time.