U.S. Racial Diversity by County

The U.S. is typically viewed as a melting pot of races and cultures, but recent maps showing the ethnic distribution of the U.S. seem to hint that the U.S. isn’t as well-mixed as we all thought. In this visualization, I mapped out the racial/ethnic diversity of the U.S. to give us a better sense of the hotspots of diversity.


To calculate racial/ethnic diversity, I computed the entropy on the “% ethnicity” data for each county using the 6 ethnic categories the U.S. Census tracks: White (non-Latino), African American, Native American, Asian American, Latino, and Other. A county will come out with high entropy when all 6 ethnic categories are as even as possible (i.e., each ~16.7%), whereas it will come out with low entropy if the county is only inhabited by people of one ethnic category. I’ve included the raw census data if you want to tinker with it yourself.

One of the most notable features is that the Midwest and Northeast are fairly homogeneously white. Vermont, New Hampshire, and Maine stand as the pinnacle of racial homogeneity, each with only one or two counties with even a blip of diversity. The only exceptions to this trend are the major cities throughout the U.S., which seem to attract people of all ethnicities regardless of the state the city is in.

As a Michigander, I’m the most surprised to see how diverse the Upper Peninsula is. I thought only crazy white people lived up that far in Michigan.

Here’s the least diverse counties:

  1. Tucker County, West Virginia (100% White, non-Latino)
  2. Robertson County, Kentucky (100% White, non-Latino)
  3. Hooker County, Nebraska (100% White, non-Latino)
  4. Hand County, South Dakota (99% White, non-Latino and 1% Latino)
  5. Owsley County, Kentucky (98% White, non-Latino and 2% Latino)

And the most diverse counties:

  1. Aleutians West Census Area, Alaska (31.4% White (non Latino), 5.7% African American, 15.1% Native American, 28.3% Asian American, 13.1% Latino, and 6.4% Other)
  2. Aleutians East Borough, Alaska (13.5% White (non Latino), 6.7% African American, 27.7% Native American, 35.4% Asian American, 12.3% Latino, and 4.4% Other)
  3. Queens County, New York (27.6% White (non Latino), 17.7% African American, 0.3% Native American, 22.8% Asian American, 27.5% Latino, and 4% Other)
  4. Alameda County, California (34.1% White (non Latino), 12.2% African American, 0.3% Native American, 25.9% Asian American, 22.5% Latino, and 5.1% Other)
  5. Solano County, California (40.8% White (non Latino), 14.2% African American, 0.5% Native American, 14.3% Asian American, 24% Latino, and 6.2% Other)

Dr. Randy Olson is the Chief Data Scientist at FOXO Bioscience, where he is bringing advanced data science and machine learning technology to the life insurance industry.

Tagged with: , , , ,
10 comments on “U.S. Racial Diversity by County
  1. Casual Reader says:

    13.1% of the population of the Aleutian West Census area is Hispanic and only 15.1% is Native American? Something no smell right…

    • Randy Olson says:

      Look it up. It’s there in the data. 🙂

      • Tlaloc says:

        But, of course, it is possible the data itself is incorrect. Not saying it is for sure, but the possibility remains. Particularly since this relies, I believe, on self reporting, there’s the possibility for misunderstanding of the instructions or deliberate attempts to skew the results. In large populations that’s unlikely to produce big shifts from the ‘true’ values but I’m assuming the Aleutian populations are small enough that they might well be noticeable.

  2. Paul says:

    Nice work and nice map!
    Data is better when it can be viewed in multiple formats. Could you post a ranking of all 6,286 counties…., or a formula we can use to replicate your methodology? The Wikipedia article does not make it very clear/easy on how to calculate entropy.

    • Randy Olson says:

      Thanks Paul! Entropy for a single county can be calculated by summing up –p * log(p) for each racial category (White (non-Latino), African American, etc..), where p is the percentage of people for that race in the county. So say we wanted to calculate entropy for Hand County, South Dakota (99% White, non-Latino and 1% Latino). Entropy would be:

      Entropy(Hand County, SD) = -0.99 * log(0.99) – 0.01 * log(0.01) – 0.0 * log(0.0) – 0.0 * log(0.0) – 0.0 * log(0.0) – 0.0 * log(0.0) = 0.024

  3. Claudia says:

    Yay Tompkins County! I suspect we’d be even more diverse if it weren’t for our location. There aren’t a lot of people eager to move to Upstate NY, where we freeze 8 months out of the year and New York City is 220 miles away. That may help explain the lack of diversity in Vermont and NH, too.

    A conversation I’ve had more than once with a graduating Cornell senior who is a member of a minority group:

    Me: Congratulations! So what’s your next move?

    S/he: Oh, I’m heading back to NYC/Florida/Malaysia/NJ/etc.

  4. A says:

    This id great. Thanks for posting it. Excellent use of Shannon Entropy! I can’t tell you how excited I was to see it used. Do you have the actual raw data? The link you posted is to the percentages of each ethnicity in each county. Presumably there were some counts originally. I ask because, like many map visualizations, this one puts too much emphasis on geographic area and not enough on population. For instance, the first and second most diverse counties earned their spots in large part because of their high percentage of Native American residents (15 and 27%). These counties have 5500 and 3200 residents respectively. Queens, the third most diverse has only 0.3% Native American population. But with 2.273 million residents, it’s Native American population of 6189 is several times the combined Native American population of the previous two entries. There are undoubtedly city blocks in Queens with more people than either of these Alaskan counties.

    Now, maps that skew area to reflect population are usually unpleasant to look at, which is why the actual raw data would be useful. Then your readers can have their pretty pictures and also play around with the data to get a better idea of the context of each county’s score.

    • Randy Olson says:

      Thank you! I was pretty happy when I found another good application for entropy. 🙂

      Unfortunately the data I posted is the rawest form I was able to find. But I’m fairly sure the raw population count data is out there somewhere!

  5. Daniela says:

    Do you have the diversity score for every county in a table that I could take a look at?