U.S. Racial Diversity by County

The U.S. is typically viewed as a melting pot of races and cultures, but recent maps showing the ethnic distribution of the U.S. seem to hint that the U.S. isn’t as well-mixed as we all thought. In this visualization, I mapped out the racial/ethnic diversity of the U.S. to give us a better sense of the hotspots of diversity.

US_racial_diversity_map

To calculate racial/ethnic diversity, I computed the entropy on the “% ethnicity” data for each county using the 6 ethnic categories the U.S. Census tracks: White (non-Latino), African American, Native American, Asian American, Latino, and Other. A county will come out with high entropy when all 6 ethnic categories are as even as possible (i.e., each ~16.7%), whereas it will come out with low entropy if the county is only inhabited by people of one ethnic category. I’ve included the raw census data if you want to tinker with it yourself.

One of the most notable features is that the Midwest and Northeast are fairly homogeneously white. Vermont, New Hampshire, and Maine stand as the pinnacle of racial homogeneity, each with only one or two counties with even a blip of diversity. The only exceptions to this trend are the major cities throughout the U.S., which seem to attract people of all ethnicities regardless of the state the city is in.

As a Michigander, I’m the most surprised to see how diverse the Upper Peninsula is. I thought only crazy white people lived up that far in Michigan.

Here’s the least diverse counties:

  1. Tucker County, West Virginia (100% White, non-Latino)
  2. Robertson County, Kentucky (100% White, non-Latino)
  3. Hooker County, Nebraska (100% White, non-Latino)
  4. Hand County, South Dakota (99% White, non-Latino and 1% Latino)
  5. Owsley County, Kentucky (98% White, non-Latino and 2% Latino)

And the most diverse counties:

  1. Aleutians West Census Area, Alaska (31.4% White (non Latino), 5.7% African American, 15.1% Native American, 28.3% Asian American, 13.1% Latino, and 6.4% Other)
  2. Aleutians East Borough, Alaska (13.5% White (non Latino), 6.7% African American, 27.7% Native American, 35.4% Asian American, 12.3% Latino, and 4.4% Other)
  3. Queens County, New York (27.6% White (non Latino), 17.7% African American, 0.3% Native American, 22.8% Asian American, 27.5% Latino, and 4% Other)
  4. Alameda County, California (34.1% White (non Latino), 12.2% African American, 0.3% Native American, 25.9% Asian American, 22.5% Latino, and 5.1% Other)
  5. Solano County, California (40.8% White (non Latino), 14.2% African American, 0.5% Native American, 14.3% Asian American, 24% Latino, and 6.2% Other)

Randy is a PhD candidate in Michigan State University's Computer Science program. As a member of Dr. Chris Adami's research lab, he studies biologically-inspired artificial intelligence and evolutionary processes.

Posted in data visualization Tagged with: , , , ,
7 comments on “U.S. Racial Diversity by County
  1. Casual Reader says:

    13.1% of the population of the Aleutian West Census area is Hispanic and only 15.1% is Native American? Something no smell right…

  2. Paul says:

    Nice work and nice map!
    Data is better when it can be viewed in multiple formats. Could you post a ranking of all 6,286 counties…., or a formula we can use to replicate your methodology? The Wikipedia article does not make it very clear/easy on how to calculate entropy.

    • Randy Olson says:

      Thanks Paul! Entropy for a single county can be calculated by summing up -p * log(p) for each racial category (White (non-Latino), African American, etc..), where p is the percentage of people for that race in the county. So say we wanted to calculate entropy for Hand County, South Dakota (99% White, non-Latino and 1% Latino). Entropy would be:

      Entropy(Hand County, SD) = -0.99 * log(0.99) – 0.01 * log(0.01) – 0.0 * log(0.0) – 0.0 * log(0.0) – 0.0 * log(0.0) – 0.0 * log(0.0) = 0.024

  3. Claudia says:

    Yay Tompkins County! I suspect we’d be even more diverse if it weren’t for our location. There aren’t a lot of people eager to move to Upstate NY, where we freeze 8 months out of the year and New York City is 220 miles away. That may help explain the lack of diversity in Vermont and NH, too.

    A conversation I’ve had more than once with a graduating Cornell senior who is a member of a minority group:

    Me: Congratulations! So what’s your next move?

    S/he: Oh, I’m heading back to NYC/Florida/Malaysia/NJ/etc.

  4. A says:

    This id great. Thanks for posting it. Excellent use of Shannon Entropy! I can’t tell you how excited I was to see it used. Do you have the actual raw data? The link you posted is to the percentages of each ethnicity in each county. Presumably there were some counts originally. I ask because, like many map visualizations, this one puts too much emphasis on geographic area and not enough on population. For instance, the first and second most diverse counties earned their spots in large part because of their high percentage of Native American residents (15 and 27%). These counties have 5500 and 3200 residents respectively. Queens, the third most diverse has only 0.3% Native American population. But with 2.273 million residents, it’s Native American population of 6189 is several times the combined Native American population of the previous two entries. There are undoubtedly city blocks in Queens with more people than either of these Alaskan counties.

    Now, maps that skew area to reflect population are usually unpleasant to look at, which is why the actual raw data would be useful. Then your readers can have their pretty pictures and also play around with the data to get a better idea of the context of each county’s score.

    • Randy Olson says:

      Thank you! I was pretty happy when I found another good application for entropy. :-)

      Unfortunately the data I posted is the rawest form I was able to find. But I’m fairly sure the raw population count data is out there somewhere!

3 Pings/Trackbacks for "U.S. Racial Diversity by County"
  1. […] “other.” Olson quantified diversity by calculating entropy for each of these sets. He explains it on his blog like this: “A county will come out with high entropy when all six ethnic categories are as […]

  2. […] “other.” Olson quantified farrago by calculating entropy for any of these sets. He explains it on his blog like this: “A county will come out with high entropy when all 6 racial categories are as even […]

  3. […] Olson quantified diversity by calculating entropy for each of these sets. He explains it on his blog like this: “A county will come out with high entropy when all six ethnic categories are as […]

About this blog

The data visualizations on this blog are the result of my “data tinkering” hobby, where I tackle a new data analysis problem every week. If I find something interesting, I report my findings here to share with the world.

If you like the work in this blog, I'm currently available for hire as a freelancer. Send me an email if you'd like to discuss freelance work.

If you would like to use one of my graphs on your website or in a publication, please email me. Donations to keep the site running ad-free are greatly appreciated, but never required.

Archives

Enter your email address to subscribe to this blog and receive notifications of new posts by email.