• About
    • About the FIA
    • Priorities
    • Our Team
    • Brainstorming Board
    • Partners and Affiliates
    • Contact Us
  • News + Events
    • News
    • Events
    • Videos
    • Newsletters
    • @FIAumd
    • In the Media
  • Spark Grants
    • Spark Grants Overview
    • Spark Grants FAQ
    • 2012-2015 Seed Grants
    • 2012-2015 Seed Grant Winners
  • Special Topics
    • SearchReSearch
    • Curated Topics
FIA

SearchReSearch

SearchResearch Challenge (6/11/18): How do you plot out data by region? The case of regional boundaries.

Dan Russell • July 11, 2018
 SearchReSearch
Republished with permission from SearchReSearch
SearchResearch Challenge (6/11/18): How do you plot out data by region? The case of regional boundaries. Dan Russell

It's time for a Challenging Challenge!

As you know, every so often I like to mix up the SRS Challenge with something that's a bit more in-depth. (And if this is overwhelming, just take the week off--I'll be back next week with an easier one.)

The Setup: If you read the news these days you'll see all kinds of claims about various kinds of data. In an earlier SRS post we talked about immigration rates, and found that the data is a bit complicated, but you can figure it out.

One of the things you'll see in the news are charts like this one:

.. by COUNTY (not MSA or CSA).

This is the "Median household income in 2012 by county." This chart is from Wikimedia and shows the median income by county in the US. Of course, counties are sometimes just arbitrary boundaries. They may or may-not make sense. (For instance, Los Angeles County has around 10M souls living inside the county, while only 600K people live in Providence county, Rhode Island. That's a factor of 16X difference in size.)

There are many ways to draw regional boundaries that make some kind of sense. For instance, gerrymandering is the practice of drawing political boundaries to give a particular party more (or less) voting power.

There are commercial regional boundaries (such as the "Designated Market Areas," aka DMAs, define by the polling / survey company Nielsen). These regions correspond to media markets.
More often, though, people who are looking at data use either "Metropolitan Statistical Areas" (MSA). An MSA is “is a geographical region with a relatively high population density at its core and close economic ties throughout the area.”
For instance, the San Francisco-Oakland-Hayward Metropolitan Statistical Area (with a population of 4.5 million) and the larger San Jose-San Francisco-Oakland Combined Statistical Area (8.4 million) are both near where I live in Silicon Valley.
A slightly different version of the MSA is the "Combined Statistical Area" (CSA), whi is composed of "adjacent metropolitan (MSA) and micropolitan (μSA) regions in the United States and Puerto Rico that can demonstrate economic or social linkage." (This is primarily defined by commuting patterns.)
A map of the combined metropolitan and micropolitan statistical areas of the US looks like this:
Wikimedia

I'm telling you all of this background because it leads to today's Challenge.
1. Can you make a map of the median household income for each of the MSAs in the United States? (Or equivalent statistical areas, if you're from another country.)
That is, you'll need to:
A. Find a source of recent data that's organized by MSAs. 2017 would be best, but you should look for the most recent data.
B. Find a visualization application that can ingest both the median income data and the shape of the MSA.
C. Figure out a way to create a visualization of the US MSAs that color-codes the income. It should look a bit like the above example, except with the income level determining the color of the MSA region.
This is a bit of a Challenge, but it doesn't require programming. (If you want to program, be my guest, but this doesn't really need it.)
And, if you really don't like MSAs as the boundaries of map regions... find a different one, and tell us why you like yours better.
Once you figure out how to do this, you'll have the means to do your own analysis, looking at data in your own way.
Search on!


P.S. This is the kind of thing that Data Scientists do all the time. With this Challenge, I'm hoping to instill some of the skills and values that Data Scientists bring to the job every day. Hope you have fun with it. I'm looking forward to your comments!





Share

Comments

This post was republished. Comments can be viewed and shared via the original site.
3 comments

About the Author

Dan RussellDan Russell

I study the way people search and research. I guess that makes me an anthropologist of search. While I work at Google, my blog and G+ posts reflects my own thoughts and not those of my employer. I am FIA's Future-ist in Residence. More »

Recent News

  • Deepfakes and the Future of Facts
    Deepfakes and the Future of FactsSeptember 27, 2019
  • Book cover for Joy of Search by Daniel M. Russell
    The Joy of Search: A Google Insider’s Guide to Going Beyond the BasicsSeptember 26, 2019
  • The Future of Facts in a ‘Post-Truth’ World
    The Future of Facts in a ‘Post-Truth’ WorldMay 15, 2018
  • The Future of Virtual and Augmented Reality and Immersive Storytelling
    The Future of Virtual and Augmented Reality and Immersive StorytellingJune 6, 2017

More »

Upcoming Events

There are no upcoming events scheduled. Please check back later.
Event Archive »
Video Archive »

Join Email List

SearchReSearch

  • Answer: What do these everyday symbols mean?
    Answer: What do these everyday symbols mean?March 15, 2023
  • SearchResearch Challenge (3/8/23): What do these everyday symbols mean?
    SearchResearch Challenge (3/8/23): What do these everyday symbols mean?March 8, 2023
  • PSA:  Read Clive Thompson’s article about how he does research
    PSA: Read Clive Thompson’s article about how he does researchMarch 3, 2023
  • Answer: World’s largest waterfall?
    Answer: World’s largest waterfall?March 2, 2023

More »

University of Maryland logo
Robert W. Deutsch Foundation logo
Google logo
Barrie School
Library of Congress logo
State of Maryland logo
National Archives logo
National Geographic Society logo
National Park Service logo
Newseum logo
Sesame Workshop logo
Smithsonian logo
WAMU
© 2023 The Future of Information Alliance, University of Maryland | Privacy Policy | Web Accessibility