Tuesday, March 8th, 2011
Thanks to those of you who checked out Telestrian, my new data analytics and visualization venture. I want to kick off an occasional series of posts demonstrating what it can do. In an effort to not make it be too salesy, I’m going to use the tool to derive some interesting insight.
Today I want to look at thematic maps. Even if you need to map data I don’t have, you can do it with Telestrian for the map types I support. So if you have any type of data – election data, sales data, etc. – and want to map it, read to the bottom to see how.
The Telestrian system goes well beyond just letting you look up facts by place and date. You can run actual queries against the data. And for most of the queries, you can render the results directly to a thematic (choropleth) map in about 30 second end to end. It’s almost trivial to make extremely powerful charts using this. Let’s look at a few examples.
Here’s one, looking at the the results of the 2010 Census so far. This is the percentage change in population for counties in states where the redistricting files had been released as of the end of last week vs. the last Census:
There are lots of web sites that create maps, but they almost all have major weaknesses. One of them is that almost all of them use Flash as the technology, which is great for web interactivity, but terrible if you actually want to take your map with you and put it in a presentation or document. Telestrian is different because its maps are rendered as actual image files (PNG) that you can right click and save off to take with you. As I demonstrated in a previous example last week, you can also scale the image to pretty much any size you want with no resolution loss or distortion. Images are great for presentations or basic documents, but if you are interested in using this in a pamphlet or other production graphics work, Telestrian will also export the map in scalable vector graphics (SVG) format you can hand off to your graphic designer to format and put right in your publication. So the first Telestrian difference is that it generates formats you can actually use.
Another issue with a lot of mapping sites is that they are designed as web eye candy. They look cool, but you really can’t do a lot to change them. The data is presented how it is presented. But Telestrian gives you tons of control over how you render the data, ranging from the query you execute to the colors and more.
Here I decided to run a map based on a percent change function. I also took advantage of a built in capability that lets me assign a different color type for values above and below a threshold value. In this case, I used zero as the threshold, so blue counties are those that grew in population, and red are those that declined. I used a five bucket (quintile) sort with five positive and five negative quintiles to assign colors. You can also see that I don’t need data for every single county just to create the map.
You can use other values as your threshold. Two built-in functions let you threshold on the national value or, for state maps, the state value. It’s automatic – the system will pull the appropriate value for you. The meaning of this obviously is dependent on the data. Or you can just pick a threshold yourself. Here’s an example where I plotted the unemployment rate in October 2010, highlighting only those counties with an unemployment rate of 11% or more:
Here I selected a monochrome color scheme just to highlight the counties, and just set the ones below the threshold to white (empty).
Here’s the last national county map I’ll do. It’s a monochrome map showing the counties that had any migration (in or out) with Franklin County, Ohio (Columbus) between 2000-2010. I’m just using this to map the migration shed of that county:
Think states are completely irrelevant today? Think again. Lest you think this is entirely Ohio State driven, note that it is based on tax return data, so undergrads leaving home to go to school might not even show up. Also, if you map Marion County, Indiana (Indianapolis) you get a similar pattern. There’s clearly still some level of migrational affinity for staying within the same state. I’ve really enjoyed comparing the size and composition of migration sheds for various counties and metro areas as it is very illuminating.
Speaking of metro areas, let’s do a map of those. I mentioned that you can apply various functions in your queries. In addition to just raw value, you can also map things like total change, percent change, per capita, density, percentage of a parent value, etc. One of the functions supported is location quotient. This is normally used to measure the concentration of employment in various industry clusters. But an economist in Columbus was telling me he found it useful for all sorts of things, especially as the math works for any hierarchical data element. So here let’s take a look at the change in location quotient for college degree attainment for US metro areas from 2000 to 2009:
Again, positive changes are in blue, negative in red. You can easily get the raw data if you want too, and further analyze it. Because location quotient and change in location quotient are built in, again it takes virtually no time to run this. You just pick the value from a drop down.
I find this interesting because it shows where degrees are concentrating and deconcentrating. In order to have a positive change in LQ, you basically need to increase your educational attainment at a faster rate than the nation as a whole. That’s harder to do when you’re on a high base. Some places like NYC managed it. Others did not. So while a map to total attainment might show major spikes, this shows some areas with lower attainment that really upped their game. If you pull up the bottom a lot, you can actually deconcentrate something on a national basis even if the folks at the top still improved.
I’m still exploring this way of measuring it, and it would be interesting to see a true pro’s take, but I think this is an interesting dimension, though certainly not the only one, on which to look at some of these measures.
Here’s one of the percentage of the population that are binge drinkers (2009 data):
Hello, Wisconsin! This shows a state map and that I have colors other than red and blue. It also shows another method of assigning colors, this one what I call the “intensity” method, which is actually the default. In this method, we don’t create discrete buckets, but rather scale the color continuously in a manner directly proportional to the data value. In a case like this, a quintile sort might have implied major differences in value that aren’t there in the data. Maps like this don’t work well with outliers, but for a lot of things – population of the states for example – it can be perfect.
All of the example above show system assigned colors. That’s the fastest way to do it. But you can choose your own color assignment thresholds if you’d like. In fact, you absolutely need to do that to do time series maps, etc. Here’s an example from my Indiana census post of total change in Hispanic population that I did where I selected the thresholds myself:
This also shows the state level county maps Telestrian can do. The system supports national maps of states, metros and counties, or state level maps of counties. You can’t define or create arbitrary maps, which is the one price you pay of making everything so simple.
Here’s one more state map, this one Michigan counties for percent change in jobs from 2000 to 2009, positive change in blue, negative in red, intensity color assignment:
Lastly, I promised that you could map literally any data you wanted, so let’s see how. It can be so difficult to create thematic maps without special tools or technical skills, I thought I would extend it to any type of data you wanted. You can’t (yet) save the data into the Telestrian system and use the rest of the system functions against it, but you can map it.
You do this by uploading a comma delimited file with the data. Let’s see an example of the original red-blue map, this one the 2008 presidential election results. I created a file that contains President Obama’s margin of victory. Here’s a sample of it:
As you can see, that’s pretty simple. In fact, I have downloadable templates for all available maps with the names and codes. All you have to do is drop your data into it and upload the file.
Let’s do a red-blue monochrome based on a threshold of zero and we get this map:
No legend needed for this map, so I turned it off.
I can’t think of how this could get much easier for a non-technical user.
I won’t show this here, but if you do have a bit of technical skills, you can actually directly control the colors by including an RGB color string in the file instead of a data value. This can be useful for highlighting a particular county or state or something.
I hope this shows how easy mapping can be. Other than the Obama one, which took longer since I had to download the process the data by hand, and the Indiana Hispanic map, which took about a minute because I had to type in my own thresholds, the rest of these maps again only took me about 30 seconds each to create. And they are in image format I could plop directly in here.
You an read much more about the nuts and bolts to mapping in The Telestrian Guide to Thematic Maps.
Again, I hope you found these maps interesting, and that you’ll give Telestrian serious consideration for your own thematic mapping needs.