Sunday, February 27th, 2011

A Better Way to Find, Look At, Analyze and Display Civic Data

You all know that I love doing data driven posts. But I found myself frustrated that it would make me literally hours to create even simple blog posts doing what I figured was very basic analysis like putting up something about what happened in the latest Census estimates release. There was just tons of tedious work involved.

It can be surprisingly difficult to answer seemingly basic questions about cities, like:

  • Which large metro areas grew their GDP the most in the last year?
  • How does Chicago benchmark against New York on job creation?
  • What counties in Indiana increased their Hispanic population share in the last decade by the most?
  • How did the population growth rate in the city of Chicago compare to Cook County, the metro area, state, and nation last year?
  • Where do the people who move to Indianapolis come from?

Answering these questions can involve lots of drudge work to download raw data, manipulate it in Excel to find what you want, then to type it into an HTML table or put it into a chart you can use in a post, presentation, or document. It take literally take hours, sometimes days.

There are tons of free tools that let you access data, but every one I’ve seen is almost useless for real data analysis. They more or less only let you look up facts – like the population of Chicago – or display grids of numbers. It’s telling that the Census Bureau’s tool is actually called “Fact Finder.” If they create graphs, it’s mostly what they want to show, not what you to, and almost invariably only in Flash, so that you can’t take it out of their system without doing a screen shot.

Conversely, there are tons of pro tools that do fantastic stuff, programs like SAS, ArcGIS, or Moody’s Economy.com. The problem is that these cost huge amounts of money, are aimed at high end power users doing hard core statistical analysis and the like, or both; and are often hard to use as a result. There’s a reason that there’s an entire job category out there called “GIS Analyst”.

So I gave up in my search to find something that met my needs, and instead decided to build my own private database and query tools. Then I discovered that’s what half the world is doing, which seems like a waste. So I figured if this is so valuable to me, which it is, maybe it’s valuable to others and they might use it too.

And so my latest venture, the Telestrian data terminal was born. (See www.telestrian.com). For people who work with data about cities, counties, regions, and states, Telestrian is all about providing three bigtime benefits:

  • Huge time and money savings. I can honestly say that having Telestrian for my own use during development has reduced the amount of time I spend on many data analysis tasks by over 95%. I’m serious. Stuff that would have taken me hours or been nearly impossible before I can now do in a few seconds. And as we all know, time is money.
  • New capabilities. Notice that I’ve been posting more maps lately? That’s because I can actually make them now. And with Telestrian, so can you – and a lot more.
  • New revenue opportunities. If you are a consultant, I’ll show you how Telestrian can power new types of engagements you can sell. In fact, I originally thought it would make a nice proprietary tool for my own consulting business.

And if you’re wondering whether this is the system with the IRS migration data, the answer is Yes! so read on.

You can read more about the benefits and walk through a few examples in my white paper, A Better Way to Find, Look At, Analyze, and Display Civic Data. I’ll highlight a few examples of the benefits in action.

Massive Time Savings

You’ve seen lots of studies that rate metro areas on college degree attainment, like Brookings’ wonderful State of Metropolitan America. Let’s say we’re doing an update to that study for them, and want to look specifically at growth in the share of people who have professional or graduate degrees. Which of the top 100 metros areas had the greatest change in their percentage of population with graduate or professional degrees?

With the Telestrian system, you can answer that question in about 30 seconds. We just go do that data element and do it. Telestrian gives you a common toolset on every data element. The Query tab is what most people gravitate to, since it is what lets you look up data by geography and date like other sites do. But that’s arguably the least powerful thing in the system. If you go to Analyze, you can run powerful parameterized queries that let you mine the data in a snap. Here’s the query you want. You can click to enlarge this screen shot:

Note that we set a threshold to only look at places greater than 510,000 people in population. This gives us the top 100, which is what Brookings looks at.

Bam, here’s the answer:

You’ll note that on the left there are a ton of options for working with the results. Maybe we want to dump that into a blog post like this one in a form you can actually read, for example. In just a couple clicks we can export an HTML table that we can paste right here:


Row Metro 2000 2009 Change in % of Total Adult (25+) Population
1 Washington-Arlington-Alexandria, DC-VA-MD-WV 607,122 (19.1%) 820,534 (22.6%) 3.49%
2 Buffalo-Niagara Falls, NY 74,319 (9.5%) 96,625 (12.5%) 3.05%
3 Baltimore-Towson, MD 201,072 (11.9%) 267,724 (14.8%) 2.95%
4 Boston-Cambridge-Quincy, MA-NH 455,971 (15.4%) 574,092 (18.3%) 2.94%
5 Poughkeepsie-Newburgh-Middletown, NY 41,647 (10.5%) 57,859 (13.3%) 2.85%
6 Worcester, MA 50,857 (10.3%) 70,294 (13.1%) 2.82%
7 Hartford-West Hartford-East Hartford, CT 96,943 (12.5%) 123,378 (15.3%) 2.80%
8 St. Louis, MO-IL 158,331 (9.0%) 220,061 (11.6%) 2.61%
9 Portland-South Portland-Biddeford, ME 34,082 (10.2%) 46,163 (12.8%) 2.54%
10 Columbia, SC 37,534 (9.1%) 55,623 (11.6%) 2.52%

Or maybe put these into a bar chart. Voilà!

Yes, Telestrian system even truncates those overly long metro area names if you want it to. At this point we’ve spent about one total minute in the system.

If you’ve worked with this data at all, you’ll know that it comes from two completely different data sets. The 2000 data comes from Census 2000 and the 2009 data comes from the American Community Survey. So you’d have to manually extract both, merge them, merge in the population data somehow, trim it down to the top 100 metros, calculate the percentage attainment, calculate the percentage point change, sort on that, then hand create the tables or charts. But what’s worse, you may remember that the Census 2000 data is distributed in that old 1990’s era CMSA/PMSA stuff that isn’t comparable with today’s metro area definitions. So you have to download the county data and manually re-aggregate all the 2000 data to current metros yourself, unless you find a source that did it for you already.

Or you can just spend about a minute in the Telestrian tool.

Beyond change in the percentage of a parent data value, there are several other functions you can use in your search too, such as total raw value, total change, percent change, density, and location quotient. The Telestrian data terminal can almost turn you into a one man Brookings Institution.

New Capabilities

Calculating data like the above is tedious, but conceivably doable. But there are some things that are almost impossible to do yourself without the right tools. One of them is to create thematic maps of your results, like those red-blue election maps. Most people create those with ArcGIS, but if you don’t have or can’t use it, or don’t have a graphic designer on call, making one can be almost impossible. I sure didn’t know how to make them.

Using ArcGIS to make a simple thematic map is like using a tactical nuclear weapon to get rid of the spider in your bath tub. That’s why I built it right into the system, letting you render almost any of those Analyze queries directly to a thematic map. We do that in the app on the Map tab, which is similar to analyze but gives you some other options. Let’s just map our same query for all US metros:

In blues, we see places where the percentage of people with graduate degrees increased, in reds those where it actually decreased. I could have picked my own thresholds for coloring, but decided to go with one of the built in algorithms, in this case a 5 bucket sort. This took about 30 seconds total to create by the way, so don’t think that just because I filed this under “new capability” it doesn’t mean it wouldn’t save you lots of time too even if you already have and can use ArcGIS.

By the way, these maps are images files (PNG), not Flash, so you can actually right click and save them to use them as you see fit. And you can make them pretty much as big or small as you want with no resolution loss or distortion. To see an example of what I mean by that, just click here.

Make More Money

This one also saves time and gives you new capabilities, but additionally it enables consultants to make more money too. Cities and states spend hundreds of millions of dollars on human capital and “brain drain” initiatives. But frankly very few places have much of a clue about their human capital networks. Where do people who move in come from? Where do people to leave go? How much money and how big of families are they taking or leaving?

A big problem is the data. The Census Bureau only publishes net migration, but doesn’t talk about where people come from or go to. The IRS publishes that in its migration data, but it is super painful to use. For one thing, other than the last handful of years, the data only comes in the form of over 3,500 Excel spreadsheets. (They will mail those to you on a CD for $500). And the data only tracks state-state and county-to-county when often what we really care about is metro-metro or metro-state. Unless you have and can use (sparse matrix, anyone?) a tool like SAS (which is thousands of dollars a year and doesn’t come with any data) and crack the code on data import, it’s virtually hopeless.

But with Telestrian, all that data has been processed for you, and presented not just at the county-county and state-state level, but also at the metro-metro, and metro-state level. And there are tons of summary metrics taken from the IRS files, as well as other bespoke calculations of things like migration rates and intra-metro migration (e.g., core to suburb moves). Over 100 items in all.

Want to know where the money is going when it leaves Atlanta and how much of it ends up there? Here you go, looking at 2000-2008:

Of course this data is available in raw form, exportable to Excel if you want it. Again, it’s about 30 seconds or so to make this.

This only scratches the surface of what you can do with migration. I hope it is easy to see that there are huge market opportunities for consultants to use this to start helping cities and states map out their human capital networks and find ways to take advantage of them. Much more on this later.

So What Is Telestrian?

So what does Telestrian actually do? A full feature summary is available for your perusal, but in brief, Telestrian provides the following.

  • Data Repository. It contains an aggregated data repository of over 600 data elements, including core data such as population, sex, age, race, migration, education, immigration, commuting, highway congestion, health data, labor force and unemployment, jobs and wages, GDP, personal and household income, poverty, and more. I consider this a “starter set” and there’s virtually unlimited room to expand, which I have big ambitions to do.
  • Common Analysis Toolset. Run parameterized queries to mine the data and analyze results sets. Includes things like filtering by state or population; applying functions like percent change, total change, or location quotient; and calculating CAGR, index values, percentage of parent, and much more.
  • Task Automation. In addition to automatically applying functions like the above, Telestrian also automatically applies rollups of regions, allows saving of commonly used geography lists so you don’t have to recreate them over and over, defining custom regions, etc. The various components of the system are also integrated to enable rapid end to end processing.

  • Visualization. Render results to bar, column, area, line, and pie charts (Flash or image), or export to Excel/CSV or HTML tables. Thematic maps can be made at the national level for states, counties or metros, or at the state level for counties.

The focus of the system is data about cities, counties, regions (MSA, CSA, EA, etc), and states, though national level data is also available.

Pricing is currently on an annual basis at only $495/year (a bit over than $40 a month). But for a limited time to my loyal readers who work for organizations who might be able to use this, I am offering it at $395/year (less than $35/month). If you use it for one project like that grad degree one, it already paid for itself. I might offer a monthly plan in the future, but it will be at a price premium to the annual, and not include access to IRS data. A free trial is available with no credit card required and no obligation so you can try it for yourself without risk. IRS data is not included in the trial.

Consider that just to have the IRS send you their raw data on a CD – in the form of over 3,500 spreadsheets – is $500 by itself. You’d pay well over $300/year just for GIS free mapping with something like Indiemapper. To say nothing of the untold thousands you could spend on high end products.

For those of you who work as consultants, planners, journalists, analysts, economic developers, agency staffers, etc. who work with this data and need to do more than just look up simple facts, I’d ask you to take a look, and if you see the value – which I’m confident you will – please buy.

Since this is the official launch day, I’d ask that you please be gentle if we run into performance or other type issues right here at the start. I will increase site capacity as fast as I can if need be, and of course candid feedback is always welcome. Again, the link is www.telestrian.com.

I’ll wrap up with a couple more fun examples, but before I do I want to tell you a few problems Telestrian is NOT designed to solve:

  • If you need statistical analysis like multi-variate regressions, you need SAS or SPSS or something.
  • If you need data at the zip code, Census tract, or other level below city or county, you need tools from ESRI or one of the many specialist providers who will help you decide where to locate your store or whatever else you need.
  • If you need to look at detailed breakdows like jobs at the 4-digit NAICS code or black-female-and-hispanic, look for something like Moody’s Economy.com
  • If you need to know the unemployment rate the minute it hits the wire, get a Bloomberg terminal.
  • If you need non-US data, again go get Moody’s Economy.com

If you have problems like these that involve very detailed, complex, or time sensitive considerations, I’m sorry. You probably do need to spend a lot of money and hire some specialists.

Fun With Data

Here are a couple more fun pieces of data analysis.

First, a comparison of job growth in New York vs. Chicago vs. the US. I actually go through how to do this example in the Telestrian User’s Guide (yes, the system actually has documentation).

This is a great example of how you can query data at any geography level simultaneous if the data supports it, and the use of indexes for comparison of regions with very different sizes. If you’re familiar with the Current Employment Statistics, you’ll also know that the US data and Metro data come from two separate data sets, but I allow you to query them together.

By the way, I created every single chart in my Chicago vs. New York blog post from last fall combined in about five minutes using a development version of the system. If you at all benchmark or compare cities, I think you’re in the sweet spot of the product. That’s doubly true if you compare places at different geographic levels (such as metro vs. nation or county vs. state, etc) since Telestrian puts no arbitrary restrictions on what geographies you can query together.

One more. Here’s a national county map of unemployment rates for October 2010 (not seasonally adjusted):

It’s a cool graphic, but I especially posted it because data visualization guru Nathan Yau wrote a long blog post at his widely read blog Flowing Data that explained how to a create a map almost like this in 14 easy steps – easy if you know how to program in Python that is. As he put it, “There are about a million ways to make a choropleth map. You know, the maps that color regions by some metric. The problem is that a lot of solutions require expensive software or have a high learning curve…or both.” Yau’s solution requires you know to know how to write computer software. Telestrian is almost de minimis in cost to any real organization and only requires you to know how to surf the net. With that, about 30 seconds later you can have your map.

You can also check out my recent metro GDP post, or my Chicago Census post, which used this system to power the data analysis.

Thanks so much for reading and I hope you’ll check it out and decide to buy – remember, it’s www.telestrian.com. It’s a great way to support the work I do – but much more importantly I’m confident the business value is very real and significant because I’m enjoying it every day myself.

11 Comments
Topics: Demographic Analysis, Economic Development, Technology, Transportation

11 Responses to “A Better Way to Find, Look At, Analyze and Display Civic Data”

  1. Mordant says:

    Very cool – good luck with it!

  2. Chris Hawley says:

    Buffalo MSA had the second highest growth rate in the proportion of adults with graduate or professional degrees from 2000 to 2009, behind only Washington, DC, and ahead of Boston and Portland?? STUNNING.

  3. Greg says:

    Yea I thought that Buffalo stat was surprising too!

    Lol I’m going to guess it’s due to the fact that Buffalo has an overall decreasing population. That way, it would affect the numbers in the way it did.

  4. I thought that was interesting too. In fact, when I first saw it, it made me wonder if I had an error. I didn’t hand recalculate all the data, but I did go back and add up Buffalo’s total number of graduate and professional degrees in 2000, and got the number I expected.

    Keep in mind, the ACS values have a much lower sample size than Census 2000. This means there is a greater margin of error. Most people seem to report simply the headline number. That’s what Brookings did, and if it’s good enough for them, it’s good enough for me. But there’s a lot of year to year variability with the ACS. In the 2008 ACS Brookings used for the original state of metro America, Indy ranked #4 on this metric for total college degrees. Had the 2009 data been used, Indy would have ranked 19th. Metrics are only as good as the underlying data.

    Keep in mind that there are many ways to slice this data. For example, Buffalo only ranked 24th out of the 100 largest metros on percent attainment for grad and professional degrees in 2009.

    You may recall, however, that in my recent GDP post, Buffalo made the top 10 (out of 52) metros with over 1 million people for its percent increase in per capita real GDP. This would foot with an increase in college degrees IMO.

  5. David says:

    This looks to be a great tool. Is the current version incorporating the 2010 Census data that was recently released? If not, when will that data be incorporated? How about for the full release of 2010 Census data?

    Thanks

  6. David, I have all of the total population for US states. I also have the redistricting data that has been released. I’m generally doing a sweep of the states that come out at the end of the week. This is county and municipal data, but can be rolled up to MSA if all the counties are available. It includes population, race (I haven’t yet done all the multi-race groups, however), and Hispanic origin. I plan to bring in the data as it is released.

  7. Alon Levy says:

    You could use the migration data to see if the emigrants from the Buffalo MSA are unusually poor, which would explain this. As far as I can tell from eyeballing the table, they’re not, which would indicate real positive change.

  8. Chris Barnett says:

    Note that the difference among the top 10 cities for grad/prof degrees is negligible; every “change” value on that table rounds to 3%.

    This is partly a demographic phenomenon: as the Depression/WW2 generation dies off, better-educated Boomers, Xers, and Millenials make up a larger portion of the population. Every major metro on the map showed an increase, too, generally in the 1-3% share increase range. Separating out a “real” change that could be attributed to some non-demographic factor or another would take some more-sophisticated analysis.

    Looking at a really stark difference: I’d love to understand how Muncie (home of Ball State) and Terre Haute (ISU and Rose-Hulman) LOST share of population with advanced degrees. Students, professors, and staff represent significant numbers in those cities.

    One would have to assume that the transient student population was about the same over the decade (i.e. a constant inflow of students and outflow of newly-minted grad degrees) so the change in percentage would measure the “base” of permanent residents, including professors and university staff with advanced degrees…which shrank. How does that happen in a small city with a good-sized university base?

  9. Chris, wrt Muncie, lots of undergrads don’t count as they don’t have advanced degrees. The population base here is people over age 25. How many people over age 25 with advanced degrees are sticking around Muncie these days unless they teach at Ball State?

  10. John Morris says:

    Chris Hawley,

    Chris Barnett almost certainly solved that puzzle. In Both Pittsburgh and Buffalo, a huge factor is less one of gaining the highly educated as the loss of the older generation of less educated.

    Pittsburgh is in a very rapid transformation from a very old city to one dominated by the young.

  11. Chris Barnett says:

    Aaron, I meant that BSU produces grad degrees and wanted to hold off the “of course they’re a grad degree exporter” argument. The flow of people entering Muncie with undergrad degrees and leaving with grad degrees is a constant and wouldn’t impact the “base” level of “percentage of residents with a grad degree”.

    But you got my point: people other than newly-minted degree-holders are leaving Muncie (and Terre Haute) with their grad and professional degrees. That seems counterintuitive since it’s not also happening in South Bend, West Lafayette, and Bloomington.

    South Bend is the most like TH and Muncie: it has suffered mightily in the exodus of auto-related manufacturing of the past 30 years. I would have expected to see a decline in grad/prof degree percentage there too.

    But perhaps South Bend’s decline is tempered by two factors: close enough to Chicago for a commute, and home of Indiana’s one major “national” university. Those would serve as grad-degree magnets. Or perhaps all the poorly-educated have moved out, to nearby Elhart, Mishawaka, and Warsaw where there are jobs making RVs and orthopedic implants. Or perhaps a little of each.

The Urban State of Mind: Meditations on the City is the first Urbanophile e-book, featuring provocative essays on the key issues facing our cities, including innovation, talent attraction and brain drain, global soft power, sustainability, economic development, and localism. Included are 28 carefully curated essays out of nearly 1,200 posts in the first seven years of the Urbanophile, plus 9 original pieces. It's great for anyone who cares about our cities.

Telestrian Data Terminal

about

A production of the Urbanophile, Telestrian is the fastest, easiest, and best way to access public data about cities and regions, with totally unique features like the ability to create thematic maps with no technical knowledge and easy to use place to place migration data. It's a great way to support the Urbanophile, but more importantly it can save you tons of time and deliver huge value and capabilities to you and your organization.

Try It For 30 Days Free!

About the Urbanophile

about

Aaron M. Renn is an opinion-leading urban analyst, consultant, speaker, and writer on a mission to help America’s cities thrive and find sustainable success in the 21st century.

Full Bio

Contact

Please email before connecting with me on LinkedIn if we don't already know each other.

 

Copyright © 2006-2014 Urbanophile, LLC, All Rights Reserved - Copyright Information