Iron Viz Geospatial: Building Invisible Walls

This post recounts my experience building my first ever Iron Viz feeder contest submission. View my complete viz on Tableau public: INVISIBLE WALLS: The Reality of Racial Segregation in America.

You can cast a vote for my viz by using the hashtag #GeoIronVizMSoares on Twitter: 

View all the Iron Viz Geospatial submissions on the Tableau Public blog.

 

Finding a story to tell

The first order of business was to decide on a topic for my viz and to find the appropriate data and shapefiles. I started exploring open city data, inspired by Tableau product manager Kent Marten’s examples in his recent post on the Tableau blog.

After trying out a few shapefiles of New York City population, I thought it could be interesting to compare population and density across the largest cities in North America: Mexico City, New York, Los Angeles, Toronto, Chicago. But I soon realized that obtaining comparable data from each city was going to be problematic; the available data was not necessarily in the same format, at the same level of granularity, or even in the same language. Mexico City didn’t appear to have an easily accessible public data library. And the Statistics Canada website just happened to be down at the same time as this feeder contest was announced.

So, for about a week I bounced ideas around while searching for other sources of data and shapefiles. I knew I needed a good story, but for many of the shapefiles I found, I couldn’t create an interesting story, and for some story ideas I thought could be interesting, I couldn’t find relevant shapefiles.

With about a week left to the contest deadline, I returned to my initial ideas about cities and population. I downloaded a shapefile from the U.S. Census Bureau that contained population data for every census tract in the country. I was able to import this mass of data into Tableau without issue, however it slowed the program to a crawl. There were just too many rows to map and explore all at once. However, this led me to choosing my ultimate Iron Viz story.

This census dataset contained racial and ethnic population breakdowns for each census tract. Seeing this data triggered a memory of a fascinating visualization I had seen a few years back—The Racial Dot Map by Dustin Cable. In this interactive map, every person is the United States is represented as a dot coloured according to their race or ethnicity. The intriguing result of this viz is the ability to see how different races are often clustered into ethnic neighourhoods within the same city.

 

Image of Chicago in the Racial Dot Map by Dustin Cable
Chicago, as shown in Dustin Cable’s Racial Dot Map

 

Dustin Cable’s Racial Dot Map was inspired by the work of Eric FischerBrandon Martin-Anderson, and Bill Rankin before him. So, in the same spirit, I took this as inspiration for my Iron Viz submission—I decided to tell a story about racial segregation in the cities of America.

 

Compiling the data

As I discovered from my first try working with U.S. Census data, it would not be feasible to use a single data source. But if I was going to focus on a few major cities, I didn’t need data for the entire country anyway. Thus, I found my way to the U.S. Census Bureau’s TIGER/Line Shapefiles and American FactFinder.

For the shapefiles, I downloaded the county file of census blocks for each of the cities in my viz. In the case of a city spanning multiple counties, I downloaded the state file instead. In Tableau 10.2, it’s not possible to combine (or “union”) multiple shapefiles into a single data source, so it was necessary to have a single shapefile for each city.

The data comes from the 2010 Census Summary File 1, Table P5, which contains data on race and ethnicity (which for the purposes of the census are two distinct characteristics). It took a few tries to figure out how to use the American FactFinder tool, but once I got the hang of it, compiling a dataset for each city was relatively straightforward.

In Tableau, a simple join on the census block ID gave me a data source for each city.

I created a few calculated measures to give me the percent of the total for each racial/ethnic category (i.e. White, Black, Hispanic, Asian, Other), and another calculated field to return the majority:

Tableau calculation for majority ethnicity

For each map in Tableau, I dropped Majority on the Color shelf to colour each census block according to its majority population.

With the data all sorted out, I proceeded to refine the design, analysis, and storytelling of my viz.

 

Building the framework

I played around with a few different formats for my viz: should I use a single column? a grid? maybe an actual Tableau story? As an Iron Viz submission, I wanted it to be visually impactful, and so I thought it should fill the screen. I wanted it to have the feel of an infographic or poster. And with geospatial being the central theme, I wanted the maps to be front-and-centre, not an accessory to a bunch of other charts and tables.  A grid layout seemed to be the most natural choice, so the question then was how many cities would be enough?

Though I started with only New York, Los Angeles, Chicago, and Houston (the 4 largest cities in the U.S. by population), I realized that in order to achieve the impact and scale that I envisioned for this viz, I would need to cover a lot more cities. So, I started doing research on racial segregation in the U.S. and came across several different articles and studies about the most segregated cities in the country. As a result, I started adding to my list: Philadelphia, Baltimore, Milwaukee, Atlanta, St. Louis. In the end, I had 13 cities visualized in my Tableau workbook. In order to create an even 4-by-3 grid, I had to drop one city (sorry Detroit).

 

Designing the details

Being my first ever Iron Viz submission, I didn’t want to leave anything to chance. Thus, I obsessed over every detail of my viz: font, colour, spacing, size. I even deliberated over whether I should use the American spellings of “color” and “neighborhood”. Here are just some of the most important design details:

  • Colour: I used the same colours as Dustin Cable’s racial dot map, i.e. blue for White, green for Black, orange for Hispanic, and red for Asian. However, I chose lighter, more muted shades from Tableau’s “Summer” palette.The Summer colour palette in Tableau
  • Legend: The default Tableau colour legend was far too inflexible for the design I wanted. Instead, I created a custom legend using dashboard layout containers for the top and bottom of my viz.
    Colour legend for my vizWhen I shared an initial draft with some of my colleagues, they raised a concern that readers may forget the colour association once they reached the middle of the viz and the legend was out of sight. To help alleviate this problem, I inserted a “mini-legend” between each row of maps to provide a constant reference.
    Mini colour legend for my viz
  • Maps: In order to eliminate as much distraction as possible, I turned off all map layers so that only the data is visible. I love how the geography of the city is still discernible from coloured census blocks, even without a background. To me, this is just an example of maximizing the data-ink ratio, a la Edward Tufte.The only exception is Houston, for which I left the “streets and highways” layer enabled. This is because as I was researching the population patterns of Houston, I found that major highways running through the city have heavily shaped adjacent ethnic neighbourhoods. So, showing the roads makes this relationship more evident.Map of Houston

Telling the story

Once all the maps were done, the final step was weaving a coherent story throughout the viz. I decided to write an introduction at the top and a few comments below each city map. I researched each city using Wikipedia and other sources in order to understand the racial segregation patterns that were apparent in the viz. It was interesting to learn how certain cities have been shaped by historical policies, others by immigration, and others by geography and infrastructure. Ultimately, I hope every person who reads this viz will learn something new.

 

Building this viz over the past couple weeks has been an exciting, yet exhausting experience. It has required the application of my entire repertoire of data visualization knowledge and Tableau skills. My goal, of course, was to create a viz worthy of being selected as an Iron Viz finalist, and I can only hope that I have done so.

View my completed viz on Tableau public: INVISIBLE WALLS: The Reality of Racial Segregation in America.

Makeover Monday: Employment Growth in the G7

The subject of this week’s Makeover Monday is a pair of pie charts displaying employment data from the G7. The chart is used in an article on Business Insider, but the original source is a report by the White House Council of Economic Advisers.

chart13_JOBS_employmentgrowthg7

These charts are used to support the conclusion that the United States has been responsible for a disproportionate share of employment growth in the G7. Since 2010, the US has generated 55% of net employment growth, although it accounts for about 42% of total G7 employment.

What works well?

  • Pies show the composition of the whole, which is appropriate given that we’re dealing in percentages
  • Each country is given the same colour in both charts
  • Numerical labels make comparison easier, rather than having to compare the size of each wedge

What could be improved?

  • The pies prioritize showing the composition of the whole, but the focus of the analysis is on the comparison between the measures
  • The slices of the pie are ordered alphabetically; Sorting by size would make it easier to see how the countries rank relative to each other

I decided to take a minimalist approach to my #MakeoverMonday viz and represent this data with a slope graph. I was inspired by the work of Edward Tufte, who is credited with inventing the slope graph. I also aimed to maximize the data-ink ratio—a principle championed by Tufte—by eliminating non-data ink. Here is the result:

Makeover Monday: Scottish Index of Multiple Deprivation

This week’s #MakeoverMonday dataset presented a lot of options for visualization. The data is from the Scottish Index of Multiple Deprivation (SIMD), a national report that scores and ranks the relative “deprivation”, or poverty level, across Scotland. The population is divided into “datazones”, which are each associated with a local authority. Each datazone is evaluated on seven different aspects: employment, income, health, education, crime, housing, and geographic access to services. These scores are then combined into an overall deprivation index. The objective of the SIMD is to identify the areas in Scotland suffering from deprivation in multiple aspects.

In the 2012 SIMD report, the following “barcode chart” is used to present the level of deprivation in each local authority:

Scottish Index of Multiple Deprivation by local authority

What works well?

  • Using the bars allows a lot of information to be encoded into a compact graphic
  • Dense clusters of bars make it easy to spot regions of concentrated deprivation

What could be improved?

  • The local authorities are sorted alphabetically. It may be better to sort by level of deprivation.
  • The datazones are plotted by ranking (1 to 6,505), but this does not allow for comparison based on deprivation score. Most likely, the level of deprivation is not linear along the ranking.

Here is my version, improving upon the original barcode chart. After playing around with circle marks, boxplots, and other forms of viz, I decided to keep it simple and make incremental improvements:

  • Local authorities are sorted by Local Share of Deprivation. Those at the top have a higher percentage of their datazones among the 15% most deprived in Scotland.
  • Bars are plotted according to the overall SIMD score, not the ranking. This makes the relative levels of deprivation more apparent.

Makeover Monday: U.S. National Debt

This week’s #MakeoverMonday features a really simple dataset: the size of the U.S. national debt ($19.5 trillion) compared to that of the rest of the world combined ($39 trillion).

I aimed to maintain simplicity in my visualization, representing the debt in blocks of $500 billion:

Canadian Federal Election Results

It’s been one year since Justin Trudeau and the Liberal Party of Canada swept into power. Here’s the viz I created last year to explore the results of the most recent Canadian federal elections:

Makeover Monday: Clinton vs. Trump U.S. Election Viz

This week’s #MakeoverMonday viz takes a look at polling results in each state leading up to the U.S. Presidential election. The original viz is this interactive “Votamatic” tool at Daily Kos.

Daily Kos polling results tracker

This is honestly an awesome visualization. The filter allows you to view the results for each state individually. And the interactive chart lets you hover and view the polling numbers at any point during the race.

So, I decided to look at the data in a slightly different way.

Here’s my Tableau #ElectionViz:

The History of the NHL

With the start of a new NHL season upon us, I thought I would visualize the historical performance of each NHL team. I used a format similar to my NHL Barcode Viz, except this time it’s not binary; rather, it charts Points Percentage above and below the .500 mark.

I took my inspiration from similar vizzes created by Chris Jones (MLB Franchise Performance) and Matt Chambers (The History of the NFL). I also took some pointers from Andy Kriebel in this blog post.

I have focused on seasons from 1967 to the present, i.e. the NHL’s “Expansion Era”, as only the Original Six teams were in existence prior to ’67. The franchises are ordered alphabetically within their current divisions, and with their current team names. Winning seasons are shown in the team’s colour, while losing seasons are in grey.

Makeover Monday: Public Transit Satisfaction Survey

Here’s my first attempt at a Makeover Monday viz.

This week’s featured graphic is from the Financial Times. It displays the results of a public transit satisfaction survey conducted across Europe. Respondents were asked to rate their satisfaction with public transit in their city.

Satisfaction with public transport - Financial Times

So how can we improve this viz? Let’s find out.

Here’s my version:

What improvements did I make?

  • Taking a hint from Zen Master Andy Kriebel, I centred the bars around a central axis, showing positive sentiment to the right and negative to the left. This makes is easier to judge the overall response.
  • I used an orange-green colour palette, with positive responses in green and negative responses in orange.
  • Instead of a traditional colour legend, I used colour-coded text labels along the top of the chart. I also aligned the city labels immediately beside the data bars. And I added data labels on the bars, instead of using a horizonal axis. The intent of these changes is to make the chart easier to read by more directly labeling the data.

All-Time Summer Olympic Medals

Prior to this past summer’s Olympic Games in Rio, I came across an interesting graphic from Reuters titled Precious Medals. It displays the all-time medal standings of the Summer Olympics, and allows you to drill into each country to view its performance over time.

All-Time Summer Olympic Medals: Reuters Graphics

I was impressed with the simplicity, yet sophistication of this presentation. The clean bar charts and appropriate use of colour keep the viz uncluttered. Yet, the interactivity allows a lot of information to be embedded within the graphic.

Naturally, I decided I would try to replicate this visualization myself with Tableau.

The result of my efforts is below:

NHL Barcode Viz

The “barcode” chart has been used to great effect for visualizing the performance of sports teams over the course of a season.  After seeing the work of Peter Gilks (BallCode) and Craig Wortman (MLB Bar Code Chart), I decided to take a run at creating my own barcode viz using the results of the most recent NHL season.

If barcodes can work for basketball and baseball, then why not hockey?

My interactive viz below represents the final standings of the 2015-16 NHL regular season.

The barcode chart shows the win-loss record of each team. I used custom shapes to add the team logos and a custom colour scheme for the barcodes.

I’m generally a proponent of using colour in a chart only when it serves a purpose. In this case, the logos and corresponding colours enable any hockey fan to easily find a team based on these visual cues alone.

To fit my vertical blog format, this version allows you to view one division at a time, which you can choose from the drop-down filter at the top of the viz:


To see all teams at once, check out the full version on Tableau Public.

NHL Standings Barcode Tableau Viz