Takeapart Tuesday: Year-over-Year Change in Google Data Studio

In her recent article for Forbes, Eva Murray writes about How to use simple visual designs to communicate data. She describes how simplicity in design helps make data visualization clearer and easier to understand. One of the examples she showcases is my The World’s Top Exporters of Medicine viz from Makeover Monday Week 8, 2018.

Inspired by Eva’s post and my viz, Charlie Hutcheson wrote a #TakeapartTuesday post detailing his process of replicating my viz in Tableau. I feel honoured to have my work featured by Eva and Charlie, and I recommend that you go read both of their posts:

I created this viz almost a year ago with the intention of showcasing the features of Google Data Studio, specifically the ease of adding year-over-year calculations to a table and bar chart. In this post, I’ll show you exactly how to achieve this without any complicated workarounds.

Let’s go step-by-step through creating this chart in Google Data Studio.

1. Create a new data source

We begin with the data on drug and medicine exports by country (from Makeover Monday 2018 Week 8). This dataset contains 3 columns: exporter (country), year, and exports in USD.

To create the data source in Data Studio, we can either upload the data as a Google Sheet or connect directly to data.world. The resulting schema in Data Studio should look like this:

2. Create a new report

We now add our data source to a new report.

In the report editor, we should first set the canvas size in the Layout panel. My viz uses a width of 600px and height of 500px.

3. Add a new chart

Next, we can add a table with bars. This is a default option in the chart presets menu.

Before we worry about formatting, let’s get the data to display in the way we want.

4. Filter and transform the data

At this point, we see the aggregate exports for all years, as well as the “World” summary row. So, we need to add a filter to our chart to exclude World:

To filter for 2016 only and show the year-over-year change, we will use Data Studio’s built-in date functionality. However, we first need a full date field. Although we have the year available, Data Studio’s date range filters and comparison metrics rely on a full date (i.e. YYYYMMDD).

We need to add a new calculated field in our data source that appends “0101” to the year:

To ensure everything works properly, we need to format both year and date as “Text”. You may think we should use a date format, but trust me, it needs to be text. While we’re here, we should also format the exports metric as currency (USD).

Now we can return to our chart. In the Data panel, first set the date field as the Date Range Dimension. Next, set a custom date range for the chart as Jan 1, 2016 to Dec 31, 2016. Lastly, select “Previous year” as the comparison date range.

These date settings will automatically produce a % change column in our table:

Just like that, all the hard work is done! Everything else is layout and formatting.

5. Apply formatting and style

Let’s start by renaming the column headers to “Exporter” and “Exports in USD (2016)”. This can be done by editing the fields in the chart via the Data panel. We also need to change the “Rows per page” setting to 10.

Moving to the Style panel, let’s give the table header a transparent background, increase the font size of the table labels, make the cell borders transparent, and hide the pagination in the table footer.

Almost there! In the style settings for the exports column, we need to enable “Show number” to display the metric value. We can then select “Compact Numbers” and set decimal precision to 1. We can also change the bar colour using the fill selector.

Our table now looks like this:

Finally, we can add some overall report formatting. We need to insert text boxes for the chart title and footer. The grey stripes at the top and bottom of the viz can be inserted using rectangles. The report font can also be set in the Layout panel; I used “Quicksand” in my original viz.

We’re done!

Although some of my Data Studio work involves what I consider to be “hacks”, the process above requires only standard data source and report functionality. I hope you will try it out yourself and share your thoughts.

If you would like to see a similar step-by-step breakdown of any of my other Data Studio vizzes, let me know which ones!

Designing a Better Tile Map of the United States

Tile maps. A favourite among data viz practitioners and enthusiasts for visualizing geographic data. By representing each region at an equal size, a tile map eliminates the visual bias created by differences in physical area.

Maps of the United States are particularly prone to this bias. On a traditional map, Texas and California are much more visually prominent than, say, Delaware or Rhode Island. If the data presented is unrelated to physical area, this bias makes it difficult to fairly compare states and see patterns.

A traditional map makes it difficult to fairly compare states and see patterns.

A tile map, however, comes with its own challenges. Turning states into equal-area tiles means that compromises must be made when arranging them on a map. The spatial relationships between states cannot always be preserved. With the exception of the Four Corners, state borders are anything but straight lines and 90 degree angles.

There are many variations of a square tile map of the United States. Here are several that I have come across:

Sources: NPR (top L), Bloomberg (top R), FiveThirtyEight (bottom L), New York Times (bottom R)

When I was creating my first U.S. tile map viz about a year ago, I searched for the perfect tile layout. However, each one I found had something about it I didn’t like: North Carolina beside South Carolina; North Dakota beside South Dakota; odd placement of Wisconsin, California, or D.C.

Ultimately, I decided to create my own layout to better respect the geographic relationships of the states, in particular:

  • Washington-Oregon-California along the west coast
  • Washington-Idaho-Montana-North Dakota-Minnesota along the 49th parallel
  • Space between Michigan and New York
  • Washington D.C. adjacent to Maryland and Virginia
  • West Virginia west of Virginia
  • North Carolina above South Carolina
  • North Dakota above South Dakota
  • Texas and Florida extending south
  • Alaska at top left, Hawaii at bottom left

My final tile map looks like this:

I have used this layout in two different visualizations so far:

U.S. Household Income Distribution by State
View in Data Studio
View in Power BI

Let me know if you decide to use this layout in your own work. I would love to see your vizzes!

Creating Jitter Plots in Google Data Studio

Jitter plots (or strip plots) are a method for better visualizing the distribution of a set of data. When plotting a large number of data points, overlapping marks can often be a problem. By introducing a small amount of random noise into the data, a jitter plot slightly disperses the data points to make the overall distribution easier to analyze.

Google Data Studio can easily display a scatterplot, but there isn’t a built-in way to add random noise into a data set. At the moment, Data Studio doesn’t have a “RANDOM” function available for use in calculated fields. However, that just means you have to generate the randomness in your dataset before connecting to Data Studio.

If your data is in a spreadsheet, the “RAND” function in Google Sheets is all you need to power your jitter plot. Simply add a column with =RAND() to generate a random number between 0 and 1.

Once you have a “random” factor as a metric in Data Studio, you can add it to your x-axis metric in order to create the desired effect.

Here’s an example created for Makeover Monday. It shows the results of a survey that asked men and women across 20 countries about their preferences in a romantic partner. By jittering the points, it is easier to see clusters in the data that would otherwise be obscured by overlapping points.

Makeover Monday 2018 Week 2

Click to view interactive viz

Makeover Monday with Google Data Studio

Makeover Monday is a social data visualization project that started within the Tableau community. Over the past 2 years, I have participated in many of the weekly challenges to rework a given chart and dataset. Although I’ve been inactive for several months, I’m taking the new year as an opportunity to get back into completing #MakeoverMonday on a regular basis—but with a new twist. I’m going to use Google Data Studio!

I have become enamoured with Data Studio over the past year for, among other things, the ease with which it lets you connect to data, create visualizations, and share them with others. Of course, it still lacks many of the powerful features that Tableau possesses for data transformation and analysis. But I think there are many aspects where Data Studio comes out ahead. For instance, Data Studio’s drag-and-drop dashboard canvas makes Tableau’s layout containers look archaic.

Datasets for Makeover Monday are usually posted in XLS or CSV format, in addition to TDE files for Tableau. For 2018 however, all the datasets are being hosted on the data.world platform. This will make it easier for users of other tools to participate in Makeover Monday through the integrations available with data.world. For Data Studio, you can connect directly to the data using the data.world Community Connector.

Connecting to data.world in Data Studio

  1. Open the data.world project for the dataset.
  2. Click the “Export” button and select “Google Data Studio”:
  3. Click to “Copy” the dataset URL. Then click “Launch Google Data Studio”:
  4. Paste the dataset URL into the “Dataset or Project URL” field:
  5. Enter a SQL query to select your data. For example:
  6. Click the Connect button at the top right.
  7. Create your viz!

Data for a Cause: Marine Conservation

I recently participated in my first Data for a Cause challenge. In case you haven’t heard of it before, Data for a Cause is an initiative that connects data visualization professionals with non-profit organizations to help bring attention to social and environmental issues.

Each challenge is focused on a specific theme—such as global hunger, public education, or environmental conservation—in collaboration with a particular non-profit organization. Participants are provided with a dataset and a defined objective and tasked with creating a visualization.

For this 9th Data for a Cause Challenge, we were asked to visualize data on international marine conservation for the United Nations Environment World Conservation Monitoring Centre.

I’m happy to say that my viz below earned a 3rd place mention! Read the full results here.

Data for a Cause Marine Conservation

Comment from UNEP-WCMC team: Marc Soares’ image is very elegant, it has a very clear message.

Click to view interactive viz