Takeapart Tuesday: Year-over-Year Change in Google Data Studio

In her recent article for Forbes, Eva Murray writes about How to use simple visual designs to communicate data. She describes how simplicity in design helps make data visualization clearer and easier to understand. One of the examples she showcases is my The World’s Top Exporters of Medicine viz from Makeover Monday Week 8, 2018.

Inspired by Eva’s post and my viz, Charlie Hutcheson wrote a #TakeapartTuesday post detailing his process of replicating my viz in Tableau. I feel honoured to have my work featured by Eva and Charlie, and I recommend that you go read both of their posts:

I created this viz almost a year ago with the intention of showcasing the features of Google Data Studio, specifically the ease of adding year-over-year calculations to a table and bar chart. In this post, I’ll show you exactly how to achieve this without any complicated workarounds.

Let’s go step-by-step through creating this chart in Google Data Studio.

1. Create a new data source

We begin with the data on drug and medicine exports by country (from Makeover Monday 2018 Week 8). This dataset contains 3 columns: exporter (country), year, and exports in USD.

To create the data source in Data Studio, we can either upload the data as a Google Sheet or connect directly to data.world. The resulting schema in Data Studio should look like this:

2. Create a new report

We now add our data source to a new report.

In the report editor, we should first set the canvas size in the Layout panel. My viz uses a width of 600px and height of 500px.

3. Add a new chart

Next, we can add a table with bars. This is a default option in the chart presets menu.

Before we worry about formatting, let’s get the data to display in the way we want.

4. Filter and transform the data

At this point, we see the aggregate exports for all years, as well as the “World” summary row. So, we need to add a filter to our chart to exclude World:

To filter for 2016 only and show the year-over-year change, we will use Data Studio’s built-in date functionality. However, we first need a full date field. Although we have the year available, Data Studio’s date range filters and comparison metrics rely on a full date (i.e. YYYYMMDD).

We need to add a new calculated field in our data source that appends “0101” to the year:

To ensure everything works properly, we need to format both year and date as “Text”. You may think we should use a date format, but trust me, it needs to be text. While we’re here, we should also format the exports metric as currency (USD).

Now we can return to our chart. In the Data panel, first set the date field as the Date Range Dimension. Next, set a custom date range for the chart as Jan 1, 2016 to Dec 31, 2016. Lastly, select “Previous year” as the comparison date range.

These date settings will automatically produce a % change column in our table:

Just like that, all the hard work is done! Everything else is layout and formatting.

5. Apply formatting and style

Let’s start by renaming the column headers to “Exporter” and “Exports in USD (2016)”. This can be done by editing the fields in the chart via the Data panel. We also need to change the “Rows per page” setting to 10.

Moving to the Style panel, let’s give the table header a transparent background, increase the font size of the table labels, make the cell borders transparent, and hide the pagination in the table footer.

Almost there! In the style settings for the exports column, we need to enable “Show number” to display the metric value. We can then select “Compact Numbers” and set decimal precision to 1. We can also change the bar colour using the fill selector.

Our table now looks like this:

Finally, we can add some overall report formatting. We need to insert text boxes for the chart title and footer. The grey stripes at the top and bottom of the viz can be inserted using rectangles. The report font can also be set in the Layout panel; I used “Quicksand” in my original viz.

We’re done!

Although some of my Data Studio work involves what I consider to be “hacks”, the process above requires only standard data source and report functionality. I hope you will try it out yourself and share your thoughts.

If you would like to see a similar step-by-step breakdown of any of my other Data Studio vizzes, let me know which ones!

Creating Jitter Plots in Google Data Studio

Jitter plots (or strip plots) are a method for better visualizing the distribution of a set of data. When plotting a large number of data points, overlapping marks can often be a problem. By introducing a small amount of random noise into the data, a jitter plot slightly disperses the data points to make the overall distribution easier to analyze.

Google Data Studio can easily display a scatterplot, but there isn’t a built-in way to add random noise into a data set. At the moment, Data Studio doesn’t have a “RANDOM” function available for use in calculated fields. However, that just means you have to generate the randomness in your dataset before connecting to Data Studio.

If your data is in a spreadsheet, the “RAND” function in Google Sheets is all you need to power your jitter plot. Simply add a column with =RAND() to generate a random number between 0 and 1.

Once you have a “random” factor as a metric in Data Studio, you can add it to your x-axis metric in order to create the desired effect.

Here’s an example created for Makeover Monday. It shows the results of a survey that asked men and women across 20 countries about their preferences in a romantic partner. By jittering the points, it is easier to see clusters in the data that would otherwise be obscured by overlapping points.

Makeover Monday 2018 Week 2

Click to view interactive viz

Makeover Monday with Google Data Studio

Makeover Monday is a social data visualization project that started within the Tableau community. Over the past 2 years, I have participated in many of the weekly challenges to rework a given chart and dataset. Although I’ve been inactive for several months, I’m taking the new year as an opportunity to get back into completing #MakeoverMonday on a regular basis—but with a new twist. I’m going to use Google Data Studio!

I have become enamoured with Data Studio over the past year for, among other things, the ease with which it lets you connect to data, create visualizations, and share them with others. Of course, it still lacks many of the powerful features that Tableau possesses for data transformation and analysis. But I think there are many aspects where Data Studio comes out ahead. For instance, Data Studio’s drag-and-drop dashboard canvas makes Tableau’s layout containers look archaic.

Datasets for Makeover Monday are usually posted in XLS or CSV format, in addition to TDE files for Tableau. For 2018 however, all the datasets are being hosted on the data.world platform. This will make it easier for users of other tools to participate in Makeover Monday through the integrations available with data.world. For Data Studio, you can connect directly to the data using the data.world Community Connector.

Connecting to data.world in Data Studio

  1. Open the data.world project for the dataset.
  2. Click the “Export” button and select “Google Data Studio”:
  3. Click to “Copy” the dataset URL. Then click “Launch Google Data Studio”:
  4. Paste the dataset URL into the “Dataset or Project URL” field:
  5. Enter a SQL query to select your data. For example:
  6. Click the Connect button at the top right.
  7. Create your viz!

Makeover Monday: Employment Growth in the G7

The subject of this week’s Makeover Monday is a pair of pie charts displaying employment data from the G7. The chart is used in an article on Business Insider, but the original source is a report by the White House Council of Economic Advisers.

chart13_JOBS_employmentgrowthg7

These charts are used to support the conclusion that the United States has been responsible for a disproportionate share of employment growth in the G7. Since 2010, the US has generated 55% of net employment growth, although it accounts for about 42% of total G7 employment.

What works well?

  • Pies show the composition of the whole, which is appropriate given that we’re dealing in percentages
  • Each country is given the same colour in both charts
  • Numerical labels make comparison easier, rather than having to compare the size of each wedge

What could be improved?

  • The pies prioritize showing the composition of the whole, but the focus of the analysis is on the comparison between the measures
  • The slices of the pie are ordered alphabetically; Sorting by size would make it easier to see how the countries rank relative to each other

I decided to take a minimalist approach to my #MakeoverMonday viz and represent this data with a slope graph. I was inspired by the work of Edward Tufte, who is credited with inventing the slope graph. I also aimed to maximize the data-ink ratio—a principle championed by Tufte—by eliminating non-data ink. Here is the result:

Makeover Monday 2017 Week 5

Click to view interactive viz

Makeover Monday: Scottish Index of Multiple Deprivation

This week’s #MakeoverMonday dataset presented a lot of options for visualization. The data is from the Scottish Index of Multiple Deprivation (SIMD), a national report that scores and ranks the relative “deprivation”, or poverty level, across Scotland. The population is divided into “datazones”, which are each associated with a local authority. Each datazone is evaluated on seven different aspects: employment, income, health, education, crime, housing, and geographic access to services. These scores are then combined into an overall deprivation index. The objective of the SIMD is to identify the areas in Scotland suffering from deprivation in multiple aspects.

In the 2012 SIMD report, the following “barcode chart” is used to present the level of deprivation in each local authority:

Scottish Index of Multiple Deprivation by local authority

What works well?

  • Using the bars allows a lot of information to be encoded into a compact graphic
  • Dense clusters of bars make it easy to spot regions of concentrated deprivation

What could be improved?

  • The local authorities are sorted alphabetically. It may be better to sort by level of deprivation.
  • The datazones are plotted by ranking (1 to 6,505), but this does not allow for comparison based on deprivation score. Most likely, the level of deprivation is not linear along the ranking.

Here is my version, improving upon the original barcode chart. After playing around with circle marks, boxplots, and other forms of viz, I decided to keep it simple and make incremental improvements:

  • Local authorities are sorted by Local Share of Deprivation. Those at the top have a higher percentage of their datazones among the 15% most deprived in Scotland.
  • Bars are plotted according to the overall SIMD score, not the ranking. This makes the relative levels of deprivation more apparent.


Makeover Monday 2016 Week 44

Click to view interactive viz