How to measure and visualise racial segregation? Lessons from the Washington Post

The Post’s examination of race and segregation in the US is a definite highlight among recent data journalism projects. We spoke to the reporter behind the story to find out more about its creation.

Few visual projects have tackled the issue of race and segregation as effectively as the Washington Post’s ‘America is more diverse than ever – but still segregated’. The story was widely shared on Twitter, with many posting admiring comments about the visualisations and sharing findings from the story’s interactive map.

To learn more about this complex and extensive project, we caught up with Aaron Williams, the reporter at the Post’s graphics team who authored the story with Armand Emamdjomeh. Aaron told us about the inspiration behind the project, why static maps can be better than interactive ones, and why having beautiful graphics is not only a question of aesthetics.

Analysing a changing country

Aaron Williams is graphics reporter at the Washington Post.

Williams originally got the idea to explore race data from the book We Gon’ Be Alright by Jeff Chang, which among other topics discusses racial separation in housing in the US. One section of the book looks at the worsening segregation in the San Francisco Bay Area, and as it’s where Williams grew up, he was able to recognise the changes and became interested in analysing the related data.

Fortunately, his colleagues Dan Keating and Ted Mellnik already had normalised census data going back 30 years. Using this data, Williams first started to explore the topic about a year ago, initially working in sprints and alongside other reporting.

To make the massive dataset easier to handle, Williams initially concentrated on the Washington DC area. But he eventually also included other major cities, and in the end his superiors encouraged him to expand the scope even further: ‘So what ended up happening was, I did the entire country,’ he said laughing.

Williams initially focused on a few major cities in the US, before expanding his analysis to the whole country.

Balancing interactive and static elements

Before arriving at the final version, the project went through several iterations, the first one being basically an online database with no maps or narrative component. ‘We don’t really do this in data journalism anymore, but we used to publish big datasets that let you explore the data,’ Williams said.

This evolved into an interactive ‘scrollytelling’ version of the story. ‘But about a week into building that version, I realised it was way too complicated and that we would be better served just doing a more traditional story layout.’

The team also wanted to make sure the beautiful maps got the attention they deserved. ‘Our director of graphics, Chiqui Esteban, thought the maps look gorgeous, and it would be a shame to lead with an interactive version when the static maps are so compelling.’

The focus on beautiful graphics is not only a question of aesthetics, Williams said, but attractive visualisations are part of the Post’s strategy to keep their readers engaged: ‘You’re fighting for attention all of the time. Some people have really bad UX tricks they use to do that, but we try to use striking visuals instead.’

In addition to the static maps, Williams felt the interactive layer was also important: ‘My goal as a data journalist is not just to tell you a story, but also to allow you to go deeper in it.’ Thus the final story includes an interactive map that lets readers explore specific places – for example where they grew up – and see how the areas have changed over the last 30 years.

The data showed that in the DC metro area, the Hispanic American population increased by almost 300 percent from 1990 to 2016.

Support from the academia

Beyond visualising the data, Williams also wanted the story to address the reasons why segregation persists to the degree it still does in today’s America. For this, he needed a good understanding of the related policies. ‘I’m not a social scientist by trade, and I hadn’t studied segregation in America extensively. So I needed people with authority, who knew this data and its limitations, to make sure I wasn’t making any wrong assumptions.’

The researchers quoted in the story – Michael Bader, Maria Krysan, Kyle Crowder – all contributed to the project’s creation and saw its early versions. For example the ‘multigroup entropy index’, which Williams used to measure the level of segregation, was suggested by Bader, and generally the researchers brought academic rigour to the story, Williams said.

‘I think the academic parts are really important, because journalists are very good at knowing a lot of things, but very few of us know one topic incredibly well,’ he said ‘And academics know one topic incredibly well, but tend to write for other academics.’

According to Williams, the interactive proved to be especially popular among readers: many tweeted screenshots, comparing the data to their own experiences of how the neighbourhoods familiar to them had changed.

Chicago is one of the most segregated cities in the US.

‘I knew that people were going to dig it, but I think I gained like a thousand Twitter followers in two days, I did not think it was going to be that big,’ Williams said. ‘I’m glad people liked it: there’s nothing worse than working on something really complicated and then getting no response.’

Some, however, criticised the colour choices of the visualisations, mainly because the maps used primary colours for black, white and Hispanic populations, and secondary colours for smaller populations. ‘We didn’t plan to use primary colours for the biggest racial groups, it just kind of happened’ Williams said. ‘I tried to make sure that the colours weren’t obviously racist, but I didn’t key into the question about primary versus secondary colours.’

As prisoners are included in the census data, some readers discovered that you can use the interactive tool to zoom in on prisons. ‘So you can actually see the diversity of prisons over time. Some people thought that was cool, others found it upsetting,’ Williams said.

Houston has seen large growth in Asian and Hispanic populations.

Value in going slow

Although the story required intricate data analysis, including the use of a complex statistical measure (the ‘multigroup entropy index’ mentioned above), Williams is keen to emphasise that all the people who worked on the project are part of the Post’s graphics team.

‘Oftentimes data visualisation people and graphics teams aren’t really considered journalists, they’re considered kind of art and design people, though the attitudes on this have certainly got better since I started doing this,’ he said. ‘This project is a reminder that graphics journalists are real journalists.’

In fact, he actually considers himself a storyteller and journalist first, and a data visualisation person second. ‘But dataviz is a very effective way of communicating, it let’s you communicate things that you can’t with text alone.’

He also underlined that the freedom to invest a lot of time was crucial to the project’s success and he is grateful that his editor made sure he had the needed space and time. ‘I think it’s a testament to the fact that there are articles and projects that are worth it. There’s merit in taking your time to tell a really powerful story.’