Good visualizations are a key factor in overcoming the “inhuman” nature of data

May 15, 2019

A visualization created from an analysis on the combination of instruments and their effect on the energetic quality of songs on the Spotify platform, from the project What's That Sound Again?

Has it happened to you that you felt confused or disinterested when you find yourself in front of an xls table or a json document with a long string of letters and numbers? What you need are clear conclusions and parameters which illustrate the reality of a situation, but open data comes in a format interesting for machines, not for humans. One of the major obstacles in practical use of information available to us as open data is most easily resolved through data visualization.

“If we show the data in a manner that is comprehensible, interesting, exciting or provocative, open data gains a dimension that can alter our understanding of the world and phenomena in it.”

Jovan Leković spent a long time in a center where data and their processing influence the attitudes and decisions of readers around the world. While working in the BBC team in London, he focused not only on data visualization, but also on defining internal standards, which they published as BBC Audiences Tableau Style Guide. Jovan often faced the challenge of having to make the results of data processing and analysis easily understandable to very different target audiences.

Processing, analysis and attractive visualization of data are also our great challenges and that’s why we talked to him about the importance of data visualization, the role of open data in media, difficulties in visualization and defining public perception, as well as examples of good practices.

What is the relevance of data visualization?

I wouldn’t be surprised if historians (in the future) characterized the present period as a data revolution. Currently, data helps us make decisions in everyday life, work, and social relations. In essence, visualization is a way through which we can transform the abstract nature of data into something visible, condensed and intelligible. For example, we can turn a large dataset into an image accessible to everyone. We can encapsulate phenomena that have lasted for decades into an animation lasting for a few seconds. Or we can allow the user to find an important detail in a complex hierarchy.

How does data visualization help in using open data and reaching conclusions based on them?

I am subjective of course, but I believe data visualization plays a big, if not crucial role in working with open data. If the purpose of open data is to enable a more transparent society and create insights about the real state of the world, visualization is one of the most relevant ways of communicating those insights.

Open data has had an incredible effect on visualization as a profession. Prior to the open data initiative, finding data for visualization was extremely hard, expensive and slow. On one hand, the existence of open data enables us to focus on interesting visualizations of that data. On the other, it allows us to use the data as a foundation and context for completely new ideas.

For the project What's That Sound Again?, using the Spotify api, Jovan processed around 200 000 songs, 28 000 playlists which feature 57 000 different artists.

What is the role of data and open data in (online) media?

One of the most interesting phenomena in the last two years, in the context of Britain’s open data, is the reporting on the gender pay gap. Earlier, this would be a topic mostly connected to academic institutions and limited to an academic audience. Given that datasets relevant to this issue are being published each year, this created a serious discussion on an annual level which involved media, companies and the general public. And this is just one of the most recent examples. Open data is changing the way we discuss health institutions, inequality in owning real estate and political phenomena. What is especially interesting is a less visible aspect of using open data:

Open data enables the creation of an informative context. If someone is presenting data about homelessness in a specific region, open data allows them to also show the history of unemployment which is surely relevant for that topic. A good data visualization permits us to create a more comprehensive understanding of social issues.

How is data visualized in the BBC team?

My role at BBC was within a team of around a hundred individuals (which additionally speaks about the relevance of analyzing different parameters in a media outlet) which worked on understanding the relationship between the BBC and the British public. To understand that relationship, our team used different research methods. From TV audience measurement to complex surveys about the quality of content, focus groups, and long-term studies on viewership. From reporting on how the radio listenership is changing, to thorough analyses on the role of social media in the lives of youth. All those methods, in one way or another, produce enormous amounts of data which should allow for a better understanding of BBC viewers, listeners and readers. After collection and analysis, all of this data should be compiled and clearly communicated to journalists, editors, directors, producers, commentators, computer programmers, and BBC management.

I was lucky enough to initiate the creation of a position specialized in data visualization. More precisely, this meant I had to educate the team on how to use visualization techniques, set quality standards for data visualization within the team, as well as develop visualizations and applications which are relevant for the BBC. A good example of a project that I developed with the team is an application that allows editors to comprehend the effect of social media.This also involved collaborating with various teams, gathering and automatizing data, defining relevant measurements, designing the application, and educating users.



From the project BBC Audiences Tableau Style Guide

Now that you visualize data in your own studio, what does your job look like?

As a freelancer, I have the freedom to use my time in three important ways: research for personal projects, lectures, and project development for clients. The main difference is that this kind of work allows for a more flexible approach to projects, because it demands the ability to comprehend the specific area of work of each individual client, understand the systematization of their data and their questions, programs, and data culture. This means that each project is different and presents me with interesting challenges, topics, and methods.

How did you decide to focus on visualization full time?

I have been doing data visualization in one way or another for over ten years now. For most of my career, it has sort of been a work hobby that my colleagues noticed and appreciated. At BBC, I got the opportunity to transform that “work hobby” into an official work position. Last year, I decided to create the The Synthesis Bureau so that I could fully devote myself to visualization.

I can’t imagine a more interesting profession in a more interesting period. Visualization combines several disciplines which were previously not considered to be connected. Statistics and design, programming and the theory of art, journalism and mathematics. This means that I constantly have to learn something new, I have opportunities to explore different methods and try new concepts and ideas.

Which problems are you faced with in the process of selecting data and visualizations?

It may be a cliche, but it’s frequently true that 80% of effort in data visualization is devoted to data preparation. Data often comes in complicated formats and are not adapted for machine learning, which represents a technical challenge. I find conceptual challenges interesting as well. How was the data collected? Is it data we have already seen? Is there a new way to present this data? What is missing in the data and how to we show it? Is the data unveiling something unexpected? How much do I trust the data source? Is the data important for a topic that is currently relevant?

What are the issues that arise in the process of  perceiving and reaching conclusions based on inadequate analyses and visualizations?

Data visualization seems like a simple process, however, it is extremely easy to make mistakes. One of the more classic examples is the statement of Edward Tufte on how NASA engineers could have prevented the explosion of the space shuttle Challenger with better data visualization (this topic has been widely discussed within the community). In everyday visualization, it is important to fully comprehend the data, find the best way to present the data and, this is especially important, focus on how the readers of the visualization will interpret the presented data. In all three steps, it’s possible to make mistakes. When comprehending the data, it’s possible that there are mistakes in data collection methods which should be understood and explained to the readers. In presenting data one could, for an example, use a graph with 3D effects which decrease the readability of the visualization. And finally, one should understand the users of the visualizations. For example, a specific use of colors can be unclear for users who are color blind. There are cases when readers don’t understand the terminology, acronyms, and presented measurements, or it could happen that we run into issues with user experience design - for example, users might not understand how to use the interactive components of the visualization.

It is important to say that all of us still have a lot to learn about data visualization. The Economist magazine, which has a long tradition of innovatively presenting data, published a list of examples where they think they made mistakes. In the last ten years, we’ve seen fantastic work in the academic sector which is trying to set visualization standards that are easy to implement. For example, it’s good to follow Multiple Views which publishes the principles researched by the academic community or Data Visualisation Society which started an excellent discussion on the standards in this profession.

How do we popularize open data use?

Have in mind that the data visualization community loves a good challenge. Makeover Monday and Storytelling With Data Challenge are excellent examples of initiatives which use public data to present interesting stories using data visualization. I personally find inspiration in projects that connect data experts with charity organizations that are faced with particular challenges or questions. This means that open data use gets a very pragmatic dimension. There are several organizations dealing precisely with this topic. DATA4CHANGE and DataKind are great examples. Finally, open data that are accessible and easy to process will always be interesting candidates for visualization. Having quality metadata, a good data structure, expert commentary, clear labels, and standardized tables is extremely important in developing projects.

Given that successful examples are the best motivator, can you share with us a few examples where data visualization made a difference?

A classic, and my favorite, example is the graph that Florence Nightingale made during the Crimean war. In order to convince policymakers to invest into better hospital conditions, she created a graph presenting the causes of death in the Crimean war in three categories: death in the battlefield, deaths which could have been prevented, and deaths from other causes. When looking at the graph, it becomes clear that deaths from causes that could have been prevented is the greatest category and something that the British policymakers could change. This was the moment that transformed the understanding of health and sanitary conditions.

A more contemporary example is the work of Hans Rosling, who was a leading voice in transforming how we see the world on a macro level. Rosling’s methods are genius. In order to create his visualizations, he used everything available to him to change old perceptions. From legos to toilet paper, from animated visualizations to sword swallowing. Crucially, he used all of those techniques to create a significantly different picture of the modern times, whether he was discussing “the third world”, overpopulation, or challenges during the ebola outbreak.

Can data visualization change the world (this sounds optimistic) in areas of green energy, environmental protection, data privacy, prevention of distribution of fake news...?

It is interesting that most of the problems we are dealing with today have a quantitative dimension.

Whether we’re talking about climate change, endangered animal species, gender equality, or political instability, there is data that allows us to understand these issues. Fortunately, the data visualization profession is already working on these topics in many ways. Whether we’re talking about journalists using data to turn the public’s focus on particular issues, academic institutions which visualize how our future world will look like, or institutions which create dashboards allowing users to understand areas where they should make investments.

As much as I would like it to be true, it’s hard to say that data visualization on its own can change the world. However, it can certainly present us the world in a whole new way. And in that process of presentation, it can sharpen our image of the world. It can show us where the problems are. It can demonstrate which solutions are the best. And it can definitely enable a more informed discussion about the state of the world and the potential for changes.

The interview was prepared by Katarina Popović for the Serbian Open Data Portal, within the project "Open Data - Open Opportunities implemented by the Office for IT and E-Government and UNDP Serbia, with the support of the World Bank and the Good Governance Fund of Great Britain (GGF), May 2019.