Projects submitted to the Data Journalism Awards competition

Right here you will find a list of all the projects submitted to the Data Journalism Awards competition.  

Follow the Money
Country: Canada
Organisation: Postmedia
Open data award
Interactive
Investigation
Public institutions
Applicant
Zane
Schwartz
Team Members
Brice Hall, Julie Traves, Postmedia digital team
Project Description
The political machine runs on cash, lots of it. But who’s giving it, and how much, was all but impossible to track in much of Canada — until now. Postmedia gathered more than six million records from across the country to create Canada’s first central, searchable database of political donations in every province and territory. The database allows searches by recipient (e.g. Prime Minister Justin Trudeau) and donor (e.g. Royal Bank of Canada, the country’s largest bank). In addition to the searchable database, three additional tools were built to help readers understand the data: donations over time, donations by region, and biggest donors. The latter aggregates all money given by the same donor so readers can identify who is giving the most money in a specific election, to a particular politician, or over any time period. To encourage academics, journalists and engaged citizens to get the best use of this data both individual search results and the entire data set (6,464,220 rows) is downloadable. The interactive tools are accompanied by stories highlighting important aspects of the data including: the ten largest donors in every province and territory; donors from foreign companies (including state-owned Chinese companies); potentially illegal donations from one of the most powerful families in Canada; donations from municipal politicians using public funds to enrich political parties.
What makes this project innovative?
There is so much conversation in Canada about the accessibility of political funding information. None of it is secret but so much of it is released in a process that is difficult to access for the average person. This project converted millions of records of financial data that were nominally available to the public into an easy-to-use searchable database. It is a massive difference in access to information about political financing in Canada. We used the Qlik Associative Engine and search APIs to facilitate the search process on such a large data set, with fields being added that search the data as you type. To display the results quickly, we created a table that uses virtual scrolling and paging. This table receives new data from the server as users scroll the table. It is so quick, you would never know you are scrolling through over six million records. Such quick and easy access to this information is the first of its kind for political donations in Canada.
What was the impact of your project? How did you measure it?
Our project launched just before the entry deadline so we don’t have a full suite of analytics yet. The site has generated about 40,000 page views and 30,000 uniques in the first few days, though we expect more in the weeks to come. Anecdotally, however, the project created quite a buzz among those interested in this area. Brian Lee Crowley, of the MacDonald Laurier Institute, said: “Congratulations. An important tool of democratic accountability that should be available without a private organization having to invest its scarce resources to do it. Well done.” Harold Jansen, a University of Lethbridge political scientist, said, “Wow! I am on study leave for 2018-19 and will be focusing on provincial political and election financing. You have saved me so much work!!! I will be making significant use of this.” For the analytics going forward, we’ll measure the impact of the project not just in traffic numbers, but also by time spent and engagement. We’re intending to promote the site at targeted intervals and at targeted audiences, for example, focusing on a province during an election cycle. Another factor we weigh heavily in impact is the longer-tail cycle of stories and research findings that come out of the deep-dive use of the database. As this is meant to be a public tool, the stories and findings that come out of this in the weeks and months ahead are evidence of its ongoing value.
Source and methodology
Election officials are legally required to maintain donation records and make them available to the public. However, the format of those donations varies widely in terms of quality and legibility — nine of thirteen Canadian provinces and territories keep records in PDFs, separated by year and political party. Determining how much money was given over multiple years, or to multiple parties, is effectively impossible. Data was collected from the 14 election agencies in the country, each of which maintain records in slightly different formats. Some allow hand-written records, others collect donations separately for each candidate in a particular election. Over 54,000 pages of records were available only in PDF format. These were converted into comma-separated values files using optical character recognition (OCR) technology. Twenty per cent of these files were then manually checked at intervals to weed out errors.
Technologies Used
The data visualizations and analytics in this project are powered by the Qlik Analytics Platform. The Qlik platform allowed us to embed visual analytics into our website using Qlik’s powerful, open APIs. Qlik’s Associative Engine, the core of its products, allows users to quickly and easily explore and search through over six million donations to answer their questions about who backs Canadian politicians. We used enigma.js to interface with Qlik’s engine API and React to build the user interface. The result was a powerful application that summarizes millions of rows of data in the blink of an eye. On top of the Qlik engine, React.js is designed to work by simply reacting to changes in data, and its efficient diffing algorithm makes DOM updates super-fast resulting in a blazing quick experience. The website was built using Node.js and the Express framework. The database is pulled in via an iframe from Qlik\'s servers. In addition, two optical character recognition programs, Adobe Export PDF and Cometdocs, were used to convert PDFs into machine sortable formats.