Projects submitted to the Data Journalism Awards competition

Right here you will find a list of all the projects submitted to the Data Journalism Awards competition.  

Paradise Papers: Secrets of the Global Elite; plus a massive expansion of the ICIJ Offshore Leaks Database
Country: United States
Organisation: International Consortium of Investigative Journalists, Süddeutsche Zeitung and more than 90 other media partners in more than 60 countries
Investigation of the year
Walker Guevara
Team Members
Pierre Romera, Emilia Díaz-Struck, Mar Cabra, Vanessa Wormer, Bastian Obermayer, Frederik Obermaier, Matthew Caruana Galizia, Miguel Fiandor, Rigoberto Carvajal, Julien Martin, Cécile S. Gallego, Delphine Reuter, Felix Ebert, Hamish Boland-Rudder, Marina Walker Guevara, Álvaro Ortiz, Fernando Blat, Jorge Gómez Sancha, Jorge González, Didier Orel, Helena Bengtsson, Jérémy Boulery, Manuel Villa, Paola de Perthuis, Ricardo Ortiz
Project Description
Based on a massive leak, the “Paradise Papers” exposes secret tax machinations of some of the world’s most powerful people and corporations. The project revealed offshore interests and activities of more than 120 politicians and world leaders, including Queen Elizabeth II, and 13 advisers, major donors and members of U.S. President Donald J. Trump\'s administration. It exposed the tax engineering of more than 100 multinational corporations, including Apple, Nike, Glencore and Allergan, and much more. The leak, at 13.4 million records, was even bigger in terms of the number of records than the Panama Papers, and technically even more complex to manage. And unlike in previous leaks, much of the leak came from elite firms, including Appleby, which serves not a roster of international outlaws and other rogues but the pillars of global capitalism. Exposed here are not problems within the system but the system itself. The project triggered a global uproar and official fallout around the world, as outlined in the last story in our entry. Then, between November 2017 and February 2018, the International Consortium of Investigative Journalists added more than 290,000 companies from the Paradise Papers to its already massive the Offshore Leaks Database. This includes data from the offshore law firm Appleby, as well as information from corporate registries from seven secrecy jurisdictions. In all, the database now contains more than 785,000 trusts, companies or funds, and more than 720,000 officers, a major contribution to global public knowledge about the offshore financial system. The offshore project represents a coming-of-age for a new journalism paradigm: A wholly independent news organization mobilizing the world’s journalism power – in this case 380 journalists from more than 90 news outlets – to hold accountable actors at the highest level. The operation is tech savvy, networked, collaborative, global and thoroughly professional. What was once ad hoc is now a model for the future.
What makes this project innovative?
The project’s technical innovations are too numerous to be fully listed here, but it should be understood that the Paradise Papers posed greater challenges even than those of previous leaks. The record set came from an array of sources, including Appleby, Asiacity Trust and Corporate Registries from 19 secrecy jurisdictions. It also contained more than 110,000 files in database or spreadsheet formats (excel, CSVs and SQL). ICIJ’s data unit used reverse-engineering techniques to reconstruct corporate databases. The team scraped the records in the files and created a database with information of companies and individuals behind them categorized by jurisdiction. The team then used “fuzzy matching” techniques and other algorithms to compare the names of the people and companies in all these databases to lists of individuals and companies of interest, including prominent politicians and America’s 500 largest publicly traded corporations. ICIJ’s data unit converted the databases into graph databases, later visualized using a privately hosted version of a software called Linkurious. In this case, we created different graph databases per source allowing reporters more easily to explore connections between individuals and entities. Based on the data fields, ICIJ incorporated a new type of node to the database allowing reporters to identify companies that were part of a holding company. More tech-savvy team members could perform social network analysis using Neo4j’s query language cypher. But all the technical innovation is in service of ICIJ’s biggest innovation of all: its model of “radical collaboration.”
What was the impact of your project? How did you measure it?
Since publication, governments opened tax investigations in Vietnam, Lithuania, Indonesia, Belgium, Ireland, Greece, Netherlands, New Zealand, Australia, Nigeria, Pakistan, South Korea, India and elsewhere. The Paradise Papers exposed Apple Inc.’s haven-hopping practices for its quarter-trillion-dollar offshore stash just as the U.S. Congress was moving to reward such behavior with a sweeping new tax law. After ICIJ published, Republicans in Congress twice increased the proposed tax rate to 15.5 percent, from just 10 percent, on repatriated overseas profits, as an incentive to bring foreign profits home. While still a huge giveaway to Apple, the change could net roughly an additional $12 billion to the U.S. Treasury. ICIJ’s work forced U.S. Commerce Secretary Wilbur Ross to divest secret holdings in a shipping company with links to Vladimir Putin’s inner circle and sparked Congressional calls for investigation. Weeks after publication, Israeli billionaire Dan Gertler was sanctioned by the U.S. government over “hundreds of millions of dollars’ worth of opaque and corrupt mining and oil deals” in the Democratic Republic of the Congo – essentially repeating facts in our project. The Paradise Papers is part of a body of work that has dramatically shifted the global discourse and policy on offshoring from malign neglect to international action. After the Nov. 5, 2017, publication, the European Union accelerated adoption of a “blacklist” of tax havens. In December 2017, the European Parliament overhauled anti-money laundering rules to require European countries to publish companies’ true owners. In March 2018, the European Commission imposed strict reporting rules on offshore tax advisors. “The Panama Papers and the Paradise Papers … make it obvious to everyone that tax avoidance and evasion is a global problem that requires co-ordinated rules and collaborative action,” said John Peterson, of the Organization of Economic Cooperation and Development. “The world has changed.”
Source and methodology
We believe ICIJ’s data team is unlike any other in the journalism world. Working from Washington, D.C.; Madrid; Paris and Costa Rica, the team is nonetheless a tight-knit, collaborative crew unit of developers, computer scientists, data journalists and researchers pushing the frontiers of data-journalism collection, security, reporting, analysis and interactive presentation – all while collaborating securely, in several languages, with newsrooms around the world. What makes ICIJ’s data team unique is that it both develops technologies that supports reporters in their jobs and at the same time works with them on the stories themselves to add a data-analysis perspective. For Paradise Papers, ICIJ used spreadsheets, relational databases and machine-learning techniques and other advanced approaches to understand trends and patterns at a global scale, but also to zero in on specific issues like the use of trusts, special loan agreements, the complicated tax schemes of multinationals, yacht and jet registrations, and more. A big part of the work consisted of developing the tools needed to explore and analyze the data and enable collaboration. ICIJ’s team structured the data and cross-matched information on companies, shareholders and jurisdictions, using public data for verification purposes. The team also helped present the results of all the analysis and reporting to readers through interactive elements. It integrated politicians’ profiles into the offshore leaks database, for instance, allowing users to explore interactively the “Power Players” profiled in stories and enabling searches for related entities, officers and addresses – potentially opening up new avenues of reporting. But the biggest trick wasn’t just in working with data – but in sharing it securely, in this case with more than 90 global media partners. Marina Walker Guevara, ICIJ’s deputy director and manager of the Paradise Papers project, likes to say that choosing journalism partners is like organizing a dinner party: The chemistry among the guests just has to work. But, unlike most dinner parties, all the guests here must put their trust in a group of strangers. ICIJ’s “radical sharing” model requires journalists to resist natural competitive instincts by not publishing or even allowing word of the project to leak until a prearranged time, and, critically, sharing what they know. Ultimately, this dinner party came to include more than 380 journalists on six continents. In April 2017, dozens of journalists converged on Süddeutsche Zeitung’s headquarters in Munich for a town-hall type meeting to fix a publication date (Nov. 5, 2017, 1 p.m. Washington, D.C., time), cement relationships and coordinate on collecting, analysing and verifying data. Working on a giant, secure private data-sharing-cum-social-media-platform known as I-Hub, the partners identified stories relevant to their respective countries and set to work. ICIJ’s staff produced more than a dozen major pieces of their own. In a coordinated action about a month before publication, reporters around the globe began to contact the subjects of their pieces, triggering an intense period of back-and-forth with Appleby and other law firms representing elite clients to be named in the piece. The exercise, while grueling, is a hallmark of ICIJ projects and critical to the verification process. It is a major reason a project of such massive scale, exposing so many explosive facts, ran virtually error-free. Besides the massive leak from Appleby, Asiacity Trust and 19 secrecy jurisdictions and other documents, ICIJ obtained court records from the U.S., UK, Australia, Argentina, Jersey and elsewhere. ICIJ reporters used records from tax courts, civil courts and criminal courts around the world. ICIJ reporters searched U.S. Securities and Exchange Commission filings to determine tax rates for, and how (or if) tax-haven shopping affected the bottom line of, some of the world’s largest – and most complex – corporations, including Apple, Nike and Facebook. ICIJ reporters used federal and state public records laws to submit records requests to numerous agencies around the U.S. In reporting on the offshore connections of a payday lending scheme, ICIJ reporter Spencer Woodman submitted public records requests to more than two dozen state law enforcement agencies that provided hundreds of pages of documents in response detailing the human cost of the lender’s practices. FOIA-responsive documents from the U.S. Consumer Financial Protection Bureau yielded essential details regarding information the lender had provided enforcement authorities and details – revealed in the offshore leak – it appears to have omitted regarding its ownership. ICIJ reporters also obtained confidential tax assessments from government sources that revealed large companies were being probed for tax avoidance involving structures found in the Paradise Papers. Recognizing the complexity of the subject matter, ICIJ reporters established relationships with tax, finance and business experts in the U.S. and other jurisdictions to help make sense of the files. The reporting took many months and involved hundreds of phone calls and in-person meetings with retired tax lawyers, accountants and academics who helped explain the meaning of hundreds of pages of corporate transactions, loans, contracts and PowerPoint presentations found in the Paradise Papers. The trove of documents that comprised the Paradise Papers were often highly technical and rarely self-explanatory. In many cases, to understand or verify the documents’ contents required ICIJ reporters to seek help from the few experts with highly specific knowledge of offshore maneuvers. This often meant a careful process of cultivating experts who in many cases were (or had been) practitioners in the world we were critically scrutinizing. This labor-intensive process typically involved a series of conversations to build trust to the point that ICIJ reporters felt comfortable asking experts if they would considering reviewing and explaining confidential documents to ICIJ. In other cases, ICIJ reporters cultivated human sources with direct knowledge of the principals of their stories. For a story about billionaire James Simons’s offshore holdings, for instance, a person with knowledge of Simons’s wealth told ICIJ of a Bermudan foundation that held about $8 billion in assets. It would count as one of the top 10 largest foundations derived from a U.S. fortune, but until our reporting, the public never knew it existed. Methodologies for reporting on Nike and Apple merit particular attention. In both instances, the relevant files from the Paradise Papers amounted to a few detailed pieces of a much bigger jigsaw – and were meaningless on their own. Complete pictures had to be assembled from months of open-source research and extensive discussions with experts. Simon Bowers traveled extensively to gather insights from top corporate advisers, academics, policy experts and former tax inspectors in several countries. Meanwhile, painstaking efforts were made to identify patterns of behavior among a group of multinationals that appeared in Paradise Papers files. These patterns allowed us to develop hypotheses about the likely cross-border flows of money between scores of subsidiary companies at both Nike and Apple. These hypotheses were then tested against open source data. Apple presented particular challenges. Having had its tax maneuvers already exposed once – when a 2013 U.S. Senate inquiry discovered Apple had shifted hundreds of billions in non-U.S. income through Irish tax loopholes – the iPhone-maker was especially determined to keep its tax moves secret. ICIJ learned that Apple had secretly reorganized its Irish companies in a complex maneuver involving the tax haven of Jersey – and continued to keep the iPhone maker\'s effective tax rate ultra-low, even after the Irish scandal. Apple took care to use only subsidiaries that were not required to release financial statements in company registry filings. Nevertheless, after months of open source research – including an analysis of 12 years of tax disclosures from Apple 10-Ks, information from European company registry filings, Irish national economic and tax receipts data and EU competition judgments – the ICIJ built a dossier of compelling circumstantial evidence completing the picture of Apple\'s new tax structure. We took the dossier to eminent tax academics, including two who had been the lead experts assisting the U.S. Senate with its 2013 Apple probe. Each gave on-record statements confirming they were entirely persuaded by our findings. With that backstop, we ran the story – in spite Apple\'s refusal even to properly respond. After publication, Apple made generalized remarks about \"inaccuracies\" but confirmed for the first time that it had indeed used a subsidiary that secretly held tax-residency in the tax haven of Jersey. Meanwhile, the chief minister in Jersey and an influential parliamentary committee in Ireland announced inquiries into ICIJ\'s findings. In the U.S., publication of the ICIJ\'s story came as Republicans in Congress were refining a major tax bill that would change the way multinationals were to be taxed. Some of the many tax breaks contained in the bill were targeted at multinationals – such as Apple – that had for years been parking billions of dollars in earnings overseas. By the time the bill was signed in December, however, these breaks – though still present – had been substantially reduced. This last-minute change to the Tax Cuts and Jobs Act of 2017 has effectively wiped tens of billions off the tax savings Apple had expected to receive from the legislation. And it all started with data, and the team behind it.
Technologies Used
While in Panama Papers the millions of records came from a single source, Mossack Fonseca, the Paradise Papers, as noted, came from 21 different sources with different formats and structures. This required adapting ICIJ’s custom-built platforms to share information and allow advanced search options. These included new applications that could identify the source of the information and do searches within a specific source of leaked records. ICIJ also created an online Support Desk that allowed journalists to report bugs and propose innovations while working on the project. ICIJ used the cloud to process millions of files efficiently and worked on developing and improving the process of extracting the text from files to make them searchable by using open-source technology and up to more than 30 EC2 instances (virtual servers) in Amazon Web Services that worked in parallel. The presence of new file formats ICIJ had not dealt with in the past, which included some non-traditional ones, led ICIJ to review and improve its processes to make the information available and searchable to the full network of partners. As in the past, ICIJ open-sourced the code of this software in its Github account ( Technologies used: - For data extraction and analysis: Talend Open Studio for Big Data, SQL Server, PostgreSQL, Python (nltk, beautifulsoup, pandas, csvkit, fuzzywuzzy), Google Maps API, Open Street Maps API, Microsoft Excel, Tesseract, RapidMiner. ICIJ open-sourced the code of its document processing chain. “Extract” is a cross-platform command line tool for parellelised, distributed content-analysis:, - For the collaborative platforms: Linkurious, Neo4j, Apache Solr, Apache Tika, Blacklight, Xemx (our homemade single-sign-on platform), Oxwall, MySQL and Semaphor. Jira helped us a lot to submit issues and follow changes over the code base for each software developed by ICIJ. - For the interactive products: JavaScript, Webpack, Node.js, D3.js, Vue.js, Leaflet.js and HTML. - For security and sources protection: GPG, VeraCrypt, Tor, Tails, Google Authenticator, SSL (client certificates) and OpenVPN.