Projects submitted to the Data Journalism Awards competition

Right here you will find a list of all the projects submitted to the Data Journalism Awards competition.  

Portfolio: Saimi Reyes
Country: Cuba
Student and young data journalist of the year
Reyes Carmona
Team Members
Project Description
In 2016 I founded, together with three other colleagues,, the first and only site dedicated to data journalism in Cuba. The site was nominated to the Data Journalism Awards in its first year of life as Web Site of the year. In Postdata I work as a Chief Editor and as a journalist. We are a very small team (currently integrated by three journalists, a programmer, and a translator) and we work voluntarily without receiving salary for our work in Postdata, therefore, in many occasions I have had to do more than one work for the conception of a journalistic text. In during these almost two years of life we have published research on various topics that are of interest to our public, mostly Cubans. In a country where public and accessible data are difficult to obtain due to the connectivity characteristics and the outdated data, I have often had to build or complete the databases with which we work with my team. Many of the articles I have published in Postdata have had a high social impact, because they have contributed to inform Cubans and understand some phenomena thanks to the seriousness and commitment with which we address them. In addition, as a member of the team, I teach Data Journalism classes at the Faculty of Communication of the University of Havana.
What makes this project innovative?
In Cuba there was no initiative dedicated to journalism with data, and when we founded Postdata we put in the hands of the public new tools that allowed us to understand the reality in which we live. In the last year, has increased its commitment to the audience and we have addressed issues of public interest in moments of informative interest. Thus, in recent months we have addressed issues of high impact for those who visit us, such as private work in Cuba or the elections that will culminate in April and that will change current Cuban history, because for the first time Raúl Castro has announced his retirement as President and the generation that has historically occupied power in Cuba is retiring. The analysis we have done in during the last period has allowed us to understand some aspects of the elections that the traditional Cuban press does not address due to the particularities of the electoral process in Cuba. My work in has allowed me to understand and explain later the characteristics of Cuban reality, a complex reality that, however, can be addressed from the data.
What was the impact of your project? How did you measure it?
Since the emergence of it has had a great social impact, because it offers journalistic analysis of value, added to interactive graphics. In addition, we provide the databases that we create ourselves, we are on GitHub, which allows all interested to access our work methodology. Postdata has a high impact in the last year. As a representative of Postdata I was invited to share our experiences in the Latin American Congress of Investigative Journalism (COLPIN 2017) and the first Congress of the Information Technology Union of Cuba CIBERSOCIEDAD 2017, where I won the prize at the Solutions Fair, which constituted a recognition of our work. According to statistics from Google Analytics, visits to the site have increased and constant visits are maintained in some articles that have resulted in high impact. In addition, every time we publish a new text we receive many more visits, which shows the interest of readers in our articles. Many of our texts have been cited by other media and our journalistic tools and the databases that we share, are sources of other journalistic works.
Source and methodology
Many of the sources that I use in to make data journalism are public, open and accessible sources, which, however, had not been used by other media. On other occasions, I have created my own databases with information also free collected by me and my teammates. In some cases, due to problems of disconnection and outdated content, we have gone to the ministries and public institutions to collect the information and have provided it to us. Also, in many cases, in order to verify the veracity of the data, we have confronted the official sources with those of the National Office of Cuban Statistics (ONEI) and with the information provided by the Cuban press, as well as the information that about Cuba offer other international bodies. We never published a Postdata article without the information being rigorously reviewed and verified.
Technologies Used
In we decided to use github as our work platform, so all our articles, databases and tools used are available for study and use by whoever is interested. In this way, publishing is simply committing to our github repository. On the other hand, we have always used the JSON format as the basis for the data that our stories use. In order to create databases it has been necessary to use tools such as pdftotext, download web pages using wget or scrapy, or create our own programs in python to process the data. Many times we have created CSV files as intermediate formats to be processed with python or with LibreOffice. All the data analysis is done in python. Class libraries like Numpy, NLTK, Scikit Learn and NetworkX have been used. These have allowed us to do statistical processing, natural language processing, clustering and network analysis. All the web pages have been programmed by us using HTML5, CSS and Javascript. Jquery has been used for processing the HTML DOM. Likewise, D3.js and C3.js are the main libraries used for graphics although Echarts.js has also been used. Jvectormap and Google Maps have also been used for working with maps. To perform the scrollytelling like visualizations, the Scrollama library has been used. Other libraries of classes used have been Tooltipster, Flip.js and Horizon-swiper, among others.