Project description

I am a 26 year old data journalist based in Brazil and currently working in Colaboradados, the collaborative vehicle on data transparency and open data in Brazil. I have been through the newsroom of Folha de S.Paulo, a big brazilian newspaper, and the fact-checking agency Aos Fatos, the first Brazilian platform for verifying public discourse. I am a member of the Pyladies group and an enthusiast of free and open source teaching projects.

In Folha de S.Paulo I was a data journalism trainee, producing reports using Python programming language, the first with which I had contact. Inside the essay, I published some individual and other reports in partnership with other journalists. In each one of them, I participated in different production areas, having been responsible sometimes, for all their development until the publication. As one of the largest Brazilian newspapers, the repercussion of the reports in society was enormous, and was even responsible for the changes in transparency policies of the Brazilian National Justice Council.

In Aos Fatos, I acted as a reporter and a fact-checker, having produced several reports where the data was the main subject. Most of my job was to check if the political speeches said by candidates in the 2018 elections were verificial, which was only possible using real and serious data. In Aos Fatos I also published great reports, investigating data and requesting information for Brazilian government agencies.

After I left the vehicle of fact checking, I felt the need to dig deeper into the world of public data. I have always been active in the production of open source materials, with the objective of being, in addition to being open, collaborative and transparent. Thus came the Colaboradados, a nonprofit project, of which I am founder and director. It is a cross-platform project that promotes greater knowledge in society about the benefits of government transparency, and how it is useful for monitoring political entities.

I currently participate actively in the Python community, and try to help both programmers and journalists understand how beneficial the use of programming in journalistic practice. Because of my active role in the community, I am constantly called to speak on the topic at conferences and events, and was recently accepted to speak on the subject at the largest Python International Conference, PyCon, which will take place in May this year in Cleveland, in the United States of America.

I have chosen each of these projects to illustrate my work. Each of them was a result not only of my efforts, but also of the partners who accompanied me and accompany me on my journey. My current goal is to continue working with data, and bringing more knowledge to the community not only of journalists, but also of developers, joining the two professions for a greater good: to bring information to society.

What makes this project innovative?

Link 1: "Justice system violates secrecy and exposes children victims of rape" - Brazilian National Council of Justice database exposes sensitive information of victims, including names and details of sexual crimes committed against children. In the project, we use Python to scrape information and also to analyze data found. ( Link 2: "Radiant Heart: analysis of feelings of the Brazilian fans in the 2018 World Cup" - From a program created in Python, a scraping was made on Twitter during all matches where the Brazilian soccer team participated during the World Cup of the World of 2018. From the shaved tweets, analyzed the messages to try to understand how the Brazilian reacted in the minutes of the match. ( Link 3: "2014 World Cup host cities still have work to do and transparency issues" - Series of reports on the works promised for the 2014 World Cup in Brazil showed significant gaps in the undertakings promised to the event by the public administration. In addition, the General Comptroller of the Union, responsible for maintaining and feeding data on the works, removed the information from the air after the beginning of the series showing the lack of compliance with the Brazilian Access to Information Law. This series was divided into 6 Brazilian cities, and in all of them, requests for information to government agencies and an extensive investigation of the origin of these data were used. ( Link 4: "Reports of lack of money and abandonment of the national museum date back to the 1950s" - Report identified records of warnings for the risk of fires in the Brazilian National Museum since the 1970s. Complaints about budget cuts and lack of maintenance are even older and date back to the 1950s, according to a survey carried out. The report investigated old documents and raised and crossed with statements made about the investments towards the Museum. ( Link 5: "Bread and butter" - This is a study book on various aspects of the professions and occupations of the candidates for the 2018 elections in Brazil. All the analysis was performed with the Pandas library and is available for consultation and study in the Github of the project. For visualization, Flourish was used. ( Link 6: "Collaborating, the collaborative vehicle on data transparency and open data in Brazil" - A project that, through programming and various tools, proposes to talk about data and their importance in different spheres of society. Our DJA18 submission: ( ( More stories are found on my site:

What was the impact of your project? How did you measure it?

The published reports had a great reach and were read by many people, some of them had a greater impact still, being responsible for changes by Brazilian governmental organs. Thanks to the report "Justice system violates secrecy and exposes children victims of rape", the Brazilian National Council of Justice changed the way it gave its data, hiding those that should not be exposed. With Colaborados, we are disseminating information about Brazilian public transparency and informing those who wish to integrate this area.

Source and methodology

For most projects mentioned above, we have requested the data from federal departments. For the extraction and analysis of data, I used Python, a programming language widely used by data scientists. In addition, the standard techniques of any journalistic work were also used: interviews with sources, information check, and ethical research. In all the individual projects or that were not published in large vehicles, I also tried to divulge my code, so that it could be replicated by other people.

Technologies Used

I use Python for all the processing of my work. Be it for the sake of information on unfriendly sites, or for their analysis, using libraries like Pandas and Numpy. For the graphics part, I use HTML and CSS, in addition to using Jekyll for my front end. For the graphic part of my personal projects, like Link 2 and Link 5, I used the Flourish platform.

Project members

Judite Cypreste


Additional links


Click Follow to keep up with the evolution of this project:
you will receive a notification anytime the project leader updates the project page.