Project description is an interactive tool to support and reveal information about government and business power for journalists, academics, researchers and data enthusiasts. Research and development took 10 months and was carried out with a team of journalists and data specialists.
From Open Knowledge we seek to strengthen democracy and free expression, and we work on tools to collaborate with journalists and researchers who need data. is powered by different open databases, machine learning, scrapers, requests for access to information and its purpose: to facilitate comprehension through visualizations of various power relations to identify relationships and achieve easily accessible information for deep investigations such as of corruption. allows us to search by name, public officials, level of importance through graphs / visualizations and filters. The analysis we perform is easy to share with our colleagues, download an image of the graph or data. is a platform in Beta state, it implies large amounts of data and an updated data card. The more data, the more power relations we can obtain.
The project is supported by the Open Government Partnership (OGP), the Civil Association for Justice, the journalism team of the Tres de Febrero University, the University of San Martín and 20 journalists who collaborated in the research and usability testing process.
To know more: Insert and In the search you can put for example Mauricio Macri and know the links. In the left menu you can find the level of graph size that can be enlarged from 1 to 3, showing in level 3 the greatest relationship and links.
We met with journalists and newsroom managers for feedback. The business plan integrates a subscription and international cooperation support proposal in the future. It is necessary that the platform be independent to achieve confidence in the users.

What makes this project innovative?

the bureaucracy of the State, the times of requests for access to information (many times the State refuses to provide information), preserving the identity of the sources, preserving the identity of the journalist who investigates a cause of corruption means that is a tool that collaborates with the journalist providing quality, verified and didactic information through visualizations, graphs and filters. Using scrapers, machine learning and algorithms as well as data opening techniques, makes a tool with visualizations and graphs, allowing the journalist to discover relationships in minutes, and in a traditional investigation it can take weeks. besides collaborating with journalists and researchers, saves time to work teams and newsrooms.

What was the impact of your project? How did you measure it?

In the first launch (beta project) of it had repercussions in newspapers such as Pagina12 and it was also presented on the C5N news channel and in several media that seek to show the relationship between companies and officials. As in many countries, in Argentina we also have polarized information, therefore, this free tool strengthens democracy and free expression by collaborating to inform. We held a series of meetings with journalists and researchers who showed us great interest as a support tool for research, helping to save time and discover new data and power relations that are not known. Although this is not an online metric, it is a valid metric for this tool that does not seek to obtain massive metrics but rather contributes to research, data opening and support for journalists. On the landing, we offered an e-mail so that journalists and academics could try the tools. In a week, we obtained close to more than 150 e-mails.

Source and methodology

Argentina has a public registry of companies and NGOs, with all its members and how they change over time. The first problem: it is not easy to use the data for various problems: the size of the registry (it weighs more than 1GB, which means that certain knowledge and expertise in programming tools are required for its analysis) Many typical Data Entry problems. The names are written in free fields which leads to the data being loaded in various ways .. For example, the current President find written in 3 different ways and our former President in 5 different ways what It makes it impossible to cross these data within a single source. To cross data of officials and companies we had to use different sources of information and tools. We use scrapers to obtain data in closed or web formats, we make requests for access to information and we structure data. Sources that we use to feed the analysis and visualization platform: For data of Justice we use the data of IGJ (General Inspection of Justice) We use public contracting data from the City of Buenos Aires and National through the open data portal Organigram of officials of the National State a request for access to public information. The organizational chart of officials of the City of Buenos Aires through the Open Data portal. We use Machine Learning with a database of names to calculate the sex of the people Structured Name Data We made the Scraper for the Official Bulletin of Argentina to relate the purchases and public contracts of companies

Technologies Used

We develop scrapers: 1 - To scrape in the official bulletin because it does not offer a way to download data. 2 - Scraper to collect profile data of wikipedia officials and politicians 3 - Scraper the National Center of Argentina 4 - Analyze, order, structure, unify the data with Python 5 - We use Neo4j as a database 6 - Python-flask as backend 7 - we develop in JavaScript with vue.js to develop the front 8 - d3.JS for the multiple dynamic visualizations We take care of releasing the datasets for use to contribute to the open data culture. Also the code is freely available to replicate.

Project members

Franco Bellomo Yas García Alison Depsky Nicolas Grossman


Additional links


Click Follow to keep up with the evolution of this project:
you will receive a notification anytime the project leader updates the project page.