We downloaded 26 million data streams from property listings. The DR’s Investigative Datadesk has collected all publicly available information about properties and grounds in the public register called Tinglysningen. The information is obtained from 30 August 2016 to 20 October 2016. 48 virtual servers in the cloud were used to suck data of all properties in Denmark into a database in DR. The purpose of collecting data on all properties in Denmark is to enable the editorial staff to analyze and understand the value of properties, ownership and borrowing in order to convey new knowledge to the Danes.All information can already be viewed on tinglysning.dk by searching an address or a landmark. What was extraordinary is that DR’s Investigative Database Operations has gathered the information in a form that allows the editors to analyze property data across the data set rather than single-storey.The work have been awarded the Nordic Datajournalism Award for best Feature 2018, with following words from the jury: "Danmarks Radio has analyzed and mapped the Danish housing market based on an spectacular data set. The data set was harvested by the editorial staff at Danmarks Radio. The stories are told in a down-to-earth language with impressive visualizations."
What makes this project innovative?
There are several benches in the process. The biggest downturn is that there is a limit on the number of requests that can be made per ip address (unique address on the internet) per day. The other benches were long black times, crashes on specific queries and capacity on the server's servers. To minimize the risk of overloading server information servers, we chose to run two servers at a time per hour. Without the ip restriction, one server would be ample. But here we created 48 virtual servers with a cloud provider - as well as a no-sql database and a key-value store.The database was configured and a complete list of all addresses in Denmark was entered. Address data is available at aws.dk, which contains all addresses in Denmark. Data maintained by Denmark's Address Register (DAR) under the Data Supply and Efficiency Board.The copy of the listing will fill 26 million rows in an encrypted database. The database contains information about 3.5 million ownership, 4.7 million creditors and 3.5 million counters, and 7.9 million waiters.
What was the impact of your project? How did you measure it?
We published more than 30 articles and infographics with the data as source. The stories covered the properties of Denmark, the worth of all properties, the worth of Lords and Barons, the tax-system of houses in Denmark and even a story that located the “castles” of outlow bikers and gangs in Denmark.
Source and methodology
There are several benches in the process. The biggest downturn is that there is a limit on the number of requests that can be made per ip address (unique address on the internet) per day. The other benches were long black times, crashes on specific queries and capacity on the server's servers.To minimize the risk of overloading server information servers, we chose to run two servers at a time per hour.Without the ip restriction, one server would be ample. But here we created 48 virtual servers with a cloud provider - as well as a no-sql database and a key-value store.The database was configured and a complete list of all addresses in Denmark was entered. Address data is available at aws.dk, which contains all addresses in Denmark. Data maintained by Denmark's Address Register (DAR) under the Data Supply and Efficiency Board.A program developed for the apartment was run on all 48 servers where Tinglysningen is asked once per address. The query per address gives a number of entries in either a dossier or a member book. Each of these is retrieved and saved.The startup was communicated to the IT Information Provider. Likewise, for the good reason, there was contact information in the statements we made.Two months later, the list of addresses was run and data was collected.Data was downloaded in DR City, where it was put into a relational database to connect data and create an overview. Items with wrong values, eg. An interest rate of 450,000 percent was sorted out by the final analysis.The database is encrypted, so only the editor's programmer has access to the information. All data handling is handled with regard to the processing of sensitive data.
MADS RAFTE HEIN (designer), JENS LYKKE BRANDT (developer), BO ELKJÆR (journalist), ALEXANDER HECKLEN (journalist),KRESTEN MORTEN MUNKSGAARD (journalist)