Jens Finnäs, founder and datajournalist at Journalism++ StockholmSascha Granberg, datajournalist at Journalism++ StockholmPeter Grensund, datajournalist at Journalism++ Stockholm
Journalism++ Stockholm (J++) is an freelance agency for datadriven journalism. We analyze and tell stories with data, as well as doing consultancy and in-house trainings with newsrooms. As the first agency dedicated to datadriven journalism in Sweden, J++ has had a pivotal role in developing the Swedish data journalism scene, creating both tools and methodology for colleagues. Among the things we have built and done for the benefit of the community (and ourselves): - Statscraper: A standard and a library for creating scrapers. With Statscraper comes a unified interface for both building and using scrapers, as well as standardised data formats. https://pypi.python.org/pypi/statscraper/1.0.6 - Thenmap, a repository for historical administrative borders, that sees a lot of usage from countries such as Finland and Switzerland, where there are often multiple municipality mergers each year: http://jplusplus.org/en/work/thenmap/ - Our internal fact checking method: https://github.com/jplusplus/check/blob/master/method.en.md - Lot\'s of other stuff: https://gitlab.com/jplusplus/ Some other things we did over last past 12 months: - Newsworthy: A news service that automates story digging in statistical data: https://www.newsworthy.se/en/ - Teaching journalists to code over two five-week courses in Python for newsroom usage. - Giving many, many in-house training to newsrooms. - We used current and historical population grids to calculate the average distance to the nearest labour ward for citizens in each Swedish municipality, now and in the past. An ambitious GIS research, resulting in a web app: http://bb-resan.ottar.se/ - We took the ambitious municipality ranking of Dagens Samhälle online, and used a robot writer to expand on what part of the figures stood out in each municipality, rather than just publishing the figures: http://superkommuner2017.dagenssamhalle.se/ - Running a scraper that collects all meeting minutes from the municipality of Tampere. The documents are scanned for certain topics, and alerts are sent to reporters at the Aamulehti newspaper when a topic they are following pops up: http://jplusplus.org/en/work/monitoring-tampere-municipality/ - We wrote weekly datadriven articles for Dagens Samhälle, using tools such as scrapers and GIS software to take their data reporting one step further: http://jplusplus.org/sv/work/dagens-samhalle/ - We helped Expressen mapping criminal gangs and deadly shootings of Stockholm: https://www.expressen.se/nyheter/qs/gangen-inifran/konfliktzon-stockholm/192/ - For the third consecutive year, we have been monitoring how well the 290 municipalities of Sweden are covered by news media outlets, keeping a database over media presence and owner structures: http://mediestudier.se/kommunbevakning2017/ - We were part of the Security For Sale investigation: https://thecorrespondent.com/10221/security-for-sale-the-price-we-pay-to-protect-europeans
What makes this project innovative?
We have been developing not only tools, but also methodology for doing datadriven journalism, such as the fact checking method outlined above, workshop concepts for newsdesks that want to start doing more datadriven work, etc. As it turns out, it\'s harder to change newsroom culture, processes and workflows, than it is to introduce new technology. We are therefore focusing more and more on helping newsrooms taking datadriven journalism into their workflow, rather than just teaching them the technical skill.Whenever possible we combine trainings and research, working closely together with journalists over e.g. a number of workshops, finishing it off with a publication.
What was the impact of your project? How did you measure it?
Media coverage database: The findings from the latest publication in December 2017 was picked up by a state public report on the state of the media industry (and gained attraction by local news media).The maternity clinic analysis has been picked up in the debate for the upcoming general election (and gained attraction by local news media).Newsworthy: Since October 2017 we offered local newsrooms across Sweden a trial version of the service. We have delivered thousands of newsleads to reporters in most of the large local media groups, including public service. For each batch of leads delivered, we see stories done that would unlikely been discovered and written without Newsworthy. Statscraper: Since version 1.0.0 was released in Augusti 2017, tens of scrapers have been written on top of Statscraper. That means tens of less wheels being reinvented.Trainings: We always build our in-house trainings on real cases, with real publications as the ideal outcome. Most of our workshop series have see at least one or two, publications as a direct outcome, followed by more indirect (and harder to measure) follow ups.
Source and methodology
Whenever possible we combine trainings and research, working closely together with journalists over e.g. a number of workshops, finishing it off with a publication.Data gathering ranges from manually building our own databases in spreadsheets, to setting up scraper parks on Amazon AWS.Verification happens in multiple stages, the last one being a kind of line-by-line by the colleague least involved in the project. (Routines before that differ a bit depending on the needs of the project.)
Analysis is mostly done in Python+Pandas. We have more or less abandoned Excel, in favour of doing analysis programmatically. That helps us reduce errors, and increase transparency and reproducibility.Some parts of Newsworthy runs in R.Web apps are mostly built in Node JS.Simple geodata analysis in mostly done in QGIS, more complex in Python.When analysing data on a larger scale, we mostly use Amazon AWS for storing data, running code, etc.We believe data journalism should have as much of an angle to its stories as other journalism, and that often means cutting back on exploratory graphics, and using more static graphics. When interactivity is called for, it has often been done in d3, or in simples cases vanilla JS.