January 9, 2018 – BOSTON – Today, OpenDataSoft, a software-as-a-service company that offers cities and private companies alike a turnkey data…
Open Data missing layers
The Open Data movement is growing, there are more and more portals, more and more publishers and more and more people using the data. As the Open Data movement is growing, both the people publishing and using the data evolve. Though, it appears that if the Open Data stack is evolving too, there are still some missing layers, some missing tools. And, as an ecosystem, we should care more about these Open Data missing layers!
Open Data weird diffusion
Open datasets are usually produced by people who know what they are doing. They may be a little bit rusty, they may use old-fashioned tools, they may be large teams with some struggle for power, but data producers know their data.
What they usually don’t master is who will reuse their open data, through which media these open data will reach people, and how to market and advertise their open data.
The main problem: open data are supposed to be useful for a lot of different people. And they won’t benefit the data producer/opener if they don’t reach enough people and enough different segments.
Your open data may be used by:
- Developers on private time,
- Developers on their working time,
- Data analysts,
- Excel experts,
- Excel novices,
- Curious citizens,
- Less curious citizens,
- Busy data journalists,
- Mobile readers,
- Low bandwidth users,
- Search Engine bots,
- Other data publishers,
If you are in charge of a data portal, do you know how each of these customer segments accesses, reuses, and makes the most of your data?
Actual strong layers
Since the early days of opening data around 2008, some pretty strong layers have emerged. Mostly because of the design of the first big Open Data platforms – think Ckan – those layers are: downloading data, choosing a data format and defining useful meta-data, and that’s pretty much it! That’s already awesome and when we gathered a list of 1600+ Open Data portals worldwide we’ve found out that a lot of portals around the world are really neat, useful and are well designed around those layers.
If you add the growing presence of APIs, the number of linked data portals and the huge share of good practices in data sharing, the stack is more and more complete. However, I’m not really sure that half the customer segments can reuse the data.
The ‘Leave no man behind’ strategy of Open Data
The inherent nature of Open Data is to empower everybody through an adapted access to re-usable information. And when I say everybody I really mean everybody. Open Data publishers must adapt their media to a large spectrum of data customers. And only a minor part of those people is willing to search for a portal, to search for a piece of data, to download it, to search for some records and to plot them before sending the result to a colleague via email.
Now consider that in addition to the basic layers (bulk download of raw data), Open Data portals offer a ready-to-use-without-an-account API to access raw data. And even better, consider that they offer ready-to-use tools to analyse the data, to create charts on the go, or – even – already automatically charted data. Of course the same idea can be applied to geographic data that should always be directly on a map. These Open Data layers allow people without technical skills to be independent in their usage of the data, to understand a trend and to share it easily in a few minutes without having to download the data in their own device. There is absolutely no reason why data usage should be reserved to skilled people. Just as blogs and social media has given everybody the opportunity to say something to the world, Open Data should allow anyone to easily tell a story.
If we want everyone to access data, we have to empower those without the skills! Via embed, dashboard, emails, tweets, FB, snaps (I really can’t wait to be able to snap data to friends), slack or basic HTML tables, every medium should be an accessible data medium.
Another huge part of the stack, that is still a bit weak, is the data discovery. For now, there are 3 ways to discover data:
- Hours of scrolling through pages and pages of unrelated datasets,
- 2000’s search experiences based on incomplete human-given lists of tags,
- Rich – but requiring experience and skills – linked data SPARQL queries.
That’s not okay in 2016! Every single online shopping site has a recommendation tool. Every single news website suggests other articles when you’ve finished one. When there were only a couple thousand open datasets available online, it was okay for everyone to publish static markdown-written list of links. Now we need a Netflix/Medium/Spotify kind of experience for data discovery! I do believe that the path to that kind of experience is much more semantic but the experience design must be with much fewer semantic.
Open Data missing layers
In a way, I’ve described an Open Data ecosystem where most of the layers already exist and are used every day. What’s missing is more integration between them, greater consistency and much more fluidity in the global experience of dealing with data.
“Only two ways to make money in business: One is to bundle; the other is unbundle.” Jim Barksdale
This article was first published on Medium.