Welcome to the OpenDataSoft Leadership Podcast Series, “Open Data Discussions”. Each month, Jason Hare, our Open Data Evangelist, features a different…
Nope, HTML is not Open Data
In this column, our Chief Data Officer shares his view on why HTML is not Open Data. In need for an opiniated post? Let’s open the discussion!
This is a quick reminder because I’m so sick of scrolling through hundreds of open datasets, looking for new data and new ideas, and finding almost exclusively HTML resources. Let’s be clear, HTML resources are a terrible excuse for Open Data, and just make me want to leave your portal!
There are several reasons to explain why we can find thousands of HTML links on Open Data portals:
1) Vanity metrics: Open Data portals like data.gov or data.gouv.fr like to see themselves as “platforms” and “huge catalogs” allowing every single actor in their administration to pretend that they opened something. This is problematic because when your goal is mired by vanity – for exemple favouring quantity rather than quality – you put anything on your portal. That ruins the global experience of looking for data. That ruins other honest organizations that publish real and neat open data. And that ruins your own portal in the eyes of potential-future data openers.
2) Laziness: a lot of organizations have heard that Open Data was cool and/or were pressured by government officials to open some of their data. If not for their laziness or lack of knowledge, they could have:
- Fought for real Open Data, and make the data clean and usable,
- Fought against Open Data, and not release anything at all.
Instead of those solutions, they choose to copy-paste a vague link to their own website page. And most of the time it links to html-table-formatted data of badly-shaped PDF.
3) Data washing: as we’ve seen a lot of green-washing or social-washing among organizations, there is a real tendency to data-washing. And once again it’s cheaper to share a couple of links on data.gov than to build a real data portal.
4) Scam: I don’t know about the US or the UK, but on the French Open Data portal, some people are trying to achieve better SEO rankings through HTML resources. Some of them are also trying to get free advertising, linking to pages selling products.
5) Lack of understanding about Open Data: I’ll take it easy on the subject.
If you are responsible for a data portal, please, I have no problem with links to open data hosted elsewhere, but the resource should link to an actual data file! You can even add a `linked portal` to each dataset. But an agency website on its own is not an open data resource.
If you are in charge of opening your organization data, there’s tons of documentation online about how to benefit from Open Data. Doing it is not free, it’s not totally trivial, but when it’s done well it’s game changing!
If you are a scammer, remember in the early days, before Amazon and Ebay when scammers could sell something online without shipping it? Posting HTML links as Open Data is pretty much the same thing. That phase in the internet ended very quickly. Posting HTML as open data needs to end as well.