Product News: AI enables intelligent semantic search and accelerates the use of large-scale data

Learn more
Data intelligence & reporting

Understanding and accelerating data usage with data lineage

Data portals need to demonstrate their impact and meet user needs by providing the right data assets to generate reuse. We explore how our customers are using the Opendatasoft data lineage feature to analyze portal performance and continually improve the experience they provide.

VP of Marketing , Opendatasoft
More articles

Whether your data portal is internal, aimed at partners or open to all, it is vital that you understand how your data is being used. This enables you to demonstrate the impact of your program, helps convince data owners to share their data, guarantees future funding and resources and enables you to react more quickly to changing user needs.

That’s why Opendatasoft has launched its unique, innovative data lineage feature. Focused on usage, it allows organizations to better understand how their data is used internally and externally, across data ecosystems, while improving the ease and efficiency of data portal management.

Copy to clipboard

Data lineage is vital to running successful data portals at both an operational and strategic level.

By understanding data usage through lineage you can focus your data roadmap and make better and more informed decisions around:

  • Publishing new datasets – Which new datasets should I publish first? Which datasets do my users want me to make available?
  • Updating and managing datasets – Which datasets should I update as a priority? Who is using my data within my ecosystem? How might they be impacted by any changes I make?
  • Sharing and demonstrating the impact of a data portal – How do I know if my data portal is used and is providing value? How can I know if particular pages or datasets depend on my own data?

Opendatasoft’s data lineage feature has been created to uncover relationships between data and how it is used, enabling better operational and strategic decision-making by providing understandable insights to data portal managers. It delivers value at both the dataset level, through data mapping, and at a portal level, providing strategic and operational insights through an intuitive dashboard.

Copy to clipboard

It is vital to understand the journey of individual datasets, from when they are generated or added to your portal through their different uses inside your portal and when reused within other applications. Data lineage mapping provides insights into where datasets are being used (such as within pages, visualizations and even by other portals within the Opendatasoft ecosystem) and who is reusing them internally or externally.

This means that if you need to make changes to a dataset or even remove it from the portal you can identify who will be affected. This enables you to contact relevant users and ensure they are in the loop and kept informed.

Digital Wallonia: Manage data depreciation with intelligence

Digital Wallonia, which supports digitization across the Belgian region of Wallonia uses data lineage insights to manage data depreciation. If the team is looking to delete a dataset, they go through data lineage to see if it’s used, and if so where and by whom. This helps with daily decision making and enables them to minimize any risks to reuses when deleting datasets.

Screenshot - Digital wallonia data lineage use case
This feature helps us verify if a dataset is used before deleting it, identify data sources for regional insights, and track data flow performance through KPI monitoring. It's a great asset for efficient data management and informed decision-making.
Marie-Bénédicte Laridant
Open Data Analyst, Digital Wallonia

UK Power Networks: Uncovering new data reuse insights

In the energy industry, the comprehensive open data portal of UK Power Networks (UKPN) is used by a wide community of users, ranging from local authorities and developers to consumers. While UKPN was able to see which its most popular datasets were, it couldn’t automatically understand what datasets were being reused for, and who was using them. Now, thanks to data lineage it has a clearer view of the type of reuses, all while preserving user anonymity. This helps it better plan its strategy and roadmap for releasing new datasets.

UKPN data lineage use case
From a data publisher perspective, one of the problems experienced with open data is understanding what data users are doing with the data. Whilst Opendatasoft facilitates the submission of reuses already, this new data lineage feature provides additional insight into the maps and charts that users have built whilst maintaining user anonymity, that we would not have known about. This adds to the value of open data.
Yiu-Shing Pang
Open Data Manager, UK Power Networks
Copy to clipboard

As well as understanding how individual datasets are being reused, administrators need to be able to demonstrate the overall value and impact of their data portal. That requires a higher level view of which datasets are most valued, and who top data consumers are. Administrators also need to be able to manage their portal efficiently, identifying any issues (such as invalid relationships or underused datasets).

Opendatasoft data lineage delivers this overview through an intuitive, interactive dashboard that allows portal administrators to see where data is being used (internally and externally), which datasets are most popular and flags operational issues around data quality. These insights demonstrate the impact that a portal is having, and any areas where it can be improved. For example, if a dataset is underused, administrators could choose to feature it more visibly on the portal home page to increase engagement, or alternatively depreciate it. The dashboard therefore aids better, more informed, decision-making around the direction and strategy for your portal.

SNCF: Building new connections and uncovering new insights

French railway operator SNCF publishes an enormous range of data on its portal, from timetables and customer satisfaction metrics to network information such as the location of bridges and level crossings. This can be reused by a wide variety of individuals and organizations, from mobility players to energy companies and municipalities in apps, websites and data visualizations. SNCF therefore uses the data lineage dashboard to build new connections, encourage collaboration and start conversations around particular data needs with its data ecosystem. It can now identify external stakeholders that are using particular datasets and reach out to them to understand their data needs and how they can best meet them.

screenshot of SNCF data lineage use case
The use of data lineage allows us to easily visualize the relationships between different stakeholders. In an open data portal context, it is particularly useful in encouraging collaboration with other players in the Opendatasoft ecosystem. It elevates the open data approach to the next level, giving it more meaning and value. Overall, it creates new opportunities for SNCF across its ecosystem thanks to the increased visibility that the data lineage feature provides.
Bertrand Billoud
Head of open data and content platforms, SNCF

OFGL: Identifying dependencies and improving data quality

OFGL is the French government body responsible for collecting, analyzing and sharing information on the finances of local government agencies, from municipalities and departments to entire regions. While it believed it knew who was using its data, data lineage has provided definitive evidence of who is consuming its data, enabling it to better demonstrate the value of its data portal.

Ranking of data consumers using OFGL datasets (snapshot of the data lineage dashboard)
We discovered that local communities, as well as other sector stakeholders, were using our data by integrating them into their own open data portal. At this stage, we had assumed such direct uses, but without concrete knowledge. The lineage statistics page provides easy access to the list of these users. This prompts us to be even more vigilant about the quality of the datasets being used and the regularity of their updates.
Nicolas Laroche
Project Manager, OFGL

Data lineage is an essential feature to demonstrate the impact and ROI of your data portal to all stakeholders, improve data portal maintenance and strengthen your data sharing strategy. To learn more about Opendatasoft’s data lineage feature click here to watch our new webinar on the subject.

Articles on the same topic : Data Intelligence Reporting

Learn more
How AI is transforming our data portal solution and client data projects Product
How AI is transforming our data portal solution and client data projects

Over the past months Opendatasoft has been working to transform its data portal solution by enriching it with AI, helping clients to save time, improve the experience for their users, and reduce the risk of errors within processes.

What are the benefits to using your data portal to feed AI models? Digital transformation
What are the benefits to using your data portal to feed AI models?

Learn how data portals enhance the training and effectiveness of artificial intelligence models by providing reliable, high-quality and trustworthy data, which is essential to ethically deploy AI and harness its benefits.

Data discovery – the ultimate guide Data access
Data discovery – the ultimate guide

Data discovery is an essential part of turning data into business value at scale. Our in-depth blog explains exactly what data discovery covers and how to implement it, sharing best practice to help organizations successfully industrialize their data sharing programs and meet the needs of internal and external users.

How AI is transforming our data portal solution and client data projects Product
How AI is transforming our data portal solution and client data projects

Over the past months Opendatasoft has been working to transform its data portal solution by enriching it with AI, helping clients to save time, improve the experience for their users, and reduce the risk of errors within processes.

What are the benefits to using your data portal to feed AI models? Digital transformation
What are the benefits to using your data portal to feed AI models?

Learn how data portals enhance the training and effectiveness of artificial intelligence models by providing reliable, high-quality and trustworthy data, which is essential to ethically deploy AI and harness its benefits.

Data discovery – the ultimate guide Data access
Data discovery – the ultimate guide

Data discovery is an essential part of turning data into business value at scale. Our in-depth blog explains exactly what data discovery covers and how to implement it, sharing best practice to help organizations successfully industrialize their data sharing programs and meet the needs of internal and external users.