- Use Cases
When I joined ODS last October, one of the first things I did was flip through the company's employee directory (extremely useful for making sure you don’t screw up a first name when running into a coworker next to the coffee machine). Product team, Sales team, CSM team, HR...the directory seemed to cover all the usual basis. But then I came across something unusual: “data hunter.” I thought, “OK, I'm not really sure what that means, but it sounds pretty awesome!” And lo and behold (spoiler alert), I was right!
If you’re anything like me, then you’re probably dying to know more about this profession! In this article, I'll introduce two of our data hunters: Audrey and Cécile. They’ll tell us all about their jobs and even describe one of their current tasks: hunting for data on Covid-19. Here we go!
As part of the R&D team, Audrey and Cécile are mainly in charge of hunting for cross-sectional data to expand Opendatasoft's Data Network. The Data Network contains all the public data of our customers, as well as the data that is added by our hunters. Customers can use the Data Network to cross-reference the data sets that interest them and to enhance their own content. (By the way, the Data Network is a veritable goldmine of information. It can even help you make the most of your summer...)
The world of data is vast...How do our hunters choose their prey? “We choose according to the needs of our customers and the ODS teams," explains Audrey. Customers are the first to request their services. They ask the team to search for data that can help them solve specific problems.
But sometimes it’s Audrey and Cécile's coworkers who need the data: “We often receive requests from the Sales and Production teams. These requests are, of course, designed to help the company achieve its strategic goals,” adds Audrey.
But where could data possibly be hiding? “Everywhere!” exclaims Cécile, with a big smile on her face. According to the hunters, the sources are quite varied, and searches sometimes feel like police investigations. “The web is vast. It's all about finding the right keywords and checking the accuracy of the source,” Cécile explains. “The most reliable platforms are government websites. Data Gouv, the official platform of the French government, is one of our main resources. We also get data from organizations that supply their own platforms, such as the INSEE (France National Institute of Statistics and Economic Studies).”
But working with reliable sources doesn't necessarily make our hunters immune to pitfalls… First obstacle: “Publication habits vary from country to country,” explains Cécile. “Data is often extremely local by nature,” adds Audrey. “There’s a cultural aspect that must not be overlooked.” Mexicans, for example, use a different format for their documents than the French, while Americans tend to work more with APIs than data sets.
Second obstacle: “The data we are hunting is incredibly diverse!” Audrey explains: “On the same day, we can cover topics ranging from demographics and mobility to global shark attacks. It’s not always easy, as we have to delve into subjects which we often know very little about. But this is what makes our job so exciting!”
Once the hunt is over, the team is in possession of several interesting and reliable data sets. The data is then imported into the Data Network. What happens next? The team then considers the best way to clean and promote its data. “At this stage, our goal is to present the data in a way that will maximize its chances of being reused,” explains Audrey. When data is presented in the right way, it is easy to add filters, cross-reference with other data, convert into graphs...or simply use. “Merely displaying data is not enough. Our job serves little purpose if no one makes use of our tamed data.”
Our hunters are thrilled when they receive questions about data sets from the Support team. “This proves that the data sets are alive and have made an impression!” exclaims Cécile.
The job of our data hunters doesn’t end with the publication of a new data set. Each data set must be carefully monitored and updated over time. In fact, Audrey and Cécile work hard to continuously improve the quality of the data sets in the Network. “We strive to share high-quality data, even if that means sharing less,” explains Audrey.
Cécile and Audrey just completed their biggest hunt to date: data linked to the Covid-19 crisis.
Last March, ODS decided to create Covid-19 Observatories for France, Belgium, Switzerland, Canada, and the United States. The goal was to provide and present the data in a simple way so our customers could quickly use it to their advantage by incorporating the data into their portals and communication. Audrey and Cécile played a crucial role in implementing this large-scale project. Let’s hear how it went!
Can you tell us about this project?
Audrey: As usual, we went hunting...but it was rough going because we couldn't get a grasp on our sources. New indicators were emerging every day, while other sources were appearing and then quickly disappearing again. It was intense in the beginning: we were constantly redoing what we had already done the day before. Usually, the data we work with is stable. But in this case, the hunt lacked structure, as everyone else was working at the same time.
How did you decide to present the data as an observatory?
Cécile: We went with an observatory because this format is more effective than a table. An observatory provides an overview of the current situation. It also displays clear and decisive diagrams that keep misinterpretations to a minimum. It’s important to remember that even “objective” data can be interpreted in many different ways. A field labeled “number of patients” does not have the same meaning everywhere. Are we talking about the total number of patients? The number of tested or self-diagnosed patients? Or the number of patients at the hospital? We wanted these nuances to be clear.
Audrey: We also didn't want to create widespread worry. We did a lot of research to find out how to present Covid-related data in a non-sensationalized manner. After all, our objective was clear: to allow our customers to make use of our data to provide a rapid response to their citizens. Nothing more, nothing less.
How has the feedback been from your customers?
Audrey: Extremely positive! I'm happy about that, because after all, this hunt was for them. We were able to anticipate their demands.
What did you take away from this experience?
Audrey: A certain amount of pride, because this project allowed us to make a difference. We may not have produced any masks (that wouldn't have made much sense), but by setting up our observatories and our pro-bono service, we were able to use our know-how to serve the greater good.
These observatories also prove that open data can be useful to everyone. By working with an open data platform, you can quickly create tables to highlight a topic that affects us all.
Cécile: This project also allowed us to raise awareness within the company on the data hunter profession. Since several teams were involved in the creation of the observatories, we were able to explain exactly what we do and the problems we encounter on a daily basis. Our coworkers discovered the challenges that are specific to “data culture.” Data is alive, and sometimes a bit hectic!
This work on Covid-19 data also reminded me of the need to instill better practices in the data sector. How do we manage our licenses? How do we manage our metadata? How do we make our data usable? We can now provide some answers to these questions.
As you can see, our data hunters are extremely busy. They are required to work on a variety of different subjects and handle many types of requests. Their strongest weapon? Insatiable curiosity combined with relentless determination!
See you back here soon for more datadventures!
In this month's webinar, we wanted to address a topic that affects both private companies and public organizations: how to create ...
The secondary sector is among the largest energy users in the world. According to the International Energy Agency, industry accoun ...
Reading time: 5 min