- Use Cases
The amount of data available in the world is mind-boggling. While most of us are not often responsible for “zettabytes,” choosing which data to share out of all the possible data your organization possesses may seem overwhelming. However, for data sharing initiatives, this can mean the difference between success and failure.
Having a strategy for choosing which data gets shared is critical. There are many factors to consider and a variety of ways to prioritize data for sharing. Fortunately, there are numerous examples from the data community to help you get started. Ranging from simple guides used by the State of California and State of New York to detailed mathematical formulas like this one created by the City of Toronto, organizations have used a variety of methods to tackle data prioritization.
The steps below can help you organize your thoughts as you make key choices in your organization about what data to share. However, the considerations and examples below are just a small sample of potential techniques for prioritizing data. Keep in mind what will work best in your organizational context and experiment until you find what works for you.
No one wants to prioritize sharing low-value data. But what is high-value data? Determining how your organization would define high-value data is an important first step in prioritization efforts. However, efforts to define data value hone in one key concept: data use.
Data that is used is valuable. A few simple questions can help you kick off your efforts to define what is valuable and useful for your organization. A few I like are:
- What is important for people in our community and organization, regardless of the data available? Aligning shared data with what’s important to the community can widen potential audiences for data that’s shared.
- What are the key strategic priorities for our organization? Following general organizational priorities can help you sell your choices to internal stakeholders.
- What are our goals for data sharing? Detailing specific goals for sharing data can help focus efforts more granularly on themes and data types as you prioritize.
No matter what the specific answers are for your organization, these questions attempt to focus sharing on data that people will use. Some of the answers to these questions may already exist in open data policies or organizational strategy/budget documents. And some priorities arise unexpectedly like in the current crisis of COVID-19 and past examples of data sharing during natural disasters. In any case, focusing on what is important for people in your community is the first step to defining which data is most valuable for sharing.
In addition to variations in value, not all data is created equal when it comes to readiness for sharing. Organizational data might be messy, contain private or sensitive information, or not exist in formats that can be easily shared. Getting data ready for sharing takes time and effort and should always be included when prioritizing data.
In order to prioritize data based on readiness, you need to know your data inside and out. Data inventories are a great way to start determining what data you have in your organization as well as how ready your data is for sharing. In addition, there are several factors to consider when assessing readiness for sharing:
- Does this data contain private or sensitive information? While this is not necessarily a barrier to sharing, it is critical to identify private and sensitive data as you prioritize.
- Is this data high quality? There are many lists of common data quality issues (like this one from Quartz) and prioritizing data quality can help focus sharing efforts.
- Is there a standard that our data can align to? If your data aligns to a standard (as in mobility for example), it becomes easier to prepare for sharing.
Getting into the details of your data is critical to understanding readiness. Sharing data that isn’t ready can have negative effects for your efforts as you progress. Prioritizing data that is ready to publish while getting other data up to a high quality can help ensure that people can actually use your data once it’s shared.
Once you have prioritized data for value and readiness, the most important factor for determining what data to share is who will use it. As noted in Step 1, value is closely related to use and therefore users are critical to consider in determining what data to share.
There are many potential users of data and clarifying target audiences can help you plan who you share data with, how widely data is available, and in what format data should be shared. Two questions to consider are:
- Who is the primary intended user of the data? Defining intended users helps sharpen your focus on what people will do with the data that’s shared.
- Are intended users mainly internal or external to our organization? External users have different needs and requirements for access and require specific strategies for engagement than internal colleagues.
While data may be used by a variety of people, defining target users and audiences can help prioritize how broadly to share the data you have. In addition, considering users may also lead you to revisit choices around value and readiness and identify additional data. Overall, users have a big impact on your data sharing choices as a whole and keeping them in mind as you prioritize will ensure that data you choose to share will be high value.
This 5th of November, two days after the US presidential election, I participated in another (online) celebration: Data on Board!
Reading time: 6 min