Open Data and Privacy – Most Definitely Not Mutually Exclusive
As governments open more data to the public, watchdog groups and think tanks are closely eyeing privacy protections to ensure that released data does not violate citizens’ privacy. That’s a good thing, because the focus on privacy has spawned innovations in the way data is protected and released.
As governments open more data to the public, watchdog groups and think tanks are closely eyeing privacy protections to ensure that released data does not violate citizens’ privacy. That’s a good thing, because the focus on privacy has spawned innovations in the way data is protected and released. And as the 2020 Census rolls out, innovations in differential privacy by the Census Bureau are expected to offer other governments opportunities to learn new best practices. In short – there’s a lot happening in the privacy space in terms of data sharing.
Copy to clipboard New Practices in the 2020 US Census
“We’ve seen differential privacy on a limited basis, but seeing it roll out for the entirety of the 2020 Census will be really exciting,” says Kelsey Finch, senior counsel for the Future of Privacy Forum. “Lots of lessons to be learned in terms of how to operationalize these kinds of tools and how we communicate with the public about these kinds of tools.”
danah boyd, (note:she keeps her name lower case), founder and president of Data & Society Research Institute and partner researcher at Microsoft Research, also says the 2020 Census will offer innovations.
“We should be celebrating the Census Bureau for recognizing that it must innovate in order to protect the confidentiality of the data it collects,” says boyd. “Their innovation is going to change the future of data production, dissemination, and use – not just for the Census Bureau, but for all groups invested in open data.”
Copy to clipboard What exactly is Differential Privacy?
Under differential privacy, public data is opened, but it is done in a way so individuals cannot be identified. It’s become a nuanced process, since data might still be identifiable when overlayed against other datasets. Local governments are aware of this possibility, and are careful to imagine and work through possible data overlays. The 2020 Census is doing the same, and given the resources of the census, its innovations could be transformative.
Copy to clipboard Innovation in Data Privacy Across the Board
But it’s not just the 2020 Census that is driving privacy innovations, local governments across the country are developing risk assessment processes, tools, and standards to ensure their open data protects privacy. Some have established data privacy officers.
Finch says establishing guidelines and processes is critical, as is communicating policies to the public. She worked with the city of Seattle to do just that.
“They take privacy very seriously in that city,” Finch says of Seattle. “They recognize the need to develop real robust policy safeguards and to consider more holistically the potential impact of making data available…We tried to develop tools and standards for more transparent decision-making.”
Transparency with the public about data privacy policies and decision-making frameworks is crucial, experts say.
(Governments) need to be really clear and transparent with the public about why and how particular data sets are being released,” adds Finch. “Make sure people understand that when they provide info to the government, 311 calls, or writing an email, filling out a form…make sure people know and understand if that info will later be available on a public portal.”
Finch adds: “There might be situations where data is so sensitive and so identifiable that it shouldn’t be released through open data programs. On the other hand, there might be information that has such compelling public interest, that notwithstanding privacy risks it should be put out there. For example, salaries, to look for inequities.”
Copy to clipboard Why Not Releasing Certain Data Can Also Carry a Risk
But some experts say not releasing data also carries risk.
“There are also risks involved in not opening data,” says Stefaan Verhulst, co-founder and chief research and development officer of the Governance Laboratory at New York University. “What are the opportunity costs to society if you don’t open the data?”
Verhulst stresses that governments need to examine privacy and data quality issues across the entire data chain – not just the release. They need to take an “end-to-end approach,” he says.
“It’s important for local governments to be aware that indeed risks exist at the moment you open the data, but also across the value chain,” he says.
“Data lineage” – or where the data originated – is important to understand. His team at the Governance Lab advocates for labeling of data – he calls it a kind of “nutritional” labeling – that discloses where the data originated and what its uses will be. He advocates for a new profession as well, that of data steward. The data steward would be charged with assessing data quality across the value chain and communicating with key constituencies, including the public.
“That should be the next stage of the open data movement,” Verhulst says.
Finch says she sees tremendous gains being made at the local level in terms of data privacy safeguards.
“More cities are passing privacy policies and ordinances that are professionalizing the protection of privacy and making it more systematic,” Finch says. “There is momentum growing.”
Finch adds that people need to see that open data and privacy “are not mutually exclusive.”
“Having real safeguards in place actually often helps the innovation of the data-driven work go faster,” she says, “because you’ve built processes and infrastructure.”