January 10, 2017
Reading time: 8 min
Let's take a step back and have a look at what the differences between data platforms are, notably to describe where the Opendatasoft platform stands out.
When attending Smart City conferences these days, one can't help but notice: everyone seems to be making some sort of data processing platform, whether for IoT, data science, data analytics, big data, or open data purposes. There are platforms that turn excel files into graphs and dashboards, and other platforms geared towards encouraging citizen participation. We can go so far as to say that there are too many platforms. Let's take a step back and have a look at what the differences between data platforms are, notably to describe where the Opendatasoft platform stands out.
Back in November, Opendatasoft had the chance to attend the Smart City Expo World Congress in Barcelona. The event was enormous, and brought together with it a wide range of actors in the Smart City ecosystem, from giants like Philips to startup partners working with Microsoft, or even those coming to join the pavilions for New York or Finland.
While we were there, one type of product kept popping up: with all the data there is to be collected in the Smart City, everyone seems to be building some sort of data platform. The question we got at the Opendatasoft stand constantly was:
"So what kind of data do you process?"
This was followed closely by:
"How are you different from those other guys (*points in any direction around the conference hall*) over there?"
Here is our explanation of the differences between other data platforms and Opendatasoft.
Let's start off by answering this very common (and very simple) question. The quick answer: Opendatasoft is a platform for publishing any kind of structured data to make them easier to visualize, reuse, and more importantly, to share. Our platform processes your data, allowing users to add facets to more easily search through and filter your data and create appropriate visualizations based on the information. We don't stop there, however. With each dataset published, an API is generated with querying, aggregating, and filtering capabilities. This API allows your data to be easily connected into applications and reused far and wide.
There's our quick pitch. But it gets a lot more interesting than just that.
Opendatasoft customers are doing all sorts of projects, ranging from traditional open data portals to advanced Smart City initiatives, even with some forays into the world of IoT. The Parisian suburb Issy-les-Moulineaux has built an interactive budget application based around its open data portal, while the western French City of Rennes sees real-time Open Transportation Data, such as on the location of city buses, including how early or late a given bus is running.
These real-time capabilities are thanks to the platform's unique ability to collect data from different systems and then to push them onto the platform in an automated process. The platform can be programmed to search for the data from another data source at a set time interval to then publish it onto the platform and update the dataset, all without any human intervention.
Why bother having Opendatasoft if there's all these other platforms that, for example, process in the first place the data coming from these buses? Many systems are far too complicated to visualize the data, never mind by non-technical users. In addition, these proprietary systems are not meant to make the data easy-to-share, nor have their APIs easily consumed by other applications. Opendatasoft can serve in these cases as an intermediary platform, between the complex systems, to make data easier to use, and most importantly, more interoperable. Opendatasoft is a common data hub for the Smart City that allows for the simple reuse of data. This API is the key difference between our platform and others, as it is the key to reusing data. However, its power and simplicity are key aspects of how it makes it stand out from other platforms that come with APIs.
So just a recap of what Opendatasoft does: we offer a data-publishing platform that allows users to publish their data to make them easy to visualize, share, and reuse. We are a light-weight, easy-to-deploy solution. Our cloud partners like working with us because we do one thing very well: we deploy robust data publishing platforms quickly, at scale, and that are infinitely extensible. There are many solutions that stop at allowing you to just turn Excel files into data visualizations and create pretty dashboards. Our platform goes beyond that. As a data hub for Smart City and IoT data, the platform makes data interoperate and be consumed by applications, while at the same time still allowing for the creation of entire suites of tools for government transparency that go far beyond PDF files and simple websites.
The API-centric approach is critical in order to allow for all kinds of data to be published, and push open data's potential into other verticals, including transportation, utilities, and beyond. We chose this path rather than just staying in the realm of government data in order to truly work at the center of the data ecosystem to create a network of data. With this greater diversity of data from all kinds of actors, there is a greater capacity to build a true open data and Smart City ecosystem in communities everywhere.
Yes. This statement seems quite broad! But if you have structured data, there's a way for the platform to process them. However, what we don't do is take raw data from sensors themselves; the Opendatasoft platform is not meant to receive data from a connected object and to directly transform a reading into a piece of data. This is the role of the aforementioned proprietary platforms, such as the Predix, part of the GE Smart Cities systems, and even Sigfox for example. Where Opendatasoft comes in is for the simple visualization and sharing of that data so that it can be useful to a wide audience, not just within a city department for example. Think of Opendatasoft as an ingester of data as a middleware piece. The data is produced and then accessed through an API, FTPS or through a simple upload utility. We're helping to make Smart City data actually empowering to the Smart Citizens of the future.
For example, in Paris' Place de la Nation, Cisco has filled the square with sensors to measure noise levels, traffic flow, and air quality. At the Barcelona Smart City Summit, we were directly in front of the Cisco stand looking at a dashboard showing these sensors and the data they were collecting. The Cisco platform is there to gather the information in the first place, but whereas its platform collects and structures the data, our APIs would permit that data to be visualized and reused on any third-party systems (the data these sensors are collecting are all available on the city of Paris' open data portal, powered by Opendatasoft). It's all about working together as an ecosystem with our different technologies in order to get the largest amount of data out there in the most comprehensible manner.
So yes, we can publish just about anything. We're not lying when we say it! You structure it, we'll help you get it opened up. For the techies out there, the technology that allows this is our swagger-based API framework. We really can harvest just about anything with an endpoint.
If you're reading this article, it's almost certain that you've heard of or are very familiar with Big Data, Data Science, and Data Analytics. These are some of the most popular buzz topics these days. Since we open up, among other things, Smart City and IoT data, where do we fit within these subjects?
Data Science is made up of two important aspects: The data, produced by machines, and the science, which can only be carried out by humans. Computers are not scientists, they compute. People can be scientists; they ask questions about the data. Data science is a long process. We can sum it up with the help of Wikipedia by saying it is a field working to extract knowledge or insights from data. It includes four principal parts: data architecture, data acquisition, data analysis, and data archiving.1 Opendatasoft fits easily within a few of these subdomains. Our platform is an architecture that allows data to be processed. Next, we can acquire data from a variety of sources to enable data analysis.
However, afterwards, we don't do the data analysis: our platform democratizes a small portion of data science by allowing anyone to perform analytics and even data publishing aspects. The platform makes it easy to build interactive and meaningful visualizations for anyone, from the citizen activist looking into budget data or another active citizen working to analyze data about energy consumption. We don't process complex algorithms, but we allow data to be analyzed with a basic toolset. Even if the data is being used internally, anyone within an organization can do data analysis with our visualization tools.
Big Data, as described by Joel Gurin in open data Now involves, as the name implies, "processing very large datasets to identify patterns and connections in the data."2 These come from generally passive data sources, such as mobile phone GPS systems sending location data, credit card purchase records, Google searches, and more. Generally, these data are kept private, as the data are often used for business and security reasons.
Big data are used to feed algorithms; often, humans won't look too much into the data in order for them to be used, but rather the outcomes of the algorithms.3 open data is a lot about visualizing the data. Looking at snapshots of air pollution during a specific time frame cross-referenced with traffic or weather information helps people to understand how their individual actions add up to collective impact. They can use this information to change behaviors that can be measured in big data at a larger systemic level.
Are we a big data platform? Yes and no. Our platform does have a Big Data side to it, capable of processing, for example, a dataset with over 24 million records in public, with much larger datasets for private customers. This is nothing, however, compared with how massive big datasets can get. However, theoretically, if there were no other big data tools capable of easily filtering, searching, and sharing data, we could fill that void.
We hope this article not only helps clear up where we, as Opendatasoft, fit within the myriad of data platform solutions, but also helps clear up some of the main terms and concepts floating around these days. What's next? Why not go set up a free account with us and see what data you can publish? It can't hurt now that you know just exactly what we do. Happy data publishing!
1. Jeffrey Stanton, Introduction to Data Science, 4-5, 2012, accessed December 13, 2016, https://ischool.syr.edu/media/documents/2012/3/DataScienceBook1_1.pdf.
2. Joel Gurin, open data Now: The Secret to Hot Startups, Smart Investing, Savvy Marketing, and Fast Innovation (New York: McGraw-Hill Education, 2014), 12-14.
3. Viktor Mayer-Schönberger and Kenneth Cukier, Big Data (New York: First Mariner Books, 2013), 11-12.
Reading time: 5 min
Start small. Stay flexible. That’s the mantra coming out of recent smart city initiatives in Pittsburgh and Boston. To which we at ...
Reading time: 5 min