“Data-informed policy-making” can be a daunting, complex phrase for local governments. However, behind these words lies a much simpler truth. Here are some best practices from the CitizenLab team to help local governments of all sizes harness the power of data.

Data-informed policy-making represents the efforts to increase policies’ efficiency and impact by grounding them in facts and data. It aims to minimize policy failures caused by a mismatch between government expectations and actual, on-the-ground conditions. In that sense, this process can be applied to small and large administrations alike.

Data-informed policy-making by local governments is a rapidly growing trend fuelled by 3 coinciding forces:

  • Exponential data-growth: 90% of the world’s data was created in the last two years, and each day 2.5 billion gigabytes of data are created. Cities are no exception to that exponential rule. They collect air pollution data through sensors, parking data via smart meters, performance data via the different departments… And when it comes to online engagement, they also collect contributions shared by residents on online platforms. 
  • Emerging technologies: Data on its own is worthless. 80% of all data is unstructured, unformatted data – most of this is written texts, images, audio, or video. Technologies are needed to source, clean, process, and analyze this data in order to extract relevant insights. AI (particularly NLP, a subset of AI) and machine learning techniques have shown to be invaluable tools when it comes to making sense of huge amounts of unstructured data and turning them into exploitable results.
  • Citizen experience: Government leaders are increasingly focusing on citizen experience as a core function of government. Using citizen data to inform decisions is key for large public systems to deliver services that meet the needs of individuals.

It’s therefore no surprise that the European Innovation Partnership for Smart Cities & Communities (EIP-SCC) predicts that the global market for smart cities will be worth $1.2 trillion in the early 2020s.

As with every new trend, data-informed decision-making also comes with its own pitfalls. Here are 3 best practices to avoid them.

1. Transparency 2.0

More and more governments are opening up (part of) the datasets they use internally. Some even open the source code of tools and algorithms they apply on these datasets – Amsterdam (NL) is only one example. And in France, the Digital Republic Act (or “open data by default” law), requires all government agencies in the country to release all datasets into the public domain unless it contains sensitive information.

‘Transparency 2.0’ goes one step further. Where open data and open source can be sufficient for data scientists and engineers to really understand what’s happening, it definitely is not enough for the majority of citizens to find clear information. Transparency 2.0 requires an active translation step to make the data truly accessible and informative. Algorithms and code are translated into accessible text and visuals that explain in layman terms what’s going on under the hood. Citizens can for instance understand how is the data being processed, what underlying assumptions the code relies on, or what input went into the results and conclusions.

Transparency leads to a relationship of trust with your citizens, and it could also enable collaborative initiatives to emerge from civil societies. This type of initiative can be a major driving force when governments are trying to tackle complex challenges that cover multiple domains.

Good data in, good data out

The quality of your data input impacts the automation output by a lot. Most of the input quality is defined by the way you start collecting the data. Make sure the collected data is structured in a way that reflects what you want to do with that data or what you want to learn from it. Collecting data to combine with an existing data set? Make sure there’s a structural link between both, e.g. the date and time of the measurement. In need of a spatial analysis of your data? Make sure you also collect the location of your data, in a format that’s easily readable by your geographic information system (GIS).

This rule is even more true when working with unstructured data, such as text, images, video, or audio. Most data collected through citizen participation is unstructured data. You can improve the analysis by setting up your participation process in such a way that you are in fact collecting semi-structured data. Some examples:

  • Open up the participation and allow people to interact with other participants’ input (this means enabling participants to vote on an existing idea, rather than entering a similar idea). That way, you’ll avoid having many duplicate input to process. And the interactions themselves prove to be a new valuable source of insights (e.g. comments bring nuance and different perspectives to a specific idea).
  • Ask specific questions. For instance, the question ‘How did you like the conference and the food?’ should actually be split into 2 separate questions. By doing so, you make sure you’re collecting relevant and trustworthy information for every question.
  • Ask participants to locate their input or to add topics or tags, if you want to use these as a way of analyzing their input later on. We recommend defining a set of tags yourself: it’s easier to work with a group of comments all tagged as “mobility” than to work with comments tagged as “car”, “bike”, “traffic”, and “red light“.
  • Is an open question really the answer? For the reasons we highlight above, it can be challenging and time-consuming to process the results of an open question of you’re not working with powerful NLP software. Whenever you can, replace open questions with multiple choice one with an ‘other’ option.

Don’t forget about humans

While data might be king these days, it’s good to have some checks and balances in place. In other words, machines are not enough: we still need the humans. 

A data-based conclusion can be very convincing, even more so when it is presented visually. When something is expressed as numbers or graphs, it easily gives the impression it needs no further explanation. However, the dataset could be incomplete, the algorithms based on false assumptions or the data conversion could have been done using the wrong metric system. Human interpretation, expertise and critical thinking therefore remain indispensable. 

Bring the data-based conclusion back to your stakeholders, show them how the conclusion was reached and ask for their feedback. When applied to citizen participation, this means linking the original citizen input with the conclusion while highlighting the steps in between. That way, participants can indicate their agreement or disagreement with the process and assumptions, and corrections can be made.

Simply put, it’s all about understanding the strengths and weaknesses of both data technology on the one hand and human expertise on the other. Data technology is great at finding patterns and logical connections, thereby reducing human biases such as the ‘recency bias’. Humans are masters in bringing nuance and context to the table. Where humans used to have a monopoly on answering the ‘Why’ question, data technology is now also swiftly moving beyond its former boundaries of the ‘What’ and ‘How’ questions.

When analyzing unstructured data such as written citizen input, one also has to recognize that the applied technologies such as NLP are not (yet) 100% accurate. Text can be misinterpreted, misclassified, or wrongly translated. If the decision at stake requires 100% accuracy, it is recommended to check the data-based conclusion and correct where necessary.

Forget about the big numbers and the fancy apps – in local government, data-informed policy-making starts with small, reliable datasets and human processes. It can have a hugely positive impact on cities of all sizes. At CitizenLab, our ‘Insights’ team works closely together with several governments to make data-informed policy-making a reality for all governments. If you want to learn more about how we help governments crowdsource actionable insights to improve decision-making, don’t hesitate to get in touch!