You are on page 1of 20

GOVINF-01109; No.

of pages: 20; 4C:


Government Information Quarterly xxx (2015) xxx–xxx

Contents lists available at ScienceDirect

Government Information Quarterly

journal homepage: www.elsevier.com/locate/govinf

A systematic review of open government data initiatives


Judie Attard ⁎, Fabrizio Orlandi, Simon Scerri, Sören Auer
University of Bonn, Regina-Pacis-Weg 3, 53113 Bonn, Germany

a r t i c l e i n f o a b s t r a c t

Article history: We conduct a systematic survey with the aim of assessing open government data initiatives, that is; any attempt,
Received 2 March 2015 by a government or otherwise, to open data that is produced by a governmental entity. We describe the open
Received in revised form 16 July 2015 government data life-cycle and we focus our discussion on publishing and consuming processes required within
Accepted 25 July 2015
open government data initiatives. We cover current approaches undertaken for such initiatives, and classify
Available online xxxx
them. A number of evaluations found within related literature are discussed, and from them we extract
Keywords:
challenges and issues that hinder open government initiatives from reaching their full potential. In a bid to
Open data overcome these challenges, we also extract guidelines for publishing data and provide an integrated overview.
Government data This will enable stakeholders to start with a firm foot in a new open government data initiative. We also identify
Data portals the impacts on the stakeholders involved in such initiatives.
Publishing © 2015 Elsevier Inc. All rights reserved.
Consuming
OGD life-cycle
Openness

1. Introduction the 2014 European Commission Anti-Corruption Report5 states that


corruption can be estimated to cost the European Union economy
In recent years, a number of open data movements sprung up 120 billion Euros per year. In places where there is widespread belief
around the world, with transparency and data reuse as two of the that corruption prevails, the people end up losing faith and trust in
major aims. To mention a few, there is the Public Sector Information those entrusted with power. As the Global Corruption Barometer
(PSI) Directive1 in 2003 in Europe, U.S. President's Obama open data 20136 shows, corruption can be identified running through the dem-
initiative in 2009,2 the Open Government Partnership3 in 2011, and ocratic and legal process in many countries. This results in people
the G8 Open Data Charter4 in 2013. Open government data portals losing trust in key institutions such as political parties, the judiciary
resulting from such movements, such as data.gov.uk, data.gov, and and the police. While transparency cannot be regarded as an end
data.gov.sg, provide means for citizens and stakeholders to obtain (Zuiderwijk et al., 2014), it can be regarded as a means to act as a dis-
government information about the locality or country in question. incentive to corruption.
While not being the only motivation, initially corruption was one Collectively, there are three main reasons for opening government
of the main issues that prompted the founding of open government data7:
data initiatives such as the above. Corruption is a global issue that se-
riously harms the economy and society as a whole, affecting people's 1. Transparency — in order to have a well-functioning, democratic
lives and often infringing fundamental human rights. The democracy society, citizens and other stakeholders need to be able to monitor
of many countries around the world is undermined by deep-rooted government initiatives and their legitimacy. Transparency also
corruption, which also affects the economic development. While means that stakeholders not only can access the data, but they also
the total economic costs of corruption cannot be easily calculated, should be enabled to use, reuse and distribute it. The success to
achieve transparency results in a considerable increase in citizen
social control;
2. Releasing social and commercial value — governments are one of the
⁎ Corresponding author. largest producers and collectors of data in many different domains
E-mail addresses: attard@iai.uni-bonn.de (J. Attard), orlandi@iai.uni-bonn.de (Alexopoulos et al., 2014). All data, whether addresses of schools,
(F. Orlandi), scerri@iai.uni-bonn.de (S. Scerri), auer@cs.uni-bonn.de (S. Auer).
1
http://ec.europa.eu/digital-agenda/en/european-legislation-reuse-public-sector-
5
information. http://ec.europa.eu/dgs/home-affairs/what-we-do/policies/organized-crime-and-
2
http://www.whitehouse.gov/open/documents/open-government-directive. human-trafficking/corruption/anti-corruption-report/index_en.htm.
3 6
http://www.opengovpartnership.org/. http://www.transparency.org/gcb2013.
4 7
https://www.gov.uk/government/publications/open-data-charter. http://opengovernmentdata.org/.

http://dx.doi.org/10.1016/j.giq.2015.07.006
0740-624X/© 2015 Elsevier Inc. All rights reserved.

Please cite this article as: Attard, J., et al., A systematic review of open government data initiatives, Government Information Quarterly (2015),
http://dx.doi.org/10.1016/j.giq.2015.07.006
2 J. Attard et al. / Government Information Quarterly xxx (2015) xxx–xxx

geospatial data, environmental data, transport and planning data, or initiatives, including the above-mentioned barriers, their impact upon
budget data, has social and commercial value, and can be used for a all the different processes within the open government data life cycle,
number of different purposes which are different than the ones orig- any proposed standards, and challenges impeding the efforts. In this
inally envisaged. By publishing such data the government encour- paper, our aim is not to provide new interpretations of existing litera-
ages stakeholders to innovate upon it, and create new services; and tures. Thus we here give an overview of existing interpretations, and
3. Participatory Governance — through the publishing of government provide an integrated and unified model that covers the relevant
data citizens are given the opportunity to actively participate in concepts, terminology, initiatives, challenges, and guidelines, therefore
governance processes, such as decision-taking and policy-making, portraying how each element forms part of the bigger picture. We
rather than sporadically voting in an election every number of start by explaining the process of the systematic survey in Section 2,
years. Through open government data initiatives such as portals, followed by short definitions of key terms used in this paper in
stakeholders can also be more informed and be able to make better Section 3. In Section 4 we describe the open government data life-
decisions (Rojas et al., 2014). cycle as we envision it, followed by an overview of open government
data initiatives, including assessment frameworks, evaluations and
The above motivations, while not being the sole ones, are the foun- challenges in Section 5. We proceed with discussing different aspects
dations for most open government data initiatives. The latter act as a of publishing and consuming open government data in Section 6,
preventive policy and give stakeholders the opportunity to scrutinise whereas in Section 7 we identify the various impacts that publishing
and reuse the available information in a number of ways, including government data has on different stakeholders. We provide our
identifying patterns in the data and creating new services. This results concluding remarks in Section 8.
in an increased accountability that in turn hinders corruption. Besides,
by the creation of new services based on open government data, users
add value to the latter, which can also be commercialised. The participa- 2. Research method
tion of citizens in decision-making processes is also a very important
aspect of opening governmental data, as it empowers citizens and In this survey paper we followed a systematic literature approach. By
thus enables governments to be more citizen-centred. However, citizen following this formal method with explicit inclusion and exclusion
participation is not only limited to the decision-making process. Open criteria, we intend to provide a replicable research review with minimal
government initiatives may also allow stakeholders to provide feedback bias arising from the review process itself. Our approach is based on the
on government actions or collaborate in policy-making. guidelines proposed in (Dyba et al., 2007) and (Kitchenham, 2004). The
Although the number of public entities seeking to publicly disclose procedure we undertake is as follows:
their data has seen a drastic increase, it is still a major challenge to
achieve the full potential of open government data and support all 1. Define search terms;
interested parties with the publication and consumption of this data. 2. Select sources (digital libraries) on which to perform search;
A number of barriers, including technical, policy and legal, economic 3. Application of search terms on sources; and
and financial, organisational, and cultural barriers, also contribute to 4. Selection of primary studies by application of inclusion and exclusion
this challenge (Conradie & Choenni, 2012), (Zuiderwijk & Janssen, criteria on search results.
2014a). Yet, a major stumbling block for the full exploitation of open
government initiatives remains the heterogeneous nature of data
formats used by public administrations, which include anything from
2.1. Research questions
images, PDF and CSV files and Excel sheets, to higher structured XML
files and database records. This heterogeneity is a technical barrier to
Identifying the research questions is essentially what distinguishes a
both data providers and data consumers, and hinders society from
systematic review from a traditional review. Asking pre-defined ques-
realising government data transparency. Open government data portals
tions is not only required for determining the content and structure of
also suffer from the large number of diverse data structures that make the
the review, but it also aids in guiding the review process. This includes
comparison and aggregate analysis of government data practically
the techniques used for identifying studies, the critical reviewing of
impossible. The diversity of tools to present, search, download and
studies, and the ensuing analysis of the results.
visualise this government data is also nearly as diverse as the number
The goal of this survey is to analyse existing open government data
of existing portals. Past efforts have sought to overcome this situation
initiatives, tools, and approaches, for publishing and consuming open
by creating comprehensive and connected European transparency
government data. We therefore define the following as a generic
portals such as publicdata.eu. However, the diversity of transparency
research question:
standards across Europe, which proved to be a bottleneck, highlighted
the need that platforms beyond the state-of-the-art also need to be
more than just direct entry points to government data analysis. They What are existing approaches that enable the publishing and
also need to provide a platform for advocacy towards common consumption of government data?
transparency standards at the highest level across several jurisdictions.
Government data portals also experience a number of cultural This generic question can be further divided into more specific sub-
obstacles which hinder them from reaching their full potential. For questions as following:
example, public entities might be unwilling to publish their data.
This may be so for a number of reasons, including the perception 1. What are existing approaches for publishing or consuming open
that it requires a lot of resources and effort, or that the release of gov- government data, and how can they be classified?
ernment data might backfire. This disposition is, however, slowly 2. What are the supported technical aspects, features and functions in
changing world-wide, mostly due to advocacy of civil society existing approaches?
initiatives. 3. Are there any defined guidelines for the publishing or consumption
In this paper we aim to explore the state of open government data of open government data?
initiatives, as well as existing tools and approaches. For this aim, we 4. What are existing challenges with publishing or consuming open
conduct a systematic survey on the literature related to publishing government data?
and consuming open data through government portals, data catalogues, 5. What are possible impacts of open government initiatives on
or otherwise. We discuss different aspects of such open government relevant stakeholders?

Please cite this article as: Attard, J., et al., A systematic review of open government data initiatives, Government Information Quarterly (2015),
http://dx.doi.org/10.1016/j.giq.2015.07.006
J. Attard et al. / Government Information Quarterly xxx (2015) xxx–xxx 3

2.2. Search strategy Publications that meet any of the following criteria are excluded
from the review:
In order to cover the largest spectrum of relevant publications
possible, we identified and used the most extensively used electronic E1. a study that only mentions some of the search terms, but does
libraries, namely: not focus on government data or its publishing or consumption
E2. a study that focuses on open data in general (not limited to
– ACM Digital Library government data); and
– IEEE Xplore Digital Library E3. a study that describes portals that exploit only non-governmental
– Science Direct data.
– Springer Link
– ISI Web of Knowledge
The procedure for selecting the primary studies for this review was
conducted in October 2014. Consequently, this review only includes
Although we considered Google Scholar to be used for this systemat- studies that were either published or indexed before that date. We
ic review, we decided against including it, since its content is indirectly also limited our search to publications written in the English language
obtained through the listed electronic libraries, thus making the use of and were published after 2002. This year was selected as a delimiter
Google Scholar redundant. since the preliminary search indicated that there were no relevant re-
Based on the research questions, we led out some pilot searches and sults before that date. As shown in Fig. 1, we started by applying the
consulted with experts in the field in order to obtain a list of pilot studies. search string in each data source separately. Since the results included
The latter were then used as a basis for the systematic review in order to a couple of proceedings, we resolved them by including all publications
find the search terms which would best answer our research questions. within the proceedings, resulting in 368 publications. Subsequently, the
The following are the search terms used in this systematic review: results were merged, and duplicate studies were removed. This left us
1. “government data portal”; with 338 publications. We then proceeded to manually go through the
2. “government public portal”; titles of the remaining studies, removing those entries whose title indi-
3. “government open data”; cated that they were not relevant to our review. This reduced the
4. “government open data portal”; amount of potential primary studies to 159. The following step was to
5. “government open data publishing”; manually scan the abstracts. Yet again, the number of studies was
6. “government data publishing”; reduced to 103. Finally we went through the full-text of the studies,
7. “public government data”; whilst applying the Inclusion and Exclusion criteria defined above.
8. “consuming open government data”; This resulted in 75 studies, which represented our final set of primary
9. “consuming open data”; studies.
10. “public open data”;
11. “open data consumption”;
12. “open data publication”; 2.4. Overview of included studies
13. “open data portal”; and
14. “consuming public data”. The goal of this publication is to execute a systematic analysis of
existing literature within the field of open government data. We here
To construct the search string, all the search terms were combined discuss some statistics of the relevant literature resulting from the
by using the “OR” Boolean operator. The reason this conjoining method conducted systematic analysis. As shown in Fig. 2, the period between
was implemented for the query construction was to keep the query as 2002 and 2009 did not yield any relevant literature, however, the results
simple as possible, with as few Boolean operators as possible. This increase significantly in the subsequent years. Even though a number of
made the query more flexible to use in different electronic library search major open data initiatives were already established, such as the ones
tools. indicated in the figure, the surge in open government data literature
The next step in defining the search strategy was to find suitable may be potentially linked to U.S. President Obama's Open Government
metadata fields on which to apply the search string on. Searching in Directive at the end of 2009. As shown in the image, the year 2014 re-
the publication title field alone does not always provide the relevant sulted in the highest number of related literature, indicating that the
publications, mostly due to low precision and low recall rates. While awareness of open government initiatives is increasing at a fast pace.
the search on the title retrieves a potentially larger number of results,
the results might not all be relevant. Thus by adding the search on the
abstract, irrelevant results would be reduced, while other relevant pub-
lications which do not have the search terms in the title are also re-
trieved. We therefore decided to lead the search on both the title and
abstract field of publications.

2.3. Study selection

Some of the results obtained using the above method might still be
irrelevant for our research questions, even if the search terms appear
in either the title, abstract, or both. Therefore, a manual study selection
has to be performed, retaining only those results which are relevant to
answering our research question. Thus we defined inclusion and
exclusion criteria. Publications that satisfy any of the following inclusion
criteria are selected as primary studies:

I1. a study that focuses on open government portals, open govern-


ment data, or its publishing or consumption; and
I2. a study that describes open government data initiatives. Fig. 1. Procedure for identifying primary studies.

Please cite this article as: Attard, J., et al., A systematic review of open government data initiatives, Government Information Quarterly (2015),
http://dx.doi.org/10.1016/j.giq.2015.07.006
4 J. Attard et al. / Government Information Quarterly xxx (2015) xxx–xxx

Fig. 2. Resulting primary studies by year.

3. Terminology
Fig. 3. Relationship Between Open, Government and Linked Data.

In order to give some context to our discussion, we here define the


most important concepts used within this paper. Fig. 3 visually 3.4. E-government
represents the relationships between open data, government data,
and linked data. While many different definitions of e-government exist in the
literature, we here stick to the government's use of technology to
enhance the services it offers to other entities, including citizens, busi-
3.1. Open data
ness partners, employees, and other government agencies (Layne &
Lee, 2001). Technologies used for this purpose are most often web appli-
The ‘Open’ 8 definition sets out eleven requirements that open data
cations. Thus, by aiding the interaction between citizens and their
should conform to. The latter requirements basically indicate how to
government, an e-government has the potential of building better
enable the free use, reuse, and redistribution of data. Moreover, open
relationships and also deliver information and services more efficiently.
data should not discriminate any person and must not restrict the use
While initially e-government just referred the simple presence of
of the data to a specific field or venture. Thus, data published in an
government on the Internet, mostly in the form of an informative
open data format would be “platform independent, machine readable,
website, the concept has since evolved. With the introduction of the
and made available to the public without restrictions that would
‘open government’ concept, we now consider open government data
impede the re-use of that information” 9. Hence open data only refers
initiatives to be a subset or an extension of e-government (Jetzek
to data that is available free of charge for the general public without
et al., 2014).
any limitations (Reiche & Höfig, 2013). Open data is considered to be a
key enabler of open government (Kučera et al., 2013).

3.5. Linked data


3.2. Public data
Linking data is the process of following a set of best practices for pub-
It is important to note the distinction between public data and open lishing and connecting structured data on the Web (Bizer et al., 2009). It
data. While public data is made freely available to the general public, it is the final step in the five star deployment scheme11 for open data. The
is not necessarily open. An extreme example of public data which is not term ‘linked data’ thus refers to data which is published on the Web and,
open is an archive of legal documents. While they are freely accessible, apart from being machine readable, it is also linked to other external
imagine the effort required to identify and locate a specific document. datasets. The increased rate of adoption of linked data best practices
On the other hand, if such data is digitalised and made available online has lead the Web to evolve into a global information space containing
in a standardised format (also indexed), then this public data is also billions of assertions, where both documents and data are linked. The
open. evolution of the Web enables the exploration of new relationships
between data and the ensuing development of new applications.
3.3. Open government data

Open government data is a subset of Open Data, and is simply 3.6. Data portal
government-related data that is made open to the public (Kučera
et al., 2013). Government data might contain multiple datasets, includ- The open data movement aims at opening public sector information
ing budget and spending, population, census, geographical, parliament with the purpose of maximising its reuse. A typical implementation is to
minutes, etc. It also includes data that is indirectly ‘owned’ by public collect and publish datasets into central data portals or data catalogues
administrations (e.g. through subsidiaries or agencies), such as data in order to provide a “one-stop-shop” for data consumers. While a data
related to climate/pollution, public transportation, congestion/traffic, catalogue would most commonly act as a registry of data sources
child care/education. Several countries have already demonstrated (Alexopoulos et al., 2013), providing links, a portal is more commonly
their commitment to opening government data by joining the Open a single entry point hosting the actual data, where end users can search
Government Partnership (OGP).10 and access the published data and explore or interact with it in some
manner. A key function of a data portal is the management of metadata
8 for the datasets, possibly including metadata harmonisation. Various
http://opendefinition.org/od/.
9
http://www.whitehouse.gov/open/documents/open-government-directive.
10 11
http://www.opengovpartnership.org/countries. http://5stardata.info/.

Please cite this article as: Attard, J., et al., A systematic review of open government data initiatives, Government Information Quarterly (2015),
http://dx.doi.org/10.1016/j.giq.2015.07.006
J. Attard et al. / Government Information Quarterly xxx (2015) xxx–xxx 5

tools are provided on government data portals, such as data format


conversion, visualisations, query endpoints, etc.

3.7. Publishing

Publishing data on the Web enables data providers to add their data
to the global data space. This allows data consumers to discover and use
this data in various applications. By following linked data best practices,
published data is made more accessible and eases its reuse. A large
number of linked data publishing tools exist; they either serve the
content of RDF stores as linked data on the Web or otherwise provide
linked data views over non-RDF data sources (Bizer et al., 2009). The
majority of these tools allow publishers to avoid dealing with the
technical details behind data publishing.

3.8. Consuming

The aim of publishing data on the Web is to enable its use, reuse, and
distribution. Such data is made more discoverable and accessible if the
Fig. 4. Open government data life-cycle.
data publishers follow linked data best practices. For example, if pub-
lished data has good quality metadata (Reiche & Höfig, 2013), then con-
sumers would more easily discover the contents of the published
data life-cycles as a basis, we here attempt to cover all the processes in
dataset, and decide whether it is fit for the intended use. While the
the life-cycle of open government data, in order to provide a standard
role of data consumers and data publishers is distinct, it is also inter-
process which government open data stakeholders can follow.
changeable in that a publisher can also be a consumer and vice versa.
The proposed life-cycle, shown in Fig. 4, is made up of three sections,
To describe this, the authors of (Alexopoulos et al., 2014) coin the
namely a pre-processing section (rectangle), an exploitation section
term prosumers. Data consumption can be either data exploration,
(oval), and a maintenance section (hexagon). The latter sections, in
where a user visualises or scrutinises open data, or data exploitation,
order, take care of: preparing the data to be published, using the
where a user adds value to the open data by creating mashups, leading
published data, and maintaining the published data in order to be
analysis, or innovating upon the data itself. This is also known as knowl-
sustainable. We proceed to give a short overview of each interdepen-
edge economy.
dent step in the life-cycle. This description is not meant to be extensive,
as each step can also require a number of different processes, and such
3.9. Data quality an extensive description is beyond the scope of this paper.

Since the concept of quality is cross-disciplinary, there is no single ‫־‬Data Creation — the open government data life-cycle typically starts
agreed-upon definition of quality (Kučera et al., 2013). However, data with the creation of data. In public or governmental entities, the
quality is commonly perceived to be fitness for use (Juran, 1974). Fitness creation of data is usually part of daily procedures, however, it is
for use is, however, a multi-dimensional concept that has both subjec- also possible to collect data for the specific purpose of publishing it.
tive perceptions and objective measurements based on the dataset in ‫־‬Data Selection — this is the process involving selecting the data to be
question (Pipino et al., 2002). Subjective data quality assessments published. This requires removing any private data or personal data,
reflect the requirements and experiences of the consumers of the data. as well as identifying under which conditions will this data be
Let us take an example using restaurant reviews. What one person published (potentially through the specification of open govern-
might consider to be a tasty dish, another might find bland. These ment data policies) (Zuiderwijk et al., 2014).
different perceptions result in varying reviews of the same dish. Objec- ‫־‬Data Harmonisation — this step involves preparing the data to be
tive assessments can be task-dependent or task-independent. Task- published in order to conform to publishing standards, such as the
independent quality assessment metrics reflect the properties of the Eight Open Government Data Principles (explained further in
data without contextual knowledge of how it will be consumed. Section 6.1.2).
Continuing on the previous example, if a restaurant uses fresh ingredi- ‫־‬Data Publishing — this is the actual act of opening up the data by
ents in its food, then it is considered to be a good restaurant. Task- publishing it on government portals.
dependent metrics, on the other hand, reflect the requirements of the ‫־‬Data Interlinking — data interlinking is the final step in the Five Star
application at hand. For example, if a person who does not like fish is Scheme for Linked Open Data. This allows published data to have ad-
served a fish dish, then of course he will not like it. Thus, albeit a public ditional value, as the linking of data gives context to its
entity publishes governmental data, if this data does not have good interpretation.
quality standards with regards to its consumers, then the data will not ‫־‬Data Discovery — the publishing of data is not enough to enable its
be exploited to its full potential. reuse. Data consumers must discover the existence of open data in
order to be able to consume it. Data discovery can be enhanced by
4. Open government data life-cycle actively raising awareness on its existence (e.g. through organising
hackathons).
In this section we propose and explain the open government data ‫־‬Data Exploration — this step is the most trivial way of consuming
life-cycle. Albeit a number of open data life-cycles exist,12 none of data. Here, a user passively examines open data by visualising or
them are tailored to the specific needs of open government data. More- scrutinising it.
over, a number of vital steps are omitted, and only the most common ‫־‬Data Exploitation — this step is a more advanced way of consuming
procedures for opening data are discussed. Therefore, using the existing data. Data Exploitation enables a user to pro-actively use, reuse or
distribute the open data by leading out analysis, creating mashups,
12
http://www.w3.org/2011/gld/wiki/GLD_Life_cycle. or innovating upon the open data.

Please cite this article as: Attard, J., et al., A systematic review of open government data initiatives, Government Information Quarterly (2015),
http://dx.doi.org/10.1016/j.giq.2015.07.006
6 J. Attard et al. / Government Information Quarterly xxx (2015) xxx–xxx

‫־‬Data Curation — while not necessarily occurring at a fixed stage, data A typical implementation to opening government data is to collect
curation is vital in ensuring the published data is sustainable. This relevant datasets and their respective metadata and publish them on a
involves a number of processes, including updating stale data, data open government data portal. Open government data portals can have dif-
and metadata enrichment, data cleansing, etc. ferent operators, i.e. either an official government entity or a citizen ini-
tiative. Another difference between open government implementations
is the scope, where a portal or catalogue may publish data relevant to a
Since the complete open government data life-cycle is out of the specific administrative region, for example, a city or a country. A large
scope of this paper, we here proceed to focus on the essential aspects number of countries have created local or national government data por-
of opening government data through portals, namely the processes of tals in order to provide access to open government datasets (Martin,
publishing and consuming open data. Of course, while the aim of Foulonneau & Turki, 2013). Four major sites to date are in the US (data.
opening up government data is achieved, enabling the full exploitation gov), the UK (data.gov.uk), France (data.gouv.fr), and Singapore (data.
of the data is hardly possible in this manner. gov.sg) (Hendler et al., 2012). Such portals act as one-stop-shops and
facilitate consumers' access to government data, saving the trouble of
collecting data from various authorities, offices, or websites.
5. Open government initiatives While the main implementations of open government data initia-
tives are data portals, there exist a number of different implementations
The open government movement aims to achieve a government that with various characteristics. Government Data Catalogues or Metadata
enables cooperation between public administrations and the general Portals/Repositories are indexes which store structured descriptions
public, in order to become more transparent and democratic (Mutuku (metadata) about the actual data (e.g. PublicData.eu). Such tools have
& Colaco, 2012). Open government data does not only enhance the the potential of improving the discoverability of published datasets, as
transparency and accountability of a government, but can result in the discoverability of data is directly dependent on the quality of the
economic benefits, innovative solutions for community advancement, metadata (Reiche & Höfig, 2013). An open government catalogue
as well as supporting public administrations' functions (Bakıcı et al., would contain a collection of metadata records which describe open
2013), (Bertot et al., 2014), (DulongdeRosnay & Janssen, 09, 2014), government datasets and also have the corresponding links to the
(Foulonneau et al., 2014), (Fuentes-Enriquez & Rojas-Romero, 2013), online resources (Kučera et al., 2013), (Marienfeld et al., 2013). The im-
(Kalampokis, Tambouris & Tarabanis, 2011), (Kučera et al., 2013), (Lin plementation of a catalogue, however, raises an important question:
& Yang, 2014), (Maali et al., 2010), (Matheus, Ribeiro, Vaz, et al., What metadata should be stored and how should it be represented? This
2012), (Mercado-Lara & Gil-Garcia, 2014), (Parycek et al., 05, 2014), question is especially significant when automatic importing of metadata
(Solar et al., 2012), (Styrin et al., 2013). Furthermore, these benefits records (also known as harvesting) is performed, as metadata structure
can be achieved simply by publishing and reusing data which has and meaning are not usually consistent or self-explanatory (Marienfeld
already been produced in the day-to-day administration of a governing et al., 2013). Open data portal software such as CKAN14 or vocabularies
entity. We can thus assume that the two major motivations which such as DCAT (Maali et al., 2010) provide solutions for this problem.
prompt governments to jump on the open data bandwagon are the Furthermore, the authors of (Reiche & Höfig, 2013) propose the
(i) spirit of democracy, and (ii) economics (Chan, 2013). Regarding implementation of metadata quality metrics on CKAN-powered govern-
the first motivation, governments exploit open data initiatives in order ment data catalogues with the aim of determining the metadata's
to lift the veil of secrecy and become more transparent. The second mo- adequacy for a user's specific need.
tivation, on the other hand, enables the growth of the information Fig. 5 shows the 2014 Global Open Data Index15 of a number of places
marked by sharing government data. Whilst sensitive or personal data (some places might not be officially recognised as countries). This index
cannot be shared, other data can have economic value to businesses or tracks whether published data is actually released in a way which is ac-
individuals if exploited, and new uses for the particular data can also cessible to all stakeholders, and measures the openness level of data
be discovered. The publishing of data, such as traffic, meteorological, globally. The index represents the percentage of dataset entries that
budgetary, geo-spatial, and geographical data, provides consumers are deemed to be open, based on the Open Definition.16 The technical
with opportunities to create new services, which, apart from being and legal dimensions of each dataset available from the various places
profitable, also can benefit the common good (Bertot et al., 2014). is assessed using the following nine questions:
This, in turn, can potentially contribute to economic growth. Other
important benefits resulting from open government initiatives include 1. Does the data exist?
crowdsourcing for error reporting, increased public service employee 2. Is the data in digital form?
motivation due to the reuse of published data, more informed citizens, 3. Is the data available online?
enhanced citizen participation, and job creation (Parycek et al., 05, 4. Is the data machine-readable?
2014). 5. Is it available in bulk?
To date, 64 countries have joined the Open Government 6. Is the data provided on a timely and up to date basis?
Partnership13 (OGP) to demonstrate their commitment to making 7. Is the data publicly available?
data free to use, reuse and redistribute according to Open Data princi- 8. Is the data available for free?
ples. The OGP initiative aspires to guarantee commitments from 9. Is the data openly licensed?
governments to promote transparency, accountability, empower
citizens, and exploit technologies to strengthen governance. In order 5.1. Assessment frameworks
to be eligible to participate in the OGP, countries (and their respective
governments) should meet the eligibility criteria and demonstrate It is undeniable that with all the current open government initiatives
their commitment to open government principles in four key areas: a large amount of data has been released to the public. This, however,
does not mean that the targeted aims of promoting transparency and
1. Fiscal transparency; facilitating accountability have been achieved yet. For example, the
2. Access to information; authors of (Arcelus, 2012) point out that after interacting with
3. Income and asset disclosures; and
4. Citizen engagement. 14
http://ckan.org.
15
http://index.okfn.org/place/.
13 16
http://www.opengovpartnership.org/. http://opendefinition.org/od/.

Please cite this article as: Attard, J., et al., A systematic review of open government data initiatives, Government Information Quarterly (2015),
http://dx.doi.org/10.1016/j.giq.2015.07.006
J. Attard et al. / Government Information Quarterly xxx (2015) xxx–xxx 7

and the information types that consumers looked for. The authors
proceed to extract a set of requirements from the led studies, and
propose them as part of an open government dataset portals assess-
ment framework. They conclude that while dataset portals were created
with the intention of meeting open government strategies, as yet no
evidence was found in open government literature that publishing a
large amount of datasets actually contributes in promoting transparen-
cy and facilitating accountability.
Another assessment model is proposed by (Sandoval-Almazan &
Gil-Garcia, 2014), whereby the authors analyse existing assessment
models and then proceed to propose a new model catering for previous
discrepancies in the older models. The authors base their assessment on
four conceptual pillars; focusing on collaboration, co-production,
institutional arrangements, legal obligations, and data openness. They
proceed to test and analyse the proposed assessment model using
actual open government data portals.
In (Veljković et al., 2014) the authors propose a benchmark for open
government data initiatives. The benchmark is based on data openness,
transparency, participation, and collaboration. It assesses both the
openness index as well as the maturity of relevant initiatives.

5.2. Open government initiative evaluations


Fig. 5. Global open data index by place (Source: http://index.okfn.org).

The number of evaluations carried on existing portals, catalogues,


transparency websites (such as data portals), consumers do not consid- and other open data initiatives is nearly as varied as the number of ini-
er that transparency and access to information have been achieved. In tiatives itself. Furthermore, since there is no agreed-upon evaluation
such cases, while these portals would be complying with the law and framework as yet (Bogdanović-Dinić et al., 2014), the authors of such
following the requirements for the publishing of information, they literature employ different approaches. Table 2 shows an overview of
would not be promoting transparency in itself. Unfortunately, while the approaches undertaken within literature covered in the rest of this
being aware of such deficiencies, several governments do not tackle section. One should note that while all authors assess various aspects
them. This is because government initiatives are evaluated according of an initiative, they base their evaluation on one or two main aspects.
to whether they are complying with the law or not, and not based on Most of the evaluations assess the published data properties of the
the usefulness of the information provided. A very apt example in this initiative in question using the Five Star Scheme for Linked Open Data
case is the publishing of data in PDF format, which makes it pretty or the eight open government data principles, however others also
inconvenient for any intended use, reuse and redistribution. assess data availability, data content, and data accessibility. Two other
Another contributor to the afore-mentioned deficiencies is the lack popular assessment approaches consider the features and functions of
of an agreed-upon framework to evaluate and assess the content pro- an initiative (usually in the form of a portal). In contrast to the above
vided on such data portals (Arcelus, 2012). Whilst various authors pro- approaches, some authors assess the maturity of an initiative as a
pose and discuss recommendations and requirements for evaluating whole, rather than based on specific aspects such as data, functions or
data portals or catalogues, the contribution varies from publication to features. In the latter cases, the maturity is assessed based on other
publication. On the other hand, the majority of the proposed frame- aspects such as the amount of fulfilled objectives, compliance to existing
works and recommendations reflect the Five Star Scheme for Linked laws and regulations, and the usability from a stakeholder's point of
Open Data by Tim Berners-Lee,17 such as (Höchtl & Reichstädter, view. A number of approaches in literature also consider stakeholder
2011), as well as the eight open government data principles,18 such as in participation in the initiative in question, as well as their feedback.
(Lourenço, 2013). Table 1 gives an overview of the different aspects Within the results of our systematic survey, one of the most popular
on which the following literature focus on within the discussed assess- approach was to evaluate the functionality of portals or catalogues. The
ment frameworks. In this table, by ‘Nature of Data’ we mean the assess- authors of (Marienfeld et al., 2013), (Matheus, Ribeiro, Vaz, et al., 2012),
ment of various data aspects according to the Five Star Scheme for (van der Waal et al., 2014) follow this approach, for GovData.de,
Linked Open Data, and the eight open government data principles. Brazilian anti-corruption and transparency portals, and PublicData.eu
The authors of (Bogdanović-Dinić et al., 2014) propose a model for respectively. Through the three publications, the authors assess these
assessing data openness by relying on the Eight Open Government portals by identifying the functions, limits, and challenges of the
Data Principles. With the aim of automatically evaluating openness, evaluated portals, and also give recommendations towards avoiding
the model was implemented as a web tool, and it also aids in the process or solving the challenges.
of building openness principles. The authors applied the model to seven Another popular approach was to evaluate the features provided in
data portals with the aim of demonstrating its capabilities and results. data portals and their usability, such as the number of data formats
The authors of (Lourenço, 2013) also propose an assessment frame- available and multilinguality. The authors of (Liu et al., 2011) and
work. They focus on accountability, aiming to propose a set of require- (González et al., 2014) both evaluate, for different use-cases, how por-
ments intended to assess whether portals are actually contributing to tals and catalogues actually enable the use, reuse, and distribution of
a higher degree of transparency. The authors raise two essential data. They identify shortcomings such as consumers' difficulty in identi-
questions regarding the effectiveness of portals in making data available fying the required datasets and the use of different formats. The authors
for accountability purposes, and how this can be evaluated. They of (Fuentes-Enriquez & Rojas-Romero, 2013) and (Sandoval-Almazan
analyse the related literature on internet-based transparency and et al., 2012), in a similar manner to the previous authors, evaluate the
consider two dimensions, namely the type of public entities studied use of open data through mobile applications.
In (Sayogo et al., 2014), the authors carry out a preliminary explora-
17
http://www.w3.org/DesignIssues/LinkedData.html. tion of the worldwide status of open government. The authors analyse
18
https://public.resource.org/8_principles.html. the open government data portals from 35 countries, reviewing the

Please cite this article as: Attard, J., et al., A systematic review of open government data initiatives, Government Information Quarterly (2015),
http://dx.doi.org/10.1016/j.giq.2015.07.006
8 J. Attard et al. / Government Information Quarterly xxx (2015) xxx–xxx

Table 1
Overview of aspects evaluated by assessment frameworks proposed in literature.

Data Portal External factors Public engagement

Nature Accountability Transparency Access to Openness Legal Institutional Participation Collaboration


of Data Information Obligations Arrangements

(Arcelus, 2012) ✓ ✓ ✓
(Bogdanović-Dinić et al., 2014) ✓ ✓ ✓ ✓ ✓
(Höchtl & Reichstädter, 2011) ✓
(Lourenço, 2013) ✓ ✓ ✓ ✓
(Sandoval-Almazan ✓ ✓ ✓ ✓ ✓
& Gil-Garcia, 2014)
(Veljković et al., 2014) ✓ ✓ ✓ ✓ ✓

published data, the provided features, and the level of stakeholder hand, strive to understand the link between the Italian open government
participation. They also provide a framework for assessing open govern- data legislation and a newly enacted Transparency Act. The authors of
ment data initiatives. The authors of (Prieto et al., 2012) and (Rojas (Martin, Foulonneau & Turki, 2013) also have a specific aim; that is, to as-
et al., 2014) also evaluate the status of open government initiatives, sess the data openness level through metadata quality.
however, they directly focus on the Colombian government initiative Publications (Parycek et al., 05, 2014), (dos Santos Brito, dos Santos
as a whole, rather than for specific portals. Similarly, the authors of Neto, et al., 2014), (Vasa & Tamilselvam, 2014) and (Jetzek et al., 2014)
(dos Santos Brito, da Silva Costa, et al., 2014) and (Matheus, Ribeiro & all assess stakeholders' opinion to a certain degree. Through interviews
Vaz, 2012) both discuss the current state of Brazilian open data initia- and online polls, the authors of (Parycek et al., 05, 2014) identify a
tives. The authors of (Lin & Yang, 2014) and (Yang et al., 2013) both number of factors that enabled the success of the open government
lead out a study on the Taiwanese open data platforms with the aim data strategy in Vienna. The authors of (dos Santos Brito, dos Santos
of identifying their status. The Greek open data movement is analysed Neto, et al., 2014) and (Vasa & Tamilselvam, 2014) discuss issues and
in (Alexopoulos et al., 2013), where the authors analyse the current challenges of developing applications that implement open government
state of open data from three different perspectives, namely the data. While the former extract these challenges from evaluating and
functionality, the semantics of the data, and the provided features. analysing an application developed during an organised hackathon,
The authors of (Egger-Peitler & Polzer, 2014) and (Palmirani et al., the latter describe the challenges they faced during the development
2014) also evaluate the status of open government data initiatives, how- of their own application. Finally, the authors of (Jetzek et al., 2014)
ever, they have more specific aims. The former attempt to analyse the re- analyse the interpretations and perspectives of stakeholders with
lationship between the open data ambitions at the European level and regards to opening municipalities' data in Sweden, and strive to identify
those at the Austrian federal level (focusing mostly on Vienna), both how the stakeholders' opinion contributes to the implementation
from the data consumer and producer side. The latter, on the other success of open data initiatives.

Table 2
Overview of evaluated aspects in open government initiative evaluations.

Data Functionality Features Stakeholder Initiative Stakeholder Evaluated Initiatives Geographic Coverage
Participation Maturity Feedback

(Alexopoulos et al., 2013) ✓ ✓ ✓ Various portals Greece


(Arcelus, 2012) ✓ ✓ n/a Mexico
(Egger-Peitler & Polzer, 2014) ✓ data.wien.gv.at Vienna, Austria
(Fuentes-Enriquez & Rojas-Romero, ✓ ✓ Various mobile applications Mexico
2013)
(González et al., 2014) ✓ ✓ datosabiertos.df.gob.mx, Mexico
labplc.mx/hackdf-2
(Jetzek et al., 2014) ✓ ✓ n/a Stockholm and Skellefteå,
Sweden
(Lin & Yang, 2014) ✓ ✓ Various Portals Taiwan
(Liu et al., 2011) ✓ Various Portals and Agencies Australia
(Marienfeld et al., 2013) ✓ ✓ GovData.de Germany
(Martin, Foulonneau & Turki, 2013) ✓ PublicData.eu Europe
(Matheus, Ribeiro & Vaz, 2012) ✓ ✓ ✓ Various entities and portals Brazil
(Matheus, Ribeiro, Vaz, et al., 2012) ✓ Mato Grosso, Paraíba, Piauí and Brazil
Paraná
(Palmirani et al., 2014) ✓ ✓ Various portals Italy
(Parycek et al., 05, 2014) ✓ n/a Vienna, Austria
(Petychakis et al., 05, 2014) ✓ ✓ ✓ Various portals European Union
(Prieto et al., 2012) ✓ www.datos.gov.co Colombia
(Rojas et al., 2014) ✓ datosabiertoscolombia.cloudapp.net Colombia
(Sanabria et al., 2014) ✓ ✓ ✓ n/a Colombia, Chile, Brazil
(Sandoval-Almazan et al., 2012) ✓ various mobile applications Various Countries
(dos Santos Brito, dos Santos Neto, ✓ ✓ Meu Congresso Nacional Brazil
et al., 2014) Application
(dos Santos Brito, da Silva Costa, ✓ ✓ ✓ Rio Inteligente and Cidadão Brazil
et al., 2014) Recifense
(Sayogo et al., 2014) ✓ ✓ ✓ n/a Worldwide
(Vasa & Tamilselvam, 2014) ✓ ✓ ✓ RASOI mobile application India
(van der Waal et al., 2014) ✓ ✓ www.stat.gov.rs, PublicData.eu, Europe, Serbia, Poland
INSIGOS
(Yang et al., 2013) ✓ PublicData.eu, INSIGOS Taiwan

Please cite this article as: Attard, J., et al., A systematic review of open government data initiatives, Government Information Quarterly (2015),
http://dx.doi.org/10.1016/j.giq.2015.07.006
J. Attard et al. / Government Information Quarterly xxx (2015) xxx–xxx 9

The authors of (Petychakis et al., 05, 2014) do not focus on a single Government Data Principles, in fact, regard the format in which data is
aspect for their evaluation. Rather, they carry out a comprehensive published, and state that such data should be made open to the public
analysis of open government data initiatives in the European Union, in a machine processable data format which is non-proprietary.
focusing on functions, data semantics, and features. They collect and Unfortunately, while this is a guideline, it is not legally required by
categorise a number of public data sources for each European Union many open government initiatives (which only require the publishing
member country, and they assess their characteristics and provided of data). Many governmental entities still publish data in a large variety
services. The authors identify the differences in content, licences, of data formats which can also be proprietary. This has resulted in a
multilinguality, data accessibility, data provision, and data format. number of data silos which appear to be available for use but which in
Finally, the authors point out that while the quality of open government reality require significant effort before being actually usable (Davies &
infrastructures is improving, there are still great differences between Frank, 2013), (Hendler et al., 2012), (Jiříček & Di Massimo, 2011), (Liu
national open data portals. The authors also identify two important et al., 2011), (Maali et al., 2010), (Martin, Foulonneau & Turki, 2013),
challenges which are still not catered for, namely multilinguality and (Shadbolt et al., 2011).
open licences. In a similar but downscaled manner to (Petychakis In an ideal world, in order to achieve economic growth, governmen-
et al., 05, 2014), the authors of (Sanabria et al., 2014) compare three tal entities (data providers) should take into account the requirements
South American open government data initiatives (Brazil, Colombia, of the data end-users (data consumers) (Zuiderwijk & Janssen, 2013).
Chile), however, the authors rather focus on open government policies, This should include the specific formats that are most convenient for
citizen involvement and the use of new technologies. The authors of the widest spectrum of consumers. W3C recommends the use of
(Arcelus, 2012) also focus their evaluation on different aspects, namely established open standards and tools, such as XML and RDF as a publish-
accountability, content, and usability. They review the evaluation ing format.19 A feasible solution would then be to enforce data providers
literature of Mexican e-transparency websites with the aim of defining to publish their data in machine processable and non-proprietary
a theoretical framework that could enable governments to improve formats through the open government initiatives in which they partake
their portals' contribution beyond standard transparency obligations. (Solar et al., 2013). Thus the portal's ‘success’ would not only be
evaluated on the amount of data published, but also on the usability of
5.3. Challenges this data.

Even though there are numerous open government data initiatives,


5.3.2. Data ambiguity
there still exist a number of setbacks which prevent them from reaching
While of course any machine-readable data format, such as CSV,
their full potential. Through the evaluations led in the literature
is preferred over non-readable ones, such as PDF, more expressive
mentioned in the previous section, and otherwise, we identified the
data formats are generally preferred, simply because they are more
most common challenges faced, and also propose possible solutions.
descriptive of the actual data they represent. This decreases the
While the following challenges vary in domain, they are mostly barriers
risks of ambiguity and misinterpretations (Martin, Foulonneau &
of a technical nature. Public access to government data also remains
Turki, 2013). Consider the example of the concept of a year. While
challenging due to the heterogeneous and dispersed nature of the
a calendar year would be the most common in our everyday lives,
data. The lack of consumers exploiting existing open data portals indi-
some financial agencies within the public sector might use a financial
cates that there is the need to understand what factors influence partic-
year to describe their data (Liu et al., 2011). This leads to difficulties
ipation in open data, and the requirement to engage stakeholders in
when attempting to find relationships between two datasets due to
participating and collaborating. If the projected consumers of the data
this difference in temporal representation. Semantic ambiguity
do not use it, then the objective of open government initiatives is futile.
therefore would require extra efforts in order to link and understand
For a portal to be successful, consumers (including citizens, end users
the data in question (Conradie & Choenni, 2012). Similar to (Davies &
and beneficiaries) must be made aware of the published data, and its
Frank, 2013) we can thus conclude that although data is available in
relevance and usefulness (Mutuku & Colaco, 2012). Considered to be a
a machine readable format, such data is not really useful unless it is
core pillar of democratic society, the collaboration between a govern-
easily understandable; maybe by requiring just minimal background
ment and its citizens has the potential of open data consumption, policy
knowledge on the subject.
making, service delivery, and also political opinions and decisions
A simple enough solution for this issue is to publish data with
(Veljković et al., 2012). This interaction would allow the government
descriptive titles, or otherwise provide a key to code names, if the latter
to provide more citizen-centred services and data.
are used (OHara, 2014). This would help data consumers to clearly and
In literature such as (Yang & Kankanhalli, 2013) the authors attempt
easily understand what the data is about, and if it is actually useful for
to identify what influences the participation of stakeholders in consum-
them. The use of RDF as a data format is also encouraged as it is a highly
ing open data, with the aim of mitigating the barriers they face. Further-
descriptive data format.
more, the authors of (Chan, 2013) establish strategies to ensure that
open data initiatives reach the desired participation rate. Similarly, in
(Solar et al., 2012), the authors tackle the question of what kind of ser- 5.3.3. Data discoverability
vices should governmental entities provide in order to increase stake- Publishing data and making it accessible qualifies as ‘open data’,
holder participation. In contrast, the authors of (Bertot et al., 2014) however open data also needs to be discoverable. The discoverability
focus on issues that smaller communities face when attempting to con- of open data is bound to the quality of the metadata describing the
sume open data. The authors analyse these issues with the aim of en- data itself, which is not always complete or accurate (Conradie &
hancing public participation with the purpose of creating local data Choenni, 2012), (Kučera et al., 2013), (Martin, Foulonneau & Turki,
infrastructures. In (Sheffer Correa et al., 2014) the authors attempt to 2013), (Reiche & Höfig, 2013). In addition, other factors lead to difficul-
give structure to unstructured documents (such as PDF) and store ties in finding useful data quickly (Liu et al., 2011). For instance, some
them in repositories compliant with open government data principles, portals support only simple search functions which do not return only
with the aim of providing stakeholders with analysis functionality and relevant data, but also related policies and documents such as research
unrestricted data access. papers (Alexopoulos et al., 2014). This may result in the user being
overloaded with information (Zuiderwijk & Janssen, 2014b) and having
5.3.1. Data formats to go through all the results to potentially identify the relevant datasets.
The whole point of opening and publishing data in portals is to
enable its use, reuse, and redistribution. Two of the Eight Open 19
http://www.w3.org/TR/gov-data/#formats.

Please cite this article as: Attard, J., et al., A systematic review of open government data initiatives, Government Information Quarterly (2015),
http://dx.doi.org/10.1016/j.giq.2015.07.006
10 J. Attard et al. / Government Information Quarterly xxx (2015) xxx–xxx

Moreover, most portals only allow users to simply download the (Marienfeld et al., 2013). Hence, provenance does not only regard
available data, with no possibility of exploring it directly through the the source of the data, but also how the data was modified or manip-
portal (for example through visualisation). These issues are particularly ulated during the publishing process.
evident when the data consumers do not know the responsibilities of Here again, named graphs can be a solution to provenance issues, as
the government entity in question or the data structures that they different provenance metadata can be attached to datasets with varying
implement, making it even harder to locate the relevant data they provenance (Shadbolt et al., 2011, chap. 20). With a somewhat different
need. The fact that even most of the datasets are spread over a number approach, the authors of (Maali et al., 2010) propose a standard
of decentralised data sources further aggravates the problem (Conradie interchange format which enables federated search over catalogues or
& Choenni, 2012), (dos Santos Brito, Silva Costa, et al., 2014), portals with overlapping scope, providing a way around this problem.
(Zuiderwijk & Janssen, 2014a). Using a more concrete approach, the W3C Provenance Incubator
A number of efforts in the literature focus on metrics which assess group,20 on the other hand, strives to provide a roadmap in the area of
metadata quality. The authors of (Reiche & Höfig, 2013), for example, provenance for Semantic Web technologies.
tackle the problem of metadata quality by applying five quality metrics,
namely: completeness, weighted completeness, accuracy, richness of
information, and accessibility, to three public government data 5.3.6. Public participation
repositories. This evaluation is carried out with the aim of measuring A very relevant challenge to achieving the full potential of published
the metadata's efficiency, identifying low-quality metadata records, datasets in portals is their use, or lack thereof. The increasing number of
and also understanding the reasons behind the origin of the low quality. open data initiatives, where government entities are opening up their
Evaluated metadata is then assigned a quality score which enables the data, ideally would result in increased transparency, participation, and
uniform comparison of the metadata quality across different reposito- innovation (Reiche & Höfig, 2013). Yet, as the authors of (Chan, 2013),
ries or catalogues. Evaluated metadata can consequently be improved (Edelmann et al., 2012), (Foulonneau et al., 2014), (Fuentes-Enriquez
in order to achieve better searchability, and subsequently better & Rojas-Romero, 2013), (Matheus, Ribeiro, Vaz, et al., 2012), (Yang &
discoverability. Kankanhalli, 2013), (Zuiderwijk et al., 05, 2014) point out, the full
potential of consumer participation and collaboration for achieving
5.3.4. Data representation innovation in government services has yet to be reached. Participation,
The heterogeneity of the published datasets and their representation as defined by (Sayogo et al., 2014), means the extent to which
is quite an obvious setback for open government data initiatives. Data as stakeholders can participate in the governance of an open government
varied as traffic, budget, geographical, and environmental data, etc., is data portal, such as suggesting what data to publish, or rating datasets
published onto portals in a non-standardised manner, meaning that or features on the portal itself. Collaboration, an extension to participa-
there exists a large heterogeneity in terms of semantics, standards, tion, refers to features on a portal that enable cooperation and collabo-
and most importantly in this case: schema. This leads to interoperability ration amongst different stakeholders.
issues and challenges to aggregate existing metadata in a way that
would be useful for data consumers (Böhm et al., 2012), (Hendler
et al., 2012), (Marienfeld et al., 2013), (Martin, Foulonneau & Turki, 6. Publishing and consuming open government data
2013). Additionally, such heterogeneous data would potentially even
require the to be mapped to a global schema. A further aspect to this The act of publishing data is the very basis of open government
issue is versioning. An ideal representation of a dataset would also data initiatives. Government and public entities are sharing data on
capture how it evolves over time. the Internet at an astonishing pace. Yet, there is a lack of agreed-
A number of efforts in the literature approach this challenge by upon standards for data publishing (dos Santos Brito, Silva Costa,
proposing a generic schema. For example, in (Marienfeld et al., 2013), et al., 2014), and as discussed in detail in Section 5.3, there are
the authors propose a minimal schema that is compatible with the pre- many challenges to be overcome in order for the published data to
dominant data catalogue vocabulary and software. The schema be exploited to its full potential. While not all challenges are directly
supports the description of datasets as well as documents and applica- related to publishing issues, tackling these issues at the root could
tions, and most importantly includes a list of resources containing prevent subsequent issues related to data consumption. For exam-
pointers to the actual data, documents, or applications. In contrast, the ple, if data is published in a machine-readable format with good
authors of (Maali et al., 2010) propose a standardised interchange metadata descriptions, then usability issues will most probably be
format which enables machine-readable representations of data avoided when it is consumed.
catalogues. Thus, for catalogues differing widely in scope, terminology, The publishing of data enables it to be available for use by the public,
structure, and metadata fields, this contribution acts as an interoperabil- in an attempt to achieve the main aim of open government data initia-
ity format. With regards to versioning, a solution to the issue is the use tives; namely to use, reuse and distribute the published data. This is only
of Named Graphs (Carroll et al., 2005), where the metadata represents achievable through the consumption of the data by stakeholders. Data
the temporal validity of the annotated RDF data. However, this solution consumption is possible through a number of means. The most direct
is only available with the use of RDF. example is to obtain a copy of the actual published data, generally
with the aim of using it in a specific use-case. Certain portals might
5.3.5. Overlapping scope also provide exploration tools, where a data consumer can simply look
Provenance, whilst not a challenge in itself, is also an issue. through the published data. Other tools, such as analysis tools, enable
Provenance refers to details about the origins of data, or, in other a consumer to actually identify potential patterns in the published
words, who created or generated the data. The issue with provenance data. Usually analysis tools also provide for visualisations, which aid
occurs when there is the assumption that data strictly travels in a data consumers to view the data in a pictorial manner. An even more
vertical direction, for example from local, to regional, national, hands-on way of consuming the data is to create mashups, where
European and international level. There are numerous parallel enti- different datasets are merged in order to create new knowledge using
ties which collect data, and then pass it on to another relevant entity. existing data.
For example, budget datasets from a city can be published on the
city's portal, but also transferred to the entity taking care of cities
within a specific region. This results in an overlapping scope, 20
http://www.w3.org/ 2005/Incubator/prov/wiki/W3C_Provenance_Incubator_Group_
where data may have duplicates, but also new or modified data Wiki.

Please cite this article as: Attard, J., et al., A systematic review of open government data initiatives, Government Information Quarterly (2015),
http://dx.doi.org/10.1016/j.giq.2015.07.006
J. Attard et al. / Government Information Quarterly xxx (2015) xxx–xxx 11

6.1. Publishing data The basis of most of these guidelines are the Eight Open Government
Data Principles:
In this section we provide a classification of different data publishing
approaches, and proceed to discuss guidelines and best practices for 1. Complete — all available public data that is not subject to privacy,
publishing data in any data publishing effort. security or privilege limitations is made available.
2. Primary — data is made available as it is available at the source, and
6.1.1. Data publishing approach classification not aggregated or modified.
There are countless methods towards publishing data. Following the 3. Timely — data is made available to the public as soon as possible after
contribution within (Kalampokis, Tambouris & Tarabanis, 2011), we the actual data is created, in order to preserve the value of the data.
here classify open government data publishing initiatives into two: 4. Accessible — data is made available to all consumers possible, and
with no limitations on its use.
1. The technological approach — followed by the data publisher in the 5. Machine Processable — data is published in a structured manner, to
actual act of publishing data, i.e. making the data available on the allow automated processing.
Web. Publishing initiatives are classified within the first approach 6. Non-Discriminatory — data is available for all to use, without
depending on the variation of technologies implemented for requiring any registration.
publishing the data. These include: 7. Non-Proprietary — data is published in a format which is not
(a) the format of the published data (proprietary, machine readable, controlled exclusively by a single entity.
descriptive); 8. Licence-Free — other than allowing for reasonable privacy, security
(b) the access method (RESTful APIs, custom APIs, search interfaces); and privilege restrictions, data is not subject to any limitations on
(c) the use of linked open data principles (HTTP, URIs, RDF); and its use due to copyright, patent, trademark or trade secret
(d) the level of linkage to different datasets (LOD cloud 21). regulations.

As is evident, the above reflect most of the existing guidelines for pub- The above principles provide a roadmap for the data publisher and
lishing data, especially the Five Star Scheme for Linked Open Data. help result in good open government data with the best potential for
2. The organisational approach — followed by the data provider, i.e. the being consumed by the stakeholders. Further to these principles, the
manner in which the data is provided to the data consumers. The Five Star Scheme for Linked Open Data, listed below, provides a more
second dimension for open government data publishing initiatives fo- technical guide towards publishing linked open data, the epitome of
cuses on the provision of data, rather than the actual act of publishing. open government data initiatives:
The authors of (Kalampokis, Tambouris & Tarabanis, 2011) identify
two different methods of providing linked open data, the epitome of 1. Available on the Web in any format but with an open licence (Open
an open government initiative, each with their own advantages and Data);
disadvantages. 2. Available as machine-readable structured data (e.g. Microsoft Excel
(a) Direct Data Provision — Direct Data Provision involves a portal table instead of image scan of a table);
aggregating all processed and value-added data provided by a 3. Available as machine-readable structured data in a non-proprietary
public entity. In this case, the data publisher is not necessarily format (e.g. CSV instead of Microsoft Excel);
the same as the data provider (public entity). In the case that 4. All of the above as well as using open standards from W3C (RDF and
the latter are 2 different entities, the maintainability is limited SPARQL) to identify things;
unless an effective data synchronisation process is in place. For 5. All of the above as well as linking the published data to other existing
example, if the original data from the public entity changes over data to provide context.
time, this change must be reflected in the data published on the
In order to provide official guidelines, the W3C eGov Interest Group
data portal, otherwise the data provided here will be obsolete
has also developed the following set of steps for publishing open
(dos Santos Brito, Silva Costa, et al., 2014). An advantage of
government data,22 which emphasise standards and methodologies to
having direct data provision, however, is the consumers' direct
encourage the publishing of government data, with the aim of enabling
access to data through a single entry point.
easier use by the public:
(b) Indirect Data Provision – data catalogues are a good example of
indirect data provision, where the data cannot be directly 1. Identify — the use of permanent, patterned and/or discoverable URI/
accessed through the catalogue. Catalogues contain links (meta- URLs enables processes and people to find and consume the data
data) to the actual data provided by the public entity. Therefore, more easily.
in this case, the data provider is usually also the data publisher. 2. Document — documentation helps the data to be more understand-
To access data, a consumer has to search for the relevant data able and less ambiguous, as well as enabling easier data discovery.
through the catalogue, then follow the provided links to the The use of formats such as XML/RDF would be self-documenting.
public entity that provides the actual data. In contrast to direct 3. Link — linked data contains links to other data and documentation,
data provision, indirect data provision has the advantage of providing context.
being up to date and unique, since the actual data is published 4. Preserve — the use of versioning of datasets enables data consumers
by the data producer itself. On the other hand, processed and to cite and link to present and past versions, where new and
value-added data has to be performed by the data consumer, as upgraded datasets can refer back to original datasets. Versioning
it cannot be provided by the data catalogue. also allows the documentation of changes between versions.
5. Expose interfaces — to make it easier for published data to be discov-
ered and explored, published data should be both human-readable
6.1.2. Publishing guidelines and machine-readable. Preferably, data should be published separate
In order to tackle the previously-mentioned issues in Section 5.3, from the interface, and external parties should have direct access to
and other publishing-related problems, a number of publications in raw data. This enables them to build their own interfaces if needed.
the literature, such as (Höchtl & Reichstädter, 2011), (Liu et al., 2011), 6. Create standard names/URIs for all government objects — the use of a
(Solar et al., 2012), propose guidelines for publishing data on the Web. unique identifier for each object is as important as having

21 22
http://lod-cloud.net/. http://www.w3.org/TR/gov-data/.

Please cite this article as: Attard, J., et al., A systematic review of open government data initiatives, Government Information Quarterly (2015),
http://dx.doi.org/10.1016/j.giq.2015.07.006
12 J. Attard et al. / Government Information Quarterly xxx (2015) xxx–xxx

information about the object itself. This aids in discoverability, im- et al., 2014). In an attempt to enhance the consumption experience,
proves metadata, and ensures authenticity. the authors of (Janev et al., 2014) explore the challenges and issues
related to the integration and analysis of open data. Amongst other
Along with the above, the W3C eGov Interest Group also discusses challenges the authors identified:
the importance of choosing what data to publish, the right format to
publish it in, and the restrictions on its use. Data which is to be shared – the lack of standard procedures for querying government portals;
with the public should be published in compliance with applicable – the low quality of metadata;
laws and regulations, and only after addressing issues of security and – low reliability and non-completeness of public datasets; and
privacy. Such data is usually already available in other formats, and – the heterogeneity of formats used to publish open data.
may already have been shared with the public in other ways. The best
format to publish this data is in its raw form serialised as XML and
RDF, to allow for easy manipulation. The use of established open They proceed to propose a linked open data approach to modelling,
standards is also recommended. Finally, the published data should merging and analysing specific data; namely spatio-temporal and statis-
have clear documentation on any legal or regulatory restrictions on tical data.
the use of that data. The authors of (Jetzek et al., 2014) tackle the question of how open
The authors of (Liu et al., 2011) present some recommendations for data can encourage the creation of sustainable value. They discuss that
data publishing and analysis based on a survey on the sustainability new methods of generating value can be brought about by the sharing
related datasets published by the Australian government, with the aim and reuse of open data. The authors proceed to propose a model
of identifying underlying opportunities and issues. While not entirely describing how various processes within an open data system can
reflecting the above-mentioned guidelines, the proposed recommenda- generate sustainable value, based on a number of contextual factors
tions complement the essential aspects. The authors tackle commonal- that provide stakeholders with the motivation, the opportunity, and
ities amongst data published by different public entities, the ideal the ability to create it.
formats for publishing data as linked data, its discoverability, and its
re-usability. 6.3. Data quality
Similarly, the authors of (Dulong de Rosnay & Janssen, 09, 2014)
identify common issues and challenges to the accessibility and reusabil- As defined in Section 3, data quality has no agreed-upon definition,
ity aspects of public sector information. They point out that such and apart from being cross-disciplinary, it is also subjective (OHara,
obstacles can be of legal, institutional, technical or cognitive nature. 2014). Also, the publishing of data on portals does not guarantee that
They proceed by providing common solutions that can be implemented it is of good or high quality (DiFranzo et al., 2011), (Reiche & Höfig,
to overcome these issues. 2013). For these reasons, we hereby do not define how published data
In (Solar et al., 2012), the authors propose a maturity model for open can be of good quality, but we discuss the different aspects which influ-
data, with the aim of assessing the commitment and capabilities of ence the quality of the data, whether positively or negatively.
public agencies in pursuing the principles and practices of open data. The authors of (Ochoa & Duval, 2006) propose a set of metrics to
The authors extend the discussed guidelines and principles by consider- identify metadata quality, based on parameters used for human
ing other aspects towards publishing data, including an Establishment reviewing. The authors of (Reiche & Höfig, 2013) build upon these
and Legal Perspective, a Technological Perspective, and finally a Citizen metrics, adapting them for assessing the quality of the actual data, rath-
and Entrepreneurial Perspective. er than the metadata. Similarly, in (Kučera et al., 2013), (Lourenço,
Another maturity model was defined in (Lourenço & Serra, 2014). 2013), the authors discuss a number of quality dimensions, as found
Here the authors aim towards identifying essential contextual aspects in the majority of related literature. We here establish the following
which affect the way data is published by public entities on their portals. criteria which are considered by most efforts in the literature for
The latter aspects are then organised into an online transparency for an calculating data quality. The authors of (Kučera et al., 2013) identify
accountability maturity model, which has the purpose to assess the two types of strategies for improving data quality; namely data-driven
level of advancement of a governing region. In other words, researchers and process driven. The first involves directly modifying the values of
requiring to assess an entity should start by analysing the context using data, such as correcting invalid data values or normalising data. The sec-
the proposed maturity model, and then proceed to define the assess- ond involves the redesign of the data creation and modification process-
ment model depending on the identified maturity level. es in order to identify and correct the cause of quality issues, such as
implementing a data validation step in the data acquisition process.
6.2. Consuming data Efforts in publications such as (Debattista et al., 2014), (Kučera et al.,
2013), (Reiche & Höfig, 2013) take a number of quality dimensions and
The provision of data enables stakeholders (whether individuals, implement them, with the aim of assessing the quality of published
businesses, NGOs, or otherwise) to not only scrutinise the published data. The authors of (Debattista et al., 2014) evaluate and assess the
data, but also to stimulate stakeholders to create, deliver, and use new datasets’' quality in such a way that consumers can then identify the
services that are coupled with the published data (Edelmann et al., ideal quality for the intended use, attaching the results of the evaluation
2012). Services can be as simple as offering exploration of the published to the actual dataset graph. In (Kučera et al., 2013), the authors focus on
datasets, but may also include visualisation and data discovery services the quality of catalogue records within initiatives in the Czech Republic.
such as data mining and comparative analysis. The latter enable stake- They proceed to propose some techniques and tools to improve the
holders to explore the data and identify patterns. Furthermore, if the quality of the data catalogue records. Similarly, the authors of (Reiche
published data is linked with other data on the Web, the services can & Höfig, 2013) propose quality assessment metrics and implement
be enhanced with mashups, and further increase the knowledge that them in three public government data repositories.
can potentially be discovered through the available data. This opportu-
nity then potentially proceeds to an improvement in e-government ser- 6.3.1. Usability
vices provision, increasing work opportunities and finally contributing This is the most “generic” quality criterion. By usability we mean
to economic growth (Chan, 2013), (Kalampokis, Tambouris & how easily can the published data be used. It is the most generic as it
Tarabanis, 2011). depends on other quality dimensions whether the published data is
Unfortunately, few open government data portals provide consump- usable or otherwise. For example, it is directly related to what degree
tion functionalities other than simple data downloads (Alexopoulos the data is accessible, open, interoperable, complete, and discoverable

Please cite this article as: Attard, J., et al., A systematic review of open government data initiatives, Government Information Quarterly (2015),
http://dx.doi.org/10.1016/j.giq.2015.07.006
J. Attard et al. / Government Information Quarterly xxx (2015) xxx–xxx 13

(Liu et al., 2011), (Martin, Foulonneau, Turki, et al., 2013). The more the data truly open. Here we shortly discuss relevant issues, including
published data is usable, the more potential data consumers are those pointed out by the authors of (Conradie & Choenni, 2012),
encouraged to reuse and exploit the data. (Dulong de Rosnay & Janssen, 09, 2014), (Eckartz et al., 2014), (Verma
& Gupta, 2013), (Zuiderwijk & Janssen, 2014b). These challenges vary
6.3.2. Accuracy between organisational, economic and financial, policy and legal, and
By accuracy we mean the extent to which a data/metadata record cultural barriers.
correctly describes the respective information (Kučera et al., 2013),
(Martin, Foulonneau & Turki, 2013), (Reiche & Höfig, 2013). With 6.4.1. Factors which discourage entities from joining an open government
respect to metadata, this quality dimension directly affects the data initiative
discoverability of datasets, as good quality metadata enables the dataset
to be easily discovered by data consumers. 6.4.1.1. Awareness. The concept of open data, while not new, might seem
a daunting task for people unfamiliar with the term and what it involves
6.3.3. Completeness (DulongdeRosnay & Janssen, 09, 2014), (Verma & Gupta, 2013). Public
This quality dimension deals with the number of completed fields in a entities in the past would have only been concerned with delivering
data/metadata record (Reiche & Höfig, 2013), (Solar et al., 2012), (Ochoa reports formatted to given templates. Recent requests to provide data
& Duval, 2006). Thus, a record is considered complete only when the in its raw format might not be understood clearly (Conradie &
record contains all the information required to have the ideal represen- Choenni, 2012). For this reason, the value and potential use of raw
tation of the described data. The completeness of the metadata, like open data needs to be highlighted (Zuiderwijk & Janssen, 2014a).
accuracy, also directly affects the discoverability of datasets.
6.4.1.2. Motivation. The provision of raw data can be considered to be
6.3.4. Consistency extra work without any purpose, especially to public entities such as
The consistency of record fields depends on whether they follow a those described above (Verma & Gupta, 2013). The value of the data
consistent syntactical format, without contradiction or discrepancy within generated during day-to-day administration needs to be pointed out.
the entire catalogue of metadata (Maali et al., 2010), (Kučera et al., The reuse of open datasets can be a great motivator in portraying the
2013). Apart from the syntactical format, a field is considered to be unexpected use of the generated data, and can also help the data
consistent if the respective values are selected from a fixed set of producers in understanding the true value of the data they create and
options. An example of inconsistency is if within two records the use publish (Parycek et al., 05, 2014).
of “U.S.” and “United States” is interchangeable. Another example is
the representation of dates, where the date, month and year follow an
6.4.1.3. Capacity. The use of open data should be targeted towards no-
arbitrary order.
body in particular. Having said that, it should be available for the use,
reuse, and distribution of all, whether machines or humans. Unfortu-
6.3.5. Timeliness
nately, many entities are not so open-minded about the application of
By this quality dimension we mean the extent to which the data or
open data, and rather focus on the simple publishing of data rather
metadata is up to date. As pointed out in Section 6.1, the organisational
than ensuring that it is of good quality in this aspect. Furthermore,
approach affects the timeliness of the published data, which depends
public entities might focus on publishing data with no value, rather
on whether the data is directly or indirectly provided by the data
than other, more relevant, data (Verma & Gupta, 2013), (Zuiderwijk &
provider.
Janssen, 2014b). There is the urgent need for the application of
standards and large-scale training in order to overcome these issues.
6.3.6. Accessibility
As identified by the authors of (Ochoa & Duval, 2006), the accessibil-
ity quality dimension has two measures. The cognitive accessibility de- 6.4.1.4. Budget provision. Being a relatively new concept, there might not
fines how easy it is for a data consumer to understand the published be any local budget allocation for open government data efforts (Verma
information. Several aspects of the data affect the cognitive accessibility, & Gupta, 2013). Considering the required processes for publishing data
such as the ambiguity of the data, discussed in Section 5.3. The second are “extra” tasks, requiring effort, resources, and time, there is the new
measure is the psychological or logical accessibility, which can be defined necessity of having a specific budget allocated for this purpose,
as the ease with which the relevant dataset is discovered through a data otherwise there is the risk that open government initiatives are not
catalogue or repository. As discussed in Section 5.3, this quality dimen- given the priority they deserve, moreover if public entities do not
sion is affected by the format in which the data is published, the search grasp the true value of open data.
tool used, and the discoverability of the dataset (Maali et al., 2010).
6.4.1.5. Technical Support. Most of the existing government data portals
6.3.7. Openness were not envisaged for large-scale open data publishing and consump-
The openness of a dataset directly influences the use, reuse, and tion. Thus, these public entities now require technical support to update
redistribution of data. Tim Berners-Lee's Five Star Scheme for Linked their websites or portals to enable their published data to achieve its
Open Data23 (Fig. 6) can be seen as a mix of the accessibility and usabil- highest reuse potential (Conradie & Choenni, 2012), (Dulong de
ity quality dimensions. As the authors of (Kučera et al., 2013) point out, Rosnay & Janssen, 09, 2014), (Eckartz et al., 2014), (Verma & Gupta,
open data can be technically defined to be open if it is available as a 2013), (Zuiderwijk & Janssen, 2014b).
complete set in an open, machine readable format, at a reasonable price
which is not more than the cost of reproduction. 6.4.1.6. Institutionalisation. Being a relatively new initiative, open data
tasks are usually assigned to employees whose job was already pre-
6.4. Challenges defined, with no institutional structure or public entity dedicated solely
to this task (DulongdeRosnay & Janssen, 09, 2014), (Eckartz et al., 2014),
There are a number of issues and challenges which hinder govern- (Verma & Gupta, 2013), (Zuiderwijk & Janssen, 2014b). This issue
ments from jumping on the open data bandwagon and from making results in no regular monitoring of the open data initiative performance.
The establishment of open government initiative policies would help in
23
5stardata.info. this challenge by clearly defining required responsibilities.

Please cite this article as: Attard, J., et al., A systematic review of open government data initiatives, Government Information Quarterly (2015),
http://dx.doi.org/10.1016/j.giq.2015.07.006
14 J. Attard et al. / Government Information Quarterly xxx (2015) xxx–xxx

Fig. 6. Five star scheme for linked open data (Source: 5staridata.info).

6.4.2. Issues hindering data from being truly open dataset ownership resulting from data sharing, for example between
public entities (Conradie & Choenni, 2012), (Eckartz et al., 2014),
6.4.2.1. Conflicting regulations. Whilst there is a lack of open government (Zuiderwijk & Janssen, 2014b). This hinders data from being published.
data policies, many open government data initiatives still belong to
existing legal frameworks concerning freedom of information, reuse of
public sector information, and the exchange of data between public en- 6.4.2.4. Competition. While open data can be considered as unfair com-
tities. The issue lies in the unclear task of how such initiatives can petition for private entities, public entities might consider the commer-
interact, resulting in uncertainty on the possible use of the relevant cial appropriation of public open data unfair (Dulong de Rosnay &
data. This issue does not only concern data consumers, but also data Janssen, 09, 2014), (Eckartz et al., 2014). In the first case, consider com-
producers who end up being sceptical of fully opening up their panies who invested in creating their own data stores (e.g. database of
institutions' data, even if it is covered by a clear legal framework streets and locations for navigation purposes). If the same data they
(Dulong de Rosnay & Janssen, 09, 2014). created is made public through government open data initiatives,
these companies will obviously deem it to be unfair competition as
there is the possibility of new competitors who did not need to invest
6.4.2.2. Privacy and data protection. There is a considerable conflict
anything but could get the freely available open data. Thus, manage-
between open data and the aims of transparency and accountability,
ment mechanisms need to be applied in order to ensure that private
and data protection and the right to privacy (Dulong de Rosnay &
companies do not suffer financial consequences due to opening up
Janssen, 09, 2014), (Meijer et al., 09, 2014), (Zuiderwijk & Janssen,
their data. On the other hand, public entities might be reluctant to
2014a), (Zuiderwijk & Janssen, 2014b). Even though data is anonymised
publish their data openly due to not wanting data belonging to the
before publishing, the merging of different datasets can still possibly
public (and paid by taxes) to be used for commercial gain. A possible
result in the discovery of data of a personal nature (Zuiderwijk et al.,
approach for the latter issue is to provide the data for a nominal fee.
2014). For example, if garbage collecting routes are published, along
Yet, this limits the openness of the data in question.
with the personnel timetable, a data consumer would be able to identify
the location of a particular employee. This issue requires more research
in order to come up with guidelines that can provide a solution to this 6.4.2.5. Liability. This issue is limited to data providers. Public entities
conflict, however a plausible approach would be to employ access fear being held liable for damage caused by the use of the provided
control mechanisms which regulate data access. However, this restricts data, due to it being stale, incorrect, or wrongly interpreted (Dulong
the openness level of such data. de Rosnay & Janssen, 09, 2014), (Eckartz et al., 2014). To cater for this
fear, many public entities either do not publish their data or otherwise
6.4.2.3. Copyright and licensing. The licensing of published data is one of impose restrictions on its use, resulting in data which is not truly
the Eight Open Government Data Principles. The first aspect of this issue open. In the worst case, due to fears of data being used against the pub-
is the incompatibility of licences (Dulong de Rosnay & Janssen, 09, lishing entity, such data might not even be collected/generated any lon-
2014). As discussed in Section 6.1.2, data providers should provide ef- ger (Zuiderwijk & Janssen, 2014b). A possible solution for these issues is
forts towards publishing their data in an open format, allowing the to enable social interaction with regards to the data in question. A com-
free and unrestricted use, reuse and distribution of data. Since there munity of stakeholders within the data platform where the data is pub-
are no agreed-upon standards, this can result in a number of incompat- lished can aid data consumers to better interpret and exploit the
ible open licences. While they all, in different grades, allow the reuse of published data.
data, they might contain restrictions which prevent data with different Considering the above risks or negative impacts, it is vital to find a
licences from being merged for a specific use. The definition of clear data trade-off for open government initiatives. One must keep in mind the
policies is a means to provide a solution to this challenge. The second as- numerous benefits associated with open data, but also cater and
pect of this issue is copyright inconsistencies that arise from unclear prepare for any risks, challenges and issues.

Please cite this article as: Attard, J., et al., A systematic review of open government data initiatives, Government Information Quarterly (2015),
http://dx.doi.org/10.1016/j.giq.2015.07.006
J. Attard et al. / Government Information Quarterly xxx (2015) xxx–xxx 15

6.5. Publishing tools and standards visualisation interfaces, can be built to work with multiple, or even
across, catalogues.
While there exist a huge number of government data portals that
enable data producers to publish their data, there are not many tools The dcat vocabulary has since been proposed as a W3C recommen-
aiding data publishers in this task. Yet, efforts are currently being dation by the Government Linked Data Working Group.25
focused on providing portals and other open government data initia-
tives which allow stakeholders to publish (and consume) datasets 7. Impact on stakeholders
without requiring background knowledge on the open data life-cycle.
An example of such efforts is the LinDA project.24 In this project a Open government data initiatives are based on transparency, citizen
stakeholder is able to upload data in any format, which is then convert- participation, and collaboration for strengthening democracy (Arcelus,
ed to RDF to enable easy linking with other open datasets. 2012), (Chan, 2013), (Edelmann et al., 2012), (Matheus, Ribeiro, Vaz,
The authors of (Hofman & Rajagopal, 2014, 09) propose a technical et al., 2012), (Mutuku & Colaco, 2012), (Yang & Kankanhalli, 2013).
framework for data sharing between data providers and consumers, Through these three pillars, the publishing of government datasets not
based on an analysis of a number of data platforms. They aim to identify, only has the potential of improving accountability and decreasing
from the relevant literature, the required functionality for data sharing, corruption, but it also affects all the involved stakeholders in a number
considering challenges such as different published formats, data of ways. While there is an obvious niche in literature with regards to
ambiguity, and privacy issues. frameworks which assess the impact achieved through open govern-
In (Meijer et al., 09, 2014) the authors present two case studies ment data initiatives, a number of authors discuss the different impacts
involving two different public sector entities, with the aim of demon- that can be obtained through such initiatives. The authors of (López-
strating the use of pre-commitment to resolve conflicts during a data Ayllón & Arellano Gault, 2008) depict the different levels of impact
request procedure. Pre-commitment involves applying restrictions on that can be achieved by an open government data initiative. We adapt
the type and content of the data that is available for request, ensuring these levels in Fig. 7 and portray, in context, how each impact builds
the data conforms to the legal requirements (e.g. removing privacy upon or supports the other impacts. While each impact does not strictly
sensitive data), and deciding on whether to open the data publicly or require the previous one, each impact supports the next one to
restrict its access to specific user groups. achieving a higher level of impact on the relevant stakeholders.
The authors of (Alexopoulos et al., 2014) propose a second genera- As shown in Fig. 7, the most direct impact is access to information.
tion platform, which offers both the basic functionality of a government Once data is published (made open to a given degree), this impact is im-
data portal, but also additional functionality (based on Web 2.0 mediately effective, since it provides the means for data to be reused. Of
technologies) aiming to stimulate and aid value generation from open course, the data's reuse is conditional on how the data is published (its
government data. This additional functionality includes the capability level of openness), and the consumer's willingness to participate in such
of performing a number of processing techniques, information and an effort. Through providing access to relevant information, an open
knowledge exchange, and collaboration between stakeholders. government data initiative can be more transparent.
The authors of (Jiříček & Di Massimo, 2011) introduce the European Transparency, the second level of impact for publishing government
Open Government Data Initiative, which is a free, open-source, cloud- data, can result in a considerable increase in social control by citizens by
based collection of datasets that public entities can exploit. In this enabling them to scrutinise the data. Subsequently, if provided with the
case, public data can be uploaded and stored into the Microsoft Cloud relevant means, they can also provide relevant feedback to the data pro-
through the Windows Azure Platform and environment. This tool is vider, and monitor policies and government initiatives (dos Santos
aimed at experts, and allows developers to use a variety of program- Brito, dos Santos Neto, et al., 2014), (Matheus, Ribeiro, Vaz, et al.,
ming languages. This initiative strives to keep in line with the open 2012), (Yang et al., 2013). Consequently, stakeholders gain more re-
government data principles and thus enables data to be openly sponsibilities as they are able to interact with the government and
published in a re-usable format, enables stakeholders to develop new other public entities more actively than in traditional governmental
applications based on the published data, allows developers to use the structures. For example, following the publishing of budget data, stake-
free and customisable source code, and has the aim of enhancing holders such as citizens, NGOs and even other private entities can pro-
transparency through increased visibility of a governments' services. vide feedback on budget priorities and specific transactions.26
In contrast to the above, the authors of (Maali et al., 2010) propose a Therefore, by easing social control, open government data initiatives
standardised interchange format, the dcat vocabulary, for machine- allow citizens to further exercise their duty and right of participation.
readable representations of government data catalogues, with the aim Moreover, it helps citizens establish a trusting relationship with the
of bringing all published datasets into the Web of linked data, resulting government, which is able to prove legitimacy of the actions taken.
in higher interoperability. The use of this interchange format results in a The increased transparency resulting from publishing data will also
number of advantages: impact public administrations in that there will be enhanced account-
ability within public sectors. The authors of (Bovens, 2007) define
1. the embedding of machine-readable metadata in Web pages accountability as the disclosure of data that provides stakeholders
increases discoverability; with the information required for assessing the propriety and effective-
2. the decentralised publishing by individual agencies could be aggre- ness of the government's conduct, while the authors of (López-Ayllón &
gated into national or supra-national (e.g. EU-wide) catalogues; Arellano Gault, 2008) identify accountability as having a dimension of
3. catalogues with overlapping scope (e.g. Bonn, Germany and EU) can answerability. They separate the latter into two components, namely
be searched in a federated manner; information and justification. The first implies that there should be an
4. one-click download and installation of data packages is available for entity that is obliged to provide information to which the stakeholders
application developers; should have access. Justification, on the other hand, is more challenging
5. priority is given towards archiving and digital preservation of to achieve since it implies that the data-providing entity should justify
valuable government datasets through the use of manifest files their actions to the citizens. Yet, as the authors of (Lourenço, 2013)
with accurate metadata; and point out, even if the published data is usable and adheres to good
6. software tools and applications, such as improved search and data
25
http://www.w3.org/TR/vocab-dcat/.
26
http://www.participatorybudgeting.org/about-participatory-budgeting/where-has-
24
http://linda-project.eu/. it-worked/.

Please cite this article as: Attard, J., et al., A systematic review of open government data initiatives, Government Information Quarterly (2015),
http://dx.doi.org/10.1016/j.giq.2015.07.006
16 J. Attard et al. / Government Information Quarterly xxx (2015) xxx–xxx

original data producers and the other consumers. User participation


usually follows a “90-9-1 rule” 27 where:

– 90% of users are lurkers who follow by reading or observing but do


not actively contribute;
– 9% of users contribute from time to time, but other priorities
dominate their time;
– 1% of users participate a lot and account for most contributions.

‘Lurking’ tends to have a negative connotation, however lurking is


also valuable in a democratic society where an informed citizen can
take effective decisions (Edelmann et al., 2012). In open government
data initiatives, the aim is to achieve the highest number of active
users as possible, keeping in mind that collaboration is not done for
the sake of doing it, but to enable all stakeholders to participate in
Fig. 7. Relationship between different impacts of open government data initiatives. efficient and effective decisions.
Anonymity can be seen as an advantage in online participation. It
allows anyone to be able to speak freely about his/her opinions and
about any agendas they might be interested in, without the fear of
quality standards, the simple provision of data does not guarantee that being persecuted for them. This makes it easier for stakeholders to
the public entity or government is immediately enhancing transparency participate in efforts such as decision-making. Yet, anonymity also has
and/or accountability. its downside as it allows participants to contribute undesirable and
Through the long term interaction with an open government data useless information, as well as making participants more likely to insult
platform, open data promotes not just transparency and accountabil- or verbally attack others whilst hiding behind their anonymity
ity, but also democracy (Mutuku & Colaco, 2012). As mentioned in (Edelmann et al., 2012). Furthermore, a single user can use multiple
the example for the budget data, stakeholders can be enabled to pro- online identities to manipulate the discussion in progress.
vide feedback on the published data. Such feedback loops will not The participation of third parties in processes such as policy-making
only inform the public entity of the public opinion, but also can im- or decision-taking does not only potentially increase citizen satisfaction,
prove service delivery through the repeated querying of the open but it also increases the potential of more innovative solutions or
data by all stakeholders, including citizens and government agen- approaches to problems. (von Lucke & Große, 2014) term this participa-
cies. For example, the analysis of published budget data would en- tion as open government collaboration, which involves the collaboration
able the shift from a centralised government to a citizen-centric of different entities during the implementation, monitoring, and evalu-
governance model. ation of policies. Entities such as unions and political party associations
While datasets are usually published in their raw form, and thus were always traditionally included in the process of policy-making. Yet,
have little value on their own, public entities can leverage on other these entities do not represent all members of society equally. By
stakeholders, such as the private sector, community groups, and allowing all stakeholders to participate through eParticipation, a new
citizens, to innovate upon the published data and strive to achieve the collaboration approach that enables a many-to-many communication
utmost potential of open government data initiatives (Edelmann et al., allows all individuals to participate in shaping the democracy they live
2012), (Mutuku & Colaco, 2012), (Yang & Kankanhalli, 2013). Benefits in.
are plenty, including exploiting user participation (crowdsourcing) in Albeit the benefits of open data outweigh the efforts required, it ap-
order to enhance data quality through feedback (OHara, 2014). Yet, pears that there is a lack of public participation in open government data
active participation is not so simply achieved. While open data initia- initiatives. In (Yang & Kankanhalli, 2013) the authors identify that the
tives form the basis for citizen participation and collaboration, there is lack of research on the factors influencing external stakeholders' deci-
no guarantee that there is actually any resulting participation or collab- sion to participate and consume open data might be a factor in this
oration (Alexopoulos et al., 2014), (Chan, 2013), (Edelmann et al., problem. The authors of (Chan, 2013), on the other hand, point out
2012), (Solar et al., 2013). Moreover, as the authors of (Mercado-Lara that governmental agencies do not have effective strategies to encourage
& Gil-Garcia, 2014) and (Mutuku & Colaco, 2012) point out, there is participation from external stakeholders. Such public entities must
the need to bridge the gap between data providers and consumers by come to the realisation that successful open data initiatives are based
using data intermediaries. Thus, those who can make sense of the on the actual usage of the data rather than simply the creation of an
published data should interact with the software developers in such a open data portal. In (Bertot et al., 2014), the authors carry out a case
way that the latter can develop innovative applications or services study with the aim of identifying how community data can be leveraged
based on the published data. Even though this informal type of through public libraries. Amongst the authors' conclusions, they point
collaboration is facilitated by the existing technologies, it is not yet out that stakeholders (i) not only need more data, but need it to be
fully endorsed by public and governmental entities. meaningful, (ii) need the identification of best practices for using the
data, and (iii) request the collaboration of different stakeholder
communities.
7.1. Challenges
7.2. Motivating the use of open data
The need to instigate data reuse through citizen participation is
essential, as it promotes the innovative potential of developers and On the premise that the role of government agencies in open data
other stakeholders. This is, however, easier said than done. A number initiatives is not only to publish the data, public agencies are starting
of barriers hinder public participation, and mostly include challenges to focus their efforts on motivating external stakeholders to use the
related to the cultural domain. published data. While there is no agreed-upon method to achieve public
The authors of (Solar et al., 2013) point out the need of an action plan
for stimulating the consumption of open datasets between both the 27
www.nngroup.com/articles/participation-inequality/.

Please cite this article as: Attard, J., et al., A systematic review of open government data initiatives, Government Information Quarterly (2015),
http://dx.doi.org/10.1016/j.giq.2015.07.006
J. Attard et al. / Government Information Quarterly xxx (2015) xxx–xxx 17

participation, there are a number of popular methods. Challenge Table 3


competitions are a commonly-used approach (Foulonneau et al., Overview of challenges in open government data initiatives.

2014), where the competition involves developing the best application, Nature of challenge Challenge Possible solution
or finding an innovative use, based on the published data. Usually the Technical Formats Using a machine-processable,
winners are awarded a prize or recognition for their efforts. A disadvan- non-proprietary format
tage of such competitions is that most participants are usually novices Ambiguity Using a descriptive format; Adding
rather than professionals (Chan, 2013). This is of course somewhat documentation/metadata
Discoverability Using good quality metadata; More
reflected in the submitted entries, which tend to be amateurish.
advanced search tools on portals
Moreover, the entries do not usually contribute to the development of Representation Defining and using standardised
sustainable services (Foulonneau et al., 2014). Professionals are usually representation; Using named graphs
deterred from participating in such competitions due to the minimal (if for versioning
any) prize money. In any case, it is evident that in such cases the Capacity Applying standards; Large-scale
training
governmental entity does not have any direct control on the output of Policy/Legal Copyright/Licensing Defining standard data policies
the competition, and there is no assurance on the quality. For these rea- Conflicting Defining open government data
sons, challenge competitions are more suitable for just raising aware- Regulations initiative policies and legal
ness about the open data initiative, and introducing stakeholders to frameworks
Privacy/Data Defining privacy regulations;
public participation. Another approach towards encouraging participa-
Protection Implementing access control
tion are Calls for Collaboration, where companies are invited to submit mechanisms (this limits the openness
proposals to create particular services. As opposed to challenge of the data)
competitions, the governmental entity now has a say as to what will Liability Social interaction; Raising awareness;
be developed as the output of the call, as well as the possibility to Defining legal frameworks
Economic/Financial Budget Provision Providing budget specifically for open
enforce the participants to meet specific requirements.
data initiatives
A number of publications in the literature attempt to identify the Organisational Institutionalisation Re-organising the current
best method to achieve public participation. In (Mutuku & Colaco, organisational structure; Defining
2012), (Yang & Kankanhalli, 2013), the authors propose their intentions open government initiative policies
Overlapping Scope Using provenance metadata
in researching the best practices in increasing the consumption of open
Technical Support Providing support to public entities
data. The authors of (Kalampokis, Hausenblas & Tarabanis, 2011) with the executing of an open data
research the use of social media platforms in eParticipation and propose initiative
a two-phased approach for backing participatory decision-making, Cultural Motivation Raising awareness on the reuse of
along with an architecture which supports its implementation. This open data and its benefits
Awareness Highlighting the value and potential of
approach is based on the integration of government and social data
open data
and attempts firstly to help the government identify public opinion Public Participation Raising awareness; providing
and predict public reactions, and secondly to enable citizens and incentives
stakeholders to contribute to the decision-making process. With the Competition Providing specific data at a nominal
fee (this limits the openness of the
similar aim of identifying what motivates stakeholders to participate
data)
and collaborate, the authors of (Chan, 2013) identify a set of
considerations for motivating stakeholders to innovate upon the
published datasets.

8. Conclusion for evaluating various aspects of open government initiatives. We


follow by providing a summary of open government initiative evalua-
In this paper we give an overview of the open government data tions found in our primary studies. The various publications covered
initiatives surveyed in our systematic research. The aim behind this evaluate different aspects of the initiatives, such as the features provid-
research is to answer a set of questions, mainly concerning open ed, the openness level of the available data, and the impact on relevant
government data initiatives and their impact on stakeholders, existing stakeholders. Many of them also evaluate the current status for specific
approaches for publishing and consuming open government data, administrative regions. Based on the results of our evaluations, we
existing guidelines, and challenges (see Table 3) for the discussed proceed to point out challenges and issues which hinder open
approaches. We identify corruption to be the major problem which trig- government initiatives from reaching their full potential, and we also
gered open government data initiatives, and we point out the various suggest possible solutions.
motivations for opening government data. One major motivation is In this paper we focus on the publishing and consumption processes
transparency, which however should not be an end in itself. It should of open government data, which are the most essential processes within
rather be a means to enhance an open government initiative. This the life-cycle. We classify different publishing and consumption
perspective will avoid governments in publishing their data for the approaches, and identify different data quality aspects which influence
sake of it, rather than striving to provide useful data which stake- or are influenced by the approaches undertaken for consuming or pub-
holders can use, reuse and distribute, and ideally even innovate lishing the data. Based on the literature covered in the survey, the Eight
upon. Open Government Data Principles, and the Five Star Scheme for Linked
Based on existing open data life-cycles and on existing open data Open Data, we extract and integrate various guidelines for publishing
initiatives, we define the open government data life-cycle, which is open government data. Adhering to these guidelines will improve the
provided as the depiction of the processes and their ideal order re- end usability of the data (for consumption), and the resulting success
quired during the lifetime of open government data. The definition of the initiative in question. Unfortunately, while some solutions exist,
of this life-cycle is not meant to be an extensive description of the there are still a number of factors which influence public entities from
processes; rather we propose it to act as a guideline for stakeholders jumping on the open data bandwagon in the first place, as well as
to follow during their participation in an open government data other issues which hinder data from being truly open. Besides, even
initiative. though efforts are being targeted towards producing publishing tools
One of our main contributions is the discussion about open govern- to aid data publishers in their task, there are no fixed standards to
ment data initiatives. We first discuss different assessment frameworks follow.

Please cite this article as: Attard, J., et al., A systematic review of open government data initiatives, Government Information Quarterly (2015),
http://dx.doi.org/10.1016/j.giq.2015.07.006
18 J. Attard et al. / Government Information Quarterly xxx (2015) xxx–xxx

To conclude, we revisit the research questions posed in Section 2.1 Companion on World Wide Web (pp. 321–324). New York, NY, USA: WWW '12
Companion, ACM. http://dx.doi.org/10.1145/2187980.2188039.
and summarise the discussions in this paper with the following Bovens, M. (2007). Analysing and assessing accountability: A conceptual framework.
observations: European Law Journal, 13(4), 447–468. http://dx.doi.org/10.1111/j.1468-0386.2007.
00378.x.
‫־‬What are existing approaches for publishing or consuming open Carroll, J.J., Bizer, C., Hayes, P., & Stickler, P. (2005). Named graphs, provenance and trust.
government data, and how can they be classified? Proceedings of the 14th International Conference on World Wide Web (pp. 613–622).
New York, NY, USA: WWW '05, ACM. http://dx.doi.org/10.1145/1060745.1060835.
Open government data initiatives vary in nature, and the imple- Chan, C.M. (2013). From open data to open innovation strategies: Creating e-services
mented approaches reflect this heterogeneity. However, the most using open government data. 2014 47th Hawaii International Conference on System
common approaches include data portals, data catalogues, and Sciences 0 (pp. 1890–1899).
Conradie, P., & Choenni, S. (2012). Exploring process barriers to release public sector
services. information in local government. Proceedings of the 6th International Conference on
– What are the supported technical aspects, features and functions in Theory and Practice of Electronic Governance (pp. 5–13). New York, NY, USA: ICEGOV
existing approaches? '12, ACM. http://dx.doi.org/10.1145/2463728.2463731.
Davies, T., & Frank, M. (2013). there's no such thing as raw data': Exploring the socio-
The aim behind most open government data initiatives is to publish technical life of a government dataset. Proceedings of the 5th Annual ACM Web Science
data in order to make it available for reuse. The most commonly Conference (pp. 75–78). New York, NY, USA: WebSci '13, ACM. http://dx.doi.org/10.
available feature is therefore the availability of data. This basic 1145/2464464.2464472.
Debattista, J., Lange, C., & Auer, S. (2014). Representing dataset quality metadata using
feature is then complemented through other technical aspects,
multi-dimensional views. Proceedings of the 10th International Conference on Semantic
together with features and functions, such as multilinguality, Systems (pp. 92–99). New York, NY, USA: SEM '14, ACM. http://dx.doi.org/10.1145/
different data formats, data accessibility, data content, and visualisa- 2660517.2660525.
DiFranzo, D., Graves, A., Erickson, J., Ding, L., Michaelis, J., Lebo, T., Patton, E., Williams, G.,
tion tools.
Li, X., Zheng, J., Flores, J., McGuinness, D., & Hendler, J. (2011). The web is my back-
– Are there any defined guidelines for the publishing or consumption end: Creating mashups with linked open government data. In D. Wood (Ed.), Linking
of open government data? Government Data (pp. 205–219). New York: Springer. http://dx.doi.org/10.1007/978-
While a number of different guidelines are defined in literature, 1-4614-1767-5_10.
dos Santos Brito, K., Silva Costa, M., Cardoso Garcia, V., & Romero de Lemos Meira, S.
there are no agreed upon standards for the publishing or consump- (2014, July). Experiences integrating heterogeneous government open data sources
tion of open government data. Yet, by following the integrated to deliver services and promote transparency in Brazil. Computer Software and
overview of guidelines we propose, we attempt to provide a higher Applications Conference (COMPSAC), 2014 IEEE 38th Annual (pp. 606–607).
dos Santos Brito, K., dos Santos Neto, M., da Silva Costa, M. A., Garcia, V. C., & de
possibility for an open government data initiative to succeed. LemosMeira, S. R. (2014). Using parliamentary brazilian open data to improve
– What are existing challenges with publishing or consuming open transparencyand public participation in brazil. Proceedings of the 15th Annual Interna-
government data? tional Conference on Digital Government Research (pp. 171–177). New York, NY, USA:
dg.o '14,ACM. http://dx.doi.org/10.1145/2612733.2612769.
We identified and explored a number of challenges, including dos Santos Brito, K., da Silva Costa, M. A., Garcia, V. C., & de Lemos Meira, S. R. (2014). Bra-
technical, policy and legal, economic and financial, organisational, zilian government open data: Implementation, challenges, and potential opportuni-
and cultural barriers. ties. Proceedings of the 15th Annual International Conference on Digital Government
Research (pp. 11–16). New York, NY, USA: dg.o '14, ACM. http://dx.doi.org/10.1145/
– What are possible impacts of open government initiatives on
2612733.2612770.
relevant stakeholders? Dulong de Rosnay, M., & Janssen, K. (09 2014). Legal and institutional challenges for
Transparency was identified to be one main aim of opening govern- opening data across public sectors: Towards common policy solutions. Journal of
theoretical and applied electronic commerce research, 9, 1–14 [http://www.scielo.cl/
ment data, however it is not the only impact. There are varying
scielo.php?script=sci_arttext pid=S0718-18762014000300002 nrm=iso].
impacts of open government data initiatives, including the direct Dyba, T., Dingsoyr, T., & Hanssen, G.K. (2007). Applying systematic reviews to diverse
impact of access to information that results in more informed study types: An experience report. Proceedings of the First International Sympo-
citizens, as well as an increase in accountability and a higher oppor- sium on Empirical Software Engineering and Measurement (pp. 225–234).
Washington, DC, USA: ESEM '07, IEEE Computer Society. http://dx.doi.org/10.
tunity for citizens to actively participate governance processes. 1109/ESEM.2007.21.
Eckartz, S., Hofman, W., & Van Veenstra, A. (2014). A decision model for data sharing. In
M. Janssen, H. Scholl, M. Wimmer, & F. Bannister (Eds.), Electronic Government,
Lecture Notes in Computer Science, vol. 8653. (pp. 253–264). Berlin Heidelberg: Spring-
References er. http://dx.doi.org/10.1007/978-3-662-44426-9_21.
Edelmann, N., Höchtl, J., & Sachs, M. (2012). Collaboration for open innovation processes
Alexopoulos, C., Spiliotopoulou, L., & Charalabidis, Y. (2013). Open data movement in in public administrations. In Y. Charalabidis, & S. Koussouris (Eds.), Empowering Open
greece: A case study on open government data sources. Proceedings of the 17th and Collaborative Governance (pp. 21–37). Springer (http://dblp.uni-trier.de/db/
Panhellenic Conference on Informatics (pp. 279–286). New York, NY, USA: PCI '13, books/daglib/0028914.html#EdelmannHS12).
ACM. http://dx.doi.org/10.1145/2491845.2491876. Egger-Peitler, I., & Polzer, T. (2014). Open data: European ambitions and local efforts.
Alexopoulos, C., Zuiderwijk, A., Charalabidis, Y., Loukis, E., & Janssen, M. (2014). Designing Experiences from austria. In M. Gascó-Hernández (Ed.), Open Government, Public
a second generation of open data platforms: Integrating open data and social media. Administration and Information Technology, vol. 4. (pp. 137–154). New York: Springer.
Electronic Government — 13th IFIP WG 8.5 International Conference, EGOV 2014, Dublin, http://dx.doi.org/10.1007/978-1-4614-9563-5_9.
Ireland, September 1-3, 2014. Proceedings (pp. 230–241). http://dx.doi.org/10.1007/ Foulonneau, M., Martin, S., & Turki, S. (2014). How open data are turned into services? In
978-3-662-44426-9_19. M. Snene, & M. Leonard (Eds.), Exploring Services Science, Lecture Notes in Business
Arcelus, J. (2012). Framework for useful transparency websites for citizens. Proceedings of Information Processing, vol. 169. (pp. 31–39). Springer International Publishing.
the 6th International Conference on Theory and Practice of Electronic Governance http://dx.doi.org/10.1007/978-3-319-04810-9_3.
(pp. 83–86). New York, NY, USA: ICEGOV '12, ACM. http://dx.doi.org/10.1145/ Fuentes-Enriquez, R., & Rojas-Romero, Y. (2013). Developing accountability, transparency
2463728.2463749. and government efficiency through mobile apps: The case of mexico. Proceedings of
Bakıcı, T., Almirall, E., & Wareham, J. (2013). A smart city initiative: The case of Barcelona. the 7th International Conference on Theory and Practice of Electronic Governance
Journal of the Knowledge Economy, 4(2), 135–148. http://dx.doi.org/10.1007/s13132- (pp. 313–316). New York, NY, USA: ICEGOV '13, ACM. http://dx.doi.org/10.1145/
012-0084-9. 2591888.2591944.
Bertot, J.C., Butler, B.S., & Travis, D. (2014). Local big data: The role of libraries in building González, J.C., Garcia, J., Cortés, F., & Carpy, D. (2014). Government 2.0: A conceptual
community data infrastructures. 15th Annual International Conference on Digital framework and a case study using mexican data for assessing the evolution towards
Government Research, dg.o '14, Aguascalientes, Mexico, June 18-21, 2014 (pp. 17–23). open governments. Proceedings of the 15th Annual International Conference on Digital
http://dx.doi.org/10.1145/2612733.2612762. Government Research (pp. 124–136). New York, NY, USA: dg.o '14, ACM. http://dx.
Bizer, C., Heath, T., & Berners-Lee, T. (2009). Linked data — The story so far. International doi.org/10.1145/2612733.2612742.
Journal Semantic Web Information Systems, 5(3), 1–22. Hendler, J., Holm, J., Musialek, C., & Thomas, G. (2012). Us government linked open data:
Bogdanović-Dinić, S., Veljković, N., & Stoimenov, L. (2014). How open are public govern- Semantic.data.gov. IEEE Intelligent Systems, 27(3), 25–31.
ment data? An assessment of seven open data portals. In M.P. Rodríguez-Bolívar Hofman, W., & Rajagopal, M. (2014, 09). A technical framework for data sharing. Journal of
(Ed.), Measuring E-government Efficiency, Public Administration and Information theoretical and applied electronic commerce research, 9, 45–58 (http://www.scielo.cl/
Technology, vol. 5. (pp. 25–44). New York: Springer. http://dx.doi.org/10.1007/978- scielo.php?script=sci_arttext pid=S0718-18762014000300005 nrm=iso).
1-4614-9982-4_3. Höchtl, J., & Reichstädter, P. (2011). Linked open data — A means for public sector infor-
Böhm, C., Freitag, M., Heise, A., Lehmann, C., Mascher, A., Naumann, F., Ercegovac, V., mation management. In K.N. Andersen, E. Francesconi, Å. Grönlund, & T.M. van
Hernandez, M., Haase, P., & Schmidt, M. (2012). Govwild: Integrating open govern- Engers (Eds.), EGOVIS. Lecture Notes in Computer Science. 6866. (pp. 330–343). Spring-
ment data for transparency. Proceedings of the 21st International Conference er [http://dblp.uni-trier.de/db/conf/egov/egovis2011.html#HochtlR11].

Please cite this article as: Attard, J., et al., A systematic review of open government data initiatives, Government Information Quarterly (2015),
http://dx.doi.org/10.1016/j.giq.2015.07.006
J. Attard et al. / Government Information Quarterly xxx (2015) xxx–xxx 19

Janev, V., Mijović, V., Paunović, D., & Milošević, U. (2014). Modeling, fusion and Mutuku, L.N., & Colaco, J. (2012). Increasing kenyan open data consumption: A design
exploration of regional statistics and indicators with linked data tools. In A. Ko, & E. thinking approach. Proceedings of the 6th International Conference on Theory and Prac-
Francesconi (Eds.), Electronic Government and the Information Systems Perspective, tice of Electronic Governance (pp. 18–21). New York, NY, USA: ICEGOV '12, ACM.
Lecture Notes in Computer Science. 8650. (pp. 208–221). Springer International Pub- http://dx.doi.org/10.1145/2463728.2463733.
lishing. http://dx.doi.org/10.1007/978-3-319-10178-1_17. Ochoa, X., & Duval, E. (June 2006). Quality metrics for learning object metadata. In E.
Jetzek, T., Avital, M., & Bjørn-Andersen, N. (2014). Generating sustainable value from open Pearson, & P. Bohman (Eds.), Proceedings of World Conference on Educational Multime-
data in a sharing society. In B. Bergvall-Kåreborn, & P. Nielsen (Eds.), Creating Value dia, Hypermedia and Telecommunications 2006 (pp. 1004–1011). Chesapeake, VA:
for All Through IT, IFIP Advances in Information and Communication Technology. 429. AACE [http://www.editlib.org/p/23127].
(pp. 62–82). Berlin Heidelberg: Springer. http://dx.doi.org/10.1007/978-3-662- O'Hara, K. (2014). Enhancing the quality of open data. In L. Floridi, & P. Illari (Eds.), The
43459-8_5. Philosophy of Information Quality, Synthese Library. 358. (pp. 201–215). Springer
Jiříček, Z., & Di Massimo, F. (2011). Microsoft open government data initiative (ogdi), eye International Publishing. http://dx.doi.org/10.1007/978-3-319-07121-3_11.
on earth case study. In J. Hřebíček, G. Schimak, & R. Denzer (Eds.), Environmental Palmirani, M., Martoni, M., & Girardi, D. (2014). Open government data beyond transpar-
Software Systems. Frameworks of eEnvironment, IFIP Advances in Information and ency. In A. Ko, & E. Francesconi (Eds.), Electronic Government and the Information Sys-
Communication Technology. 359. (pp. 26–32). Berlin Heidelberg: Springer. http://dx. tems Perspective, Lecture Notes in Computer Science. 8650. (pp. 275–291). Springer
doi.org/10.1007/978-3-642-22285-6_3. International Publishing. http://dx.doi.org/10.1007/978-3-319-10178-1_22.
Juran, J.M. (1974). Juran's quality handbook (4th edn ). Mcgraw-Hill (Tx). Parycek, P., Hochtl, J., & Ginner, M. (05 2014). Open government data implementation eval-
Kalampokis, E., Hausenblas, M., & Tarabanis, K.A. (2011). Combining social and govern- uation. Journal of theoretical and applied electronic commerce research, 9, 80–99 [http://
ment open data for participatory decision-making. In E. Tambouris, A. Macintosh, & www.scielo.cl/scielo.php?script=sci_arttext pid=S0718-18762014000200007 nrm=
H. de Bruijn (Eds.), ePart. Lecture Notes in Computer Science. 6847. (pp. 36–47). iso.].
Springer [http://dblp.uni-trier.de/db/conf/epart/epart2011.html#KalampokisHT11]. Petychakis, M., Vasileiou, O., Georgis, C., Mouzakitis, S., & Psarras, J. (05 2014). A state-of-the-
Kalampokis, E., Tambouris, E., & Tarabanis, K. (Jun 2011). A classification scheme for open art analysis of the current public data landscape from a functional, semantic and techni-
government data: Towards linking decentralised data. International Journal of Web cal perspective. Journal of theoretical and applied electronic commerce research, 9, 34–47
Engineering and Technology, 6(3), 266–285. http://dx.doi.org/10.1504/IJWET.2011. [http://www.scielo.cl/scielo.php?script=sci_arttext pid=S0718-18762014000200004
040725. nrm=is].
Kitchenham, B. (2004). Procedures for performing systematic reviews. Tech. rep. Pipino, L.L., Lee, Y.W., & Wang, R.Y. (Apr 2002). Data quality assessment. Communications
Departament of Computer Science, Keele University. of the ACM, 45(4), 211–218. http://dx.doi.org/10.1145/505248.506010.
Kučera, J., Chlapek, D., & Nečaský, M. (2013). Open government data catalogs: Current Prieto, L.M., Rodrguez, A.C., & Pimiento, J. (2012). Implementation framework for open
approaches and quality perspective. Technology-Enabled Innovation for Democracy, Gov- data in colombia. Proceedings of the 6th International Conference on Theory and Practice
ernment and Governance, Lecture Notes in Computer Science. 8061. (pp. 152–166). Berlin of Electronic Governance (pp. 14–17). New York, NY, USA: ICEGOV '12, ACM. http://dx.
Heidelberg: Springer. http://dx.doi.org/10.1007/978-3-642-40160-2_13. doi.org/10.1145/2463728.2463732.
Layne, K., & Lee, J. (2001). Developing fully functional e-government: A four stage model. Reiche, K.J., & Höfig, E. (2013). Implementation of metadata quality metrics and applica-
Government Information Quarterly, 18(2), 122–136 [http://www.sciencedirect. tion on public government data. COMPSAC Workshops (pp. 236–241).
com/science/article/pii/S0740624X01000661.]. Rojas, L., Bermúdez, G., & Lovelle, J. (2014). Open data and big data: A perspective from
Lin, C., & Yang, H.C. (2014). Data quality assessment on taiwan's open data sites. In L.L. Wang, colombia. In L. Uden, D. Fuenzaliza Oshee, I.H. Ting, & D. Liberona (Eds.), Knowledge
J. June, C.H. Lee, K. Okuhara, & H.C. Yang (Eds.), Multidisciplinary Social Networks Research, Management in Organizations, Lecture Notes in Business Information Processing. 185.
Communications in Computer and Information Science. 473. (pp. 325–333). Berlin Heidel- (pp. 35–41). Springer International Publishing. http://dx.doi.org/10.1007/978-3-
berg: Springer. http://dx.doi.org/10.1007/978-3-662-45071-0_26. 319-08618-7_4.
Liu, Q., Bai, Q., Ding, L., Pho, H., Chen, Y., Kloppers, C., McGuinness, D., Lemon, D., de Souza, Sanabria, P., Pliscoff, C., & Gomes, R. (2014). E-government practices in south american
P., Fitch, P., & Fox, P. (2011). Linking australian government data for sustainability sci- countries: Echoing a global trend or really improving governance? The experiences
ence — A case study. In D. Wood (Ed.), Linking Government Data (pp. 181–204). New of Colombia, Chile, and Brazil. In M. Gascó-Hernández (Ed.), Open Government, Public
York: Springer. http://dx.doi.org/10.1007/978-1-4614-1767-5_9. Administration and Information Technology. 4. (pp. 17–36). New York: Springer. http://
López-Ayllón, S., & Arellano Gault, D. (2008). Estudio en materia de transparencia de otros dx.doi.org/10.1007/978-1-4614-9563-5_2.
sujetos obligados por la Ley Federal de Transparencia y Acceso a la Información Pública Sandoval-Almazan, R., Gil-Garcia, J.R., Luna-Reyes, L.F., Luna, D.E., & Rojas-Romero, Y.
Gubernamental. Centro de Investigación y Docencia Económicas: Instituto Federal (2012). Open government 2.0: Citizen empowerment through open data, web and
de Acceso a la Información: UNAM. Instituto de Investigaciones Jurdicas. mobile apps. Proceedings of the 6th International Conference on Theory and Practice of
Lourenço, R.P. (2013). Open government portals assessment: A transparency for account- Electronic Governance (pp. 30–33). New York, NY, USA: ICEGOV '12, ACM. http://dx.
ability perspective. In M. Wimmer, M. Janssen, & H.J. Scholl (Eds.), EGOV. Lecture Notes doi.org/10.1145/2463728.2463735.
in Computer Science. 8074. (pp. 62–74). Springer [http://dblp.uni-trier.de/db/conf/ Sandoval-Almazan, R., & Gil-Garcia, J. (2014). Towards an evaluation model for open gov-
egov/egov2013.html#Lourenco13]. ernment: A preliminary proposal. In M. Janssen, H. Scholl, M. Wimmer, & F. Bannister
Lourenço, R., & Serra, L. (2014). An online transparency for accountability maturity model. (Eds.), Electronic Government, Lecture Notes in Computer Science. 8653. (pp. 47–58).
In M. Janssen, H. Scholl, M. Wimmer, & F. Bannister (Eds.), Electronic Government, Lec- Berlin Heidelberg: Springer. http://dx.doi.org/10.1007/978-3-662-44426-9_4.
ture Notes in Computer Science. 8653. (pp. 35–46). Berlin Heidelberg: Springer. http:// Sayogo, D., Pardo, T., & Cook, M. (Jan 2014). A framework for benchmarking open govern-
dx.doi.org/10.1007/978-3-662-44426-9_3. ment data efforts. System Sciences (HICSS), 2014 47th Hawaii International Conference
von Lucke, J., & Große, K. (2014). Open government collaboration. In M. Gascó-Hernández on (pp. 1896–1905).
(Ed.), Open Government, Public Administration and Information Technology. 4. (pp. Shadbolt, N., O'Hara, K., Salvadores, M., & Alani, H. (2011). Egovernment. In John Domingue,
189–204). New York: Springer. http://dx.doi.org/10.1007/978-1-4614-9563-5_12. Dieter Fensel, & James Hendler (Eds.), Handbook of Semantic Web Technologies
Maali, F., Cyganiak, R., & Peristeras, V. (2010). Enabling interoperability of government (pp. 840–900). Springer-Verlag. http://dx.doi.org/10.1007/978-3-540-92913-0_20.
data catalogues. In M. Wimmer, J.L. Chappelet, M. Janssen, & H.J. Scholl (Eds.), Sheffer Correa, A., Correa, P., Silva, D., & Soares Correa da Silva, F. (June 2014). Really
EGOV. Lecture Notes in Computer Science. (pp. 339–350). Springer. opened government data: A collaborative transparency at sight. Big Data (BigData
Marienfeld, F., Schieferdecker, I., Lapi, E., & Tcholtchev, N. (2013). Metadata aggregation at Congress), 2014 IEEE International Congress on (pp. 806–807).
govdata.de: An experience report. Proceedings of the 9th International Symposium on Solar, M., Concha, G., & Meijueiro, L. (2012). A model to assess open government data in
Open Collaboration (pp. 21:1–21:5). New York, NY, USA: WikiSym '13, ACM. http:// public agencies. In H.J. Scholl, M. Janssen, M. Wimmer, C.E. Moe, & L.S. Flak (Eds.),
dx.doi.org/10.1145/2491055.2491077. EGOV. Lecture Notes in Computer Science. 7443. (pp. 210–221). Springer [http://dblp.
Martin, S., Foulonneau, M., Turki, S., & Ihadjadene, M. (2013). Open data: Barriers, risks, uni-trier.de/db/conf/egov/egov2012.html#SolarCM12].
and opportunities. European Conference on eGovernment, Como, Italy [June 13-14]. Solar, M., Meijueiro, L., & Daniels, F. (2013). A guide to implement open data in public
Martin, S., Foulonneau, M., & Turki, S. (2013). 1-5 stars: Metadata on the openness level of agencies. In M. Wimmer, M. Janssen, & H.J. Scholl (Eds.), EGOV. Lecture Notes in Com-
open data sets in europe. In E. Garoufallou, & J. Greenberg (Eds.), MTSR. Communica- puter Science. 8074. (pp. 75–86). Springer [http://dblp.uni-trier.de/db/conf/egov/
tions in Computer and Information Science. 390. (pp. 234–245). Springer http://dblp. egov2013.html#SolarMD13].
uni-trier.de/db/conf/mtsr/mtsr2013.html#MartinFT13. Styrin, E., Dmitrieva, N., & Zhulin, A. (2013). Openness evaluation framework for public
Matheus, R., Ribeiro, M.M., Vaz, J.C., & de Souza, C.A. (2012). Anti-corruption online mon- agencies. Proceedings of the 7th International Conference on Theory and Practice of
itoring systems in brazil. Proceedings of the 6th International Conference on Theory and Electronic Governance (pp. 370–371). New York, NY, USA: ICEGOV '13, ACM. http://
Practice of Electronic Governance (pp. 419–425). New York, NY, USA: ICEGOV '12, dx.doi.org/10.1145/2591888.2591964.
ACM. http://dx.doi.org/10.1145/2463728.2463809. Vasa, M., & Tamilselvam, S. (2014). Building apps with open data in india: An experience.
Matheus, R., Ribeiro, M.M., & Vaz, J.C. (2012). New perspectives for electronic government Proceedings of the 1st International Workshop on Inclusive Web Programming -
in brazil: The adoption of open government data in national and subnational govern- Programming on the Web with Open Data for Societal Applications (pp. 1–7). New
ments of brazil. Proceedings of the 6th International Conference on Theory and Practice York, NY, USA: IWP 2014, ACM. http://dx.doi.org/10.1145/2593761.2593763.
of Electronic Governance (pp. 22–29). New York, NY, USA: ICEGOV '12, ACM. http://dx. Veljković, N., Bogdanović-Dinić, S., & Stoimenov, L. (2012). Web 2.0 as a technological
doi.org/10.1145/2463728.2463734. driver of democratic, transparent, and participatory government. In C.G. Reddick, &
Meijer, R., Conradie, P., & Choenni, S. (09 2014). Reconciling contradictions of open data S.K. Aikins (Eds.), Web 2.0 Technologies and Democratic Governance, Public Administra-
regarding transparency, privacy, security and trust. Journal of theoretical and applied tion and Information Technology. 1. (pp. 137–151). New York: Springer. http://dx.doi.
electronic commerce research, 9, 32–44 [http://www.scielo.cl/scielo.php?script=sci_ org/10.1007/978-1-4614-1448-3_9.
arttext pid=S0718-18762014000300004 nrm=is]. Veljković, N., Bogdanović-Dinić, S., & Stoimenov, L. (2014). Benchmarking open govern-
Mercado-Lara, E., & Gil-Garcia, J.R. (2014). Open government and data intermediaries: ment: An open data perspective. Government Information Quarterly, 31(2), 278–290
The case of aiddata. Proceedings of the 15th Annual International Conference on Digital [http://www.sciencedirect.com/science/article/pii/S0740624X14000434.].
Government Research (pp. 335–336). New York, NY, USA: dg.o '14, ACM. http://dx.doi. Verma, N., & Gupta, M.P. (2013). Open government data: Beyond policy & portal, a study
org/10.1145/2612733.2612789. in indian context. Proceedings of the 7th International Conference on Theory and

Please cite this article as: Attard, J., et al., A systematic review of open government data initiatives, Government Information Quarterly (2015),
http://dx.doi.org/10.1016/j.giq.2015.07.006
20 J. Attard et al. / Government Information Quarterly xxx (2015) xxx–xxx

Practice of Electronic Governance (pp. 338–341). New York, NY, USA: ICEGOV '13, Applied Electronic Commerce Research, 9(3), i–ix http://dl.acm.org/citation.cfm?id=
ACM. http://dx.doi.org/10.1145/2591888.2591949. 2661036.2661037.
van der Waal, S., Wecel, K., Ermilov, I., Janev, V., Milošević, U., & Wainwright, M. (2014). Zuiderwijk, A., Helbig, N., Gil-García, J.R.A., & Janssen, M. (05 2014). Special issue on
Lifting open data portals to the data web. In S. Auer, V. Bryl, & S. Tramp (Eds.), Linked innovation through open data: Guest editors' introduction. Journal of Theoretical
Open Data – Creating Knowledge Out of Interlinked Data. Lecture Notes in Computer and Applied Electronic Commerce Research, 9, i–xiii [http://www.scielo.cl/scielo.php?
Science. (pp. 175–195). Springer International Publishing. http://dx.doi.org/10.1007/ script=sci_arttext pid=S0718-18762014000200001 nrm=iso].
978-3-319-09846-3_9. Zuiderwijk, A., & Janssen, M. (2013). A coordination theory perspective to improve the
Yang, T.M., Lo, J., Wang, H.J., & Shiang, J. (2013). Open data development and value-added use of open data in policy-making. In M. Wimmer, M. Janssen, & H.J. Scholl (Eds.),
government information: Case studies of taiwan e-government. Proceedings of the 7th EGOV. Lecture Notes in Computer Science. 8074. (pp. 38–49). Springer [http://dblp.
International Conference on Theory and Practice of Electronic Governance uni-trier.de/db/conf/egov/egov2013.html#ZuiderwijkJ13].
(pp. 238–241). New York, NY, USA: ICEGOV '13, ACM. http://dx.doi.org/10.1145/ Zuiderwijk, A., & Janssen, M. (2014a). Barriers and development directions for the publi-
2591888.2591932. cation and usage of open data: A socio-technical view. In M. Gascó-Hernández (Ed.),
Yang, Z., & Kankanhalli, A. (2013). Innovation in government services: The case of open Open Government, Public Administration and Information Technology, vol. 4. (pp.
data. Grand Successes and Failures in IT. Public and Private Sectors – IFIP WG 8.6 115–135). New York: Springer. http://dx.doi.org/10.1007/978-1-4614-9563-5_8.
International Working Conference on Transfer and Diffusion of IT, TDIT 2013, Bangalore, Zuiderwijk, A., & Janssen, M. (2014b). The negative effects of open government data — In-
India, June 27–29, 2013. Proceedings. (pp. 644–651). http://dx.doi.org/10.1007/978-3- vestigating the dark side of open data. Proceedings of the 15th Annual International
642-38862-0_47. Conference on Digital Government Research (pp. 147–152). New York, NY, USA: dg.o
Zuiderwijk, A., Gascó, M., Parycek, P., & Janssen, M. (Sep 2014). Special issue on transpar- '14, ACM. http://dx.doi.org/10.1145/2612733.2612761.
ency and open data policies: Guest editors' introduction. Journal Theoretical and

Please cite this article as: Attard, J., et al., A systematic review of open government data initiatives, Government Information Quarterly (2015),
http://dx.doi.org/10.1016/j.giq.2015.07.006

You might also like