You are on page 1of 5

Applications of Linked Data Technologies in Libraries: Technical and Ethical

Considerations

Nadyabutt@gmail.com
Qurat Ul Ain Saleem
University of Central Punjab Nosheen Fatima Warriach
Lahore, Pakistan Department of Information Management
Quratulainsaleem266@gmail.com University of the Punjab, Pakistan
Nadia Butt Lahore, Pakistan
Riphah International University Nosheen.im@pu.edu.pk
Lahore, Pakistan

Abstract—this study aims to identify and report the challenges way that it becomes part of a single data space, Reference [3]
and issues/ barriers faced by libraries to implement Linked outlined a set of rules for publishing data. The rules are;
Data (LD) technologies. It also highlights some misconception 1. “Use URIs as names for things.
regarding linked open data among library community. It is a 2. Use HTTP URIs so that people can look up those
qualitative study based on literature review. Literature was names.
searched using scholarly databases and Google scholar by 3. When someone looks up a URI, provide useful
using keyword searching. Articles were selected on the basis of information, using the standards
relevancy with the topic. The findings identified technical (RDF, SPARQL).
issues such as ontologies development; use of multiple
4. Include links to other URIs, so that they can discover
standards and languages along with legal issue including data
protection and privacy of data. Some conceptual issues such as
more things”.
lack of awareness, scarce resources to train the practitioners,
and issues of publishing data on web were also highlighted. In 2007, the W3C Linking Open Data (LOD) project
The study is limited to limited to the challenges and issues began publishing existing datasets under open licenses based
along with misconceptions about linked data highlighted in on linked data principles. Again, Berners-Lee, the initiator of
literature. It didn’t gather primary data for empirical linked data, gave a five-star deployment scheme for
evidences. This study will be helpful for library practitioners publishing data. According to Berners-Lee, five-star
and policy makers to manage in forecasting the potential issues deployment scheme includes;
while adopting linked data technologies. Identification of issues 1. “Data is available on the Web,
before starting any project helps the stakeholders to work on 2. Available as machine-readable structured data,
the solutions to lesser the problems while working. 3. Available in a non-proprietary format,
4. Published using open standards from the W3C,
Keywords- Issues - Linked Data; Challenges - Linked Data;
Linked Data Adoption – Libraries; Misconception - Linked Data. 5. All of the above and links to other Linked Open
Data”
I. INTRODUCTION Linking data on the web has its own benefits and
Linked Data (LD) is a set of best practices required for challenges and it is not as cool as it seems. The purpose of
publishing and connecting structured data on the web that is this study is to review the published literature to identify the
usable by both machine and human being. LD is an challenges, issues and barriers of linked data. In Pakistan, it
expression used to expose, share and connect data on the is an emerging concept, this study will help those librarians
web through the help of URIs (Uniform Resource and information professionals who are planning to link their
Identifiers) [8]. Multiple terms are used for LD including, datasets on the web with a public accessibility.
linked open data and open data. Open data is defined as
“non-privacy-restricted and no confidential data which is
produced with public money and is made available without
any restrictions on its usage or distribution” [13]. For linking
data, the data must be open and structured so that it may link
to other datasets for providing consumers access to a wealth
of data.
Linked data is part of the semantic web, and is built upon
semantic web technologies. For publishing data on web, in a
II. OBJECTIVES OF THE STUDY A. MISCONCEPTION OF LINKED DATA
There are certain misconception about LD technologies in
A major objective of this paper is to highlight the issues library community. They perceive that LD has no value in
related to the adoption of linked data technologies. itself; it only becomes valuable when serve the public. This
Following objectives guided the study: study will highlight few of the misconceptions regarding
• To identify the ethical and technical challenges for the linked open data.
application of LD technology in libraries Reference [13] identified five myths of linked data: “(a) the
publicizing of data will automatically yield benefits; (b) all
• To throw light on the misconceptions of LD technologies information should be unrestrictedly publicized; (c) it is
among library community matter of simply publishing public data; (d) each constitute
can make use of open data; (e) open data will result in open
III. RESEARCH QUESTIONS government”.
Few other misconception regarding semantic web
Following research questions guided the study in the context technologies includes that Semantic Web applications
of libraries; (should) always reuse existing ontologies. If existing
ontologies are fulfilling the purpose exhaustively, there is no
A. RQ1: what are the misconceptions related to LD need to develop new one. But if the need arose then
technologies? developing a new ontology is necessary.
Ontology design require agreement from all stakeholder is
B. RQ2: what are the technical issues related to LD
another misconception. There are basically two approaches
technologies?
for designing an ontology; bottom up and top down
C. RQ3: what are the ethical and legal issues involved in approach. Top down is not usually practical but appealing.
adopting linked data technologies? As though everybody concurs from the starting point, at that
point everybody will have the capacity to reuse similar
D. RQ4: what are the other issues highlighted in literature ideas, and the subsequent data and software will function
other than technical and legal? together. Semantic Web ontology standards, (for example,
RDFs and OWL) are intended to use in a bottom-up
approach.
IV. METHODOLOGY Data access is provided through federated search is almost
This paper is based on published literature about fair to understand as semantic web integrate heterogonous
linked data. Different databases and scholarly journals data across the web to provide access. However, this is not
were consulted to find articles and studies for this seem true as semantic web has option to choose the data
purpose. Available databases such as LISTA, Jstor, and technology and integration paradigm.
Science Direct along with Google and Google scholar B. ISSUES TO LINKED DATA
were searched. Search terms for literature includes,
linked data, issues to linked data, ethical considerations There are many problems related to the adoption of LD
for linked data, linked data: associated issues, challenges technologies, including economic, political, ethical, legal,
of linked data, and barriers to linked data. Articles from technological and institutional. The challenges does not stop
2009 to 2017 were selecetd after a carefull review of here, there are some barriers with relation to the information
abstract. Twenty five articles were seleceted and quality for linking data on the web and as well as regarding
discussed among all the three reserachers to derive the the use of linked data from a user point of view. Literature
major themes. also clustered benefits of linked data in three categories; 1)
political and social; 2) economic; and 3) operational and
technical benefits [13].
V. ANALYSIS OF LITERATURE This study will highlight legal, technical, conceptual, quality
issues and some other issues related to linking data on the
Linked data has potential benefits for libraries. Reference web through published literature. The barriers of LD
[18] advocates LD adoption in libraries and emphasize that adoption are arranged according to the research questions.
libraries should discuss its potential with different Technical issues:
stakeholders. They encouraged librarians to take a Linking data on the web is not an easy task. To make data
leadership role in accelerating linked data initiatives. The machine-readable and interpretable includes a lot of
barriers that libraries could face while adopting LD technology, experience, standards, vocabularies, and
technologies are highlighted in this study along with certain languages. Reference [1] identified the issues and
misconception about it. challenges of LD technologies with respect to libraries.
Most of the libraries in the world are using MARC standard
for assigning metadata to the information items. MARC has
limitations, as LD does not support it. The replacement of semantic publishing in their study. Reference [15]
MARC standards in libraries to linked data on the web also highlighted the issues of relationship semantic in web of
has issues. Data conversion from MARC to another linked data. Too much specificity of any particular schema
standard metadata format, and especially when there are will not be useful for others especially outside the scholarly
large datasets is a key fact [7]. community and it will be soloing scholarly resources away
Data Modelling and Linking: from more general applications. In
Reference [25] identified two issues of data modelling and Interoperability:
data linking. Data modeling is an entity relationship Reference [20] experienced the issue of low interoperability.
modeling, which describe how terms relate to each other. Application of a standard mechanism to access the
They studied the application of linked data technologies to databases, classes, and attributes should consider.
social sciences and identified that the exciting terminologies Publishing of scientific data with LD technologies is
and vocabularies are not enough as these are not describing challenging due to its dynamic and ever increasing attributes
and expressing the terms or relationship to the fullest. So, as with static nature of RDF is important to consider as the
an extension of complete vocabulary for social sciences, a change of RDF may crash the application.
separate vocabulary is developed. This means, that for every LD environment, accurate and meaningful representation of
field of subject a separate vocabulary will be needed to any work play a vital role implanting LD successfully.
develop so to describe the terms. The second issue Reference [16] reported that majority of searches completed
highlighted was data linking that further identified two online relate to aboutness. It is also reported in literature that
issues entity disambiguation and specification of links and current vocabularies are not exhaustive and cannot benefit
their semantics. the library community at a wider scale. Terminological
Integration of sensor data is an interesting challenge as differences between libraries and web-based standards are
sensor data is highly dynamic and temporal most of the time not easy to understand for library community. Moreover,
[14]. LD has to face challenges in the publishing and there is a lack of tools and applications in libraries for
consumption of data. Literature established that there is still adopting LD technologies.
much data in right format that is not linked on web, Legal issues:
probably due to the availability of data at hard to reach Legal issues include the debate related to data protection,
places [20] and this is very true in the context of libraries copyright, and privacy of data. Reference [13] identified
and cultural heritage institutions. Reference [11] opined legal issues related to linked data on the web. Further, these
about the infrastructure requirement for setting up LD issues are elaborated as;
services in library and archives. They emphasized the • “Privacy violation and Security
importance of stable URIs. • No license for using data
Semantic Inconsistency: • Limited conditions for using data
Reference [10] observed, “Library community and Semantic • Dispute and litigations
Web community have different vocabularies for the same • Prior written permission required to gain access to
metadata causing a complication in the mapping process”. and reproduce data
Even different libraries and scientific databases have been • Reuse of contracts/agreements”
using different vocabularies. Reference [23] pointed out the Licensing and Copyright Issues:
issue of semantic inconsistency. During the application of In terms of legal matters, two important issues highlighted
LD to scientific databases, they identified the difference of in [11] are publication rights and licensing of linked data.
vocabularies due to the lack of standards. Every institution Gonzales [7] pointed out legality and copyright of data to
work with their own standards, and when they linked data share on web. In [12] discussed the issue of licensing and
together there are problem such as homonyms, synonyms, waving the rights of publically available data. Absence of a
and heterogeneity of data. Reference [22] described the copyright statement doesn’t mean that data can be reused.
issue of ontology matching in LD. Reference [20] also They recommend that Linked Data published on the web
identified the issue to ontology alignment, when there are should include explicit license or waiver statements.
multiple ways to structure a data, there will be heterogeneity Licenses and waivers are two interrelated concepts.
in datasets. In addition, LD has no recommended standard to Licenses grant others rights to reuse something and
control data access and to authenticate users. They further generally attach conditions to this reuse, while waivers
recommended, “The application server must implement enable the owner to explicitly waive their rights to
unification authentication and access control in the process something. Reference [24] highlighted the complex factor of
of data access, which limits the interoperability between copyright in his article which was based on presentations
different systems”. Data selection and Ontology selection given at the Talis Open Day at the British Library.
are also two important considerations [11]. Development of Data protection:
ontologies, entities (classes), elements (properties) and Reference [19] narrated personal information protection as
values (instances) are significant for making a conceptual an important concern of linked data technologies. Before
shift [1]. Reference [6] bring out an important issue of adopting linked data technologies, a global conversation on
privacy and data reuse should be defined so to deal with the linking data. In [17] problems of LD were identified such
issue of copyright as well. as;
• Identity
• Concept
Ethical Issues: • Publishing data
Reference [14] explored the issue of privacy of people and • consuming data
discussed the solutions for protecting the privacy of people. Conceptual issues are also there in LD technologies. In [11]
If sensitive data related to people will be made available on the authors identified the conceptual issues along with
the internet through linked data technologies that is very technical and legal issues with relation to cultural heritage.
dangerous. Reference [5] studied the personal privacy and Some issues as identified in [10] are
web of linked data and identified the issues related to • Lack of agreements to provide data, difficulty of
privacy of personals. The case study investigated the risks migrating data to new models,
and uncertainties associated with user privacy in linked data • Need to develop tools for Linked Data
environments. Further, they advised educating relevant transformation,
stakeholders, including software developers, users, and • Lack of experts in different areas for the
ethics committees, about the potential risks to personal transformations,
privacy. • Lack of applications consuming Linked Data,
Reference [10] described that the difference of ownership
rights that are varying from country to country is a complex VI. CONCLUSION
problem. Reference [19] narrated personal information Addressing to technical, social, ethical, legal and
protection as an important concern of linked data conceptual issues will not only help in the consumption of
technologies. Before adopting linked data technologies, a LD technologies but will also be helpful to its adoption and
global conversation on privacy and data reuse should be sustainability. Major barriers can be summarized as the need
defined so to deal with the issue of copyright as well. of infrastructure, terminological differences of web and
Other issues: libraries, issues related to data linking and modelling,
Reference [14] described the issue of data quality as there licensing, copyright and data protection. Lack of success
are many data available on the internet, how to assure the stories, knowledge and experience is another hindrance.
truthfulness of data, is a question. Reference [2] identified Despite of all these challenges, LD is considered as a
the quality issues related to data. The quality assessment of standardized, practical, open access mechanism Literature
data allows users or applications to understand whether data establishes that LD has affordances in information
is appropriate for the task or not. Mainly, linked data suffers environment that will alter the challenges into opportunities.
from quality problems like inaccuracy, out datedness, The overall purpose of Linked Data is facilitating the re-
incompleteness, and inconsistency, due to which full usability, cross-linking, and integration and sharing of data.
exploitation of data remain fails. Reference [4] mentioned The adoption of Linked Data will not only provide an
in the study that anyone could published data to the web of interactive system but also ensure the availability and
linked data, so the data quality issues are there. To maintain accessibility of information, re-usable and with the
the quality of data it is important to ensure the accuracy, possibility of serendipitous discovery of other resources.
completeness, consistency, readability and accessibility of
data published on the web. REFERENCES
Another issues highlighted in Reference [11] is general lack [1] Alemu, Getaneh, Brett Stevens, Penny Ross, and Jane
of experience reports for establishing linked services. Best Chandler. "Linked Data for libraries: Benefits of a conceptual
shift from library-specific record structures to RDF-based
practices are available that can be used as a rule. Reference data models." New Library World 113, no. 11/12 (2012): 549-
[7] also ascertain that teaching a new bibliographic system 570.
to the practitioners is not easy, it will require a great deal of [2] Batini, Carlo, Cinzia Cappiello, Chiara Francalanci, and
time and effort and as well as resources. Lack of knowledge, Andrea Maurino. "Methodologies for data quality assessment
experience and success stories was highlighted as a and improvement." ACM computing surveys (CSUR) 41, no.
3 (2009): 16.
challenge of linked data for libraries [9]. Reference [7] [3] Berners-Lee, Tim, and Ralph Swick. Semantic Web
discerned that knowledge and awareness is an issue as well Development. Massachusetts Inst of Tech Cambridge, 2006.
as the institutional willingness in sharing data is very [4] Bizer, Christian. "The emerging web of linked data." IEEE
important. intelligent systems 24, no. 5 (2009).
Getting feedback from the users, can be used for multiple [5] Corsar, David, Peter Edwards, Chris Baillie, Milan Markovic,
purposes, is also a challenge as pointed out by reference Konstantinos Papangelis, and John Nelson. "GetThere: a rural
passenger information system utilising linked data & citizen
[21]. Reference [15] demonstrated the role of stakeholders sensing." In Proceedings of the 2013th International
in linking data on the web. The role of resource creator and Conference on Posters & Demonstrations Track-Volume
curator and other stakeholder are needed to define clearly. 1035, pp. 85-88. CEUR-WS. org, 2013.
They also explained the role of creator and curator in
[6] Dimou, Anastasia, Sahar Vahdati, Angelo Di Iorio, Christoph [16] Méndez, Eva, and Jane Greenberg. "Linked data for open
Lange, Ruben Verborgh, and Erik Mannens. "Challenges as vocabularies and HIVE’s global framework." El profesional
enablers for high quality linked data: Insights from the de la información 21, no. 3 (2012): 236-244.
semantic publishing challenge." PeerJ Computer Science 3 [17] Milicic, V. “Problems of link data (1/4): identity”. Retereived
(2017): e105. May 22, 2017from:
[7] Gonzales, Brighid M. "Linking libraries to the web: linked http://milicicvuk.com/blog/2011/07/26/problems-of- linked-
data and the future of the bibliographic record." Information data- 14-identity/ July, 2011.
Technology and Libraries (Online) 33, no. 4 (2014): 10. [18] Miller, Eric, and Micheline Westfall. "Linked data and
[8] Guerrini, Mauro, and Tiziana Possemato. "Linked data: a new libraries." The Serials Librarian 60, no. 1-4 (2011): 17-22.
alphabet for the semantic web." JLIS. it 4, no. 1 (2013): 67. [19] Nichols, B. Nolan, Satrajit S. Ghosh, Tibor Auer, Thomas J.
[9] Halla, Michelle L. "Linked Data in Libraries: Library of Grabowskith, Camille Maumet, David Keator, Kilian Pohl,
Congress’ Bibliographic Framework Transition Initiative." and Jean-Baptiste Poline. "Building a Web of Linked Data
(2013). Resources to Advance Neuroscience Research." bioRxiv
[10] Hallo, María, Sergio Luján-Mora, Alejandro Maté, and Juan (2016): 053934.
Trujillo. "Current state of Linked Data in digital libraries." [20] Omitola, Temitope, Christos L. Koumenides, Igor O. Popov,
Journal of Information Science 42, no. 2 (2016): 117-127. Yang Yang, Manuel Salvadores, Gianluca Correndo, Wendy
[11] Hannemann, Jan, and Jürgen Kett. "Linked data for libraries." Hall, and Nigel Shadbolt. "Integrating public datasets using
In Proc of the world library and information congress of the linked data: challenges and design principles." (2010).
Int’l Federation of Library Associations and Institutions [21] Paton, Norman W., Klitos Christodoulou, Alvaro AA
(IFLA). 2010. Fernandes, Bijan Parsia, and Cornelia Hedeler. "Pay-as-you-
[12] Heath, Tom, and Christian Bizer. "Linked data: Evolving the go data integration for linked data: opportunities, challenges
web into a global data space." Synthesis lectures on the and architectures." In Proceedings of the 4th International
semantic web: theory and technology 1, no. 1 (2011): 1-136. Workshop on Semantic Web Information Management, p. 3.
ACM, 2012.
[13] Janssen, Marijn, Yannis Charalabidis, and Anneke
Zuiderwijk. "Benefits, adoption barriers and myths of open [22] Semantic web misconception. (n.d.). Retrieved May 22, 2017
data and open government." Information systems from https://www.cambridgesemantics.com/semantic-
management 29, no. 4 (2012): 258-268. university/semantic- web-misconceptions#
[14] Li, B. (n.d.). “Linked data: overview, usage and application”. [23] Shen, Zhihong, Jianhui Li, and Fang Han. "Opencsdb:
Retrieved May 22, 2017 from: https://www.snet.tu- Research on The Application of Linked Data in Scientific
berlin.de/fileadmin/fg220/courses/WS1112/snet- Databases." Data Science Journal 14 (2015).
project/linked-open- data_li.pdf [24] Williams, Helen KR. "Linked data and libraries." Catalogue
[15] Mayernik, M. S., Phillips, J., & Nienhouse, E. (2016). and Index 160 (2010): 2-5.
Linking Publications and Data: Challenges, Trends, and [25] Zapilko, Benjamin, Johann Schaible, Timo Wandhöfer, and
Opportunities. D-Lib Magazine, 22(5/6). Peter Mutschke. "Applying Linked Data Technologies in the
Social Sciences." KI-Künstliche Intelligenz 30, no. 2 (2016):
159-162.

AUTHORS’ BACKGROUND

Your Name Title* Research Field Personal website


Qurat Ul Ain Saleem PhD candidate Library and Information Science
Dr Nosheen Fatima Assistant Library and Information Science
Warriach Professor
Nadia Butt PhD candidate Library and Information Science

You might also like