You are on page 1of 17

Journal of

Information
Warfare
Volume 15, Issue 4
Fall 2016

Contents
From the Editor i
L Armistead

Authors iii

Rhizomatic Target Audiences of the Cyber Domain 1


M Sartonen, A-M Huhtinen and M Lehto

Exploring the Complexity of Cyberspace Governance: State Sovereignty, Multi- 14


stakeholderism, and Power Politics
A Liaropoulos

Applying Principles of Reflexive Control in Information and Cyber Operations 27


ML Jaitner and MAJ H Kantola

Utilising Journey Mapping and Crime Scripting to Combat Cybercrime and Cyber 39
Warfare Attacks
T Somer, B Hallaq and T Watson

Disinformation in Hybrid Warfare: The Rhizomatic Speed of Social Media in the 50


Spamosphere
A-M Huhtinen and J Rantapelkonen

Security-Information Flow in the South African Public Sector 68


H Patrick, B van Niekerk and Z Fields

South Koreas Options in Responding to North Korean Cyberattacks 86


J Park, N Rowe and M Cisneros

Understanding the Trolling Phenomenon: The Automated Detection of Bots and 100
Cyborgs in the Social Media
J Paavola, T Helo, H Jalonen, M Sartonen and A-M Huhtinen
Authors
Captain Maribel Cisneros, areas include identifying methods and techniques for
United States Army, is a cross border cyber attack attribution, mitigation at
Military Intelligence Officer scale of complex multi-jurisdictional cyber events,
assigned to the U.S. Army and, maritime and rail cyber security. He holds
Cyber Command. She was several professional qualifications including:
commissioned as a Second penetration testing, incident response, malware
Lieutenant in the Military investigation, digital forensics investigation amongst
Intelligence Branch in 2007, others.
and has served as Assistant
S1, Platoon Leader, Company Tuomo Helo is a senior
Executive Officer, Battalion S2, and Company lecturer with Turku University
Commander. She deployed multiple times to of Applied Sciences, Turku,
USSOUTHCOM as a Mission Manager and Battle Finland. He earned a masters
Captain, and to USCENTCOM as Task Force OIC. degree in information systems
She earned a masters degree in computer systems and a masters degree in
and operations from the Naval Postgraduate School economics from the
and a masters degree in management and leadership University of Turku, Finland.
from Webster University. His current research interests
include text analytics and data
Dr. Ziska Fields is an mining in general. He has also completed research in
associate professor and the fields of health economics and the economics of
academic leader at the education.
University of KwaZulu-Natal,
South Africa. Her research Dr. Aki-Mauri Huhtinen,
interests focus on creativity, (LTC [GS]) is a military
entrepreneurship, human professor in the Department of
resources, and higher Leadership and Military
education. She developed two Pedagogy at the Finnish
theoretical models to measure National Defence University,
creativity in South Africa, focusing on youth and Helsinki, Finland. His areas of
tertiary education. She has published in expertise are military
internationally recognized journals and edited books. leadership, command and
She is the editor of the book Incorporating business control, the philosophy of
models and strategies into social entrepreneurship science in military organisational research, and the
and has completed another book titled Collective philosophy of war. He has published peer-reviewed
creativity for responsible and sustainable business journal articles, a book chapter, and books on
practice. She is a member of the South African information warfare and non-kinetic influence in the
Institute of Management, the Ethics Institute of South battle space. He has also organised and led several
Africa, and the Institute of People Management. research and development projects in the Finnish
Defence Forces from 2005 to 2015.
Bil Hallaq is a cyber security researcher with more
than 15 years of academic, commercial and industrial Margarita Levin Jaitner is a
experience. He previously spent several years researcher in the area of
handling and mitigating against various security information warfare and
threats and vulnerabilities within commercial cyberspacewith a particular
environments. He is delivering on various projects focus on Russian operations
including: the identification and application of novel at the Swedish Defence
techniques for OSINT, EU E-CRIME Project - University, Stockholm,
comprised of several European partners including Sweden. She is also a Fellow
Interpol where he is working with partners on at the Blavatnik
understanding criminal structures and mapping Interdisciplinary Cyber
cybercriminal activities to produce and recommend Research Center. She has previously conducted
effective countermeasures. His other applied research research at the Finnish National Defence University

Journal of Information Warfare iii


as well as at the Yuval Neeman Workshop for reports and articles on the areas of C4ISR systems,
Security, Science and Technology in Tel Aviv. She cyber security and defence, information warfare, air
earned a masters degree in Societal Risk power, and defence policy.
Management, and a bachelors degree in political
science. Dr. Andrew Liaropoulos is
an assistant professor in the
Dr. Harri Jalonen is a Department of International
principal lecturer and research and European Studies at the
group leader (AADI) at the University of Piraeus, Greece.
Turku University of Applied He also teaches in the Joint
Sciences, Turku, Finland. He Staff War College, the Joint
also holds a position as an Military Intelligence College,
adjunct professor at the the National Security College,
University of Vaasa. He has the Air War College, and the
research experience dealing Naval Staff Command College. His research interests
with knowledge and include international security, intelligence reform,
innovation management and digitalisation issues in strategy, military transformation, foreign policy
different organisational contexts. He has published analysis, cyber security, and Greek security policy.
more than 100 articles in these fields. He is one of the He also serves as a senior analyst in the Research
most referred researchers in the field of complexity Institute for European and American Studies
thinking in Finland. He has managed or been (RIEAS) and as the assistant editor of the Journal of
involved in many international and national research Mediterranean and Balkan Intelligence.
projects. In addition, he has guided several thesis
projects, including doctoral theses. He is a reviewer Dr. Jarkko Paavola is a
for many academic journals and a committee member research team leader and a
on international conferences. principal lecturer with Turku
University of Applied
Major Harry Kantola Sciences, Turku, Finland. He
teaches and conducts research earned his doctoral degree in
at the Finnish National technology in the field of
Defence University, Helsinki, wireless communications from
Finland. He also is currently the University of Turku,
appointed to the Finnish Finland. His current research
Defence Command as a Cyber interests include information security and privacy,
Defence planner in C5 (J6) dynamic spectrum sharing, and information security
branch. He joined the Finnish architectures for systems utilising spectrum sharing.
Defence Forces in 1991. He
served in various capacities (CSO, CIO) in the Major Jimin Park, Republic
Finnish Navy, Armoured Signal Coy, and Armoured of Korea Airforce, is a Cyber
Brigade. From 2014 to 2016, he served an Intel-Ops Officer assigned to
appointment as a researcher at the NATO ROK Cyber Command. His
Cooperative Cyber Defence Centre of Excellence previous assignments have
(NATO CCD COE), Tallinn, Estonia. included the 37th Air
Intelligence Group, and
Dr. Martti Lehto (Col., serving as an Intel-Ops Officer
retired) works as a cyber and an Intel-Watch Officer at
security and cyber defence Osan AFB with the U.S. 7th
professor of practice in the Airforce. In 2007, he went to Ali Al Salem AFB in
Department of Mathematical Kuwait with the U.S. Central Command and the
Information Technology at the 386th Expeditionary Wing as part of Operation Iraqi
University of Jyvskyl, Freedom. He earned a masters degree in computer
Jyvskyl, Finland. He has science from the U.S. Naval Postgraduate School.
more than 30 years of
experience as a developer and
leader of C4ISR Systems in the Finnish Defence
Forces. He has more than 75 publications, research

iv Journal of Information Warfare


Dr. Harold Patrick is a Tiia Smer is an early stage
forensic investigation researcher at Tallinn
specialist at the University of University of Technology,
KwaZulu-Natal, South Africa. Tallinn, Estonia. Her research
He completed his doctorate at focuses on cyber crime and
University of KwaZulu-Natal cyber forensics, leading TUT
in 2016. His dissertation work on the EU E-CRIME
focused on information project, a three-year European
security, collaboration, and Union project, researching the
the flow of security economic aspects of cyber
information. He earned a masters degree in crime. In addition, she has taught cyber security at
information systems and technology and is a the strategic level and prepared students for cyber-
Certified Fraud Examiner. defence international policy-level competitions at the
TUT. Before starting an academic career, she served
Dr. Jari Rantapelkonen, for more than 20 years in the Estonian defence
(LTC, retired) is a professor forcesincluding teaching at the staff college;
emeritus at the Finnish working in diplomatic positions at national, NATO
National Defence University, and EU levels; and, most recently, working at EDF
Helsinki, Finland. His HQ cyber security branch. Her masters thesis, titled
expertise areas include Educational Computer Game for Cyber Security: A
operational art and tactics, Game Concept, focused on using games in the
military leadership, teaching of cyber security. She is currently
information warfare, and the completing Ph.D.-level studies, focusing on journey
philosophy of war. He has mapping and its application in understanding and
served in Afghanistan, the Balkans, and the Middle solving cyber incidents.
East. He is a mayor at the Enonteki county, Finland
in the Arctic area. Dr. Brett van Niekerk is a
senior security analyst at
Dr. Neil C. Rowe is a Transnet and an Honorary
professor of computer science Research Fellow at the
at the U.S. Naval Postgraduate University of KwaZulu-Natal,
School (Monterey, CA, USA) South Africa. He graduated
where he has been since from the University of
1983. He earned a doctorate in KwaZulu-Natal with his
computer science from doctorate in 2012 and has
Stanford University completed two years of
(1983). His main research postdoctoral research into information operations,
interests are data mining, information warfare, and critical infrastructure
digital forensics, modelling of deception, and cyber protection. He serves on the board for ISACA South
warfare. Africa and as secretary for the International
Federation of Information Processings Working
Miika Sartonen is a Group 9.10 on ICT in Peace and War. He has
researcher at the Finnish contributed to the ISO/IEC information security
Defence Research Agency and standards, and multiple presentations, papers, and
a doctoral student at the book chapters in information security and
National Defence University. information warfare to his name. He earned
bachelors and masters degrees in electronic
engineering.

Journal of Information Warfare v


Professor Tim Watson is the
Director of the Cyber
Security Centre at WMG
within the University of
Warwick, Coventry, UK. He
has more than 25 years
experience in the computing
industry and in academia and
has been involved with a wide
range of computer systems on
several high-profile projects. In addition, he has
served as a consultant for some of the largest
telecoms, power, and oil companies. He is an adviser
to various parts of the UK government and to several
professional and standards bodies. His current
research includes EU-funded projects on combating
cyber-crime; UK MoD research into automated
defence, insider threat, and secure remote working;
and, EPSRC-funded research, focusing on the
protection of critical national infrastructure against
cyber-attack. He is a regular media commentator on
digital forensics and cyber security.

vi Journal of Information Warfare


Understanding the Trolling Phenomenon: The Automated Detection of Bots
and Cyborgs in the Social Media

J Paavola1, T Helo1, H Jalonen1, M Sartonen2, A-M Huhtinen2


1
Turku University of Applied Sciences
Turku, Finland
E-mail: jarkko.paavola@turkuamk.fi; tuomo.helo@turkuamk.fi; harri.jalonen@turkuamk.fi
2
Finnish National Defence University
Helsinki, Finland
E-mail: miika.sartonen@mil.fi; aki.huhtinen@mil.fi

Abstract: Social media has become a place for discussion and debate on controversial topics
and, thus, provides an opportunity to influence public opinion. This possibility has given rise to a
specific behaviour known as trolling, which can be found in almost every discussion that
includes emotionally appealing topics. Trolling is a useful tool for any organisation willing to
force a discussion off-track when one has no proper facts to back ones arguments. Previous
research has indicated that social media analytics tools can be utilised for automated detection
of trolling. This paper provides tools for detecting message automation utilized in trolling.

Keywords: Social Media, Stakeholder, Trolling, Sentiment Analysis, Bot, Cyborg

Introduction
The current stage in the evolution of information is one in which the unpredictability of its
effects is accelerating. The volume of information is growing, and its structure is becoming
increasingly opaque. Information can no longer be seen as a system or as the extent of ones
knowledge, but must rather be seen as an entity that has started to live a life of its own. Thus,
information provides its own energy and is its own enemy. In most cases, information is also a
source of beneficial development and can improve peoples quality of life. It is essential,
however, to understand that it can also unleash danger and adversity.

Due to the plethora of information available, people are not always able to determine whether
information is valid, and consequently tend to make hasty presumptions with the data they have.
This tendency is utilized by trolling, which has come to be equated by the media in recent years
with online harassment. Because of trolling, it is becoming increasingly difficult to pinpoint
where information originates and where it leads (Malgin 2015).

In todays era of information overload, individuals and groups try to get their messages across by
using forceful language, by engaging in dramatic (even violent) actions, or by posting video clips
or pictures on social media (Nacos, Bloch-Elkon & Shapiro 2011, p. 48). The politically-driven
mass media is most probably behind this information overload on individuals. Aggressive

Journal of Information Warfare (2016) 15.4: 100-111 100


ISSN 1445-3312 print/ISSN 1445-3347 online
Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media

behaviour is increasing in social media because of the technical ease with which trolling can be
carried out. In social media, all kinds of values become interwoven with each other.

Information is essentially a product of engineering science. In order to expand the sphere of


understanding to information as a part of human social life, one has to step outside of the hard
sciences realm. Social sciences viewpoint is especially called for when discussing possible
threats and human fears connected with information. The rise in diverse Internet threats has
opened up the discussion of the possibility of nation states extending their capacity to control
information networks, including citizens private communications.

Computer culture theorists have identified the richly interconnected, heterogeneous, and
somewhat anarchic aspect of the Internet as a rhizomic social condition (Coyne 2014). During
the past quarter of a century, the usefulness of the Internet has permeated all domains
(individual, social, political, military, and business). Everyone worldwide can use the Internet
without any specific education, as the skills needed for communicating in the social media are
easy to acquire. At the same time, work-related and official messages run parallel with private
communications. Similarly, emotions and rational thinking may easily become intertwined due
to the ease and immediacy of our communications. Deleuze and Guattari (1983) use the terms
rhizome and rhizomatic to describe this non-hierarchical, nomadic and easy environment,
particularly in relation to how individuals behave in this kind of environment. The current status
of the technological evolution of the Internet can also be said to be based on the rhizome
concept. The rhizome resists the organizational structure of the root-tree system, which charts
causality along chronological lines, and looks for the original source of things, as well as
toward the pinnacle or conclusion of those things. Any point on a rhizome can be connected
with any other. A rhizome can be cracked and broken at any point; it starts off again following
one or another of its lines, or even other lines (Deleuze & Guattari 1983, p. 15).

Why is trolling so easy to implement in rhizomatic information networks? The intercontinental


network of communication is not an organized structure: it has no central head or decision-
maker; it has no central command or hierarchies to quell undesired behaviour. The rhizomatic
network is simply too big and diffuse to be managed by a central command. By the same token,
rhizomatic organizations are often highly creative and innovative. The rhizome presents history
and culture as a map, or a wide array of attractions and influences with no specific origin or
genesis, for a rhizome has no beginning or end; it is always becoming in the middle and
between things (Deleuze & Guattari 1983). One example of the diversity of rhizome networks
is the fakeholder behaviour.

This paper continues the work done by Paavola and Jalonen (2015), who examined whether
sentiment analysis could be utilised in detecting trolling behaviour. Sentiment analysis refers to
the automatic classification of messages into positive, negative, or neutral within a discussion
topic. In that work, Paavola and Jalonen concluded that sentiment analysis as such cannot detect
trolls, but results indicated that social media analytics tools can generally be utilised for this task.
Here, the authors goal is to investigate trolling phenomenon further.

To facilitate analysis, a sentiment analysis tool (Paavola & Jalonen 2015) was further developed
to detect message automation, which creates noise in social media and makes it difficult to

Journal of Information Warfare 101


Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media

observe behavioural changes among human users of the social media. Paavola and Jalonens
work followed studies performed by Chu et al. (2012), Dickerson, Kagan, and Subrhamanian
(2014), and Clark et al. (2015) in which bot detection systems were devised. Components such
as message timing behaviour, spam detection, account properties, and linguistic attributes were
investigated. Variables were designed based on those components, and they were utilised in
order to categorize message senders as humans, bots, or cyborgs. The bot refers to computer
software that generates messages automatically whereas the cyborg in this context refers either to
a bot-assisted human or to a human-assisted bot.

This paper is organized as follows: First, the social media phenomenon is described, which gives
context to the trolling behaviour. Then, trolling behaviour in the social media is analysed and the
mechanisms characteristic to trolling are discussed. Before algorithms can be trained to detect
trolls, the definition of troll has to be accurate, which is not the case in the current literature.
Therefore, the trolling phenomenon is discussed before proceeding to the automated detection. In
the experimental part of the paper, automated bot and cyborg detection is applied to Twitter
messages. Finally, the discussion section provides future directions for troll detection and how to
defend against the trolling phenomenon.

Social Media as a Public Sphere


The promise of social media is not confined to technology, but involves cultural, societal, and
economic consequences. Social media refers herein to a constellation of Internet-based
applications that derive their value from the participation of users through directly creating
original content, modifying existing material, contributing to a community dialogue, and
integrating various media together to create something unique (Kaplan & Haenlein 2010). Social
media has engendered three changes: 1) the locus of activity shifts from the desktop to the web;
2) the locus of power shifts from the organization to the collective; and 3) the locus of value
creation shifts from the organization to the consumer (Berthon et al. 2012).

Social media has become integrated into the lives of postmodern people. Globally, more than
two billion people use social media on a daily basis. Whether its a question of the comments of
statesmen, opposition leaders criticism, or celebrities publicity tricks, social media offers an
authentic information source and effective communication channel. Social media enables
interaction between friends and strangers at the same time that it lowers the threshold of contact
and personalizes communication. In a way, social media has made the world smaller.

Social media has brought with it media life, which Deuze (2011) calls the state where media
has become so inseparable from us that we do not live with media, but in it (Karppi 2014, p.
22). In a hyper-connected network society, posts on Twitter cause stock market crashes and
overthrow governments (Pentland 2014). Unsurprisingly, life in social media is as messy as it is
in the real world. Social media provides people exposure to new information and ideas, reflects
their everyday highs and lows, allows for engagement in new friendships and breaking up of old
ones, makes other people delighted or jealous by posting holiday and party photos, praises and
complains about brands, and idolises the achievements of descendants and pets. Stated a bit
simply, users behaviour in social media can be categorised into two types: rational/information-
seeking and emotional/affective-seeking behaviours (Jansen et al. 2009). A desire to address a
gap in information concerning events, organisations, or issues is an example of information-

102 Journal of Information Warfare


Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media

seeking behaviour in social media, whereas affective-seeking behaviour stands for the expression
of opinion about events, organisations, or issues.

The penetration of the Internet and the growing number of social media platforms that allow
user-generated content has not happened without consequences: on the one hand, Internet-wide
freedom of speech has created bloggers and others who magnetise substantial audiences; and on
the other hand, the Internet has democratised communication by enabling anyone (in theory) to
say anything. Obviously, the consequences can be good or bad. A positive interpretation of
freedom of speech, in turn, is that it enables the emergence of public spheres envisioned by
Habermas (1989). Internet-based public spheres enable civic activities and political participation;
that is, citizens can gather together virtually, irrespective of geographic location, and engage in
information exchange and rational discussion (Robertson et al. 2013). On the other hand, many
studies have pointed out that social media has become a place for venting negative experiences
and expressing dissatisfaction (Lee & Cude 2012; Bae & Lee 2012). Due to the lack of
gatekeepers (Lewin 1943), social media provokes not only sharing rumours and expressing
conflicting views, but also bullying, harassment, and hate speech. In addition to providing a
forum for sharing information, social media is also a channel for propagating misinformation.
Terrorist organizations, such as ISIS, have quite effectively deployed social media in recruiting
members, disseminating propaganda, and inciting fear. It has also been asked whether
governments use social media for painting black into white. The question is justifiable as the
significance of the change entailed with the emergence of public spheres becomes concrete in
countries which have prohibited or hampered the use of social media.

To connect public discussion theory and trolling, this work is based on a public discussion
stakeholder classification made by Luoma-Aho (2015). The classification includes positively
engaged faith-holders, negatively engaged hateholders, and fakeholders. Trolls can be considered
as either hateholders (humans) or fakeholders (bots or cyborgs). Luoma-Aho states that the
influence of a fakeholder appears larger than it really is in practice, but tools for analysing the
impact are not provided. In order to have a more thorough view of the discussion, it would be
important to know the sources behind the fakeholders arguments; but like the artists of black
propaganda, they attempt to hide themselves. It can be hypothesized that the role of fakeholders
increases with subjects whose legitimacy is questioned or challenged, and when the public is
confused about the relevance and significance of the arguments presented in various social media
platforms.

Studies have confirmed what every social media user already knows: virtual public spheres
attract users which Luoma-Aho (2015) has named as hateholders and fakeholders. Social media
provides hateholders with continuously changing targets and stimulus. Hateholders behaviour
can be harsh, hurtful, and offensive, and it should therefore be condemned. Although fighting
against hateholders is not an easy task, it is possible because hateholders behaviour is visible.
Hateholders do not typically try to hide. To the contrary, they pursue publicity. Fakeholders, in
turn, act in the shadows. Although their behaviour can also be harsh, hurtful, and offensive, it is
difficult to get hands on them. Acting through fake identities and using sophisticated persona-
management software, fakeholders aim to violate their targets.

Journal of Information Warfare 103


Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media

Trolling as a Phenomenon
During the experiment phase of this research, the authors found that the definitions used for
trolling were not specific enough. Human communication, including that which is malevolent, is
so diverse and contextual that in order to create automatic classification systems for trolling, a
clear definition is needed. Trolling, as a means of either distracting a conversation or simply
provoking an emotional answer can utilise multiple context-based ways of influence. A positive
word or sentence can, in the right context, actually be an insult. To have any success in creating
trolling identification systems, the authors first had to create a specific description of what they
were looking for.

The Online Cambridge dictionary (2016) defines a troll as someone who leaves an intentionally
annoying message on the Internet, in order to get attention or cause trouble or as a message
that someone leaves on the Internet that is intended to annoy people. Oxford English Dictionary
Online (2016) defines a troll as a person who makes a deliberately offensive or provocative
online post or a deliberately offensive or provocative online post. These two definitions refer
to two specific characteristics of trolling: that trolling is something that happens online and that
the intention of trolling is to offend someone.

Buckels, Trapnell and Paulhus (2014) studied the motivation behind trolling behaviour from a
psychological viewpoint. They found out that self-reported enjoyment of trolling was positively
correlated with three components of the Dark Tetriad, specifically sadism, psychopathy, and
Machiavellianism. The fourth component, narcissism, had a negative correlation with trolling
enjoyment. To include the different aspects of trolling more comprehensively, a new scale, the
Global Assessment of Internet Trolling (GAIT), was introduced. Using GAIT scores, these
researchers found sadism to have the most robust association with trolling behaviour, to the
excess that sadists tend to troll because they enjoy it (Buckels, Trapnell & Paulhus 2014, p.
101). This study points to the direction of psychological factors behind trolling behaviour.

Hardaker (2010) defines a troller as

a [computer-mediated communication] user who constructs the identity of sincerely


wishing to be part of the group in question, including professing, or conveying pseudo-
sincere intentions, but whose real intention(s) is/are to cause disruption and/or to trigger
or exacerbate conflict for the purposes of their own amusement. (Hardaker 2010, p. 237)

From a dataset of 186,470 social network posts, she identified four interrelated conditions related
to trolling behaviour: aggression, deception, disruption, and success (Hardaker 2010, pp. 225-
36). This definition maintains the view of trolling as offensive and conducted for achievement of
personal goals, but adds deception to it. There may be many reasons for hiding the true intentions
behind ones messages; but in trolling, there are two main reasons for it. First, it prevents the
targets from reasoning against the trolling influence. A straightforward offensive message can be
dismissed more easily than a subtle suggestion framed as a constructive argument. Secondly,
most discussion forums are moderated, and, typically, openly offensive posts are quickly
removed (although the rules and practices may vary), thus reducing their effectiveness.

104 Journal of Information Warfare


Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media

The previously mentioned articles define trolling as having no apparent instrumental purpose.
One effect of successful trollingreplacing a factual (or at least civilized) online discussion with
a heated emotional debate with strongly emotional argumentscan, however, be used as a tool.
The motivation behind this type of trolling behaviour is different from those who have an
emotional need for trolling. Spruds et al. (2015) identified two major types of trolls in their
study of Latvias three major online news portals: classic and hybrid. The definition of classic
trolls is very close to those offered by Hardaker (2010) and Buckels, Trapnell and Paulhus.
(2014), whereas the hybrid troll is seen as a tool of information warfare. The hybrid troll is
distinguished from the classic troll by behavioural factors: intensively reposted messages,
repeated messages posted from different IP addresses, and/or nicknames and republished
information and links (Spruds et al. 2015). The motivation behind this type of trolling is not the
satisfaction of ones psychological needs but the propagation of a (typically political) agenda.

The various definitions of trolls do not fit together very well as patterns of behaviour to look for
with automated classification systems. Trolls seem to have some of the characteristics of both the
hateholders and fakeholders introduced by Luoma-Aho (2015). In addition, there is some overlap
with other behavioural patterns such as cyberbullying, or various means of psychological
influence. Trolling, nevertheless, is a recognized phenomenon that needs to be given clear
definition in order to have a common ground for discussing the subject and building reliable
tools for automatic identification.

What, then, is the essence of trolling? What are its main elements? First, for the purposes of the
current discussion, trolling does not exist without interaction. An offensive post in somebodys
personal diary or Internet site, not intended to be read by large audiences, is not trolling. Thus,
trolling exists in the interactive communications of Internet users. Secondly, there are many
ways of influencing other peoples views, varying from the objective presentation of facts to
emotional appeals. Emotional appeals, in turn, vary by the feelings they try to arouse. Trolling
uses offensively charged emotional appeals in order to arouse an aggressive response from the
audience. Third, trolling does not target a single individual, but has the intention of appealing to
as many members of the discussion forum as possible. Mihaylov and Nakov (2016) categorize
two types of opinion manipulation trolls: paid trolls which have been revealed from leaked
reputation management contracts and mentioned trolls which have been called such by
several different people. This dichotomy indicates that reaction and interaction of discussion
participants provides evidence for trolling. Other options to recognize trolls are to utilise
intelligence information, or leaked information, which is rarely available for the general public,
or to systematically compare information provided by suspected trolls to the information
provided by reliable sources.

The authors of the current paper suggest the following definition for trolling: trolling is a
phenomenon that is experienced in the interactions between Internet users, with the aim of
gaining a strong response from as many users as possible by using offensive, emotionally-
charged content. Therefore, identification of trolling requires the interaction to include 1) the
Internet as platform, 2) offensive and emotional content, and 3) an intended strong response from
the audience.

Journal of Information Warfare 105


Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media

Deceptive practices are excluded from this definition on purpose. A troll may hide or fake his/her
identity or present untruthful information, but the authors suggest trolling to be a pattern of
interaction, regardless of the motivation. This definition thus allows for different types of trolls
with variable motivations and reliability. In other words, trolling is a recognizable technique
rather than an interaction with an offensive intention.

Tools for the Detection of Message Automation


Goals of the experimental case study were 1) to analyse how to detect fakeholders, and 2) to
develop the sentiment analysis tool (Paavola & Jalonen 2015) to detect message automation. The
essential issue of this study is to determine the properties indicating that a message is sent by a
bot or by a cyborg. The first step is to tag these messages manually in order to use classification
models on them. This procedure provides ground truth, the set of user accounts reliably
classified by a human as bot or cyborg. As the number of analysed user accounts has to be in the
scale of several thousands, this part is time consuming.

Here, case study data consists of Finnish language Twitter messages discussing the Syrian
refugee crisis. Twitter is a microblog service, where, on average, 500 million messages called
tweets (140 characters maximum) are posted on a daily basis. The openness of the Twitter
platform allows, and actually promotes, the automatic sending of messages. It is increasingly
common to send automated messages to human users in an attempt to influence them and to
manipulate sentiment analyses (Clark et al. 2015). Social media analyses can be skewed by bots
that try to dilute legitimate public opinion.

Simple bot detection mechanisms analyse account activity and the related user network
properties. Chu et al. (2012) utilised tweeting timing behaviour, account properties, and spam
detection. An example of a more advanced study is provided by Dickerson, Kagan, and
Subrhamanian (2014). Their aim was to find out the most influential Twitter users in a discussion
about an election in India. To make this kind of assessment requires the exclusion of bots from
the analysis. The authors created a very complex model with tens of variables in order to decide
whether any given user is a human or a bot. Nineteen of those variables were sentiment based.
The authors main findings were that bots flip-flop their sentiment less frequently than humans,
that humans express stronger sentiments, and that humans tend to disagree more with the general
sentiment of the discussion.

Sophisticated bot algorithms emulate human behaviour, and the bot detection must be performed
from linguistic attributes (Clark et al. 2015). The authors used three linguistic variables to
determine whether the user is a bot: the average URL count per tweet, the average pairwise
lexical dissimilarity between a user's tweets, and the word introduction rate decay parameter of
the user. With these parameters the authors were able to classify users as humans, bots, cyborgs,
or spammers. The authors concluded that for users, these three attributes are densely clustered,
but can vary greatly for automated user accounts.

Case Study
The development of our automatic bot detection systems was started with cross-topic Twitter
dataset, which was collected from 17 September 2015 to 24 September 2015. It covered more
than 977,000 tweets in Finnish, which were sent by more than 343,000 users.

106 Journal of Information Warfare


Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media

To develop an automatic classification system, a ground truth dataset needed to be created.


Among the collected data, 2,000 userswho had each sent at least 10 tweets during that one
week periodwere randomly chosen. The sample set contained 83,937 total tweets. Also the
profile data of those 2,000 Twitter users was extracted. Tweets in languages other than Finnish
were excluded from the dataset.

For each sampled user, the following procedure was used. The text content of each tweet was
carefully checked. Other properties such as the tweeting application used, the number of friends
and followers, and in some cases, the users homepage were also checked and recorded. In short,
the user was labelled as a human or a bot based on the text of tweets, other information carried
by tweets, the information contained in the user profile, and in some cases, external data. It took
more than a minute on average to classify a user.

The automatic bot detection system was further developed and applied to refugee-related Twitter
messages. The difference between bot and cyborg classification was the level of automation. If
all messages sent by the user were interpreted as automated messages, the classification was
ruled a bot. If only part of the messages seemed to originate from the automation, the
classification was ruled a cyborg. However, if less than one-quarter of the messages were
automated, the classification was ruled a human.

The collection of the refugee crisis tweets was based on Finnish keywords and abbreviations.
The free Twitter search API was used. The refugee dataset was collected from 6 December 2015
to 3 February 2016. The complete dataset contained 59,491 tweets from 15,504 users. The
dataset also contained tweets in other languages, but those were excluded from the dataset based
on the results of the Language Detection Library (Shuyo 2016). After that, Twitter users who had
fewer than 10 tweets in the dataset were also excluded.

The final refugee dataset contained 31,092 tweets from 855 Twitter users. Visualisations were
generated to perform qualitative analysis. Visualisations included the most commonly appearing
hashtags as a word cloud, and locations where tweets had been posted. The latter data was
available only if users had allowed geolocation data to be included.

The system was used to classify each user as either an automated user or a human user. The
automated user can be a cyborg or a bot. A weighted Random Forest algorithm was used. The
refugee dataset was divided into a training set (684 users) and a test set (171 users)to allow
researchers to evaluate the performance of the automatic classifier.

The test set confusion matrix of 171 Twitter users is presented in Table 1, below. The recall is
80 percent24 out of 30 Twitter users that were manually classified by humans as bots or
cyborgs were found as automated users in the test set by the pilot system. On the other hand, the
precision is 86 percent, which indicates that most of the users the system classified as automated
were manually classified as bots or cyborgs.

Journal of Information Warfare 107


Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media

Classified by humans

Human Bot or Cyborg

Human 137 6
Classified by the
pilot system
Bot or Cyborg 4 24

Table 1: Confusion matrix showing very good accuracy and low number of false positive automatic classifications

Some of the features the Random Forest algorithm found most important were the number of
other users mentioned in the users tweet on average, the number of links in the users messages
on average, and the type categories of the sending application (social media application, mobile
application, or automation application). Also some sentiment-related features were near the top
of the list of about 30 features tested.

Figure 1, below, visually depicts changes in the dataset after automated messages were
excluded. A notable difference can be seen in the most active Twitter accounts. After the
exclusion of automated messages, active human participants in the discussion can be identified.
Thus, it can be concluded that the algorithm was able to remove bot and cyborg accounts. In
tweet locations, no change was observed. Thus, automated accounts do not reveal this
information. Only minor changes were seen in the hashtag list and sentiment result, and thus
figures are not presented here.

Figure 1: Word cloud visualisations of the most active user accounts of the collected dataset and reduced data after
excluding bots and cyborgs. (Translations: pakolaiset = refugees; turvapaikka = asylum; maahanmuutto =
immigration)

108 Journal of Information Warfare


Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media

Discussion
This case study offers a solid base for further development. The main weakness of the pilot
system is that it is quite heavily dependent on many Twitter-specific features. It may also be the
case that some of the used features are effective only when the bots and cyborgs are not trying to
conceal themselves. In the future work, it is important to extract and create new and more
complex features, especially from the messages text content, such as how much the text content
is changing from one message to another in messages of the same user. This process is guided by
published research (Chu et al. 2012; Clark et al. 2015; Dickerson, Kagan, & Subrhamanian
2014). The abovementioned features based on the text content might have the advantage of being
more easily applicable on the other social media forums outside Twitter, and being more likely to
identify concealed bots and cyborgs.

Future work will investigate how to defend against trolling phenomenon once they can be
automatically identified. The one general psycho-sociological way to deal with trolls who are
systematically spamming information is to limit reactions to those trolls to reminding others not
to respond to them. In a rhizome meshwork, the only course of action is not to feed the trolls. A
troll can disrupt the discussion in a newsgroup, disseminate bad advice, and damage the feeling
of trust in the community.

Focusing on the findings of actors who try to expose these trolling entities may be one way to
detect trolling behaviour. According to Weiss (2016), these actors may be called elves. One
aim of a trolling information operation is to break peoples will to defend their knowledge and
beliefs. The lack of trust in ones information and knowledge creates a favourable environment
for possible hostile information intervention. To keep ones own information environment
coherent, organisations responsible for ground truth must participate in online groups and share
experiences about how to fight against the trolls on social media. Civic activists (elves) must
commit to knocking down disinformation and sharing their experiences in order to grow the
number of new elf-participants. They must not try to be propagandists in reverse, but rather
expose the disinformation by using humour against the trolls. They may post a link that all the
members can go to and leave their comments and reactions (for example, liking something,
disliking it). When trying to figure out the identities of the trolls, elves may begin by at least
locating the country or town from which they come (Weiss 2016). Automated troll detection is
feasible. However, more research work is required to understand trolling mechanisms in order to
train algorithms accordingly.

References
Bae, Y & Lee, H 2012, Sentiment analysis of Twitter audiences: measuring the positive or
negative influence of popular Twitterers, Journal of the American Society for Information
Science & Technology, vol. 63, no. 12, pp. 2521-35.

Berthon, PR, Pitt, LF, Plangger, K & Shapiro, D 2012, Marketing meets Web 2.0, social media,
and creative consumers: implications for international marketing strategy, Business Horizons,
vol. 55, no. 3, pp. 261-71.

Buckels, EE, Trapnell, PD & Paulhus, DL 2014, Trolls just want to have fun, Personality and
Individual Differences, vol. 67, pp. 97-102.

Journal of Information Warfare 109


Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media

Chu Z, Gianvecchio, S, Haining, W, & Sushil, J 2012, Detecting automation of Twitter


accounts: are you a human, bot, or cyborg?, IEEE Transactions on Dependable and Secure
Computing, vol. 9, no. 6, pp. 811-24.

Clark, EM, Williams, JR, Jones CA, Galbraith, RA, Danforth, CM, Dodds, PS 2015, Sifting
robotic from organix text: a natural language approach for detecting automation on Twitter,
Journal of Computational Science, vol. 16, pp. 1-7.

Coyne, R 2014, The net effect: design, the rhizome, and complex philosophy, viewed 20 August
2016, <http://www.casa.ucl.ac.uk/cupumecid_site/download/Coyne.pdf>.

Deuze, M 2011, Media life, Media, Culture and Society, vol. 33, no. 1, pp. 137-48.

Deleuze, G & Guattari, F 1983, On the line, MIT Press, New York, NY, U.S.A.

Dickerson, JP, Kagan, V, & Subrhamanian, VS 2014, Using sentiment to detect bots in Twitter:
are humans more opinionated than bots?, Proceedings of 2014 IEEE/ACM International
Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014), Beijing,
China, August.

Habermas, J 1989, The structural transformation of the public sphere: an inquiry into a category
of bourgeois society, MIT Press, Cambridge, MA, U.S.A.

Hardaker, C 2010, Trolling in asynchronous computer-mediated communication: from user


discussions to academic definitions, Journal of Politeness Research, vol. 6, pp. 215-42.

Jansen, BJ, Zhang, M, Sobel, K & Chowdury, A 2009, Twitter power: tweets as electronic word
of mouth, Journal of the American Society for Information Science and Technology, vol. 60, no.
11, pp. 2169-88.

Kaplan, AM & Haenlein, M 2010, Users of the world, unite! The challenges and opportunities
of social media, Business Horizons, vol. 53, pp. 59-68.

Karppi, T 2014, Disconnect me: user engagement and Facebook, Doctoral Dissertation, Annales
Universitatis Turkuensis, Ser. B Tom. 376, Humaniora, University of Turku, Finland.

Lee, S & Cude, BJ 2012, Consumer complaint channel choice in online and off-line purchases,
International Journal of Consumer Studies, vol. 36, pp. 90-6.

Lewin, K 1943, Forces behind food habits and methods of change, Bulletin of the National
Research Council, vol. 108, pp. 35-65.

Luoma-Aho, V 2015, Understanding stakeholder engagement: faith-holders, hateholders &


fakeholders, Research Journal of the Institute for Public Relations, vol. 2, no. 1.

110 Journal of Information Warfare


Understanding the Trolling Phenomenon: The Automated Detection of Bots and Cyborgs in the Social Media

Malgin, A 2015, Kremlin troll army shows Russia isn't Charlie Hebdo, The Moscow Times,
viewed 20 August 2016, <http://www.themoscowtimes.com/opinion/article/russia-is-not-
charlie/514369.html>.

Mihaylov, T, Nakov, P 2016, Hunting for troll comments in news community forums, The
54th Annual Meeting of the Association for Computational Linguistics Proceedings of the
Conference, vol. 2 (Short Papers), pp. 399-405.

Nacos, BL, Bloch-Elkon, Y & Shapiro RY 2011, Selling fear: counterterrorism, the media, and
public opinion, The University of Chicago Press, Chicago, IL, U.S.A.

Paavola, J & Jalonen, H 2015, An approach to detect and analyze the impact of biased
information sources in the social media, Proceedings of the 14th European Conference on
Cyber Warfare and Security ECCWS-2015, Academic Conferences and Publishing International
Limited, London, UK, July.

Pentland, Al 2014, Social physics: how good ideas spread--the lessons from a new science, The
Penguin Press, New York, NY, U.S.A.

Robertson, SP, Douglas, S, Maruyma, M & Semaan, B, 2013, Political discourse on social
networking sites: sentiment, in-group/out-group orientation and rationality, Information Polity:
The International Journal of Government & Democracy in the Information Age, vol. 18, no. 2,
pp. 107-26.

Shuyo, N 2016, language-detection, viewed 20 August 2016, <https://github.com/shuyo>.

Spruds, A, Roukalne, A, Sedlenieks, K, Daugulis, M, Potjomkina, D, Tlgyesi, B & Bruge, I


2015, Internet trolling as a hybrid warfare tool: the case of Latvia, NATO STRATCOM Centre
of Excellence Publication.

Troll 2016, Online Cambridge dictionary, viewed 20 August 2016,


<http://dictionary.cambridge.org/dictionary/english/troll>.

Troll 2016, Oxford English dictionary online, viewed 20 August 2016,


<http://www.oxforddictionaries.com/definition/english/troll>.

Weiss, M, 2016, The Baltic elves taking on pro-Russian trolls, The Daily Beast, viewed 21
March 2016, <http://www.thedailybeast.com/articles/2016/03/20/the-baltic-elves-taking-on-pro-
russian-trolls.html>.

Journal of Information Warfare 111

You might also like