You are on page 1of 14

User Identification on Social Networks

Through Text Mining Techniques:


A Systematic Literature Review

Kinza Zahra, Farooque Azam, Wasi Haider Butt and Fauqia Ilyas

Abstract Social connection between the set of people is known as social network
analysis. People keep numerous identities on various online social sites. User-related
network data has distinctive information which shows user interests, behavioral pat-
terns, and political views. By using these behaviors individually and collectively are
of great help to recognize users across social networks. SLR (Systematic Literature
Review) has been performed to distinguish 31 papers published during 2010–2018.
The idea is to determine user identification categories that are used to classify users.
Furthermore, to identify algorithms, models, methods, and tools that has been sug-
gested since 2010 for user characterization. We have identified 10 algorithms, 19
models, 5 methods and 8 tools that have proposed for 5 user identification cate-
gories. Finally, we empirically evaluated that text mining techniques are promising
approaches for the identification of users on online social networks.

Keywords User identification · Online social networks · Text classification


Text mining · Tools

K. Zahra (B) · F. Azam · W. H. Butt · F. Ilyas


Department of Computer Engineering, College of E&ME,
National University of Sciences and Technology (NUST), Islamabad 12, Pakistan
e-mail: kinza.zahra15@ce.ceme.edu.pk
F. Azam
e-mail: farooq@ceme.nust.edu.pk
W. H. Butt
e-mail: wasi@ceme.nust.edu.pk
F. Ilyas
e-mail: fauqia.ilyas85@ce.ceme.edu.pk

© Springer Nature Singapore Pte Ltd. 2019 485


K. J. Kim and N. Baek (eds.), Information Science and Applications 2018,
Lecture Notes in Electrical Engineering 514,
https://doi.org/10.1007/978-981-13-1056-0_49
486 K. Zahra et al.

1 Introduction

OSN (Online Social Networks) such as Facebook, Twitter, and Reddit, etc. have
become extremely popular over the past decade and been one of the most common
communication tools [1]. To integrate these OSN sites, for social networks it is
essential to discover the identity of a user [2]. User identification based on text
has attracted the attention of many types of research. User identification from text
related to behavioral patterns [3], demographic characteristics of authors like age
and gender [4] is essential in forensics, security, and advertisement. For instance,
one would like to learn about the behavior of the author of aggressive and criminal
textual message, or organizations might be intrigued to find out about demographic
attributes of individuals who like or dislike their items, given the web journals and
online product analysis.
Prevailing literature can be categorized into two groups, i.e., Systematic and Tradi-
tional literature reviews. State-of-the-art work and current research trends are mainly
covered by traditional literature reviews while the focus of systematic literature
reviews is to provide solutions to the research questions involving user identifica-
tion. Despite a large number of empirical studies on user identification genres, models
and algorithms there was a need to combine all these frameworks of same domain
in one literature so that they can be compared regarding performance, accuracy, and
validation against state-of-the-art machine learning algorithms. As far as we know,
No systematic literature review can be found that concentrates on online user identi-
fication through text mining techniques, which encourages and stimulate our efforts
in this study. The purpose of this SLR is to provide solutions to the following research
questions:
RQ1: Which of the user identification categories are mainly focused on researching
online social networks through text classification?
RQ2: What are the models and algorithms that are used to identify user through text
classification?
RQ3: What are the current tools for user identification in text classification research
area?
The paper is structured as follows. Section 2 illustrates the methodology per-
formed in this paper. Section 3 demonstrates and shows the results. Section 4 presents
the discussion and limitations. Conclusion and future work are suggested in Sect. 5.

2 Methodology

This paper was commenced as a systematic literature review established on the prim-
itive directions as presented by Kitchenham [5]. It is intended to improve, assess,
and understand the accessible material regarding research contributions that must be
examined applicable concerning user identification and commensurate with already
stated research questions. To lessen at least the possibility of results being writer’s
User Identification on Social Networks Through Text Mining … 487

partial preferences, a systematic literature review is carried out on holding to exten-


sive and promptly established guidelines.

2.1 Review Protocol Development

Our review protocol is accomplished under already defined guideline by Kitchenham


[5]. In first stage, five digital libraries IEEE Explorer, ACM Digital Library, Springer,
Elsevier and Taylor & Francis were selected to identify relevant papers. Our review
protocol defines the criteria of inclusion and exclusion, search strategy, evaluation
of quality and extraction and synthesis of data.

2.1.1 Inclusion and Exclusion Criteria

Our inclusion criteria concentrated on high-level comprehensive papers address-


ing original work on user characterization. We established the following selection
requirements for inclusion criteria:
• Using text mining techniques for user identification.
• Using Machine Learning techniques for preprocessing and feature selection.
• Described categories, i.e., user behavior, attribute identification and spammers for
user identification.
• Study with both journal and conference type, solely journal type is included.
Papers having these features were excluded:
• Papers that are related to user characterization other than text.
• Duplicated studies (only one copy of each study was included).

2.1.2 Search Process

Search process of our literature review commenced with the selection of digital
libraries and research questions. This selection plays a key part to show the com-
prehensiveness and completeness of accumulated papers as illustrated in Table 1.
Following steps are accomplished in the search process as shown in Fig. 1.
• Digital libraries (IEEE, SPRINGER, ELSEVIER, ACM, Taylor, and Francis) are
focused.
• Journals and conference papers strain by title, keywords, and abstract by using the
criteria of inclusion and exclusion.
• Considerable search terms were acquired from RQ’s.
• Boolean AND was applied to restrict the search.
488 K. Zahra et al.

Table 1 Selection criteria


Sr. # Search Operator IEEE Springer ACM Elsevier Taylor
terms and
Francis
1 Online AND 154 1915 14105 3043 64
user
character-
ization
2 Online AND 131 1899 95 2840 1556
user cate-
gorization
3 Social AND 12853 32893 4225 22694 1348
network
analysis
4 Online AND 14780 15792 265 15106 2264
social
networks
5 Models AND 5560 13 102 13463 1971
for OSN’s

Table 2 Quality assessment checklist


Sr. # Questions Answer
1 Does the paper aim to identify user through text categorization? Yes
2 Does the research review any of the preceding papers? Yes
3 Do the findings address the original research questions? Yes
4 Is the paper biased towards one user identification algorithm, model or No
approach?

2.1.3 Quality Assessment

We evaluated the quality assessment criteria described in the study performed by


Kitchenham [5]. Quality assessment checklist was amplified to evaluate the quality
of the selected papers as shown in Table 2.

2.1.4 Data Extraction and Synthesis

We identified Data Extraction and Synthesis to outline extraction structures to pre-


cisely record and gather the data by studying chosen publications. We formulated
excel sheet to precisely record data to provide the solution to the questions. We
included general data about the paper, for example, title, the name of the author,
publication year, research type, and overall summary. While in data synthesis each
question was separately evaluated against the results. Results in tabular form are
illustrated in Table 3.
User Identification on Social Networks Through Text Mining … 489

Fig. 1 Search process

3 Results

Findings of the systematic literature review by determined review protocol are


explored in this section. Figure 2 shows a graphical depiction of the selected papers
from the years 2010 to 2018 contemplating all researches by year. Results are then
analyzed by quantitative and cross-examination techniques.

3.1 General View of Selected Studies

We identified 31 studies in the field of machine learning based user categorization.


The papers were published between years 2010 to 2018. From selected studies,
490 K. Zahra et al.

Table 3 Data extraction and synthesis


Sr. # Description Details
1 Bibliographic information Author, title, publication year, publisher details, and
type of research(i.e. journal or conference)
Extraction of data
2 Overview Main objective of the selected paper
3 Results Results acquired from the selected paper
4 Data collection Qualitative and quantitative method used
5 Assumptions To validate the outcome
6 Validation State-of-the-art ML models are used to validate results
Synthesis of data
7 Assigning user categories Categories for user identification based on text
classification
8 Framework selection Models, algorithms and methods for user
identification based on text classification
9 Tool selection Emphasize on tools used for user identification based
on text classification

Publication per year


6
4
2
0
2010 2011 2012 2013 2014 2015 2016 2017 2018
Conference Journal

Fig. 2 Number of selected researches per year

13 papers were published in the journal, while 18 papers appeared in conference


proceedings. Regarding the types of studies, all the selected studies are from exper-
imental research. Table 4 shows overview of the selected work.

3.2 User Identification Categories (RQ 1)

All papers have been analyzed to find out from which user identification category
authors are contributing research results on text in particular. User identification
categories, i.e., behavioral, attribute, topic, spam, and crime are discussed in RQ1.
Among all the above categories behavioral and attribute are two most frequently used
categories, they together were mentioned by 65% of the selected papers as illustrated
in Table 5. Compared to other identification categories, behavioral and attribute seem
User Identification on Social Networks Through Text Mining … 491

Table 4 Selected work


Sr. # Scientific Type Selected research No. of researches
database works
1 IEEE Journal [6, 7] 2
Conference [8–12] 5
2 Springer Journal [13, 14] 2
Conference [15–19] 5
3 Elsevier Journal [3, 20–23] 5
Conference [4, 24] 2
4 ACM Journal [25] 1
Conference [1, 26–30] 6
5 Taylor and Journal [31–33] 3
Francis
Conference – –

Table 5 Identification of user categories


Sr. # User identification categories Percent (%) References
1 Behavioral identification 45 [3, 6, 7, 11, 13, 20–25, 28, 30–32]
2 Attribute identification 32 [4, 8 9, 11 15, 16, 23, 29, 32, 33]
3 Topic identification 12 [10, 26, 27, 29]
4 Spam identification 16 [1, 12, 14, 18, 19]
5 Crime identification 12 [4, 16, 17, 23]

to have received assertive research attention in many years. There are some studies
in the literature that contains two or more user identification categories in one study.

3.3 User Categorization Models and Algorithms (RQ 2)

Considering the data extracted from the answer to this research question, it emerges
that behavioral, attribute, spam and crime identification models and algorithms are
cited by both journal and conference papers while topic identification frameworks
are just mentioned in conference papers.
Different models with the goal of identifying the user, its behavioral patterns
and attributes have been listed in this research question. The studies reporting the
identification of user behaviors through short text were [11, 21]. In publications
[7, 13, 28, 32] user’s interest and influence regarding responsiveness along with
communication and exploration were identified. Demographic features of user like
age, gender, education, date and email address were identified by models cited in [4,
16, 33]. Spam emails in [14] were spotted by using the anti-spam model as shown
in Table 6.
492 K. Zahra et al.

Table 6 User categorization models


Sr. # User categorization models References
1 Behavioral(12) [7, 8, 11, 13, 20, 21, 23–25, 28,
Comment tree model, UR, UC and UCR model, UCT, 30, 32]
DSUN, DSM, pipe-lined system models, AS-LDA
and LDA multi-layer perception model, SVM
classification model, personalized recommendation
model, DTM, research models
2 Attribute(7) [4, 11, 15, 16, 23, 32, 33]
Machine learning models, pipe-lined system models,
SVM classification model, classification models,
authorship identification model, research models,
theoretical models
3 Topic(2) [26, 27]
Hidden Markov model, vector space model
4 Spam(1) [14]
Anti-spam model
5 Crime(2) [16, 23]
Pipe-lined system models, authorship identification
model

Table 7 User categorization algorithms/methods


Sr. # User categorization algorithms/methods References
1 Behavioral(6) [6, 11, 13, 21, 23, 24, 28, 30]
Measurement method, gibbs sampling algorithm, IITP
improved semi-supervised algorithm, sentiment flow
algorithm
Generation process algorithm
2 Attribute(5) [4, 9, 11, 23, 29]
ReLU, IR and ML algorithm, IITP, stylometry
method, improved semi-supervised algorithm
3 Topic(3) [10, 26, 29]
Feature term method, IR and ML algorithm group
recommendation method
4 Spam(3) [12, 18, 19]
Random forest, real-time (DeBOT) method, ML and
compression algorithm
5 Crime(2) [22, 23]
Grammar derivation, grammar combination and
general EFG algorithm,
IITP

After the detailed analysis of 31 studies, we have distinguished 10 algorithms


and 5 methods that have been trained through machine learning techniques to make
specific decisions as demonstrated in Table 7.
User Identification on Social Networks Through Text Mining … 493

Table 8 User categorization tools


Sr. # User identification tools Purpose References
1 Third party tools URLs recognition [1]
2 Knowledge based tools Emotion recognition [3]
3 ROUGE Automatic text summarization [27]
4 LIWC tool Personality/behavior recognition [9, 28]
5 ITAP Behavior recognition [23]
6 Stanford POS tagger Read text and assign POS [8]
7 Automated tools Spam recognition [12]
8 Word segmentation tool Word recognition [15]

Gibbs sampling algorithm which is used for feature selection in researches [21,
24, 30] is the only algorithm used in multiple papers. Algorithms can be applied on
supervised, unsupervised and semi-supervised learning depending upon the dataset.
In research [11] a semi-supervised learning algorithm for semi labeled data was used
to train the data until it is labeled completely. Most of the algorithms in this research
study were used on supervised learning. Random forest [12] and machine learning
and compression [19] algorithms were used for classification and regression.

3.4 User Categorization Tools (RQ3)

This section of the study presents the tools which are used in 9 research papers as
shown in Table 8 to act as user identification. The basic purpose of these tools is to
reduce the ambiguity in the text present on different social network platforms.
Tools used in this research question belong to both research community and public
sector. Multiple third-party tools [1] like Browsing API, SURBL, and Spamhaus from
both research and private sectors and automated tools [12] are detecting malicious
URLs and spam tweets. Knowledge-based tools [3] was developed in contrast with
statistical approaches to analyze and extract knowledge from each sentence to specify
its sentiment status. There are some tools that automatically evaluate documents,
ROUGE [27] is used to evaluate generated summaries with the summaries created
by experts. LIWC [9, 28] is the only tool in the research which is used by two
studies to recognize behavioral patterns. The University of Austin builds IITP tool to
describe the criminal process, vulnerabilities, and resources that facilitate criminals
to commit the crime. Two natural languages processing tools POS Tagger [8] and
Word Segmentation tool [15] were used to extract features from tweets and recognize
words respectively.
494 K. Zahra et al.

4 Discussion and Limitation

In this study, we evaluated and identified text mining techniques that help and support
to distinguish users, demographic features of users and behavioral patterns while
communicating on different social networking sites. The data used in this research
is mostly gathered from Twitter. Out of 31 studies, 13 used twitter data for their
experimentation. Other data sources include Facebook, Blogs, documents, reports
and instant messages.
In text classification, machine learning techniques vary for different datasets. Dif-
ferent text requires a different set of features and ML techniques. Preprocessing, a
data mining technique transforms data into an understandable format, was used in
14 papers mostly where natural language processing is performed on the text, to
attain optimal achievement. Correct feature selection increases the accuracy and per-
formance of the classifier. Most frequently used feature selection techniques were
POS Tagging and TF-IDF, use of these techniques improved the performance of the
machine learning models. The finding suggests that future studies adopt both seman-
tic based features and demographic features together to achieve higher performance.
Classifiers are performed on supervised learning to validate the experimental
results. In this research classifiers are used by 14 studies and support vector machine
(SVM) alone is used in 9 studies separately as well as combined with other classifiers.
Selected studies suggested that performance of SVM in text classification is much
better than other classifiers like random forest and naïve Bayes.
For user categorization, we identified 19 models, 10 algorithms, 8 tools and 5
methods. In some studies like [11, 13, 23, 26, 28, 30] they are used together to
improve the performance. It has been shown in this review; frameworks perform
differently on every dataset depending upon the size and type of text used in datasets.
Therefore, before making any decision on the choice of models, algorithms and ML
techniques, professionals not just should know about the performance, yet also need
to comprehend the qualities of the frameworks.
Table 9 shows the comparison of text mining techniques (models, algo-
rithms/methods, and tools) that have been proposed for user identification based
on the type of datasets, pre-processing, feature selection, classifier, and validation.
It has been observed that mostly algorithms/methods are validated against state of
the art machine learning techniques as compared to models and tools that have been
identified. Whereas, the comparison based on pre-processing also shows that most of
the algorithms/methods used pre-processing step as compared to tools and models.
In this review for accessing the performance of text mining techniques, only
accuracy metrics is observed. If a model or algorithms fail to perform below the
minimum threshold in terms of accuracy practitioners will reject it, although in
addition to accuracy metrics other evaluation metrics such as propagation ability and
accountability is ignored in this review can also necessarily be considered. Table 9
shows the discussion and comparison of all 31 papers selected in this literature.
Table 9 Comparison of text mining techniques
Research Datasets Pre-processing Feature selection Classifier used Tools Models Algorithms/methods Validation References
√ √
R1 Facebook [1]
√ √
R2 ISEAR BOW (Bag of words) Ensemble and Naïve Bayes [3]
√ √ √
R3 Twitter POS tagging SVM and random forest [4]
√ √ √ √
R4 Text document TF and term weight SVM and Naïve Bayes [26]
√ √ √
R5 Blogs and news reports [27]
√ √ √ √
R6 Reddit TF-IDF [28]
√ √
R7 Last.fm Ensemble and SVM [25]
√ √
R8 Twitter and blogs POS tagging SVM and Naïve Bayes [29]
√ √
R9 Twitter [30]

R10 Twitter [20]
√ √ √ √
R11 Twitter TF-IDF K-means [21]
√ √ √
R12 WITS N-grams [22]
√ √ √ √ √
R13 News and reports POS tagging and named [23]
entity recognition
√ √ √ √
R14 Corpus LDA SVM [24]
√ √ √ √
User Identification on Social Networks Through Text Mining …

R15 Twitter POS tagging K-means and EM [8]


√ √
R16 Chat room conversations Chi-square SVM and Naïve Bayes [9]

(continued)
495
Table 9 (continued)
496

Research Datasets Pre-processing Feature selection Classifier used Tools Models Algorithms/methods Validation References

R17 SinaWeibo [10]
√ √ √
R18 SinaWeibo SVM [11]
√ √
R19 Blog data SenticNet and POS tagging Ensemble [6]

R20 Yelp [7]
√ √ √
R21 Twitter ReLF Random forest [12]
√ √
R22 Instant messages POS tagging and feature SVM and neural networks [15]
based selection
√ √ √
R23 SinaWeibo Density based selection [13]

R24 Email and blogs TF-IDF SVM [16]

R25 Twitter, email and IM Stylometry [17]

R26 Facebook and Twitter [18]
√ √ √
R27 Twitter BOW (Bag of words) SVM and Naïve Bayes [19]
R28 Facebook and Twitter Content based feature [31]
√ √
R29 Facebook and Twitter [32]

R30 Facebook and Twitter [33]

R31 Email [14]
K. Zahra et al.
User Identification on Social Networks Through Text Mining … 497

5 Conclusion and Future Work

In this literature review, we have provided empirical evidence on the existence of a


mapping between identities of individuals across the social media sites and studied
the possibility of identifying users across sites. Both link and content information
was used to identify users. A Systematic Literature Review (SLR) has been used to
determine 31 studies published during 2010–2018. Identifying users across social
media sites opens the door to many interesting applications such as analyzing usage
patterns across networks and studying user behavior. We have identified five different
user identification categories and corresponding algorithms, methods, models, and
tools. Identifying different user behaviors has the potential to improve business and
resource management in OSN’s.
For future direction, we could investigate, for instance, recommendation systems
that exploit the user behaviors to display more appropriate advertisements. We could
exploit the user behaviors to define different classes and develop more accurate
performance models for the service.

References

1. Gao H, Hu J, Wilson C, Li Z, Chen Y, Zhao BY (2010) Detecting and characterizing social spam
campaigns. In: Proceedings of the 10th ACM SIGCOMM conference on Internet measurement,
Nov 2010. ACM, pp 35–47
2. Tuna T, Akbas E, Aksoy A, Canbaz MA, Karabiyik U, Gonen B, Aygun R (2016) User char-
acterization for online social networks. Soc Netw Anal Mining 6(1):104
3. Perikos I, Hatzilygeroudis I (2016) Recognizing emotions in text using ensemble of classifiers.
Eng Appl Artif Intell 51:191–201
4. Sboev A, Litvinova T, Gudovskikh D, Rybka R, Moloshnikov I (2016) Machine learning
models of text categorization by author gender using topic-independent features. Proc Comput
Sci 101:135–142
5. Kitchenham B (2004) Procedures for performing systematic reviews, Keele, UK, Keele Uni-
versity, vol 33, no 2004, pp 1–26
6. Poria S, Cambria E, Gelbukh A, Bisio F, Hussain A (2015) Sentiment data flow analysis by
means of dynamic linguistic patterns. IEEE Comput Intell Mag 10(4):26–36
7. Qian X, Feng H, Zhao G, Mei T (2014) Personalized recommendation combining user interest
and social circle. IEEE Trans Knowl Data Eng 26(7):1763–1777
8. Murkute AM, Gadge J (2015) Framework for user identification using writeprint approach.
In: 2015 international conference on technologies for sustainable development (ICTSD), Feb.
IEEE, pp 1–5
9. Amuchi F, Al-Nemrat A, Alazab M, Layton R (2012) Identifying cyber predators through
forensic authorship analysis of chat logs. In: 2012 third cybercrime and trustworthy computing
workshop (CTC), Oct. IEEE, pp 28–37
10. Wang J, Liu Z, Zhao H (2014) Group recommendation using topic identification in social
networks. In: 2014 sixth international conference on intelligent human-machine systems and
cybernetics (IHMSC), vol 1, Aug. IEEE, pp 355–358
11. Yin C, Xiang J, Zhang H, Wang J, Yin Z, Kim JU (2015) A new SVM method for short
text classification based on semi-supervised learning. In: 2015 4th international conference on
advanced information technology and sensor application (AITS), Aug. IEEE, pp 100–103
498 K. Zahra et al.

12. Meda C, Ragusa E, Gianoglio C, Zunino R, Ottaviano A, Scillia E, Surlinelli R (2016) Spam
detection of Twitter traffic: a framework based on random forests and non-uniform feature
sampling. In: 2016 IEEE/ACM international conference on advances in social networks analysis
and mining (ASONAM), Aug. IEEE, pp 811–817
13. Guo H, Chen Y (2016) User interest detecting by text mining technology for microblog plat-
form. Arab J Sci Eng 41(8):3177–3186
14. Zhang Y, He J, Xu J (2018) A new anti-spam model based on e-mail address concealment
technique. Wuhan Univ J Nat Sci 23(1):79–83
15. Ding Y, Meng X, Chai G, Tang Y (2011) User identification for instant messages. In: Neural
information processing. Springer Berlin/Heidelberg, pp 113–120
16. Ma J, Teng G, Chang S, Zhang X, Xiao K (2011) Social network analysis based on authorship
identification for cybercrime investigation. Intell Secur Inf 27–35
17. Frommholz I, Al-Khateeb HM, Potthast M, Ghasem Z, Shukla M, Short E (2016) On
textual analysis and machine learning for cyberstalking detection. Datenbank-Spektrum
16(2):127–135
18. Chavoshi N, Hamooni H, Mueen A (2016) Identifying correlated bots in twitter. In: International
Conference on Social Informatics, Nov. Springer International Publishing, pp 14–21
19. Santos I, Minambres-Marcos I, Laorden C, Galán-García P, Santamaría-Ibirika A, Bringas
PG (2014) Twitter content-based spam filtering. In: International joint conference SOCO’13-
CISIS’13-ICEUTE’13. Springer, Cham, pp 449–458
20. Zhou X, Wu B, Jin Q (2017) User role identification based on social behavior and networking
analysis for information dissemination. Future Gener Comput Syst
21. Qiu Z, Shen H (2017) User clustering in a dynamic social network topic model for short text
streams. Inf Sci 414:102–116
22. Sharef NM, Martin T (2015) Evolving fuzzy grammar for crime texts categorization. Appl Soft
Comput 28:175–187
23. Zaeem RN, Manoharan M, Yang Y, Barber KS (2017) Modeling and analysis of identity threat
behaviors through text mining of identity theft stories. Comput Secur 65:50–63
24. Liang J, Liu P, Tan J, Bai S (2014) Sentiment classification based on AS-LDA model. Proc
Comput Sci 31:511–516
25. Chelmis C, Prasanna VK (2013) Social link prediction in online social tagging systems. ACM
Trans Inf Syst (TOIS) 31(4):20
26. Manne S, Fatima SS (2012) An extensive empirical study of feature terms selection for text
summarization and categorization. In: Proceedings of the second international conference on
computational science, engineering and information technology, Oct. ACM, pp 606–613
27. Chakraborti S (2015) Multi-document text summarization for competitor intelligence: a
methodology based on topic identification and artificial bee colony optimization. In: Proceed-
ings of the 30th annual ACM symposium on applied computing, Apr. ACM, pp 1110–1111
28. Choi D, Han J, Chung T, Ahn YY, Chun BG, Kwon TT (2015) Characterizing conversation
patterns in Reddit: from the perspectives of content properties and user participation behaviors.
In: Proceedings of the 2015 ACM on conference on online social networks, Nov. ACM, pp
233–243
29. Inches G, Crestani F (2011) Online conversation mining for author characterization and topic
identification. In: Proceedings of the 4th workshop on workshop for Ph.D. students in infor-
mation & knowledge management, Oct. ACM, pp 19–26
30. Zhao Y, Liang S, Ren Z, Ma J, Yilmaz E, de Rijke M (2016) Explainable user clustering in short
text streams. In: Proceedings of the 39th international ACM SIGIR conference on research and
development in information retrieval, July. ACM, pp 155–164
31. O’Riordan S, Feller J, Nagle T (2016) A categorisation framework for a feature-level analysis
of social network sites. J Decis Syst 25(3):244–262
32. Son JE, Lee SH, Cho EY, Kim HW (2016) Examining online citizenship behaviours in social
network sites: a social capital perspective. Behav Inf Technol 35(9):730–747
33. Riedl C, Köbler F, Goswami S, Krcmar H (2013) Tweeting to feel connected: a model for social
connectedness in online social networks. Int J Hum-Comput Interact 29(10):670–687

You might also like