Credit Risk Handling in Telecommunication

Credit Risk Handling in Telecommunication
Sector
Monika Szczerba and Andrzej Ciemski
Warsaw University of Technology, Institute of Computer Science,
Nowowiejska 15/19, 00-665 Warsaw, Poland
M.Szczerba@stud.elka.pw.edu.pl,
A.Ciemski@ii.pw.edu.pl
Abstract. This article presents an application of data mining methods

in telecommunication sector. This sector becomes a new area of research
for particular problem solving e.g. churn prediction, cross-up selling marketing campaigns, fraud detection, customer segmentation and profiling,
data classification, association rules discovery, data clustering, parameter
importance analysis etc. Credit risk prediction became a new research
domain in pattern recognition area aimed to find the most risky customers. This article is devoted to assessing credit risk from the moment
of opening a customer account to the moment of closing an account
due to non-payment. Algorithms are used to identify and insolvency of
a debtor. Credit scoring is presented in a form of activation models,
which are used to predict customers debt as well as indicate clients
with the highest, medium and smallest credit risk. Practical part of the
article is based on the real customer database in a telecommunication
company.
Introduction
The research issue of risk and risk management is a growing scientic domain.
The term of risk has been introduced in many areas of social, economic and science life. First of all the term of risk can be found in nance, banking, insurance
and medicine [1][2][3]. Some of the authors are trying to build a common risk
theory, but there is still the risk theory placed in the specic context e.g. insurance, banking [4][5][6]. There is a need of risk research for business related issues
concerning crucial business processes. Especially the risk research has became
valid for companies which the core of its business is a provision of services e.g.
telecommunication services.
For telecommunication sector especially crucial became the churn analysis and
proper using methods of data analysis [7][8]. In the recent years also signicant
from the point of view of operational processes for telecommunication companies
became the credit risk analysis for the individual and business customers in the
activation process. Authors of this article notice a possibility to use machine
learning and data mining methods in pattern recognition as a new approach
P. Perner (Ed.): ICDM 2009, LNAI 5633, pp. 117130, 2009.
c Springer-Verlag Berlin Heidelberg 2009
118
M. Szczerba and A. Ciemski
in order to nd the most risky customers. This article presents (based on the
authors business practice) the outcomes of the research work and the approach
applied in the commercial projects for telecommunication companies.
Technological development within telecommunications area has signicantly
progressed over recent years. This is connected with the growth in the competitiveness within this sector. Telecommunication companies are trying to get as
many customers as they can by oering them a lot of attractive deals and products. Finding new customers becomes more dicult though when the market
gets more saturated. Customers who respond to those oers are very valuable,
because they generate prot for the company. Unfortunately among them there
are some who fail to pay their bills and put the company at risk of making considerable losses. To minimize this risk companies can take precautions by using
data mining methods.
Data mining help identify customers who may possibly fail to pay their bills.
To measure the customers credit risk level it is essential to make analysis of
the activation data. The results of analysis are fundamental part of the process
aiming to prevent the company from increasing bad debt. Among tools used for
data analysis we can distinguish predictive models, which are the most important part in analysis process. Predictive models help to identify customers with
higher, lower and the lowest credit risk.
This article discusses issues concerning credit risk as well as methods detecting
customers who may fail on their payments after the activation process. Each of
the chapters will be described below.
Chapter two introduces credit risk. It begins with explanation of the term
and how it is usually comprehended, and it ends with description of credit risk
types on which predictive models were based. Discussed issues form fundamental
knowledge needed to understand problem of credit risk and they are good introduction into next chapters. Chapter three presents description of classication
trees as one of the data mining methods used to analyze credit risk problem.
Classication trees are very well known method used in commercial projects.
Chapter four introduces population of the data which was prepared for credit
risk analysis. Chapter ve describes two of seven predictive models detecting
customers with the credit risk among the population. The rst one is activation
model predicting credit risk based on all customer population and the second one
is predicting credit risk for individual customers. Chapter six and seven present
summary of results and future research plans.
Credit Risk
Credit risk term is used in both everyday and scientic language. It can also
be interpreted dierently depending on type of economic activity. Usually the
term credit risk means the risk of unattainability of goals. The meaning credit
risk is not the same across dierent sectors like for example banking, insurance,
telecommunication, energy, industrial and public sector. For telecommunication
Credit Risk Handling in Telecommunication Sector
119
area credit risk means decrease in potential prots, lack of the cash ow and
nancial diculties which can lead the company to the bankruptcy.
Telecommunication sector is one of the most competitive markets, because it
changes very quickly by bringing new technologies and creating new business
models. The more knowledge the company has got about its customers the more
attractive deal it can oer and this way gain competitive advantage over its
competitors. Credit risk is sometimes dened as possibility that customer will
not keep to contact conditions and will create a nancial loss for the company.
Telecommunication sector is changing and developing very quickly and it is important to nd methods to lower nancial risk and protect companys prots.
Data mining methods became the best solution for this problem.
Telecommunication companies started to make several attempts to control
risk management by for example verifying customers credit history before signing a contact with them and imposing deposit on customers with lower credit
reliability. The deposit is a guarantee for company in case if the customer refuses to pay. Risk management activities are called credit scoring which means
capability to fulll nancial contact obligations with a company.
The term credit scoring denes customers credit risk. Credit scoring can
be divided into application scoring and behavioral scoring. Application scoring
is used when customer is entering in a contract with a service provider. This
process includes ling in the application form (address, age, sex). Credit scoring
process can be also used after signing the contract and is based on behavioral
customer data (for example the history of payments). It is called behavioral
scoring.
Special treatment of customers with a high credit risk allows the company to
minimize its nancial losses. It is also very important to have all the necessary
information when setting a credit limit for a new customer. Proper identication
of the customer and scoring his risk level (high, medium, low) lets the company
to lower its credit risk. Classication trees are used here to model credit risk.
Decision Trees
Decision Trees are one of the data mining methods used to classify all observations of population into groups.
Decision Trees, also known as classication trees, consist of nodes and edges
called branches. The node is called predecessor if a node has got a few branches
connected to other nodes called successors. If successor has not got any outgoing
branches, the node is known as a nal node (leaf). Successor is created as a
result of decision rule in a node, which splits observations in two groups and
sends them to successors.
Construction of classication trees is based on a sample of data (population).
Decision rule divides observations from the sample and assigns them to new
nodes. Each node is characterized by a diversity of observations. All observations
create population which is also known as a class or group. This population
consists of observation vectors described below:
120
x11 , x12 , ..., x1n1 from the class (population) 1

x21 , x22 , ..., x2n2 from the class (population) 2
...
xg1 , xg2 , ..., xgng from the class (population) g,

where: xki = x1ki , x2ki , ..., xpki is i-the observation of k-th class. Values of the
observation are coming from the p-dimensional set, where xki D. In other
words, this set is a sequence of n-arranged random pairs, which can be noted
as: (x1 , y1 ) , (x2 , y2 ) , ..., (xn , yn ), where n = n1 + n2 + ... + ng , xi means
i-th observation and yi is a observation class label. Decision rule is based on
sample of data which is also known as learning sample. Learning sample consists
of g subsample and each of subsamples consists of observations from one class
(group).
When observations are already classied in the node, then split criterion is set
up for each node. If all observations are classied within the same nal node, the
node changes to class label and shows how many observation it contains. The
main target of classication trees is to predict classication of new observations,
which are based on division rules created on the basis of a learning sample.
Each of classication trees consists of subtrees, which are part of a main tree.
Subtrees can be dened as subtree of a T tree is a tree, which is a part of T
tree. The main target of split rules is to divide learning sample in two groups
in a way that observations assigned in new groups should be the most similar to
each other.
There are tree most popular rules used in automatic creation of classication
trees: misclassication rate, Gini Index, Entropy reduction and Chi-Square test
[9][10].
The Data and Variables
Database of a telecommunication company consists of individual and business

customers who signed a contract with a company between 1st of January 2007
and 31st of March 2008. During this period of 15 months customers were observed
and their payment behaviour examined.
Database includes 53433 observations which make up customers population.
This population consists of 48663 customers who are willing to pay for invoices
(good customers) and 4770 customers who had their contracts canceled because
they failed to pay (bad customers). Customers were then divided in two groups individual and business, where the rst one includes 35941 observations and the
second one 17492 observations. Among the group of individual customers there
are 32486 good customers and 3455 bad ones. By analogy, in a business group
there are 16177 good payers and 1315 bad payers.
These two groups of customers were also divided based on a type of services
they use, that is Internet and telephone. Table 1 presents a size of the population
of customers using dierent type of service. This division will be used later to
construct predictive models.
121
Table 1. The size of population divided with regards to the customer type and service
type
Population
Amount Good payer Bad payer Bad payers per
All population
53433
Individual customers
35941
Individual customers - Internet 15595
Individual customers - Telephone 20346
Business customers
17492
Business customers - Internet
2631
Business customers - Telephone 14861
48663
32486
14841
17645
16177
2491
13686
4770
3455
754
2701
1315
140
1175
8,93
6,46
1,41
5,05
2,46
0,26
2,20
%
%
%
%
%
%
%
Credit Risk Models
Seven models were constructed to examine the customers population. Before

they are described in detail, their construction principles will be presented.
Principle of model construction was customers activation data signing a contract with a telecommunication company. Models were called activation models.
Models include customers who signed a contract for 12, 24 and 36 months and
decided to choose either telephone or internet services or both. Contracts were
signed between 01st January 2007 and 31st March 2008. Models were built for
individual and business customers, regardless of the number of services on customers account.
Modeling includes customers who signed a contract from 01st January 2007
and 31st March 2008, who failed to make their payments but their contract was
not terminated. Main target of the model was to predict a risk of payment failure on 75th, which is when a customers account is closed during debt collection
process. Every model has got dened a target function called reason of terminating a contract which divides customers population into good payers (BWSFE
value) and bad payers (WSFE value).
Customers who canceled a contract within 10 days from a signing date (based
on special oer conditions), or who did not receive parcel with telephone or
Internet device from a courier ( which means that activation process did not
start) or customers who were validated negatively were removed from database,
which was the basis of modeling. The contacts, which were overload notes, were
also eliminated from database. The next principle is that not all the invoices
had payment corresponding with invoice during period 01st January 2007 and
31st March 2008 in spite of invoices having balance status in the database and
corresponding balance identier (which means there is a payment connected to
invoice). Therefore , if as of 2008/03/31 there was no payment recorded, it was
assumed that the customer did not make a payment and that he has became a
debtor.
There are seven activation models constructed. First one is based on the
whole customer population, Two activation models based on a type of customer
122
(individual and business). Four activation models for individual and business
customers divided further by service type. Two of them are presented below.
5.1
Credit Scoring Application Model for Whole Population
Model analyses whole population where 8.93 % is a percentage of bad payers.

Percentage of bad payers was presented on gure 1. Good payers are marked as
BWSFE.
Fig. 1. Percentage of bad payers in relation to good payers in the whole population
Customers who are bad payers cause nancial losses to the company. The
reason of these losses is customers failure to pay his bills. This is causing lack of
cash ow in the company and impacts the company in a negative way creating
bad debt. Bad debt is very dangerous and it might be the reason of nancial
collapse of the company. Therefore it is crucial for companies to detect and take
up preventive actions.
Fig. 2. Amount of bad debt for individual and business customers
Bad debt occurrence was presented on the chart which describes the amount
of bad debt during months (Fig. 2). There were taken into consideration following lengths of time: 25, 45, 60, 75, 90, 120 and 180 days from the moment of
customers activation. The chart is based on entire customer population.
123
The main aim of the model is to identify reasons for bad debt formation which
will be used to predict a moment of deactivation caused by failure to pay on 75th
day, which is when a customers account is closed during debt collection process.
Whole population has been classied and arranged according to the importance of attributes showed on gure 3. Figure 3 presents a list of variables with
importance measure calculated for each of them, based on training (training
column) and validation (validation column) set. The last of columns, called Importance, shows graphic representation of earlier calculations, where dark grey
line means variable estimation based on training set and bright grey line - estimation based on validation set.
Fig. 3. Classification of variable importance
According to attributes described above, a classication tree was constructed

for entire population. Classication tree was presented on gure 4.
Fig. 4. Classification tree based on whole population
124
A colour of a node depends on quantity of observations which aect target

variable. The more observations determining non-payment there are, the darker
the node is. In addition, the thickness of the line can vary as well. It depends
on quantity of observations in branches in relation to quantity of observation in
the root of a tree. It means that line is thicker when the number of observations
is bigger. A tari, which made the rst split in a root of a tree, was chosen as
the most important variable. A classication tree is presented in gure 4. Split
method was used here based on Chi-square test.
Whole population has been divided into good and bad payers, which is illustrated in leaves of classication tree. Leaves of classication tree have been
analyzed and classied from the most risky customers to the least risky customers. Leaves in the gure 5 have been presented in order according to what
percentage of target variable had been used. The colours used in the gure describe which type of set was used in data analysis. Dark grey bar means variable
estimation on training set and bright grey bar is estimation based on validate set.
For every decision rule important results have been presented, which in addition
show a more detailed domain in telecommunication company reality.
The constructed model can assessed with regards to matching training data
and validation data. Figure 5 presents on the x-axis a model assessment measure
for dierent subtrees, where a subtree is a tree which was built by pruning nal
nodes. The chart also presented misclassication of observation rate in training
and validate dataset. A subtree ts better to the dataset, if misclassication rate
of the subtree is closer to zero value. If we look closely at subtree 11 with 11
leaves, we will see it has got the smallest misclassication rate based on training
and validate set, which means the subtree ts the most data. Trees tend to t
better to training data rather than to validate data because decision rules are
created to t training data.
Fig. 5. The most risky customers
Figure 6 presents model assessment which estimates how the model ts the
data. The chart compares a set of observations in validation dataset with estimated model results based on training dataset. A Y-axis contains characteristics,
which depend on observation frequency in various groups.
The chart (gure 7) illustrates how results (predictions) have improved after
applying model in comparison to baseline, when estimation was not been done.
125
Fig. 6. Misclassification rate chart based on training and validation data
Random model often means making decisions at random. On the basis of this
chart it is easy to notice how many times a probability of better results increases
when model is applied. Model can be also compared with base model which does
not use estimation. The baseline curve on the chart presents results for constant
number of successes, which means probability of success on validated dataset.
In addition, it has been tested how a node slip criterion eects construction of
a tree. Therefore two additional models where constructed where split criterion
were Gini Index and Entropy reduction. Results were presented on the gure 7.
Fig. 7. Results of comparison credit scoring models with various node split criterion
The best result was achieved by model with Gini Index as split criterion. It
has received the highest answer in subsequent units. The second best model after
Gini Index criterion is a model which used Chi-square test as split criterion. The
last one is a model which was using Entropy reduction.
5.2
Credit Scoring Application Model for Individual Customer

Population
Credit Scoring model for individual customers population. Model analyses individual customers population where 9.61 % is a percentage of bad payers. Percentage of bad payers was presented on gure 8. Good payers are marked as BWSFE.
126
Fig. 8. Percentage of bad payers in relation to good payers in individual customers

population
Customers who are bad payers cause nancial losses to the company. The
reason of these losses is customers failure to pay his bills. This is causing lack of
cash ow in the company and impacts the company in a negative way creating
bad debt.
Fig. 9. Classification of variable importance
The main aim of the model is to identify reasons for bad debt formation which
will be used to predict a moment of deactivation caused by failure to pay on 75th
day, which is when a customers account is closed during debt collection process.
Whole population has been classied and arranged according to the importance
of attributes showed on gure 9.
Figure 9 presents a list of variables with importance measure calculated for
each of them, based on training (training column) and validation (validation column) set. The last of columns, called Importance, shows graphic representation
of earlier calculations, where dark grey line means variable estimation based on
training set and bright grey line - estimation based on validation set.
According to attributes described above, a classication tree was constructed
for entire population. Classication tree was presented on gure 10.
A colour of a node depends on quantity of observations which aect target
variable. The more observations determining non-payment there are, the darker
the node is. In addition, the thickness of the line can vary as well. It depends on
quantity of observations in branches in relation to quantity of observation in the
root of a tree. It means that line is thicker when the number of observations is
127
Fig. 10. Classification tree based on individual customer population
Fig. 11. The most risky customers
bigger. A tari, which made the rst split in a root of a tree, was chosen as the
most important variable. Split method was used here based on Chi-square test.
Whole population has been divided into good and bad payers, which is illustrated in leaves of classication tree. Leaves of classication tree have been
analyzed and classied from the most risky customers to the least risky customers. Leaves in the gure 11 have been presented in order according to what
percentage of target variable had been used. The colours used in the gure describe which type of set was used in data analysis. Dark grey bar means variable
128
estimation on training set and bright grey bar is estimation based on validate set.
For every decision rule important results have been presented, which in addition
show a more detailed domain in telecommunication company reality.
The constructed model can assessed with regards to matching training data
and validation data. Figure 12 presents on the x-axis a model assessment measure
for dierent subtrees, where a subtree is a tree which was built by pruning nal
nodes.
Fig. 12. Misclassification rate chart based on training and validation data
Fig. 13. Lift chart
The chart also presented misclassication of observation rate in training and

validate dataset. A subtree ts better to the dataset, if misclassication rate of
the subtree is closer to zero value. If we look closely at subtree 15 with 15 leafs,
we will see it has got the smallest misclassication rate based on training and
validate set, which means the subtree ts the most data. Trees tend to t better
to training data rather than to validate data because decision rules are created
to t training data.
Figure 13 presents model assessment which estimates how the model ts the
data. The chart compares a set of observations in validation dataset with estimated model results based on training dataset. A Y-axis contains characteristics,
which depend on observation frequency in various groups.
129
The chart (gure 13) illustrates how results (predictions) have improved after
applying model in comparison to baseline, when estimation was not been done.
Random model often means making decisions at random. On the basis of this
chart it is easy to notice how many times a probability of better results increases
when model is applied. Model can be also compared with base model which does
not use estimation. The baseline curve on the chart presents results for constant
number of successes, which means probability of success on validated dataset.
In addition, it has been tested how a node slip criterion eects construction of
a tree. Therefore two additional models where constructed where split criterion
were Gini Index and Entropy reduction. Results were presented on the gure 13.
The best result was achieved by model with Chi-square test as split criterion.
It has received the highest answer in subsequent units. The second best model
after Chi-square test is a model which used Entropy reduction as split criterion.
The last one is a model which was using Gini Index.
Summary
In previous chapter there were models and methods described which support a
company in decision making. These models allow calculating protability for a
telecommunication company through measuring customer value and determining
level of the risk. By using these models we can divide customers into groups with
high, medium, low risk and examine their features are as well as choices they
make. In addition these models can be used to prevent nancial debt by, for
example, implementing deposit policy. The deposit is usually imposed on certain
groups where customers were assigned after data analysis.
Activation models presented here were used to predict customers failure to
pay after the debt collection process had nished. The most signicant variable is
a tari, chosen by an individual customer. This variable divides whole population
of customers into two groups where one is estimated to be ninety percent of good
payers and the other one ten percent of bad payers.
The most risky customers share some common features. This group can be
divided into certain types of clients. Individuals customers who may fail to pay
their bills activate service for phone restriction and they do not apply for phone
installation address. Existing business customers on the other hand do not need
to show their document of nancial reliability when signing a new contract with
a telecommunication company. In addition individual customers take advantage
of exceptions when signing up for new contract. Exceptions usually include illegible copy of identity document, out of date national judicial register or illegible
payment date on the bill. The characteristic quality of a business customer is
that they do not agree to disclose full information about their nancial obligations. Customers with high probability of failing to pay choose the highest
Internet transfer and most expensive tari with special start price. The contract
is usually signed for 36 months. Furthermore, credit limit of 300 polish zlotych
increases nancial risk for the company.
130
The lowest risk customers are individual customers, who choose low, medium
or no specic tari at all, or who negotiate better oers, or the ones who do not
get any credit limit at all.
Future Research
The aim of future research is to develop activation models on the 45 day after
debt collection process has begun. If the customer fails to pay within 45 days
from date of payment, the debt collection blocks customers outcoming calls.
Therefore it is important to build models which would predict non-payment of
an invoice. The next topic for examination could be analysis of other activation
models based on dierent split criterion in decision trees. The aim of this project
is to discover new rules which would be able to predict customer ability to pay.
Another aim of future research might be further analysis of activation models
based on dierent split methods than those described earlier in the article. Results of this new analysis could bring very useful information in discovering new
rules which identify customers creating nancial risk for a telecommunication
company.
Telecommunication Company, which was used as a basis for research in this
article, is currently growing on the market and is aiming to increase the number
of its customers. Once it achieves a satisfactory level of clients, the company
will start to analyze customers data to prevent non-payment of invoices and to
improve the process of client verication.
References
1. Bluhm, C., Overbeck, L., Wagner, C.: An Introduction to Credit Risk Modeling.
Chapman & Hall/CRC (2002)
2. Keating, C.: Credit Risk Modelling. Palgrave (2003)
3. Gundlach, M., Lehrbass, F.: CreditRisk+ in the Banking Industry. Springer,
Heidelberg (2004)
4. Grandell, J.: Aspects of Risk Theory. Springer, Heidelberg (1991)
5. Lando, D.: Credit Risk Modeling: Theory and Applications. Princeton University
Press, Princeton (2004)
6. B
uhlmann, H.: Mathematical Methods in Risk Theory. Springer, Heidelberg (1996)
7. Lu, J.: Predicting Customer Churn in the Telecommunications Industry An
Application of Survival Analysis Modeling Using SAS. In: SUGI27 Proceedings,
Orlando Florida (2002)
8. Hadden, J., Tiwari, A., Roy, R., Ruta, D.: Churn Prediction: Does Technology
Matter. International Journal Of Intelligent Technology (2006)
9. Breiman, L., Friedman, J.H., Olsen, R.A., Stone, C.J.: Classification and Regression Trees. Chapman & Hall/CRC (1984)
10. Hastie, T., Tibshirani, R., Friedman, J.H.: The Elements of Statistical Learning:
Data Mining, Inference, and Prediction. Springer, Heidelberg (2001)

Credit Risk Handling in Telecommunication

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Credit Risk Handling in Telecommunication

Uploaded by

Copyright:

Available Formats

Credit Risk Handling in Telecommunication

Abstract. This article presents an application of data mining methods

M. Szczerba and A. Ciemski