You are on page 1of 8

DATA MINING IN DATA ANALYSIS FOR BUSINESS DECISION SUPPORT

IN WAREHOUSE MANAGEMENT WITH WEKA PROGRAM


by
C. Chanawuth
School of Business, University of the Thai Chamber of Commerce
126/1 Vibhavadee-Rangsit Road, Dindaeng,
Bangkok 10400, Thailand
Tel: (662) 697-6101-5,
E-mail: chanawuth_chi@utcc.ac.th
ABSTRACT
Data mining in data analysis for business decision support in warehouse management with Weka program which
is one of the open source programs is able to analyze data and help support various kinds of decision making. For
example data mining supports customers service cancellation prediction, product association in transportation for
warehouse store keeping management. This is a tool to enable business decision making to gain superior over other
competitors. Classification techniques are mostly applied in data mining which are able to analyze both continuous and
discrete data. This research is focused on data classification analysis and numerical prediction with Weka program which
can analyze data by developing a model for decision making and business solving support.
KEY WORDS
Data Mining, Classification, Weka program, Decision Support System
INTRODUCTION
With the competition of the business world in present applying knowledge, experience and technologies has
become significant aspect in accomplishing advantageous positions among competitors. Another important aspect which
should not be overlooked is data analysis. Analyzing data, knowledge and experience must be applied in order to take the
results into business decision making support. In order to do so, with many kinds of data and information and with large
amounts, data analysis requires a great deal of time. If the analysts lack experience in choosing data and information, the
analytic outcomes will be useless. Data mining is an aid for analyzing large amounts of data (Power, 2008).
Data mining is a technique in finding significant data hidden in multiplex data sources and is able to select only
necessary data (Gargano and Raggad, 1999; Rafalski, 2002). Data mining is the discovery of both data association which
is hidden in data sources and new body of knowledge through a tool developed in the processes of data mining (Gargano
and Raggad, 1999).
DATA ANALYSIS BY DATA MINING
Figure 1 is the presentation of the use of data mining technique in data analysis within data preparation which is
a time consuming process. Sometimes, if the data is still unclear, the process of business understanding has to be rerun.
When the data is clear, the next process is to model with the use of data mining technique.
Business Understanding : analyzing business processes, the parts of occurring problems or the parts that
require data to support decision making and analytical targeting.
Data Understanding : analyzing data used in decision making support and understanding data format,
Types, sources and amounts to apply in the analysis.
Data Selection : selecting necessary data for the modeling process of decision making support which
Must be the problem solving data.
Data Cleaning : checking data accuracy and correcting such as incomplete data, inaccurate
numerical data. Weka program can present such wrong data in order to correct it.
Data Transformation : adjusting all data to be used into a ready format for analysis and modeling process.
Because data may be from various sources and with different data types, it is
necessary to change all data into the same format to be ready for developing model
for decision support.
Modeling : putting the selected data into 2 categories that are Training Data and Evaluation
Data will be used in developing a model for decision support with the use of data
mining techniques. In this part, Weka will apply the chosen algorithm in
processing data and present the best results from the analysis.
Evaluation : after developing a model to support decision making, take the Evaluation Data to be
tested with the model to check the model to check the accuracy of the developed
model.
Deployment Model : using the model for decision support with Unseen Data to work with Weka program
and aid in output evaluation.
FIGURE 1
DATA MINING WORKFLOW
In data mining, Weka is a program that evaluates the data analysis to develop models to support on decision
making on business warehouse management. There are many data mining techniques for model developing. Among the
most popular ones are Classification, Clustering and Association Rule Discovery which are applied in model developing
(Weka Machine Learning Project, 2010; Wass, 2007). For the models supporting decision making in business warehouse
management in this essay, Classification is used in analyzing nominal data and prediction analysis for numeric data.
Processes in model developing with classification technique as in figure2 is applying training data in developing model
and testing the model with the evaluation data. Then the model is used in real practice by applying the unseen data that
there is still no answer class with this model, developed by Weka program.
FIGURE 2
CLASSIFICATION WORFLOW IN DATA MINING
CLASSIFICATION CASE STUDY: CUSTOMERS TYPE ANALYSIS
One significant problem in warehouse management is dealing with the limited space for large amount of product
demands. If the types of customers are known, ordering can be adjusted to suit the demands. For example, by classifying
customers into 2 groups that is A representing premium customers and D representing general customers. The data
analysis with classification is to process the customers buying list data from their previous visit to divide customers into
groups. The model supporting customer classification will be developed by the use of a decision tree to solve the
problem. The algorithm used in developing the decision tree is C4.4 algorithm (JIANG, et al., 2009). In Weka program
the J48 is applied. This is upgrade from C4.4 algorithm to C4.5 algorithm and it is called J48 when applied in Weka
program as the algorithm to develop the decision tree.
The data of customers buying lists in the past is presented in table 1
TABLE 1
THE DATA OF THE CUSTOMERS BUYING LISTS IN THE PAST
FOR CUSTOMER CLASSIFICATION MODELING
No. Sex Age Income Product_A Product_B Product_C Type
1001 Male 20 12,000 0 0 0 D
1002 Female 18 7,000 1 1 1 A
1003 Female 35 35,000 0 0 0 D
1004 Male 28 6,000 1 1 0 A
1005 Female 32 20,000 0 0 0 D
1006 Male 20 12,000 0 0 0 D
1007 Female 23 7,000 1 1 1 A
1008 Female 35 35,000 0 0 0 D
1009 Male 18 6,000 1 1 0 A
1010 Female 32 20,000 0 0 0 D
1011 Male 34 12,000 0 0 0 D
1012 Female 45 27,000 1 1 1 A
1013 Female 22 35,000 0 0 0 D
1014 Male 18 8,000 1 1 0 A
1015 Female 32 20,000 0 0 0 D
1016 Male 22 12,000 0 0 0 D
1017 Female 34 17,000 1 1 1 A
1018 Female 35 35,000 0 0 0 D
1019 Male 53 26,000 1 1 0 A
1020 Female 32 20,000 0 0 0 D
1021 Male 34 12,000 0 0 0 D
1022 Female 34 24,000 1 1 1 A
1023 Female 35 35,000 0 0 0 D
1024 Male 33 60,000 1 1 0 A
1025 Female 34 20,000 0 0 0 D
1026 Male 20 12,000 0 0 0 D
1027 Female 35 37,000 1 1 1 A
1028 Female 35 35,000 0 0 0 D
1029 Male 33 26,000 1 1 0 A
1030 Female 32 20,000 0 0 0 D
Remark: 1 represents purchasing and 0 represents purchasing rejection in Product A, Product B and Product C.
Unseen data for model testing in order to know customer types is presented in table 2.
TABLE 2
THE DATA OF CUSTOMERS PURCHASING DECISION
FOR CUSTOMER CLASSIFICATION MODELING
No. Sex Age Income Product_A Product_B Product_C Type
1031 Male 17 5,000 1 0 1 ?
Remark: 1 represents purchasing and 0 represents purchasing rejection in Product A, Product B and Product C.
The outcome of model developing with the use of classification technique of Weka program as in figure 3 is
presented in a form of the decision tree or J48 algorithm as in figure 2. Then, if we test the data in table 2, type A
customers will be the result.
FIGURE 3
ANALYTICAL OUTCOME OF CLASSIFICATION: DECISION TREE
The result is that we will know the number of customer types from their purchasing behaviors so product
ordering and storing in response to customers demands can be well managed.
Classification:
Customers Type Model
CLASSIFICATION: ANALYZING WAREHOUSE RENTAL
If warehouse rental cannot be managed or planned ahead, the problem will be came the burden for business.
There must always be warehouse improvement such as improving air conditioners in the freezing rooms for dozen foods,
the number of RFID for product checking, improving walls in warehouses or resizing storing rooms. All these can apply
in data analysis for warehouse rental calculation. With classification analysis, transferring fees can be predicted in
advance by using linear regression in Weka program to aid model developing for decision support as in table 3.
TABLE 3
WAREHOUSE RETAL CALCUATING DATA
Area Size Security Fee RFID Upgrades Partition Upgrades Air Rental
3,529 9,191 6 0 0 205,000
3,247 10,061 5 1 1 224,900
4,032 10,150 5 0 1 197,900
2,397 14,156 4 1 0 189,900
2,200 9,600 4 0 1 195,000
3,536 19,994 6 1 1 325,000

2,983 9,365 5 0 1 230,000
Remark: 1 represents equipment improving and 0 represents no improving activity in Partition and Air.
The outcome from Weka analysis is presented in figure 4
FIGURE 4
ANALYTICAL OUTCOME OF CLASSIFICATION: LINEAR REGRESSION
Classification:
Warehouse Rental Model
The equation from Weka program processing is
Services Price = (-26.6882 * Area Size) + (7.0551 * Security Fee) + (43166.0767 * RFID) + (42292.0901 * Upgrades
Air) + (-21661.1208)................................................................................................................................ (1)
TABLE 4
THE DATA FOR MODEL TESING
Area Size Security Fee RFID Upgrades Partition Upgrades Air Rental
3,529 9,191 6 0 0 205,000
Rental
= (-26.6882 * 3,529) + (7.0551 * 9,191) + (43166.0767 * 6) + (42292.0901 * 0) + (-21661.1208)
= (-94182.6578) + (64843.4241) + (258996.4602) + (0) + (-21661.1208)
= 207,996.12 Baht
By applying the data of table 4 into the equation from model developing, it is found that the value is 207,996.12
Baht while the value of real data is 205,000.00 Baht. Comparing the 2 values, the differential is 2,996.12 Baht.
CONCLUSION
The outcome of data analyzing with data mining is a model supporting decision on warehouse management to
solve the problems. Apart from data, knowledge or experience, an efficient tool is necessary in supporting any ideas to
gain better business process ahead of other competitors.
Applying Weka program which is an open source program is the beginning of data mining application in any
business that needs to discover some hidden information among large amounts of data especially the warehouse business
which relies on limited resource management to gain the largest benefit.
REFERENCES
Edward Rafalski (2002), Using data mining/data repository methods to identify marketing opportunities in health care,
Journal of Consumer Marketing, 19(7), pp. 607-613
JIANG, L., LI, C., & CAI, Z. (2009). DECISION TREE WITH BETTER CLASS PROBABILITY ESTIMATION.
International Journal of Pattern Recognition & Artificial Intelligence, 23(4), pp. 745-763.
Liu John I. C, Yun David Y. Y and Klein Gary (1990), An Agent for Intelligent Model Management, Journal of
Management Information Systems, 7(1), pp. 101-122
Michael L. Gargano and Bel G. Raggad (1999), Data Mining a powerful information creating tool, OCLC Systems &
Services, 15(2), pp. 81-90
PETRUEL Rzvan (2009), Collaborative Virtual Enterprise Environment and Decision Mining, Informatics
Economica, 13(2), pp. 59-67
Power Daniel J (2008), Understanding Data-Driven Decision Support Systems, Information Systems Management, 25(2),
pp.149-154
Sang Jun Lee and Keng Siau (2001). A review of data mining techniques, Industry Management & Data System, 101(1),
pp. 41 46.
Wass, J. (2007). Weka Machine Learning Workbench. Scientific Computing, 24(3), pp. 21-47.
Weka Machine Learning Project, Retrieved August 2010, 01, from
http://www.cs.waikato.ac.nz/~ml/weka/index.html

You might also like