You are on page 1of 6

A Novel Data Mining Technique To Improve

Business Intelligence

Muhammad Sheraz Arshad Malik1,Sadaf Safder2, Kiran Huma3, Bakhtawar Jabeen4


Department of Information Technology
Government College University
Faisalabad, Pakistan
sheraz_awan@gcuf.edu.pk1,sadafgcuf.24@gmail.com2

Ali Ur Rehman
Institute of Business & Management Sciences
University of Agriculture
Faisalabad, Pakistan
Ranaalirehman143@gmail.com

Muhammad Awais
Department of Software Engineering
Government College University
Faisalabad, Pakistan
awais_java@yahoo.com

Abstract— In the present era, data exploration in business documents and pages. This data is used to better understand
intelligence became a big problem. Information play an important marketing dynamics and up to data the organization according
role in the business industry. As data is not classified and to existing trends. Red Box is a new technique that used in
segmented in some manner than data exploration became very business intelligence to make it more efficient. In red box
difficult. The theme of this paper is to use a technique that extracts
technique cluster method is used to extract the information.
the information and data pattern by using clustering methodology
of data mining. It is different from other tools, where data are a Clustering is a method of Data mining that uses large databases.
combination of alike items and provide a suitable group of each It is used to collect the data of same type or same attributes from
cluster is required. In this method, a number of transaction are the information. Web mining method is used here to collect the
getting from the internet by web mining. These transactions are data from web databases.
passing from a cluster based data mining algorithm and a
significant key is present to identify a priority combination of the Business Intelligence (BI) performs an instant action for the
cluster through which information extract. This method is used in organization on the base of the large and dynamic database. Red
the business world to improve the business productivity. The Box is performing a major role in improving business. It uses
major concept of this script is to provide the extraction of different the minimum time to take a good decision. In the present era,
clustering method of multiple clusters to improve the business
organizations mostly use Business Intelligence based software.
intelligence. In the future, it will try to improve the Red box
approach by using forecasting method. These software make the business more reliable and economic
as they minimize the expenditures and used in promotion of
Keywords—Business Intelligence (BI); Data Mining (DM); organization products and also minimizes the efforts of
Analytic Service Model (ASM); Induction method; Association employees.
algorithm;
Red box uses k-means model of cluster method. K-means
I. INTRODUCTION model finds same priority objects. It is a tree-like structure that
Business is growing rapidly through World Wide Web. organizes data which have the same characteristics. The aim of
There is a collection of data and information exchange through this paper to store all the data according to the date. It identifies
internet [17]. It enables the big organization to complete their the date in which organization position was good and bad and
function rapidly and more accurately. All the data is stored in also identify what is the cause behind this. Red box technique
the web documents and pages [16].Web mining is a technique used for decision making [6]. Decision making is a tree-like
through which direct extract the important data from web structure in which different factors are involved like store data

1|P age
in their own database which obtain by using web mining, single Data find and organize in the cluster. Cluster access data and
domains, and different patterns etc. divide it into parts. All parts are organized in such a way that
each part contain same priority and characteristics of data.
. When it needs to access a specific part of data then represent
this data that contains certain characteristics.it makes data
II. RELATED WORK recovery easy and simple.
TABLE I. COMPARISON TOOLS OF DATAMINING B. Business intelligence
Tool Techniques Function Reference It improved business intelligence in such a way that
productivity of organization become developed and increase the
SPSS Decision trees, Classification, [1] rank of organization. It improved skills of business such as data
CLEMENTINE Neural Network Estimation, recovery, data management, and data sharing and also manage
Bayesian Prediction, Affinity the cost of data.
Classification Grouping,
Regression Description, Time C. Data access from the web
Support Vector series data
Machine It improved Red box technique by directly access data from
a web page by using web mining. Web mining refined important
DB-Miner OLAP and Characterization, [2] data from web pages and documents. It directly accesses data
attribute-oriented Comparison, from the web and then stores it in the form of cluster in their own
induction, Association, database by using k-means model.
Statistical Classification,
Analysis, Prediction, D. Cost controlling
Progressive Clustering
Deeping for It improved currents trends of the technology of business by
mining multiple- minimizing the cost of the product. It managed cost in such a
level Knowledge, way that offered benefit to the organizations.
Meta-rule guided
mining E. Data integration
Data Mining Classification Decision making, [3]
Data access from different sources like from their own
Approach Method, NB, J48, Classification, database, from web mining, data warehouses, and different
MLP,SVM Comparison, Data website. It makes easy to analysis and gets the relevant data from
cleaning, Noise all of these sources.
Removal, Data
scoping F. Market Share
It improves market share by examining data model. Such as
WEKA Data Classification, [4]
preprocessing and prediction,
what is the structure of the model and how it is work? What are
visualization, Comparison, basic components necessary which used for business
Attribute Clustering, intelligence? It also examines how these components used for
Selection, ONER, Association, data analysis to improve market share without any interference.
Decision tree, K- Decision trees
means, Cobweb, G. True Decision
Association rule,
Nearest Neighbor,
It helps in true decision making. It forces on module
Model Evaluation learning. Module learning develops a good decision making that
is really need for the improvement of business intelligence. Such
XL-Miner Association rule, Discriminant [5] as to understand a complicated problem of programming. We
Classification, Analysis Logistic must read it thoroughly. If this problem exists already in an
Clustering, Regression with best organized module then we easily understand it and solve it.
Prediction, Time subset selection,
series Classification trees, IV. APPLICATION OF REDBOX
Naïve Bayes
Classifiers, Neural Information is derived from business and its pattern is derived
Network, K-Nearest by applying methods of Data mining. These methods derived
Neighbor
the structure of the database that helps the user to understand it
easily. Its usability is increased because it is user-friendly. The
red box makes the structure of database user-friendly in such a
III. MOTIVATION way that both organization owner and user fulfill its needs
which improves profit of the organization. It also maintains a
The reason behind to develop a new tool. data structure of stored databases.
Association rule mining is the second application of Red
A. Data representation and storage
box in which it examines and evaluates database. Each
transaction comprises data of client’s transactions. These data
is consist of different items. Rule mining identifies the

2|P age
relationships among these different items. It is collecting the V. DATA MINING TECHNIQUE USED FOR BUSINESS
same priority of items in a group which contain support & INTELLIGENCE MODEL
confidence values that equal or more than equal values of user- The techniques of data mining are developed to work on
defined support and confidence. These user-defined minimum existing information inside the Analytic Service Model (ASM).
confidence and minimum support are both of two operations ASM is designed in such a way that it analyzes all type of
which measure the interestingness of association rule mining. information and it is customer oriented. All kinds of data first
These rules are called interesting association rule. These rules pass from Analytic Services Model and then find the sections
denote a collection of different items. Such as an organization of these data by using mining tools to get input for the algorithm
wants to improve its market strategy. [7] It collects information of data mining framework and also obtain output. The data
about products which are very popular in the market. This mining framework may use only for constantly dimension
information is got by interesting rule. But this organization also members. Only the specific data is implied in it which used
gain information by non-interesting rule such as that product constantly dimension members (not used attribute dimensions
that is unpopular in the market. Because it is also a good or user specific dimensions). It is represented as input data to
information collecting by organization owner to increase its the Data Mining Framework. Thus, data which is required for
profit. analysis must present in standard dimensions and evaluate it
Fig.2 is a data mining model in which different groups of inside the cube. Now, examine what the items a customer buys
products are identified by ID. The main concept of the Red box from the market and find characteristics of these items to
is to derive meaningful information from the existing database identify why these items bought with other items. Data mining
by using data mining algorithm where each transaction tools such as prediction, classification, affinity analysis,
recognizes the Business Intelligence. reduction, exploration, and visualization these all are used for
This manuscript describes data mining, web mining, and active data analysis and identify their characteristics to make clusters
mining tools as an algorithm. In the Fig2 different cluster of [8]. The database is introduced to maintain the data in a
products from which transacted items are derived and then red systematic way. Data present in the database are organized in
box technique are applied that find out the date on which this the form of table. The table is a collection of rows and tables.
transaction is occur. It is also find what the reason behind Business Intelligence is a method used to access these data.
transactions occurred on the specific date. Fig.1 defined a Data mining is an advanced technique which is required to
design model of information which derived from overall get only mandatory data from their data warehouses and web
transactions, mining. Data mining improves Business Intelligence by good
The basic aim of using data mining tool is to improve the decision-making method [9].
predefined business which may be important for the business This paper presents the application of Red box technology
owner and as well client to take a better decision. So that they which implements the data mining tools in the business
increase their business in such a way that they gain more intelligence based model. The new idea of data mining
benefit. The red box is a predefined prototype which represents technology is design a good methodology by using data mining
the both Business Intelligence and structure of data mining approaches. This methodology contains a collection of
algorithm. It is the client-oriented system. It helps users to find transactions of data. It starts from identification of the problem
online products and buy them. and then does the process of transaction shown in Fig.1

Business Decision

Data Extraction Red Box Knowledge Building

BI Data Analysis Model

Data Mining Algorithm

Outlet Sales Data Storage OLTP


Source Systems

Data Extraction Transaction & Loading Staging


Fig.1 Working on the methodology of the Red box[10]

3|P age
MILK PRODUCT (P-Id) OIL (P-Id)

Vegetable (P-Id) MP001 GDL OL001 SFO


Grocery (P-Id)

V001 POT MP002 NBT


G001 RIC G002 MSO

V002 ONI MP003 CHS


G002 KDL G003 GNO

G003 SGR V003 GIN MP004 KLF


G004 OLV

G004 OIL V004 LFN MP005 ICC


G005 DPO

G005 SOA V005 CAR

CUST-ID 74251

Trans- ID P-ID ITEMS

SD0213 G002 KDL

G003 SGR

MP003 CHS

MP004 KLF

G004 OLV

Tran-ID Data of Transaction

Report and Analysis

Fig. 1. Example of Data Mining model with a different group of the cluster[11]

A. Web Mining improve the structure of strategy. This structure helps the
Web mining [16; 17] is a data mining technique which is owner to make a better decision. Web mining improves
used to mine important data pattern from the web. These business intelligence. It mine important information from
data patterns represent behavior and interest of user. It also the web and use this information as an input and store in the
navigate which data pattern is popular among users.it database and also get the output as when required.
identifies the relationship between items of data which

4|P age
B. Clustering Methodology used together is analyzed by using associative analysis. For
It is the main technology of data mining. It gathers items in example, in a supermarket cluster is made by those items that
a group and makes a cluster but it is different from classification customers mostly purchase relatively with other items. Such as
as it is not used for a predefined group of data. Whether it is supermarket owner is not surprised to know if a customer
beneficial for data extraction and find the cluster that data make purchase milk, tea, sugar, bread, and jam together.
itself. For example, in a supermarket cluster is made by those Associative techniques help the customer to find the group
items that customers mostly purchase and buy items relatively of items that relatively connect to each other and purchase these
with other items or maybe another way. For example, assume things together. It is also help the owner to place the relative
the sale transactions of a customer superstore. Its grouped clients items in the right place. Above described the Data mining
by sale date contained different cluster emerge: clients who techniques like web mining, a clustering method, decision trees,
purchase vegetable and oil frequently, clients who purchase and rule induction and association algorithm [14]. The aim to
meet every time, who buy baby food, milk etc. [10]. describe these all element is that to cover all aspects of Red box
C. Decision Trees techniques which may affect the trends and behaviors of
Business intelligence for example elements like input, output
Decision trees [11; 12] especially useful for decision forms etc.
making. These are work as a flowchart to make a decision. These
decisions create rules for making group of data. Decision trees V. CONCLUSION AND FUTURE WORK
help to generate an optimized path for a better decision by
This paper defines Redbox technique, clustering
understanding a minimum number of steps. For example,
Classification and Regression Trees (CART). CART give methodology, Web mining and also explore data mining tools
directions for the new dataset to evaluate which group of data to describe how to make cluster by using clustering technology.
will have an appropriate outcome. Decision trees also help new It explores how to recognize the data patterns of a data set or
user to find his particular target on the web. between the collection of items. The application Redbox
approach improves the Business intelligence. It highlights the
Decision trees are great for decision making as they make a current business trends which help the user to take a better
group of customers and products on the base of priority and decision that increase the profit of business. This is suitable for
having same attributes which permit analysis of all kind of the situation in which a customer needs specific items from the
dataset. The classification method is applied to calculate a
collection of items. This helps the customer in predicting the
person, object, transaction or an event and how to divide it into
items that they exactly want on the basis of previous purchasing
groups. If a supermarket owner wants to divide their customers
into three groups. items. This is best for the same data structure of the database
that performs multiple tasks of business.
 Faithful
In the future forecasting technique[18] of data mining is used
 Possible to leave in acquiring the more accurate data. When the possibility of
 Likely to leave predicting the data is increased then algorithm of forecasting
method is used to suggest a more accurate item.
If he has stored data about customers characteristics and their
buying data pattern and then by using classification model he ACKNOWLEDGMENT
enables how to divide into what type of category [14].
I would like to acknowledge Department of Information
D. Rule Induction technology of Government College University Faisalbad for its
Rule Induction [15] is a derivation of important if-then rules support.
from datasets depends on statistical significance. It explains the
statistical correlation among the existences of items in a dataset. REFERENCES
Association rule mining is an important type of data [1] Pushpalata Pujari, Jyoti Bala Gupta, “Exploiting Data Mining
mining which use the rule induction technique. It is used to Techniques For Improving Efficiency Of Time Series Data Using SPSS-
derived data and identify the relationship between different items of Clementine,” Journal of Arts, Science & Commerce, ISSN 2231-4172
the dataset. It helps in identifying the purchasing attitude and trends of [2] Jiawei Han, Jenny Y. Chiang, Sonny Chee, Jianping Chen, Qing Chen,
the customers. Shan Cheng, Wan Gong, Micheline Kamber, Krzysztof Koperski,
Gang Liu Yijun Lu, Nebojsa Stefanovic, Lara Winstone,
E. Association Algorithm Betty B. Xia, Osmar R. Zaiane, Shuhua Zhang, Hua Zhu, “DBMiner: A
System for Data Mining in Relational Databases and Data Warehouses,”
Association Algorithm is used for the referenced engine
[3] Shamini Raja Kumaran, Mohd Shahizan Othman, Lizawati Mi Yusuf,”
which depends on analysis of product and market. This helps the Data Mining Approaches In Business Intelligence: Postgraduate Data
customers in selecting the items on the based they purchased Analytic,” Jurnal Teknologi (Sciences & Engineering) 78: 8-2 (2016) 75-
earlier. The model is constructed by a database that has 79
identifiers. Each individual as well as a collection of items that [4] Zdravko Markov, Ingrid Russell,” An Introduction to the WEKA Data
contain cases both having identifiers. This collection of items in Mining System,”
a database are called item set. These algorithms navigate a data [5] B.Sangameshwari1, P. Uma2, “A Survey on Data Mining Techniques In
to find the objects that present in a case. MINIMUM-SUPPORT Business Intelligence,” International Journal Of Engineering And
Computer ScienceVolume 3 Issue 10 October 2014 Page No. 8575-8582
is used for all items which may exist in the case. Market analysis
is also called associative analysis. Popular items that commonly

5|P age
[6] Lior Rokach, Oded Maimon, Clustering Method, Department of
Industrial Engineering, Tel-Aviv University.
[7] mari, (2008), Data Mining for Retail Website Design and Enhanced
Marketing, PP-2.
[8] Galit Shmueli, Nitin R. Patel, Peter C. Bruce, (2010). Data Mining for
Business Intelligence: Concepts, Techniques, and Applications in
Microsoft Office Excel® with XLMiner®, 2nd Edition.
[9] Prabhu, N.Anbazhagan (2014), A New Hybrid Algorithm
for Business Intelligence Recommender System, International
Journal of Network Security & Its Applications (IJNSA), Vol.6,
No.2,pp.43-52.
[10] Tapan Nayak,” RedBox-A Data Mining Approach for Improving
Business Intelligence,” International Journal of Computer Science
Trends and Technology (IJCST) – Volume 4 Issue 2, Mar-Apr 2016
[11] Tapan Nayak,” RedBox-A Data Mining Approach for Improving
Business Intelligence,” International Journal of Computer Science
Trends and Technology (IJCST) – Volume 4 Issue 2, Mar-Apr 2016
[12] Pat Langley and Herbert A. Simon. “Applications of Machine Learning
and Rule Induction. Communications of the ACM,” 38(11):54–64, 1995.
[13] Agnar Aamodt and Enric Plaza. “Case-Based Reasoning: Foundational
Issues, Methodological Variations, and System Approaches. AICom
Artificial Intelligence Communications,” 7(1):39– 59, 1994.
[14] Dan Sullivan, (2012), Next Generation Business
Intelligence: Data Mining [online]
http://www.tomsitpro.com/articles/business_intelligence
data_mining_tools -data_analytics-spssolap_analysis
[15] Pat Langley and Herbert A. Simon. Applications of Machine Learning
and Rule Induction. Communications of the ACM, 38(11):54–64, 1995.
[16] Jaideep Srivastava, Prasanna Desikan, Vipin Kumar,” Web Mining
Concepts, Applications, and Research Directions”
[17] R. Munilatha, K.Venkataramana,” A STUDY ON ISSUES AND
TECHNIQUES OF WEB MINING,” IJCSMC, Vol. 3, Issue. 5, May 2014,
pg.331 – 341
[18] J. Scot Armstrong, Roderick J. Brodie,” Forecasting for Marketing,”
Reprinted from Quantitative Methods in Marketing, edited by Graham J.
Hooley and Michael K. Hussey (London: International Thomson
Business
Press, 1999), pages 92-120.

6|P age

You might also like