You are on page 1of 4

Pratama 1

Dionisius Pratama
Eric Hasmiraldi, S.Pd., M.Hum.
KU 201 – English II
April 18th 2018

“DATA MINING AND STATISTICS FOR DECISION MAKING”

Introduction

When someone is faced with problems, he/she must decide what to do best to deal with the

problems. The most important thing to deal with the problems are data. The knowledge of those data is

important because with accurate and precise datas, he/she can decide what is the best thing to solve

their problems.

Suppose this someone is a manager of a factory outlet which sell several types of clothes. His

department store is visited with a lot of visitors everyday, and they each have their own personal interest

on clothes sold in his store. One day, he wants to know what type of clothes which is mostly bought by

the visitors. He intends to order more of that type of clothes and put it into his store. It will be an easy

job if the visitors are only 10 people each month. He can just make a list on a paper and do calculations

there. The problem is that his store is not visited by only 10 people, suppose it is visited by 10.000

people each day. How many papers would he need to do calculations then? This is where data mining

and statistics come to help people deal with big amount of data. To put it simply, data mining and

statistics will provide people with accurate data analytics of a problem.

Data Mining and Statistics: Definition

What is data mining? Data mining is the set of methods and techniques for exploring and

analysing data sets (which are often large), in an automatic or semi-automatic way, in order to find

among these data certain unknown or hidden rules, associations or tendencies; special systems output

the essentials of the useful information while reducing the quantity of data. Briefly, data mining is the

art of extracting information – that is, knowledge – from data.1 , and statistics is a method of collecting

and analysing data2. The size of the data is denoted by n. These are the two methods used to give

precise and accurate information of a problem to people who need it.

1 Stéphane Tufféry, Data Mining and Statistics for Decision Making (Wiley: 2011), page 4.
2 Jarkko Isotalo, Basics of Statistics(journal), page 1.
Pratama, Data Mining and Statistics for Decision Making 2

Combining Data Mining and Statistics

If the data mining is so powerful, why scientist should combine the two methods? Study this

one simple case: data mining usage in business world. The business world is growing stronger, bigger,

and also more variative in the world of today. Because its growth, the data needed by the developer

and the workers are not just quantitative data which statistics alone can give. The data they need is

much more complex, such as: consumer’s profile, their interests on other products in the present and

in the future, and even consumer’s behavior.

To get a precise and accurate information from a lot amount of data which have many types of

variables (because human behavior is not a discrete variable), data mining and statistics must be

combined. Statistics cannot give people information from analysing those kinds of data type. This is the

reason why data mining is used together with statistics. Data mining, which is build based on neural

networks, decision trees, and machine learning algortihm, will be much of help when dealing with the

complex types of variables (categorial variable, variable that contains a finite number of categories or

distinct groups) as mentioned before.

Statistics give the people a quantitative data and data mining methods will give people the

qualitative data. Statistics is also used because of its simplicity. People do not need complex data

mining to do quantitative analysis because statistics alone can be used to do it. Later, the data extracted

by using those two methods will be used by the business developer to make a decision on how will they

drive and manage their business, predict future trends, come up with better business strategies, and

more.

Data Mining: Is it helpful or bad thing for making decision?

The data mining method is able to take personal information from a person. Even though it is

not used for criminal actions, there are still some people who do not want to use it. One example, taken

from the New York Times article (2015), Timothy D. Cook, the chief executive of Apple, agued that

companies are not worthy to have other people’s email or their history or the family photos data-mined

and sold off for what advertising purpose. Another, a survey by Annenberg university (The Tradeoff

Fallacy), shows 55 percent of respondents disagreed or strongly disagree to the statement “it’s O.K. if

a store where I shop uses information it has about me to create a picture of me that improves the

services they provide for me.” About 7 in 10 people also disagreed that it was fair for a store to monitor

their online activities in exchange for free Wi-Fi while at the store. 91 percent of respondents disagreed
Pratama, Data Mining and Statistics for Decision Making 3

that it was fair for companies to collect information about them without their knowledge in exchange for

a discount.

Is the data mining invention make bad influence to people’s daily life? The answer is no. Rather

than stealing personal data, data mining is used for personalized service for a person. The data

extracted by mining method and analysed by the business developers can give the develpoers a good

prediction of their consumer behaviour. With the data, they can give that person an advertisement, for

example, that matches with their personal interests, when he/she goes to the web (online shop) or even

to a store, if the store staff is told about that person interests. This is also called user behaviour analysis.

The second reason is almost everything that people do today (business process and

transactions, banking transaction, market basket analysis, and more) is stored into computer system,

and it makes the data size is now expanded to yottabyte (1024 bytes). Powerful tools for data analysing

is needing to transform those up-to 1024 bytes data into some valuable and organized knowledge for

making decision. For case example: mining usage in search engine queries. Suppose that an online

shop has a search engine built within it, and each they, it receives many queries. These queries can be

viewed as a “information exchange”, where the people describe his/her requirement for informations.

What interesting is that the developers are able to read a pattern in the queries and gain some valuable

knowledge of it by doing data mining, which cannot be seen if the developers only read individual items.

Example, the online shop puts a new category of clothes that is sold. The data mining algorithm uses

specific search terms as indicators of searching activity of the new clothes type. Closed relationship ca

be found between the number of users only doing the search and the number of users who actually

want to buy the new clothes. If all of the search queries related to the new clothes type are aggregated,

a pattern will go out. Using aggregated search data, the developer can estimate their consumer activity

quicker than traditional systems do.

Conclusion

Together with statistics, data mining method can be used to extract all information about a

person behavior. Data mining and statistics is mainly used for business process which have large

amount of data. If most of business developers out there already know these two methods, there might

be some change in their business management. After all, the final point of using data mining in their

business process is to make more money from it.


Pratama, Data Mining and Statistics for Decision Making 4

REFERENCES

Han, Jiawei; Micheline Kamber and Jian Pei. 2012. Data Mining: Concepts and Techniques (3rd edition).

United States of America: Morgan Kaufmann Publications.

Tufféry, Stéphane. 2011. Data Mining and Statistics for Decision Making. John Wiley & Sons, Ltd.,

Publication.

FROM GOOGLE SCHOLAR

Isotalo, Jarkko. Basics of Statistics. http://www.academia.edu/download/43653651/statistics.pdf.

Accessed 1 April 2018.

ADDITIONAL REFERENCES

Singer, Natasha. 2015. Sharing Data, but Not Happily.

https://www.nytimes.com/2015/06/05/technology/consumers-conflicted-over-data-mining-policies-

report-finds.html. Accessed 10 April 2018.

Turow, Joseph; Michael Hennessy and Nora Draper. 2015. The Tradeoff Fallacy. Report.

https://www.asc.upenn.edu/sites/default/files/TradeoffFallacy.pdf. United States: Annenberg

School for Communication, University of Pennsylvania.

You might also like