Bia

UNIT— 3
DATA MINING
Data Mining is the process of extracting useful information and patterns from enormous
data. Data Mining includes collection, extraction, analysis and statistics of data. It is also
known as Knowledge discovery process, Knowledge Mining from Data or data/ pattern
analysis. Data Mining is a logical process of finding useful information to find out useful
data. Once the information and patterns are found it can be used to make decisions for
developing the business. Data mining tools can give answers to your various questions
related to your business which was too difficult to resolve. They also forecast the future
trends which lets the business people to make proactive decisions.
Data mining involves three steps. They are
 Exploration– In this step the data is cleared and converted into another form. The nature of
data is also determined
 Pattern Identification– The next step is to choose the pattern which will make the best
prediction
 Deployment– The identified patterns are used to get the desired outcome.
Benefits of Data Mining

1. Automated prediction of trends and behaviours
2. It can be implemented on new systems as well as existing platforms
3. It can analyse huge database in minutes
4. Automated discovery of hidden patterns
5. There are a lot of models available to understand complex data easily
6. It is of high speed which makes it easy for the users to analyze huge amount of data in
less time
7. It yields improved predictions
Data Mining Techniques

One of the most important task in Data Mining is to select the correct data mining
technique. Data Mining technique has to be chosen based on the type of business and the
type of problem your business faces. A generalized approach has to be used to improve the
accuracy and cost effectiveness of using data mining techniques. There are basically seven
main Data Mining techniques which is discussed in this article. There are also a lot of
other Data Mining techniques but these seven are considered more frequently used by
business people.
 Statistics
 Clustering
 Visualization
 Decision Tree
 Association Rules
 Neural Networks
 Classification
1. Statistical Techniques
Data mining techniques statistics is a branch of mathematics which relates to the collection
and description of data. Statistical technique is not considered as a data mining technique
by many analysts. But still it helps to discover the patterns and build predictive models. For
this reason data analyst should possess some knowledge about the different statistical
techniques. In today’s world people have to deal with large amount of data and derive
important patterns from it. Statistics can help you to a greater extent to get answers for
questions about their data like
 What are the patterns in their database ?

 What is the probability of an event to occur ?
 Which patterns are more useful to the business ?
 What is the high level summary that can give you a detailed view of what is there in the
database ?
Statistics not only answers these questions they help in summarizing the data and count it.
It also helps in providing information about the data with ease. Through statistical reports
people can take smart decisions. There are different forms of statistics but the most
important and useful technique is the collection and counting of data. There are a lot of
ways to collect data like
 Histogram
 Mean
 Median
 Mode
 Variance
 Max
 Min
 Linear Regression
2. Clustering Technique
Clustering is one among the oldest techniques used in Data Mining. Clustering analysis is
the process of identifying data that are similar to each other. This will help to understand
the differences and similarities between the data. This is sometimes called segmentation
and helps the users to understand what is going on within the database. For example, an
insurance company can group its customers based on their income, age, nature of policy
and type of claims.There are different types of clustering methods. They are as follows
 Partitioning Methods
 Hierarchical Agglomerative methods
 Density Based Methods
 Grid Based Methods
 Model Based Methods
The most popular clustering algorithm is Nearest Neighbour. Nearest neighbour technique
is very similar to clustering. It is a prediction technique where in order to predict what a
estimated value is in one record look for records with similar estimated values in historical
database and use the prediction value from the record which is near to the unclassified
record. This technique simply states that the objects which are closer to each other will
have similar prediction values. Through this method you can easily predict the values of
nearest objects very easily. Nearest Neighbour is the most easy to use technique because
they work as per the thought of the people. They also work very well in terms of
automation. They perform complex ROI calculations with ease. The level of accuracy in
this technique is as good as the other Data Mining techniques.In business Nearest
Neighbour technique is most often used in the process of Text Retrieval. They are used to
find the documents that share the important characteristics with that main document that
have been marked as interesting.
3. Visualization -Visualization is the most useful technique which is used to discover data
patterns. This technique is used at the beginning of the Data Mining process. Many
researches are going on these days to produce interesting projection of databases, which is
called Projection Pursuit. There are a lot of data mining technique which will produce
useful patterns for good data. But visualization is a technique which converts Poor data into
good data letting different kinds of Data Mining methods to be used in discovering hidden
patterns.
4. Induction Decision Tree Technique - A decision tree is a predictive model and the name
itself implies that it looks like a tree. In this technique, each branch of the tree is viewed as
a classification question and the leaves of the trees are considered as partitions of the
dataset related to that particular classification. This technique can be used for exploration
analysis, data pre-processing and prediction work. Decision tree can be considered as a
segmentation of the original dataset where segmentation is done for a particular reason.
Each data that comes under a segment has some similarities in their information being
predicted. Decision trees provides results that can be easily understood by the user.
Decision tree technique is mostly used by statisticians to find out which database is more
related to the problem of the business. Decision tree technique can be used for Prediction
and Data pre-processing. The first and foremost step in this technique is growing the tree.
The basic of growing the tree depends on finding the best possible question to be asked at
each branch of the tree. The decision tree stops growing under any one of the below
circumstances
 If the segment contains only one record

 All the records contain identical features
 The growth is not enough to make any further spilt
CART which stands for Classification and Regression Trees is a data exploration and
prediction algorithm which picks the questions in a more complex way. It tries them all and
then selects one best question which is used to split the data into two or more segments.
After deciding on the segments it again asks questions on each of the new segment
individually. Another popular decision tree technology is CHAID (Chi-Square Automatic
Interaction Detector). It is similar to CART but it differs in one way. CART helps in
choosing the best questions whereas CHAID helps in choosing the splits.
5. Neural Network - Neural Network is another important technique used by people these
days. This technique is most often used in the starting stages of the data mining technology.
Artificial neural network was formed out of the community of Artificial intelligence.
Neural networks are very easy to use as they are automated to a particular extent and
because of this the user is not expected to have much knowledge about the work or
database. But to make the neural network work efficiently you need to know
 How the nodes are connected ?
 How many processing units to be used ?
 When should the training process to be stopped ?
There are two main parts of this technique – the node and the link
 The node– which freely matches to the neuron in the human brain
 The link– which freely matches to the connections between the neurons in the human brain
A neural network is a collection of interconnected neurons. which could form a single layer
or multiple layer. The formation of neurons and their interconnections are called
architecture of the network. There are a wide variety of neural network models and each
model has its own advantages and disadvantages. Every neural network model has different
architectures and these architectures use different learning procedures. Neural networks are
very strong predictive modelling technique. But it is not very easy to understand even by
experts. It creates very complex models which is impossible to understand fully. Thus to
understand the Neural network technique companies are finding out new solutions. Two
solutions have already been suggested
 First solution is Neural network is packaged up into a complete solution which will let it to
be used for a single application
 Second solution is it is bonded with expert consulting services
Neural network has been used in various kinds of applications. This has been used in the
business to detect frauds taking place in the business.
6. Association Rule Technique
This technique helps to find the association between two or more items. It helps to know
the relations between the different variables in databases. It discovers the hidden patterns in
the data sets which is used to identify the variables and the frequent occurrence of different
variables that appear with the highest frequencies. Association rule offers two major
information
 Support– Hoe often is the rule applied ?

 Confidence– How often the rule is correct ?
This technique follows a two step process
 Find all the frequently occurring data sets

 Create strong association rules from the frequent data sets
There are three types of association rule. They are
 Multilevel Association Rule

 Multidimensional Association Rule
 Quantitative Association Rule
This technique is most often used in retail industry to find patterns in sales. This will help
increase the conversion rate and thus increases profit.
7. Classification
Data mining techniques classification is the most commonly used data mining technique
which contains a set of pre classified samples to create a model which can classify the large
set of data. This technique helps in deriving important information about data and metadata
(data about data). This technique is closely related to cluster analysis technique and it uses
decision tree or neural network system. There are two main processes involved in this
technique
 Learning– In this process the data are analyzed by classification algorithm

 Classification– In this process the data is used to measure the precision of the classification
rules
There are different types of classification models. They are as follows
 Classification by decision tree induction

 Bayesian Classification
 Neural Networks
 Support Vector Machines (SVM)
 Classification Based on Associations
One good example of classification technique is Email provider.
Market basket analysis - Market basket analysis (MBA) is an example of an analytics

technique employed by retailers to understand customer purchase behaviours. It is used to
determine what items are frequently bought together or placed in the same basket by
customers. It uses this purchase information to leverage effectiveness of sales and
marketing. MBA looks for combinations of products that frequently occur in purchases and
has been prolifically used since the introduction of electronic point of sale systems that
have allowed the collection of immense amounts of data.
Market basket analysis only uses transactions with more than one item, as no associations
can be made with single purchases. Item association does not necessarily suggest a cause
and effect, but simply a measure of co-occurrence. It does not mean that since energy
drinks and video games are frequently bought together, one is the cause for the purchase of
the other, but it can be construed from the information that this purchase is most probably
made by (or for) a gamer. Such rules or hypothesis must be tested and should not be taken
as truth unless item sales say otherwise. There are two main types of MBA:
1. Predictive MBA is used to classify cliques of item purchases, events and services that
largely occur in sequence.
2. Differential MBA removes a high volume of insignificant results and can lead to very in-
depth results. It compares information between different stores, demographics, seasons of
the year, days of the week and other factors.
MBA is commonly used by online retailers to make purchase suggestions to consumers.
For example, when a person buys a particular model of smartphone, the retailer may
suggest other products such as phone cases, screen protectors, memory cards or other
accessories for that particular phone. This is due to the frequency with which other
consumers bought these items in the same transaction as the phone. MBA is also used in
physical retail locations. Due to the increasing sophistication of point of sale systems
coupled with big data analytics, stores are using purchase data and MBA to help improve
store layouts so that consumers can more easily find items that are frequently purchased
together.
Data Mining Applications

Here is the list of areas where data mining is widely used −
 Financial Data Analysis

 Retail Industry
 Telecommunication Industry
 Biological Data Analysis
 Other Scientific Applications
 Intrusion Detection
Financial Data Analysis

The financial data in banking and financial industry is generally reliable and of high quality
which facilitates systematic data analysis and data mining. Some of the typical cases are as
follows −
 Design and construction of data warehouses for multidimensional data analysis and data
mining.
 Loan payment prediction and customer credit policy analysis.
 Classification and clustering of customers for targeted marketing.
 Detection of money laundering and other financial crimes.
Retail Industry
Data Mining has its great application in Retail Industry because it collects large amount of
data from on sales, customer purchasing history, goods transportation, consumption and
services. It is natural that the quantity of data collected will continue to expand rapidly
because of the increasing ease, availability and popularity of the web. Data mining in retail
industry helps in identifying customer buying patterns and trends that lead to improved
quality of customer service and good customer retention and satisfaction. Here is the list of
examples of data mining in the retail industry −
 Design and Construction of data warehouses based on the benefits of data mining.
 Multidimensional analysis of sales, customers, products, time and region.
 Analysis of effectiveness of sales campaigns.
 Customer Retention.
 Product recommendation and cross-referencing of items.
Telecommunication Industry
Today the telecommunication industry is one of the most emerging industries providing
various services such as fax, pager, cellular phone, internet messenger, images, e-mail, web
data transmission, etc. Due to the development of new computer and communication
technologies, the telecommunication industry is rapidly expanding. This is the reason why
data mining is become very important to help and understand the business. Data mining in
telecommunication industry helps in identifying the telecommunication patterns, catch
fraudulent activities, make better use of resource, and improve quality of service. Here is
the list of examples for which data mining improves telecommunication services −
 Multidimensional Analysis of Telecommunication data.

 Fraudulent pattern analysis.
 Identification of unusual patterns.
 Multidimensional association and sequential patterns analysis.
 Mobile Telecommunication services.
 Use of visualization tools in telecommunication data analysis.
Biological Data Analysis

In recent times, we have seen a tremendous growth in the field of biology such as
genomics, proteomics, functional Genomics and biomedical research. Biological data
mining is a very important part of Bioinformatics. Following are the aspects in which data
mining contributes for biological data analysis −
 Semantic integration of heterogeneous, distributed genomic and proteomic databases.

 Alignment, indexing, similarity search and comparative analysis multiple nucleotide
sequences.
 Discovery of structural patterns and analysis of genetic networks and protein pathways.
 Association and path analysis.
 Visualization tools in genetic data analysis.
Other Scientific Applications

The applications discussed above tend to handle relatively small and homogeneous data
sets for which the statistical techniques are appropriate. Huge amount of data have been
collected from scientific domains such as geosciences, astronomy, etc. A large amount of
data sets is being generated because of the fast numerical simulations in various fields such
as climate and ecosystem modelling, chemical engineering, fluid dynamics, etc. Following
are the applications of data mining in the field of Scientific Applications −
 Data Warehouses and data pre-processing.

 Graph-based mining.
 Visualization and domain specific knowledge.
Intrusion Detection
Intrusion refers to any kind of action that threatens integrity, confidentiality, or the
availability of network resources. In this world of connectivity, security has become the
major issue. With increased usage of internet and availability of the tools and tricks for
intruding and attacking network prompted intrusion detection to become a critical
component of network administration. Here is the list of areas in which data mining
technology may be applied for intrusion detection −
 Development of data mining algorithm for intrusion detection.
 Association and correlation analysis, aggregation to help select and build discriminating
attributes.
 Analysis of Stream data.
 Distributed data mining.
 Visualization and query tools.
Data Mining System Products

There are many data mining system products and domain specific data mining applications.
The new data mining systems and applications are being added to the previous systems.
Also, efforts are being made to standardize data mining languages.
Choosing a Data Mining System

The selection of a data mining system depends on the following features −
 Data Types− The data mining system may handle formatted text, record-based data, and
relational data. The data could also be in ASCII text, relational database data or data
warehouse data. Therefore, we should check what exact format the data mining system can
handle.
 System Issues− We must consider the compatibility of a data mining system with different
operating systems. One data mining system may run on only one operating system or on
several. There are also data mining systems that provide web-based user interfaces and
allow XML data as input.
 Data Sources− Data sources refer to the data formats in which data mining system will
operate. Some data mining system may work only on ASCII text files while others on
multiple relational sources. Data mining system should also support ODBC connections or
OLE DB for ODBC connections.
 Data Mining functions and methodologies− There are some data mining systems that
provide only one data mining function such as classification while some provides multiple
data mining functions such as concept description, discovery-driven OLAP analysis,
association mining, linkage analysis, statistical analysis, classification, prediction,
clustering, outlier analysis, similarity search, etc.
 Coupling data mining with databases or data warehouse systems− Data mining systems
need to be coupled with a database or a data warehouse system. The coupled components
are integrated into a uniform information processing environment. Here are the types of
coupling listed below -
o No coupling
o Loose Coupling
o Semi tight Coupling
o Tight Coupling
 Scalability− There are two scalability issues in data mining −

o Row (Database size) Scalability− A data mining system is considered as row scalable when
the number or rows are enlarged 10 times. It takes no more than 10 times to execute a
query.
o Column (Dimension) Salability− A data mining system is considered as column scalable if
the mining query execution time increases linearly with the number of columns.
 Visualization Tools− Visualization in data mining can be categorized as follows
o Data Visualization
o Mining Results Visualization
o Mining process visualization
o Visual data mining
 Data Mining query language and graphical user interface− An easy-to-use graphical
user interface is important to promote user-guided, interactive data mining. Unlike
relational database systems, data mining systems do not share underlying data mining query
language.
Trends in Data Mining

Data mining concepts are still evolving and here are the latest trends that we get to see in
this field −
 Application Exploration.
 Scalable and interactive data mining methods.
 Integration of data mining with database systems, data warehouse systems and web
database systems.
 Standardization of data mining query language.
 Visual data mining.
 New methods for mining complex types of data.
 Biological data mining.
 Data mining and software engineering.
 Web mining.
 Distributed data mining.
 Real time data mining.
 Multi database data mining.
 Privacy protection and information security in data mining.
UNIT-4
Types of knowledge
Knowledge management is an activity practiced by enterprises all over the world. In
the process of knowledge management, these enterprises comprehensively gather
information using many methods and tools. Then, gathered information is organized,
stored, shared, and analysed using defined techniques. The analysis of such information
will be based on resources, documents, people and their skills. Properly analysed
information will then be stored as ‘knowledge’ of the enterprise. This knowledge is
later used for activities such as organizational decision making and training new staff
members.
There have been many approaches to knowledge management from early days. Most of
early approaches have been manual storing and analysis of information. With the
introduction of computers, most organizational knowledge and management processes
have been automated. Therefore, information storing, retrieval and sharing have
become convenient. Nowadays, most enterprises have their own knowledge
management framework in place.
The framework defines the knowledge gathering points, gathering techniques, tools
used, data storing tools and techniques and analyzing mechanism.
1. A Priori -- A priori and a posteriori are two of the original terms in epistemology (the study of
knowledge). A priori literally means “from before” or “from earlier.” This is because a
priori knowledge depends upon what a person can derive from the world without needing to
experience it. This is better known as reasoning. Of course, a degree of experience is necessary
upon which a priori knowledge can take shape. Let’s look at an example. If you were in a
closed room with no windows and someone asked you what the weather was like, you would
not be able to answer them with any degree of truth. If you did, then you certainly would not be
in possession of a priori knowledge. It would simply be impossible to use reasoning to produce
a knowledgeable answer. On the other hand, if there were a chalkboard in the room and
someone wrote the equation 4 + 6 = ? on the board, then you could find the answer without
physically finding four objects and adding six more objects to them and then counting them.
You would know the answer is 10 without needing a real world experience to understand it. In
fact, mathematical equations are one of the most popular examples of a priori knowledge.
2. A Posteriori -- Naturally, then, a posteriori literally means “from what comes later” or “from
what comes after.” This is a reference to experience and using a different kind of reasoning
(inductive) to gain knowledge. This kind of knowledge is gained by first having an experience
(and the important idea in philosophy is that it is acquired through the five senses) and then
using logic and reflection to derive understanding from it. In philosophy, this term is sometimes
used interchangeably with empirical knowledge, which is knowledge based on observation. It is
believed that a priori knowledge is more reliable than a posteriori knowledge. This might seem
counter-intuitive, since in the former case someone can just sit inside of a room and base their
knowledge on factual evidence while in the latter case someone is having real experiences in
the world. But the problem lies in this very fact: everyone’s experiences are subjective and open
to interpretation. This is a very complex subject and you might find it illuminating to read
this post on knowledge issues and how to identify and use them. A mathematical equation,
on the other hand, is law.
3. Explicit Knowledge --Now we are entering the realm of explicit and tacit knowledge. As you
have noticed by now, types of knowledge tend to come in pairs and are often antitheses of each
other. Explicit knowledge is similar to a priori knowledge in that it is more formal or perhaps
more reliable. Explicit knowledge is knowledge that is recorded and communicated through
mediums. It is our libraries and databases. The specifics of what is contained is less important
than how it is contained. Anything from the sciences to the arts can have elements that can be
expressed in explicit knowledge. The defining feature of explicit knowledge is that it can be
easily and quickly transmitted from one individual to another, or to another ten-thousand or ten-
billion. It also tends to be organized systematically. For example, a history textbook on the
founding of America would take a chronological approach as this would allow knowledge to
build upon itself through a progressive system; in this case, time.
4. Tacit Knowledge
It should note that tacit knowledge is a relatively new theory introduced only as recently
as the 1950s. Whereas explicit knowledge is very easy to communicate and transfer
from one individual to another, tacit knowledge is precisely the opposite. It is extremely
difficult, if not impossible, to communicate tacit knowledge through any medium. For
example, the textbook on the founding of America can teach facts (or things we believe
to be facts), but someone who is an expert musician can not truly communicate their
knowledge; in other words, they can not tell someone how to play the instrument and
the person will immediately possess that knowledge. That knowledge must be acquired
to a degree that goes far, far beyond theory. In this sense, tacit knowledge would most
closely resemble a posteriori knowledge, as it can only be achieved through experience.
The biggest difficult of tacit knowledge is knowing when it is useful and figuring
out how to make it usable. Tacit knowledge can only be communicated through
consistent and extensive relationships or contact (such as taking lessons from a
professional musician). But even in this cases there will not be a true transfer of
knowledge. Usually two forms of knowledge are born, as each person must fill in
certain blanks (such as skill, short-cuts, rhythms, etc.).
5. Propositional Knowledge (also Descriptive or Declarative Knowledge) --Our last

pair of knowledge theories are propositional and non-propositional knowledge, both of
which share similarities with some of the other theories already discussed. Propositional
knowledge has the oddest definition yet, as it is commonly held that it is knowledge that
can literally be expressed in propositions; that is, in declarative sentences (to use its
other name) or indicative propositions.Propositional knowledge is not so different
from a priori and explicit knowledge. The key attribute is knowing that something is
true. Again, mathematical equations could be an example of propositional knowledge,
because it is knowledge of something, as opposed to knowledge of how to do
something.The best example is one that contrasts propositional knowledge with our
next form of knowledge, non-propositional or procedural knowledge. Let’s use a
textbook/manual/instructional pamphlet that has information on how to program a
computer as our example. Propositional knowledge is simply knowing something or
having knowledge of something. So if you read and/or memorized the textbook or
manual, then you would know the steps on how to program a computer. You could even
repeat these steps to someone else in the form of declarative sentences or indicative
propositions. However, you may have memorized every word yet have no idea how to
actually program a computer. That is where non-propositional or procedural knowledge
comes in.
6. Non-Propositional Knowledge (also Procedural Knowledge) --Non-propositional knowledge

(which is better known as procedural knowledge, but I decided to use “non-propositional”
because it is a more obvious antithesis to “propositional”) is knowledge that can be used; it can
be applied to something, such as a problem. Procedural knowledge differs from propositional
knowledge in that it is acquired “by doing”; propositional knowledge is acquired by more
conservative forms of learning. One of the defining characteristics of procedural knowledge is
that it can be claimed in a court of law. In other words, companies that develop their own
procedures or methods can protect them as intellectual property. They can then, of course, be
sold, protected, leased, etc.
Procedural knowledge has many advantages. Obviously, hands-on experience is
extremely valuable; literally so, as it can be used to obtain employment. We are seeing
this today as experience (procedural) is eclipsing education (propositional). Sure,
education is great, but experience is what defines what a person is capable of
accomplishing. So someone who “knows” how to write code is not nearly as valuable
as someone who “writes” or “has written” code. However, some people believe that this
is a double-edged sword, as the degree of experience required to become proficient
limits us to a relatively narrow field of variety. But nobody can deny the intrinsic and
real value of experience. This is often more accurate than propositional knowledge
because it is more akin to the scientific method; hypotheses are tested; observation is
used, and progress results.
KNOWLEDGE MANAGEMENT SYSTEM
Knowledge management system (KMS) is an information and communication

technology (ICT) system in the sense of an application system or an ICT platform that
combines and integrates functions for the contextualized handling of both, explicit and
tacit knowledge, throughout the organization or that part of the organization that is
targeted by a knowledge management initiative. A KMS supports networks of
knowledge workers in the creation, construction, identification, capturing, acquisition,
selection, valuation, organization, linking, structuring, formalization, visualization,
distribution, retention, maintenance, refinement, evolution, accessing, search and last
but not least the application of knowledge the aim of which is to support the dynamics
of organizational learning and organizational effectiveness.
Individual knowledge (Shared)
Inter-subjective knowledge (Institutionalized)
Institutionalized knowledge
Knowledge in use (Applied)
Models of KM (Akash pg no. 24-2016)
Till date, four models have been selected based on their ability to meet the growing demands.
The four models are the Zack, from Meyer and Zack (1996), the Bukowitz and Williams
(2000), the McElroy (2003), and the Wiig (1993) KM cycles.
Zack - Acquisition –Refinement –Store—Distribution—Presentation

Bukowitz & Williams— Get—Use –Learn—Contribute—Assess
WIIG— Creation –Sourcing—Compilation—Transformation—Application
McElroy –Learning – Validation— Acquisition –Integration—Completion
1. Knowledge Creation –The actual process of conducting research and producing new
knowledge
Five modes of knowledge generation
• Acquisition
• Dedicated resources
• Fusion
• Adaptation
• Knowledge networking
Techniques of Knowledge Generation – Akash Pg no. 12-2017
2. Knowledge codification
The aim of knowledge codification is to put organizational knowledge into a form that makes it
accessible to those who need it.
Possible forms of codified knowledge
• Documented knowledge
• Mapped knowledge
• Modeled knowledge
• Knowledge codified in systems
3. Knowledge transfer –
Knowledge transfer is the process of passing available knowledge to specified
audiences
Functionalities
1. Channel identification and choice,
2. Scheduling, and
3. Sending
Aspects of knowledge transfer
Hard aspects – focus on improved access to knowledge (information), electronic
communication, document repositories, and so forth;
Soft aspects – focus on human face-to-face communication (meetings, talk rooms etc.).
Nonaka and Takeuchi Model
The Nonaka and Takeuchi model of KM has its base in a universal model of knowledge
creation and the management of coincidence. There are four different modes of knowledge
conversion in the Nonaka and Takeuchi model of knowledge conversion:
 Socialization (tacit to tacit) i.e. Indirect way,
 Externalization (tacit to explicit) i.e. Indirect to Direct way,
 Combination (explicit to explicit) i.e. Direct way, and
 Internalization (explicit to tacit) i.e. Direct to indirect way.
1.Socialization is the technique of sharing tacit knowledge through observation, imitation,
practice, and participation in formal and informal communities and groups. This process is
basically preempted by the creation of a physical or virtual space where a given community can
interact on a social level.
2. Externalization is the technique of expressing tacit knowledge into explicit concepts. As
tacit knowledge is highly internalized, this process is the key to knowledge sharing and
creation.
3. Combination is the technique of integrating concepts into a knowledge system. Some
examples or cases would be a synthesis in the form of a review report, a trend analysis, a brief
executive summary, or a new database to organize content.
4. Internalization is the technique of embodying explicit knowledge into tacit knowledge.
Knowledge Management Technologies
Knowledge Management requires technologies to support the new strategies, processes,

methods and techniques to better create, disseminate, share and apply the best knowledge,
anytime and anyplace, across the team, across teams, across the organisation and across several
organisations, especially its clients, customers, partners, suppliers and other key stakeholders.
The key technologies are communication and collaboration technologies that are web based for
internet and intranet usage, as well as mobile technologies such as PDA’s, PC’s, telephone and
videoconferencing. New technologies are rapidly emerging that act as intelligent agents and
assistants to search, summarise, conceptualise and recognise patterns of information and
knowledge.
For an effective KM initiative across the organisation, there needs to be in place, at least:
▪ Knowledge Portal
There is often confusion between the terms ‘information portal’ and ‘knowledge portal’.
An information portal is often described as a gateway to information to enable the user to have
one, more simplified way of navigating towards the desired information.
However a ‘knowledge portal’ is far more than an information portal because, as well as
information navigation and access, it contains within it software technologies to, at least,
support the processes of virtual team communication and collaboration and software
technologies to support the 9 step process of managing knowledge. Furthermore, it contains
intelligent agent software to identify and automatically distribute information and knowledge
effectively to knowledge workers based on knowledge profiling.
▪ Knowledge Profiles
Within the knowledge portal, each knowledge worker can update and maintain a personal
‘knowledge profile’ which identifies his/her specific knowledge needs, areas of interest and
frequency of distribution.
▪ Collaborative workspaces
Within the knowledge portal, shared work spaces can be set up for each new team or project.
These will become knowledge repositories from which new knowledge will be distilled
regularly and systematically and shared across other teams in the organisation. Within the
shared and collaborative workspace, at least, the following communication and collaboration
functions could be performed:
▪ Shared vision and mission ▪ Specific team objectives ▪ Knowledge Plan ▪ Team members roles
and responsibilities ▪ Team contract ▪ Best Knowledge Bases or Banks ▪ Expert locator ▪ Task
management ▪ Shared Calendar management ▪ Meeting management ▪ Document libraries ▪
Discussion forums ▪ Centralised email ▪ Capturing of new learnings and ideas ▪ Peer reviews,
learning reviews, after action reviews ▪ New knowledge nominations
▪ Urgent requests
Within the knowledge portal, it is very useful to have a facility and underlying process to enter
any ‘Urgent Request’ into the portal and receive back any responses from across the
organisation. Rather than needing to know ‘who might know’ the request is entered blindly and
responses will be made if it is known in the organisation and people are willing to support and
respond to this activity. This is a very effective way of better leveraging the knowledge across
the organisation.
▪ Document Libraries
The document library is typically the location where all documents are stored. The library
should be context relative and allow the ease of control over any document type. Many
organisations now employ an Electronic Document and Records Management System
(EDRMS) for this requirements but the integration of the EDRMS with all other relevant
information and knowledge sources is imperative.
▪ Knowledge Server and services
In order to foster knowledge networking across the entire organisation and support knowledge
processes for creating, retaining, leveraging, reusing, measuring and optimising the use of the
organisational knowledge assets, a centralised knowledge server is required that will:
▪ manage the communications and collaboration between networks of people

▪ enable the access, creation and sharing of knowledge between them.
The centralised knowledge server will manage the processes and knowledge services that
generate and disseminate knowledge assets.
The key components of a generic knowledge server are:
▪ a knowledge portal interface designed around a knowledge asset schema (see KM consulting
section) as a gateway to user access, security and applications
▪ Knowledge banks
▪ Advanced search capabilities ▪ collaboration services ▪ search and discovery services ▪
publishing services based on user knowledge needs and knowledge profiling ▪ a knowledge map
(taxonomy) ▪ knowledge repository for information and process management ▪ Text
summarising and conceptualising ▪ Intelligent agentware ▪ an Intranet infrastructure for
integrated email, file servers, internet/intranet services
Knowledge Bases (Banks)
For each key knowledge area identified, there needs to be a Knowledge Base.
A Knowledge Base contains:

▪ both structured and unstructured discussion forums
▪ rich ‘knowledge objects’ that have been designed for the efficient and effective transfer of
knowledge using multimedia, video, audio
▪ embedded communications theory (eg storytelling)
▪ KM processes to:
▪ critically review knowledge nominations and turn them into improved knowledge
▪ automatically find and publish knowledge to users according to users knowledge profiles
▪ transfer knowledge effectively

Bia

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Bia

Uploaded by

Copyright:

Available Formats

UNIT— 3

Data mining involves three steps. They are

Benefits of Data Mining

Data Mining Techniques

 What are the patterns in their database ?

 If the segment contains only one record

6. Association Rule Technique

 Support– Hoe often is the rule applied ?

This technique follows a two step process

 Find all the frequently occurring data sets

There are three types of association rule. They are

 Multilevel Association Rule

 Learning– In this process the data are analyzed by classification algorithm

There are different types of classification models. They are as follows

 Classification by decision tree induction

One good example of classification technique is Email provider.

Market basket analysis - Market basket analysis (MBA) is an example of an analytics

Data Mining Applications

 Financial Data Analysis

Financial Data Analysis

 Multidimensional Analysis of Telecommunication data.

Biological Data Analysis

 Semantic integration of heterogeneous, distributed genomic and proteomic databases.

Other Scientific Applications

 Data Warehouses and data pre-processing.

Data Mining System Products

Choosing a Data Mining System

 Scalability− There are two scalability issues in data mining −

Trends in Data Mining

5. Propositional Knowledge (also Descriptive or Declarative Knowledge) --Our last

6. Non-Propositional Knowledge (also Procedural Knowledge) --Non-propositional knowledge

KNOWLEDGE MANAGEMENT SYSTEM

Knowledge management system (KMS) is an information and communication

Individual knowledge (Shared)

Inter-subjective knowledge (Institutionalized)

Knowledge in use (Applied)

Models of KM (Akash pg no. 24-2016)

Zack - Acquisition –Refinement –Store—Distribution—Presentation

Techniques of Knowledge Generation – Akash Pg no. 12-2017

Possible forms of codified knowledge

• Knowledge codified in systems

Nonaka and Takeuchi Model

Knowledge Management Technologies

Knowledge Management requires technologies to support the new strategies, processes,

▪ Knowledge Server and services

▪ manage the communications and collaboration between networks of people

Knowledge Bases (Banks)

A Knowledge Base contains:

You might also like