You are on page 1of 3

Information Science -

Centre for Knowledge Dynamics and Decision-making


Inligtingwetenskap -

Sentrum vir Kennisdinamika en Besluitneming www.informatics.sun.ac.za/mikm

Assignment 5
Lecturers: Mrr DN Blaauw and AT Nthurubele
Assignment 5A: Artificial Intelligence Assignment 5B: Data Processing and Mining Target Date to submit: 30 November 2010
Academic Nature of Assignment 5 Assignment 5 is essentially technical and technological. The essential infrastructure of the Knowledge Economy is the global network of computer driven systems. For the management of knowledge this implies that a sophisticated use of computation is indispensable. As Becerra-Fernandez and Sabherwal indicate the two basic competencies in this regard for KM are Artificial Intelligence and Data mining. It is not the purpose of the MIKM to grow students into experts in these fields. However, it is crucial that they have a solid and basic understanding of both topics, at least at such a level that they will be able to hold meaningful discussions with experts, should that need arise in a work context. Assignment 5 is designed to assess MIKM students' basic understanding of the above two fields. Description and instructions 1. Assignment 5 is based on the lecture materials, podcasts, supplementary readings and relevant parts of Becerra-Fernandez, et al 2. The purpose of the assignment is to:

a. determine whether the student has engaged with the prescribed material at a level expected from a postgraduate student b. assess the student's ability to formulate complicated ideas in a reasonable and logical way and to apply them plausibly
3. The assignment will be assessed for the categories C, L and P. 4. The completed assignments must be sent to: mikmas@sun.ac.za 5. SUBMISSION: Create a Word document file for each part (A and B) separately Format each Word document using the thesis template on the MIKM website (but remove the first page) Do not forget your name and student number on each document Please send the parts separately In the E-mail, write your Surname clearly in the subject line

PART A ARTIFICIAL INTELLIGENCE


This is an application assignment consisting of three questions based on the IF and THEN probability theory. It is required from the student to design a written rule or set of rules in finding a solution. Answer the following questions: 1. Give an example of a fuzzy rule and determine the total strength of belief or disbelief in a hypothesis. Use the Certainty factor (CF) with a value range of 0 and 1 to calculate your answer. [100 words excluding the calculation of the CF] Can the antecedent of a fuzzy rule have multiple parts? Explain with examples and generate as many subsets to solve a problem. Use linguistic variables for your explanation and examples.[300 words] How do we evaluate multiple antecedents of fuzzy rules? Use linguistic variables, for your evaluation.[300 words]

2.

3.

PART B DATA MINING


1. Below are a total of 8 questions. You have to do the first 4 and then choose 2 more from the remaining questions. Each question should be answered in no less than 100 and no more than 150 words. What is data mining? In your answer address the following: 2. Influence from other disciplines. Explain how the evolution of database technology led to data mining. Describe the steps involved in data mining when viewed as a process of knowledge discovery.

Compulsory Questions

Define each of the following data mining functionalities: Characterization, Discrimination, Association and Correlation analysis, Classification, Prediction, Clustering, and Evolution analysis. Give examples of each data mining functionality. Briefly compare the following concepts. You may use an example to explain your point(s): Snowflake schema, fact constellation, star-net query model Data cleaning, data transformation, refresh Enterprise warehouse, data mart, virtual warehouse

3.

4.

Why is nave Bayesian classification called nave? Briefly outline the major ideas of nave Bayesian classification.

Choose TWO of the following: 5. Suppose that you are in the market to purchase a data mining system. Regarding the coupling of a data mining system with a database and/or data warehouse system, what are the differences between no coupling, loose coupling, semi tight coupling, and tight coupling? What is the difference between row scalability and column scalability? Which feature(s) from those listed above would you look for when selecting a data mining system?

6. 7.

Briefly outline the major steps of decision tree classification. State why, for the integration of multiple heterogeneous information sources, many companies in industry prefer the update-driven approach (which constructs and uses data warehouses), rather than the query-driven approach (which applies wrappers and integrators). Describe situations where the query-driven approach is preferable over the update-driven approach. What are the differences between the three main types of data warehouse usage: information processing, analytical processing, and data mining? Discuss the motivation behind OLAP mining (OLAM).

8.

You might also like