You are on page 1of 11

1.

2.

3.

percent(A,"70,71 _ _ _ 80") => placement(A, "Infosys") percent(A,"70,71 _ _ _ 80") => placement(A, "Microsoft") percent(A,"70,71 _ _ _ _ 80") => placement(A," Dell") percent(A,"70,71 _ _ _ _ 80") => placement(A,"IBM") These set of rules clearly refer to _ _ _ _ _ _ _ _ _ _ _ _ _ _ rule 1. Boolean association 2. Quantitative association 3. Single dimensional association 4. Multi dimensional association The following rule Age(A,"20,21 _ _ _ _ 27") percent(A,"60,61 _ _ _ 80") test(A,"B,B+ _ _ _ _ .A+) =>placement(A,"MNCs") is an example of _ _ _ _ _ _ _ _ rule. 1. Boolean association 2. Quantitative association 3. Single dimensional association 4. Multi dimensional association A set of items is referred to as a(n) _ _ _ _ _ _ _ _ _ _ 1. itemset 2. item set 3. set entity

4.

entity set

4.

5.

6.

7.

8.

1. Boolean association 2. Quantitative association 3. Frequent association 4. Transaction association 9. TID stands for _ _ _ _ _ _ _ _ _ _ _ 1. Transaction is associated with an identifier 2. Transaction is Differentiable 3. Transaction identifier 4. Transaction is dissociated with an identifier 10. If a rule describes association between quantitative attributes, it is a _ _ _ _ _ _ _ _ _ _ rule. 1. Boolean association
2. Quantitative association

car =>financial from bank [loan=80 %,insurance=20 %] Association rules are satisfied if they have _ _ _ _ _ ___ 1. maximum carshops, maximum bank branches 2. maximum banks, maximum loan threshold 3. maximum loan threshold, minimum insurance threshold 4. maximum bank threshold, maximum loan threshold, minimum insurance threshold If X=>Y[a=50 %,b=2 %] a(X=>Y) = _ _ _ _ _ _ _ _ _ _ & _ _ _ _ _ _ _ _ and b(X=>Y)= _ _ _ _ _ _ _ _ _ _ 1. P(AUB),P(A/B) 2. P(AUB),P(B/A) 3. P(AB),P(A/B) 4. P(AB),P(B/A) car => financial from bank [loan=80 %, insurance=20 %] _ _ _ _ _ _ _ & _ _ _ _ _ are two measures of Association rules. 1. car,bank 2. bank.loan 3. loan,insurance 4. bank,loan,insurance The set X=>Y [a=50 %,b=2 %] is a _ _ _ _ _ _ _ _ _ _ _ itemset. 1. 1 2. 2 3. 3 4. 4 If a rule concerns associations between the presence or absence of items, it is a _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ rule.

3. Frequent association 4. Transaction association 11. If the transactional data is The minimum support count is _ _ _ _ _ _ _ _ _
1. 1 2. 2 3. 3 4. 4 12. Apriori algorithm employs level-wise search, where k-itemsets uses ------ itemsets. 1. k 2. (k-1) 3. (k+1)

13. Which step could involve huge computations? 1. join step 2. prune step 3. calculation step 4. logical step 14. Anti-monotone states _ _ _ _ _ _ _ _ 1. if a set cannot pass a test, all its supersets also cannot pass the same test 2. if a set cannot pass a test, all its supersets pass the test 3. if a set pass a test, all its supersets cannot pass the same test 4. Is a set pass a test, all its subsets cannot pass the same test. 15. Transaction reduction implies _ _ _ _ _ _ 1. reducing the number of transactions scanned till previous iterations 2. reducing the number of transactions in the current iteration 3. reducing the number of transactions in the future iteration 4. reducing the number of iterations in a transaction 15. _ _ _ _ _ _ _ _ _ is used to improve the efficient of the Apriori algorithm. 1. Berg queries 2. Iceberg queries 3. Ice Burg queries 4. Ice Cube queries 16. Which threshold can be set up for passing down relatively frequent items to lower levels? 1. level-class threshold 2. level-shift threshold 3. level-passage threshold 4. level-jump threshold 17. When multi-level association rules are mined, some of the rules found will be redundant due to _ _ _ _ _ _ _

____ relationships between them. 1. hierarchical 2. multi-level 3. single-dimension 4. ancestor 18. Consider the following rile
Age(A,"18,19, _ _ _ _ _ 29") placement(A,"Infosys,IBM, _ _ _ _ ") purchases(A,"mobile") => purchases (A,"high memory card") purchases(A,"card reader") This rule is highlighted in saying that it has _ _ _ _ _ _ _ _ 1. multiple predicated 2. single predicate 3. repetitive predicate 4. dependent predicate 19. The correlation between the occurrence of A and B can be measured by computing _ _ _ _ _ _ _ _ _ _ _ _ _ _ 1. corrA,B =P(A) / P(B) 2. corrA,B=(P(A)P(B)) / P(AB) 3. corrA,B = P(A B) / (P(A)P(B)) 4. corrA,B = P(AB) / (P(A)P(B)) 20. Which association rule has overcome the disadvantage of Association rules? 1. quantitative association rules 2. Distance-based association rules 3. single dimensional association rules 4. Multi dimensional association rules 21. Let A[X] be the set of 'n' tuples t1,t2, _ _ _ _ _ _ _ _ tn projected on the attribute set X . Which measure of A[X] is the average pair wise distance between the tuples projected on X? 1. radius 2. diameter 3. density 4. frequency 22. If a single distinct predicate exists in single dimensional association rule , it is also called as _ _ _ _ _ _ _ _ _ _ 1. intra dimension association rule 2. inter dimension association rule 3. extra dimension association rule 4. quantitative association rule 23. If no repeated predicates exists in multi dimensional association rule is also called as _ _ _ _ _ _ _ _ 1. intra dimension association rule 2. inter dimension association rule 3. extra dimension association rule 4. quantitative association rule

24. If in multi dimensional association rule with repeated predicates, which contains multiple occurrences of some predicate certain rules are called as _ _ _ _ _ _ _ _ _ 1. repeatitive association rule 2. recursive association rule 3. mixed association rule 4. hybrid association rule 25. The partitioning of process is referred to as binning and the intervals are considered as _ _ _ _ _ _ _ _ _ _ 1. bins 2. modules 3. segments 4. time laps

26. Which algorithm is included with certain series of walks through itemset space? 1. Apriori algorithm 2. Ancestor algorithm 3. random walk through algorithm 4. sequential walk through algorithm 27. Which categories can be used during association mining to guide the process, leading to more efficient and effective mining? 1. antimonotone, monotone, succinct, convertible 2. antimonotone, monotone, succinct, inconvertible 3. antimonotone, monotone, convertible, inconvertible 4. antimonotone, succinct, convertible, iconvertible 28. Consider the following rule: If an engineering student inWarangal bought "speech recognition CD" and "MS Office" and "jdk 1.7", it is likely (with a probability of 58 %) that the student also bought SQL Server And "My SQL Server" and 6.5 % of all the students bought all five. The meta rule can be generated in association rule as _ _ _ _ _ _ _ _ _ _ _ 1. lives(S, _, Warangal") sales(S,"speech recognition", _) sales(S,"MS Office", _) sales(S,"jdk 1.7", _) sales(S,"SQL Server", _) => sales(S,"My SQL Server", _) [6.5 % 58 %] 2. lives(S, _,Warangal") sales(S,"speech recognition", _) sales(S,"MS Office", _) sales (S,"jdk 1.7", _) =>sales(S,"SQL Server", _) sales(S,"My SQL Server", _) [6.5 % 58 %] 3. lives(S, _,Warangal") sales(S,"speech recognition", _) sales(S,"MS Office", _) sales(S,"jdk
1.7", _) sales(S,"SQL Server", _) => sales(S,"My SQL Server", _) [58 % 6.5 %] 4. lives(S, _,Warangal") sales(S,"speech recognition", _) sales(S,"MS Office", _) sales(S,"jdk 1.7", _) =>sales(S,"SQL Server", _) sales(S,"My SQL Server", _) [58 % 6.5 %] A constraint such as "avg(I.marks) <= 70" is not a(n) _ _ _ _ _ _ _ _ _ _ _ 1. anti-monotone 2. monotone 3. succinct 4. convertible A constraint "max(I.marks) >=600" is acceptable for _ _ _ _ _ _ & _ _ _ _ _ _ _ categories. 1. antimonotone,monotone 2. monotone,succinct 3. antimonotone,succinct 4. succinct,convertible The constraint "max(I.marks)<=600" is acceptable by _ _ _ _ _ _ _ _ & _ _ _ _ _ _ _ _ categories. 1. antimonotone,monotone 2. monotone,succinct 3. antimonotone,succinct 4. succinct,convertible Which constraint specify the set of task-relevant data? 1. knowledge type constraints 2. data constraints 3. task-oriented constraints 4. rule constraints Which constraints may be expressed as Metarules? 1. knowledge type constraints 2. data constraints 3. interestingness constraints 4. rule constraints Anti-monotone, monotone, succinct, convertible and inconvertible are five different categories of _ _ _ _ _ _ _ _ _ _ constraints. 1. knowledge type constraints 2. data constraints 3. interestingness constraints 4. rule constraints Which constraints are applied before mining? 1. knowledge type and data constraints 2. data and dimension constraints 3. dimension and rule 4. knowledge and rule constraints

29.

30.

31.

32.

33.

34.

35.

36. The constraint "support(s) is acceptable by _ _ _ _ _ _ _ _ _ _ category. 1. Antimonotone 2. monotone 3. succinct 4. Anti-succinct 37. Preprocessing of data in preparation for classification and prediction can involve ------ for normalizing the data.

1. data cleaning 2. relevance analysis 3. data transformation 4. data redundancy 38. While _ _ _ _ _ _ _ predicts class, _ _ _ _ _ _ _ models continuous-valued functions. 1. prediction-classification 2. classification-prediction 3. speed-scalability 4. scalability-speed 39. _ _ _ _ _ _ _ _ & _ _ _ _ _ _ _ are the two major types of prediction problems. 1. classification-regression
2. classification-data cleaning

3. data cleaning-predictive accuracy 4. regressive-scalability 40. Which of the following criteria is not the one used for the comparison of classification and prediction? 1. speed 2. predictive accuracy 3. data cleaning
4. interpretability 41. Normalization fall within a range of _ _ _ _ _ _ _ _ _ 1. -2.0 to +2.0 2. -1.0 to +1.0 3. +1.0 to +2.0 4. -2.0 to -1.0 42. _ _ _ _ _ _ _ _ _ _ _ is a two step process 1. data classification 2. data prediction 3. data hiding 4. data abstraction 43. The data tuples analyzed to build the model collectively form _ _ _ _ _ _ _ _ _ _ 1. samples 2. training data set 3. untrained data set 4. supervised data set 44. In _ _ _ _ _ _ _ _ _ the class label of each training sample is not known, and the number or set of classes to be learned may not be known in advance. 1. supervised learning 2. unsupervised learning 3. authorized learning 4. unauthorized learning 45. _ _ _ _ _ _ _ is a simple technique that uses a test set of class-labeled samples. 1. clustering method 2. holdout method 3. data classification 4. data learning 46. _ _ _ _ _ _ _ _ _ refers to the preprocessing of data in order to remove noise. 1. predictive accuracy 2. robustness 3. data cleaning 4. Interpretability 47. Early decision tree algorithms typically assume that the data is from _ _ _ _ 1. memory 2. user input from keyboard 3. dynamic user input 4. mouse click 48. Decision trees can easily be converted to _ _ _ _ _ _ _ rules. 1. IF 2. Nested IF 3. If-THEN 4. GROUP BY 49. During the construction of decision tree induction the tree starts as _ _ _ _ _ _ _ _ _ 1. single node 2. dual child node 3. binary tree nodes

4.

multi-valued nodes

50. _ _ _ _ _ _ _ _ _ _ uses the concept to generalize the data by replacing lower-level data with highlevel concepts. 1. analysis oriented induction 2. algorithm oriented induction 3. attribute oriented induction 4. approach oriented induction 51. In a decision tree, _ _ _ _ _ _ _ _ represents an outcome of the test.
1. internal node

4. interpretability 44. Normalization fall within a range of _ _ _ _ _ _ _ _ _ 1. -2.0 to +2.0 2. -1.0 to +1.0 3. +1.0 to +2.0 4. -2.0 to -1.0 45. _ _ _ _ _ _ _ _ _ _ _ is a two step process 1. data classification 2. data prediction 3. data hiding 4. data abstraction 46. The data tuples analyzed to build the model collectively form _ _ _ _ _ _ _ _ _ _ 1. samples 2. training data set 3. untrained data set 4. supervised data set 47. In _ _ _ _ _ _ _ _ _ the class label of each training sample is not known, and the number or set of classes to be learned may not be known in advance. 1. supervised learning 2. unsupervised learning 3. authorized learning 4. unauthorized learning 48. _ _ _ _ _ _ _ is a simple technique that uses a test set of class-labeled samples. 1. clustering method 2. holdout method 3. data classification 4. data learning 49. _ _ _ _ _ _ _ _ _ refers to the preprocessing of data in order to remove noise. 1. predictive accuracy 2. robustness 3. data cleaning 4. Interpretability 50. Early decision tree algorithms typically assume that the data is from _ _ _ _ 1. memory 2. user input from keyboard 3. dynamic user input 4. mouse click 51. Decision trees can easily be converted to _ _ _ _ _ _ _ rules. 1. IF 2. Nested IF 3. If-THEN 4. GROUP BY 52. During the construction of decision tree induction the tree starts as _ _ _ _ _ _ _ _ _ 1. single node 2. dual child node 3. binary tree nodes

4. multi-valued nodes _ _ _ _ _ _ _ _ _ _ uses the concept to generalize the data by replacing lower-level data with highlevel concepts. 1. analysis oriented induction 2. algorithm oriented induction 3. attribute oriented induction 4. approach oriented induction 54. In a decision tree, _ _ _ _ _ _ _ _ represents an outcome of the test. 53.
1. 2. 3. 4. internal node branch leaf nodes root

55. The basic algorithm for decision tree induction is _ _ _ _ _ _ _ _ 1. knapsack algorithm 2. greedy algorithm 3. traveling sales person algorithm 4. 0/1 knapsack algorithm 56. _ _ _ _ _ _ _ _ _ _ _ methods use statistical measures to remove the least reliable branches. 1. tree pruning 2. fragmentation 3. segmentation 4. classification 57. In how many approaches does tree pruning work? 1. 1 2. 2 3. 3 4. 4 58. Classification threshold is also called as _ _ _ _ _ _ _ _ _ _ _ 1. exception threshold 2. SLIQ threshold 3. precision threshold 4. threading threshold 59. If an arc is drawn from node A to node B then A is _ _ _ _ _ _ _ _ _ of B i)Parent ii)immediate predecessor iii)descendent iv)immediate successor 1. only i 2. i & ii 3. only iii 4. iii & iv 60. Bayes theorem provides a way of calculating which probability? 1. posterior 2. prior

3. 4.

stable ideal

61. In Gaussian density function stands for _ _ _ _ _ _ & _ _ _ _ _ _ _ _ _ _ 1. Standard deviation(SD) & mean 2. mean & SD 3. mode & SD 4. SD & median 62. P(H/X) is a _ _ _ _ _ _ _ _ probability. 1. posterior 2. prior 4. ideal 63. Nave Bayesian classifier is also called as _ _ _ _ _ _ _ _ _ _ classifier. 1. computational Bayesian 2. simple Bayesian 3. non-computational Bayesian 4. Complexed Bayesian
64. CPT stands for _ _ _ _ _ _ _ _ _ _ _ 1. Computer preparation test 2. Computational probability test 3. Conditional probability table 4. Computational probability table 65. During Bayesian n/w s incomplete data is referred to _ _ _ _ _ _ _ _ _ _ 1. input data 2. Hidden data 3. output data 4. recursive data 66. The algorithm of "Training Bayesian Belief Networks" involve which sequence of steps i)compute the gradients ii)renormalize the weights iii)update the weights 1. i, ii, iii 2. ii, i, iii 3. i, iii, ii 4. ii, iii, i 3. stable

67. A Neural Network containing N hidden Layers is called as _ _ _ _ _ _ _ _ _ _ Neural network layered 1. (N-1) 2. N 3. (N+1) 4. 2N 68. _ _ _ _ _ _ _ _ _ _ _ are modified so as to minimize the mean squared error b/w the networks prediction and the actual class 1. no of hidden layers 2. weights of the nodes 3. no of nodes

4. no of inputs to a node 69. Back propagation is a neural n/w _ _ _ _ _ _ _ _ _ _ _ algorithm 1. updation 2. learning 3. comparison 4. data mining 70. Neural n/w learning is also referred to as _ _ _ _ _ _ _ _ _ learning 1. sequential 2. connectionist 3. dependent 4. random 71. In a multilayer feed-forward NN the weighted output of hidden layer are inputs to _ _ _ _ _ _ _ _ _ _ _ _
1. input layer

2. next hidden layer 3. output layer 4. input to another NN 72. The tech of updating the weights & biases after the presentation of each sample is refered to As _ _ _ _ _ _
_ __ 1. node updating 2. case updating 3. epoch updating 4. layer updating If in the item set {percent <= "59", placement ="no" } whose support increases from 0.7 % is c1 to 92.6 % in c2, the growth rate is _ _ _ _ _ _ _ _ _ 1. 0.7 %/92.6 % 2. 92.6 %/0.7 % 3. (92.6 %-0.7 %)/0.7 % 4. 92.6 %/(92.6 %-0.7 %) The statement "If any student age is above 20 and their percentage is 70 or more are approved for placement " the rule in rough set theory can be written as _ _ _ _ _ _ _ _ _ _ _ _ 1. if(x, age > 20) if(x, percentage >= 70) then placement ="approved" 2. if(x, age > 20) if(percentage >= 70) then placement ="approved" 3. if(age > 20) if(percentage >= 70) ?> placement ="approved" 4. if(x, age > 20) if(x, percentage >= 70) then placement = approved _ _ _ _ _ _ _ _ _ _ _ is defined in terms of Euclidean distance 1. Min distance between two points 2. max distance between 2 points 3. Closeness between 2 points 4. mean between 2 points The rule "IF NOT A1 AND A2 THEN NOT C1 "is encoded as 1. 101 2. 010 3. 001 4. 110 If a set of rules has same condent,then the rule with highest confidence is selected as _ _ _ _ _ _ _ _ _ _ to represent the set 1. probability rule 2. possible rule 3. posterior rule 4. proportionality rule EP stands for _ _ _ _ _ _ _ _ _ _ _ 1. emerging point 2. evolving point 3. emerging patterns 4. evolving patters JEP is a special case of EP , where J stands for _ _ _ _ _ _ _ _ _ _ _ _ _ _ 1. joint 2. jazzing

73.

74.

75.

76.

77.

78.

79.

3. jagging 4. jumping

80. _ _ _ _ _ _ _ _ _ are used to incorporate ideas if natural evolution 1. case based reasoning 2. genetic algorithms 3. rough set theory 4. fuzzy set approach 86. Rough set theory is based on _ _ _ _ _ _ _ _ _ _ classes with in fn training date 1. symmetric 2. transitive 3. equivalence 4. trichotonomy 87. In which operation Substring from pairs of rules are swapped to form new pair of rules 1. equivalence 2. CBR 3. crossover 4. mutation
88. From the eg. Y= 1. Y-X X, the co-efficient = _ _ _ _ _ _ _ _ _ _ _ _

2.

Y-x

3.

y'-x

4. y'89. Apart from prediction, the log linier model is also useful for _ _ _ _ _ _ _ _ _ _ _ 1. image patterning 2. data compression 3. voice recognition 4. speech recognition 90. In Y= 1. 2. 3. 4. X, are _ _ _ _ _ _ _ _ _ _ constants regression coefficients variable coefficients averages of X and Y

91. _ _ _ _ _ _ _ _ can be modeled by adding polynomial terms to the basic linear mode. 1. linear regression 2. multiple regression 3. polynomial regression 4. poison regression 92. _ _ _ _ _ _ _ _ regression can be modeled by adding polynomial terms to the basic linear model 1. Linear 2. multiple 3. polynomial 4. binomial 93. Which regression helps in counting the data frequently? 1. Linear regression 2. logistic regression 3. Poisson regression 4. multiple regression 94. Accuracy is given as _ _ _ _ _ _ _ _ _ _ 1. specificity * (pos/(pos+neg)) + sensitivity *(neg/(pos+neg)) 2. specificity * (neg/(pos+neg)) + sensitivity *(pos/(pos+neg)) 3. sensitivity * (pos/(pos+neg)) + specificity *(neg/(pos+neg))
95 A _ _ _ _ _ _ _ _ _ resembles a nominal variable. 1. normal variable 2. discrete ordinal variable 3. continuous ordinal variable 4. poission ordinal variable The process of grouping a set of physical objects into classes of similar objects is called as 1. moduling 2. segmenting 3. clustering 4. machine learning Clustering is a form of _ _ _ _ _ _ _ _ _ _ _ 1. learning by practice 2. learning by examples

96

97.

3. learning by observation
4. 98. 1. 2. 3. 4. learning by testing Interval-scaled variables are _ _ _ _ _ _ _ _ measurements of a linear scale. discrete continuous differentiable non-continuous

99. The data matrix is often called as _ _ _ _ _ _ _ 1. one-mode matrix 2. two-mode matrix 3. zero-mode matrix 4. poly-node matrix 100 Object-by-object structure is also known as _ _ _ _ _ _ _ _ _ _ _ _ 1. difference matrix 2. data matrix 3. Dissimilarity matrix 4. Identity matrix 101. the computational complexity of CLARANS is _ _ _ _ _ _ _ _ _ _

1. O(n) 2. O(n2) 3. O(log n) 4. O(n log n) 102. _ _ _ _ _ _ _ _ _ can be used to find the most "natural" number of clusters using a silhouette coefficient. 1. PAM 2. CLAPP 3. CLARA 4. CLARANS 103. The _ _ _ _ _ _ algorithm where each cluster is represented byy one of the objects located near the center of cluster. 1. k-means 2. k-medians 3. k-medoids
k-modes The agglomerative approach is also called as _ _ _ _ _ approach. 1. top-down 2. bottom-up 3. sequential 4. random 105. Most of the partitioning methods cluster objects are based on _ _ _ _ _ _ _ _ 1. number of clusters 2. distance between objects 3. number of objects in each class 4. learning rate 106. _ _ _ _ _ _ _ _ _ _ methods quantize the object space into a finite number of cells that form a grid structure. 1. hierarchical methods 2. density-based methods 3. grid-based methods 4. model-based methods 107. _ _ _ _ _ _ _ is a density-based method that computers an augmented clustering. 1. DBSCAN 2. OPTICS 3. STING 4. CLIQUE 108. EM is expanded to _ _ _ _ _ _ _ _ _ 1. entity maximization 2. exception maximization 3. expectation maximization 4. earning maximization 109. Clustering large applications can be shortened as _ _ _ _ _ _ _ _ _ 1. CLA 2. CLAPP 3. CLARA 4. CLULA 104. 110. The absolute closeness between 2 clusters, normalized w.r.t the internal closeness of two clusters is __________ 1. relative distance 4.

2. relative interconnectivity 3. relative density 4. relative closeness 112 -------- is a triplet summary information about sub clusters of objects 1. clustering features 2. clustering feature tree 3. Chameleon 4. density feature

113 Which method overcame with the problem of favoring clusters with spherical shape and similar sizes? 1. BIRCH 2. CURE
3. ROCK

4. STING 114. For given n objects the complexity of CURE is _ _ _ _ _ _ _ _ _ _ 1. O(n) 2. O(n2) 3. O(log n)
4. O(n log n)

You might also like