Gun-Woo Kim, Seung Hoon Lee, Jae Hyung Kim, Jin Hyun Son
Advanced Data Base
Spring Semester 2012 Agenda Introduction Related Work Process Mining FP-tree Direct Causal Matrix Design and construction of a modified FP-tree Process Discovery using modified FP-tree Conclusion Introduction Business Process Management Communication Gap Business Process Reengineering Process mining Petri net Process mining Relative young research discipline Sits between machine learning and data mining on one hand and process modeling and analysis on the other hand Three types: Process Discovery Conformance Checking Extension or Enhancement FP-tree (Frequent Pattern Tree) Compressed representation of the input data Has an extended prefix-tree structure and stores crucial and quantitative information about frequent patterns The tree nodes are arranged in such a way that nodes occurring more frequently have better chance of sharing nodes than less frequently occurring ones. Direct causal matrix Defines the causal relations between the tasks and start/end node It serves several purposes in our modified FP- tree algorithm Evaluate the combinations of gateways in the process model It was used as a basis for implementing the merge algorithm Direct causal matrix Design and construction of a modified FP- tree Essential prerequisites for constructing a modified FP-tree All cases of event log must have the start/end node The created process model must not have a duplicated task that has the same task name All cases of the event log should be a part of a well structured process. Design of a modified FP-tree To design the modified FP-tree they used the same design and construction strategy for a normal FP-tree Three differences: Modified FP-tree has a start/end node Input sequences are non-ordered sequences Internal node links are composed of a doubly linked list Definition of a modified FP-tree Algorithm for data insertion in the modified FP-tree Construction of a modified FP-tree Process discovery using a modified FP-tree Purpose of the modified FP-tree algorithm It exists a lot of duplicate tasks represented by a node in the modified FP-tree The reason for the presence of duplicated tasks was that the process model had a split/merge type of gateway. Process discovery using a modified FP-tree Process discovery using a modified FP-tree Process discovery using a modified FP-tree Mapping modified FP-tree to BPMN process model Conclusion It is proposed and adaptive and compact data structure called modified FP-tree, and an algorithm for process discovery based on this structure. However, if loop patterns exists in the event log, the results of process discovery are not good. If the original event log model has many parallel nodes, the modified FP-tree will have n x n node branches.