Evaluation¶
In PM4Py, it is possible to compare the behavior contained in the log and the behavior contained in the model, in order to see if and how they match. Four different dimensions exist in process mining, including the measurement of replay fitness, the measurement of precision, the measurement of generalization, the measurement of simplicity.[6]
Replay Fitness
The quality dimension of replay fitness describes the fraction of the behavior in the event log that can be replayed by the process model. Several different measures exist for this quality dimension. Some measures consider traces of behavior as whole, checking if the whole trace can be replayed by the process model. Other measures consider the more detailed level of events within a trace and try to get a more fine-grained idea of where the deviations are. Another important difference between existing measures is that some enforce a process model to be in an accepted end state when the whole trace is replayed. Others ignore this and allow the process model to remain in an active state when the trace ends. The most recent and robust technique uses a cost-based alignment between the traces in the event log and the most optimal execution of the process model. This allows for more flexibility and distinction between more and less important activities by changing the costs [7].
Precision
Precision is estimated by confronting model and log behavior: imprecisions between the model and the log (i.e., situations where the model allows more behavior than the one reflected in the log) are detected by juxtaposing behavior allowed by the log and the one allowed by the model. This juxtaposition is done in terms of an automaton: first, an automaton is built from the alignments. Then, the automaton is enhanced with behavioral information of the model. Finally, the enhanced automaton is used to compute the precision [8].
Generalization
Replay fitness and precision only consider the relationship between the event log and the process model. However, the event log only contains a part of all the possible behavior that is allowed by the system. Generalization therefore should indicate if the process model is not “overfit-ting” to the behavior seen in the event log and describes the actual system. Another explanation for generalization is the likelihood that the process model is able to describe yet unseen behavior of the observed system. To date only few measures for generalization exist [7].
Simplicity
The simplicity dimension evaluates how simple the process model is to understand for a human. This dimension is therefore not directly related to the observed behavior but can consider the process model solitarily. Since there are different ways to describe the same behavior using different process models, choosing the simplest one is obviously best. This is also expressed by Occam’s razor: “one should not increase, beyond what is necessary, the number of entities required to explain anything”. However, sometimes a complex process model can only be simplified by changing the behavior, hence influencing the other quality dimensions. Several measures exist to measure how simple a process model is, for an overview we refer to. However, research has also shown that size is the main complexity indicator [7].