A Gentle Introduction To Process Mining With Advanced Data Analytics (Part 1)
As business organizations continue to stabilize their digital assets that streamline the business processes, optimization plays a key role. Championing automation in a gallantly tech-savvy industrial landscape might offhandedly strike as a “why not” to many of us. Even still, it invariably invites an element of surprise as the automation goes into development. Anomalies are an ever-present factor that curtails progress which in turn can capsize a project unless handled delicately. “Way to think outside but pressed right up against the box”. These words may sound annoyingly familiar if you have ever run an automation project.
What then qualifies as successful business process automation? Worry not, a neighboring discipline has all the right visual diagnostic tools for us. Advanced data analytics together with machine learning and predictive modeling can do wonders here.
Now with every topic that we wrap our minds around, there comes a set of sub-topics that help us dissect the main topic. Today, let’s look at what these sub-topics are and see how we can work our way up with the series of articles that I’m planning to take you through.
Process mining — Process mining involves a set of methodologies for zeroing-in on operational processes by analyzing event logs sourced from databases, information systems, or business management software like enterprise resource planning (ERP), customer relationship management (CRM), and electronic health records (EHR). Essentially, it aims to understand how processes are executed in practice, identifying issues and areas for enhancement.
Additionally, process mining can be viewed as an integral aspect of business process management (BPM), leveraging data science techniques, including data mining and machine learning, to delve into a company’s software records. This facilitates a comprehensive comprehension of process performance and supports optimization endeavors.
The various perspectives within process mining include:
Control-flow perspective: Focuses on the sequence of activities, aiming to identify the most efficient path for executing a process.
Organizational perspective: Targets the resources involved in a process, such as roles and departments, with the objective of optimizing the organizational structure.
Time perspective: Concentrates on the timing and frequency of events within a process.
Case perspective: Examines inherent properties in different cases or types of processes. Further analysis reveals relationships and hidden dependencies, providing a deeper understanding of problems and their causes.
Heuristic mining — Since its inception, the field of process mining has branched out in various directions, with process discovery emerging as one of the most formidable challenges, evident in the multitude of techniques available today. The complexity of process discovery comes into play from the need for derived process models to excel across four key quality dimensions: fitness (the model’s ability to accurately replicate traces in the event log), precision (how accurately the model represents behavior in the log), generalization (the model’s capacity to extend to behaviors not explicitly present in the log), and simplicity (adhering to the Occam’s Razor principle). Achieving this proves challenging due to the potential inclusion of noise and erroneous behavior in event logs, requiring a robust discovery algorithm capable of handling such occurrences.
Furthermore, users often wish to enforce criteria regarding the layout or quality focus of the discovered models (for instance, emphasizing precision over generalization). As a result, achieving flexibility in configuration becomes a desirable yet difficult-to-attain characteristic in the realm of process discovery. From a pragmatic standpoint, the process of heuristic mining is associated with the so-called “heuristic” miners. These algorithms handle the noise of an event log pretty well.
Decision mining — Decision mining improves process models by introducing rules that govern decisions made during processes, drawing on historical process execution data. These rules, derived from process data, guide the selection between different activities. Present decision mining methods primarily focus on revealing rules that are mutually exclusive, allowing the execution of only one activity from multiple options. These approaches assume that decision-making is entirely deterministic and that all factors influencing decisions are documented. However, if decision rules overlap due to nondeterminism or incomplete information, the rules produced by current methods may not align effectively with the recorded data.
Bottleneck analysis — Bottlenecks can best be defined as the events that inhibit an efficient business process. Process mining has the solution for this. Event log data refers to a collection of recorded events or activities that have occurred within a system or process. For example, let’s take the steps involved in a manufacturing process. When this event log data is analyzed, it allows for a detailed “replay” of the material flow through each step of the manufacturing process. Think of it as a playback of the events that took place during the production of a particular item or product. This analysis provides a chronological and comprehensive view of how each part or component has moved through different stages of production. The term “replay” suggests that manufacturers can not only observe the current status and location of each part in real-time but can also review and analyze the historical journey of these parts. This retrospective view is valuable for various reasons. It allows manufacturers to identify any bottlenecks, inefficiencies, or issues that occurred during the manufacturing process. It provides insights into the timeline and sequence of events, aiding in quality control, process optimization, and troubleshooting.
In essence, by leveraging event log data analysis, manufacturers gain a powerful tool to monitor and improve their manufacturing processes by having a detailed understanding of how materials flow through each step both in real-time and in hindsight.
The primary benefit of process mining, compared to conventional methods, is its dynamic functionality, enabling the identification of bottlenecks that may change from one stage of the process to another over defined time periods.
The above 4 sub-topics can basically define how process mining can serve as a diagnostic tool in building end to end business process automations and addressing the challenges they can present. While this looks at a general breakdown of process mining, its application can get tricky and a little evasive at times. On my next article, I’ll discuss about the libraries we can use with Python and how machine learning finds its way into the locus of this topic.