Diagnostic metrics - Measurement of alarm system performance

2.3 Measurement of alarm system performance

2.3.3 Diagnostic metrics

Thediagnostic metrics of an alarm system are aimed at the identification of specific problems on specific alarms. These metrics are usually monitored by the owner of the alarm system or the personnel responsible to take actions for its maintenance or efficiency improvement. The various combination of the measures from the performance and diagnostic metrics form the basis of the most often monitored metrics in industrial plants and usually the construction of alarm management reports, dashboards and KPIs are based on them.

Some commonly used metrics are the following:

• Listing and quantity of the most frequent alarms (Top N bad actors)

• Listing and quantity of chattering alarms

• Listing of stale alarms

• Listing of shelved alarms, possibly with shelving duration

• Listing of out-of-service alarms, possibly with durations

• Listing of potentially redundant alarms shown by analysis

Top N bad actors

A majority of alarms are usually generated by a small number of process variables known as bad actors. According to the literature, a bad actor is "an alarm that is suspect and cannot be relied upon to deliver accurate information to the operator, such as stale, chattering, duplicate or suppressed alarms" [34]. In practice, by bad actors, the most frequent alarms in the Alarm & Event log database are referred, which are responsible for the bulk of the alarm load. According to the ISA-18.2 standard [13], "Relatively few individual alarms (e.g., 10 to 20 alarms) often produce a large percentage of the total alarm system load (e.g., 20 % to 80 %). The most frequent alarms should be reviewed at regular intervals (e.g., daily, weekly, or monthly). Substantial performance improvement can be made by addressing the most frequent alarms". This Pareto-like, 80-20 rule of distribution that 20 % or less of alarm variables are responsible for the 80 % or more alarms is common, extreme cases where only one variable contributes to more than 50 % of the total alarm load are known [35]. In other cases, the top 5 bad actors contributed to 87 % [36], or the top 10 bad actors contributed to more than 75 % of the total number of annunciated alarm messages [37]. Numbers like this are not considered outliers based on our experience as well. Therefore, the monitoring of the top 10 most frequent alarms is strongly recommended by alarm standards, and they should not contribute to more than 5 % of the total alarm load (no bad actors are acceptable) [35]. Usually, top (5-)10 alarms are monitored regularly on bar plots indicating their number and a line plot is applied to show their cumulative proportional contribution to the overall alarm number [37], but their temporal comparison is common as well [27]. Thanks to their practical importance and good problem-solving efficiency, several frameworks incorporate their usage [35, 38, 21].

Chatter index

Less informative alarms, namely nuisance or constant ones, significantly increase the operator workload by nonactionable distractions. The most common form of nuisance alarms are the chattering ones, which do not sound for a sufficient time to allow the operators to perform corrective actions and in critical plant conditions can significantly hinder the work of the operators. In a properly rationalized and designed alarm system, after the elimination of nuisance (chattering) alarms, the resultant alarm rate reflects the ability of the control system to keep the operation

in the normal operating zone without operator interactions [14]. The chattering alarms are essentially in conflict with the philosophy that each alarm should be actionable. Hollifield et al. claim that chattering alarms are the most common type of alarm, constituting about 70% of all alarms [2]. Similarly, constantly sounding alarms are also harmful to the quality of the alarm data. In an industrial environment, these long-standing alarms are ignored by the operators either as a result of their uninformativeness, or the fact that they are hidden from the operators as shelved or forbidden alarms (these alarms are usually still present in historical datasets). Different approaches are present for the detection of the presence of a chattering alarm, for example the balance between the actions taken by the operator and occurrences of alarms [39], but the most well-known and commonly used approach is the application of the chatter index introduced by Kondaveeti et al. [40], [41]. Various studies applying the chatter index in alarm management are summarized in the following list:

• Kondaveetiet al. introduced the chatter index for the quantification of alarm chatter [40]

• Wang and Chen developed an online method for the detection and reduction of chattering alarms due to oscillations [42]

• The improvement of the alarm system of an industrial power plant case study [36]

• Plotting the Chatter Index over one week’s period for top 50 alarms [43]

• Sun et al. reduces the number of chattering alarms via median filters [44]

• The application of the chatter index for the alarm system improvement in a Combined-Cycle Gas Turbine Power Plant [45]

• The design of alarm deadbands for the reduction of false and missed alarms and alarm chattering [46]

• In our previous work, we applied the chatter index to prefilter the alarm and event log database before sequence mining [28]

• Naghoosi et al. developed a method to estimate the chatter index based on statistical properties of the process variable as well as alarm parameters [47]

The calculation of the chatter index is discussed in Section 9.3.

In document Gépi tanulási technikák fejlesztése alarm managementben (Pldal 40-43)