Taming the Wild Alarm System, Part 7
Beyond Alarm Management – Doing More with a Powerful Tool
In this blog series, we’ve mostly discussed the alarm system. It is a small but important part of the overall control system. To accomplish alarm management, and comply with standards, we need to continuously analyze alarm system performance and monitor alarm settings for inappropriate changes. We need to document all our alarms, in terms of causes, consequences, corrective action and make that valuable information available to the operator.
Doing these things is not complicated and can be highly automated. They involve some good software, a connection to the control system to collect alarm data for analysis, and a Master Alarm Database for detecting and managing alarm change and providing useful information to the operator. Now, this is a powerful toolset! And there is nothing that engineers (and business managers) love more than getting more capability and performance out of a tool that you already have.
Our Hexagon software capabilities have always been driven by requests and advice from customers. Customers came up with other control system problems to be solved, and we built on this powerful infrastructure to do just that. Our PAS PlantState Integrity™ software now contains a lot more capability than just alarm management! And when you are ready for these other features, the additional modules are easily added.
From Alarm Management to Operational Risk Management for Automation Systems
The energy, process, power and similar industries use operators, control systems and independent safety systems to manage operations risk. Alarm management is an important component of this overall strategy. Other than that? Managers typically use a hodgepodge of disconnected and sometimes problematic methods to address other equally important operational challenges.
Let’s look at a structured approach to managing our process risks, in a way where every layer reinforces the other layers, and it all builds on our existing infrastructure. This blog is an overview because we have more detailed resources on each aspect, including papers and webinars.
Watch this webinar that discusses all the contents of Automation System Operational Risk Management.
Control Loop Performance Monitoring
The control system is at the center. We depend on it to keep our process within designed boundaries to make product safely, efficiently and profitably! But – does it? There are numerous problems throughout industry with control loop basics and the many problems they develop over the years. Problems such as loops not working as designed, loops that must be run suboptimally in manual, high loop variability, tuning problems, valve hysteresis and stiction, poor control strategies and an industry-wide shortage of capable engineers to deal with all these issues! The real question is not: “How do I justify improving control loop performance?” It is: “Where is the justification for poor control loop performance?”
That problem is solved. On-line, fully automated control loop performance monitoring software (PAS ControlWizard™ from Hexagon) is available for all modern control systems. It incorporates the knowledge of experts and analyzes loop performance in many categories. Regular automated reports with detailed metrics are provided to engineers. The tool greatly leverages your control and operations engineers’ ability to diagnose and solve control loop problems.
View this webinar for more about control loop monitoring.
When loops do not function properly (and for other reasons) we have operators to run the process. This blog series has been mostly about alarm management so there’s no need to repeat more here.
This is a good webinar summarizing all the steps of alarm management.
High Performance HMI
The alarm system, while important, is only a small part of the overall control system’s Human-Machine Interface (HMI). The HMI is the collection of screens, graphics, and controls the operator uses to monitor and interact with the process. A poor HMI makes situation awareness more difficult. Burgeoning process upsets may be missed, and actual upsets may become more severe or last longer than necessary. Upsets usually affect profitability and may grow to affect safety or environmental performance. Poor HMIs have been cited as significant contributing factors to major industrial accidents.
Process control graphics originated in the late 1980s to early 1990s, as part of the changeover to digital control systems such as SCADA and DCS technologies. There were no guidelines for designing a “good” HMI. Several poor practices were put into effect and became paradigms, such as in this example.
A typical graphic from the 1990s, basically a P&ID covered in numbers.
But in 2009, PAS (now a part of Hexagon) released The High Performance HMI Handbook. (https://resources.pas.com/handbooks) We showed methods to take human factors into account in the design of control graphics. The operator’s overall situation awareness is maximized. There was major, positive reaction to the book, and many large corporations began HMI redesign based on its concepts.
High Performance HMI is about changing process graphics from the typical busy, brightly colored P&IDs covered with dozens of numbers, into functional graphics displaying logically-arranged indicators that show normal and abnormal ranges and alarm conditions. Valuable context is given to the hundreds or thousands of sensors that an operator is expected to monitor.
A High Performance Level 2 Process Control Graphic of a Reactor
We followed up the HMI book with a major two-part white paper (free) that updates it, containing dozens of additional figures, elements, examples and case studies proving the tangible benefits! Download each of them below:
Beyond the Operator!
We’ve now provided the operator with the tools they need. But major industrial accidents still occur with far too high a frequency. Most such accidents involve operation of some part of the process outside of the prescribed and designed safe or acceptable boundaries. In many companies, management is highly concerned with verifying, at all times, whether the processes are within such boundaries, including those of safety, environmental, quality, efficiency and profitability. The obvious answer is the automated, continuous monitoring of a plant’s conditions relative to such boundaries. The infrastructure from accomplishing alarm management makes this easy!
The important boundaries of a process are typically stored in a hodgepodge of different procedures, design documents, and reports. These are often contradictory, out-of-date, or even lost. Control system interlock settings, for instance, have been found not to match the correct values in design documents. This is often a management of change (MOC) issue that we will get to shortly…
Visualization of boundaries is essential to safe and profitable operations. When optimum ranges are depicted, they can be achieved. Safety boundaries that are visible become avoidable. Boundary excursions away from the optimum may not be noticed by operators or managers. Processes may be run for considerable periods of time outside of desirable ranges. This may be safe, but maximum efficiency and profitability cannot be achieved under such circumstances.
Accomplishing Boundary Management
A one-time research effort identifies and agrees upon all relevant process boundary information. Sources of the data include process design documents, P&IDs, equipment specifications, process hazard analyses, operating procedures and similar documentation. The correct data is placed into a new section of the existing, secure and controlled Master Alarm Database (MADB), becoming the “best single version of the truth.”
Then we use the data connection to continuously monitor the process. This feeds real-time analysis, depiction and automated reporting of key boundary information. Useful, automated reports include:
- Most frequent boundary excursions
- Time duration of each excursion
- Excursions per processing unit and boundary type
- Excursions ranked by importance and time
- Automatically calculated financial opportunity cost or losses per excursion!
An Example of a Boundary Tracking Display
Using PAS InBound®to accomplish boundary management, you can monitor conditions as the process nears or leaves the optimum zones. Proximity of the process to quality, productivity, safety or environmental boundaries can be depicted in real-time, and even generate alerts to people such as operations engineers and managers. The system can feed corporate performance dashboards. It has never been easier for staff to know exactly what the plant is doing!
The prior mentioned webinar video on High Performance HMI also includes boundary management. For a short but very sweet case study, see here.
Management of Safety Systems and Risk
Our plants are protected by Safety Instrumented Systems (SISs). Their design involves a complex body of knowledge that supports several international standards, such as IEC61511, Functional Safety – Safety Instrumented Systems for the Process Industry Sector. Much of the work and knowledge in this area revolves around system design, not operation. But operation is where accidents occur.
These systems are often designed by functional safety experts, and then turned over to an operating facility without such expertise. There are many tasks and checks that should be made on safety systems during operation. Safety systems can be subject to the same lack of coherent documentation as was found in documenting process boundaries - no “single source of the truth.” An operating procedure may mention one setpoint for safety function activation, a design document specifies another, and a maintenance test procedure has yet another. It is common for disparate systems to accumulate errors this way. Safety interlocks are often bypassed (such as for testing), and bypass may be controlled by procedures susceptible to human error.
These factors mean that the effectiveness of the operating SIS can differ significantly from the design. Risks may be well hidden and not obvious.
A safety system requires performance monitoring, maintenance, bypass management, MOC, periodic proof testing and ongoing suitability verification. Safety function demand rates and response times assumed in the design phase are supposed to be verified by actual performance numbers in operation. This task is often overlooked. If an assumption is wrong, the function may be under-designed, over-designed, overly complex or tested more often than needed, wasting money. These tasks require the attention, time, and effort of engineers and technicians – and are often accomplished using a variety of inconsistent, error-prone, and unreliable methods, such as uncontrolled spreadsheets, notes, and manually marked-up drawings and sketches. Management of safety systems often varies by site. Automation can greatly improve execution of these tasks!
As in Boundary Management, a one-time documentation research effort consolidates the correct settings for all safety instrumented functions into a new section of the MADB.
These settings are then automatically monitored. Changes are automatically reported to ensure proper MOC. Needed calculations are done automatically. Every safety function activation is analyzed, tracked and a summary report made automatically. Bypasses are displayed, tracked, and controlled. Testing costs can be minimized. Key performance indicators for the safety system are now under control, with far less staff effort than was needed in the past.
Operational risk is increased when a safety function is bypassed, when testing is occurring or overdue, and when other control system performance factors cause problems. In the past, it has been impractical or even impossible to determine the current real-time risk level of a process resulting from combinations of these factors. The plant may be operating at a higher risk level than was ever intended. The familiar safety model showing that “the holes in the slices of Swiss cheese are lining up” applies, and accidents are more likely to occur.
By monitoring all these factors, the current risk profile can display on a dashboard, appear in automatic reporting and generate immediate notifications. Management can make operational decisions taking risk into account.
A much more detailed webinar on Safety Systems Management is here.
Automation System Management of Change
MOC of automation systems is essential. It is easy to dismiss MOC as simply filling out forms and obtaining signatures, so that only authorized people make changes. But that ignores a significant underlying need of the “authorized” people that work with these systems. And that is to know this:
“With my best intentions, is this change I am about to make going to mess something up?”
The answer is not simple. Modern automation systems are exceedingly complex, and their inner workings are not easily examined. A small change in the control system can have major, hidden ramifications. As an example, a control engineer that simply changes a tagname can cause function loss in:
- Several different process control graphics and trends
- Compensation calculations for a flowmeter used for billing purposes
- Logic points or interlock points used for protective purposes
- Process historization, tracking and inaccurate calculations of efficiency reports in other applications
The engineer needs to learn all of this in advance of a “simple” change, rather than by picking up the pieces afterwards. The problem is compounded when a single site has control systems of several different types, each with their own idiosyncrasies.
A “fairly simple” control map, showing how changing one control system entity can affect many others, in ways that might not be anticipated.
The Solution: Multi-Platform Automation System Configuration Management
Hexagon’s PAS Automation Integrity® automatically and regularly imports the configuration from over 75 different kinds of control systems and connected devices into a single engineering tool. The details are aggregated, given context and made visual, such as in the control map shown.
Customer, seeing control maps for the first time: "That’s way too complicated! Too much is going on. How am I supposed to understand that?!"
Hexagon: "It is showing you how your system is currently configured. You created that complexity, not us! And now you can see what potential changes will affect before you make them."
All of the connections and references for each component or data entity are revealed. Automated MOC reports can be generated, and the MOC history of any component or entity is tracked. Unauthorized changes are detected automatically. Engineers can ensure that a planned change handles all of the ramifications that will ensue and include those in the design. And, engineering productivity has been proven to significantly increase with such a tool.
The Architecture of Automation Integrity and these other discussed capabilities.
View this brochure for more information on PAS Automation Integrity.
Engineering and Managerial Visualization
Many risk-related factors can be combined into a single near-real-time display. KPIs of loop and alarm performance, process boundary excursions, production rates, unauthorized change, and safety system status can be combined into displays for engineers and managers. Multiple sites can be compared.
It has never been easier for engineering and corporate management to understand what is really happening in the plant. This can have a profound, positive effect on safety, production and profitability.
An example of multi-site visualization of multiple operational KPIs
Risks associated with safety, emissions, quality, and profitability are inherent in the processing industries. Managing those risks is a top priority of operations managers. New technologies and approaches are enabling the visualization and monitoring of operations KPIs and risk in real time.
The infrastructure developed for alarm management has lent itself well to addressing most other automation system-related aspects of process safety and production. Based on alarm management tools, a convergence of technologies is enabling plant operators, engineers, and managers to know and understand, at all times, exactly how their plants are performing, and where their current risks lie.
Feel free to contact me at email@example.com with questions!
Review other Taming the Wild Alarm System topics in this blog series:
- How Did We Get In This Mess?
- The Most Important Alarm Improvement Technique in Existence
- SHUT UP! Fixing Chattering and Fleeting Alarms
- Just How Bad is Your Alarm System?
- Horrible Things We Find During Alarm Rationalization
- Why did they have to call it “Philosophy?”
- Beyond Alarm Management – Doing More with a Powerful Tool