The impact of safety instrumented system isolation on current and future plant operations

By Nidhal Jamal , Uduak J. Daniels
Connectivity & Cybersecurity

Summary

Advanced persistent threat (APT) attacks are on the rise against critical infrastructure.
Most of the controls identified to reduce the likelihood of ATPs are straightforward, except SIS segion.
Until the term “isolation” is defined and understood, asset owners may misinterpret their compliance with regulatory mandates.

‘Isolation’ is a challenge to creating a secure ICS architecture

By Uduak J. Daniels and Nidhal Jamal

Advanced persistent threat (APT) attacks are on the rise against critical infrastructure. APT attackers use continuous, sophisticated, and secretive hacking techniques to gain access to a system and remain inside for a prolonged period of time—with potentially devastating consequences. The resilience and response to these incidents by asset owners have been commendable, reducing the potential impacts on safety, integrity, and reliability. Industry has responded to the most recent cyberattacks targeting safety instrumented systems (SISs) by recommending and legislating changes to the architecture of SIS networks and their interactions with other systems.

These new design requirements may mitigate the exposure of safety systems to APTs, but they also introduce design and integration challenges. These challenges will require significant innovation and thought leadership for those asset owners, automation suppliers, and integrators willing to proactively address them. In addition, regulatory mandates are driving the accelerated adoption of some form of partial or full SIS system isolation, leaving many asset owners scrambling to include these requirements within procurement language, with clarity and within ongoing projects. This article will define the threat scenarios and industry response options, and provide an approach for addressing requirements.

History of SIS cybersecurity efforts

Safety instrumented systems shut down an industrial process to a safe state in the event of unreliable system or process functionality. This definition, although simplistic, is concise and makes it explicit to the nonindustrial observer that the term “safety” is the primary emphasis.

Asset owners and system integrators employ various design approaches to connect their plant’s distributed control systems (DCSs) with the SIS. The traditional approach relies on the principles of segregation for both communication infrastructures and control strategies. The past decade has seen a trend toward integrating DCS and SIS designs for various reasons, including lower cost, ease of use, and benefits achieved from exchanging information between the DCS and SIS.

Until the 1980s, the codes of practice for designing and using trip and alarm systems were set down by major chemical and petrochemical companies. These codes established most of the ground rules used today. Over the past three decades, the International Electrotechnical Commission (IEC) and ISA can be credited with providing global leadership around the issues facing SIS by releasing standards. The current ones are ISA/IEC 61511-2018 and the technical report ISA-TR84.00.09-2017.

Evolving ICS-targeted cyberattacks

The journey to SIS isolation began in 1998 when the newly published ISA/IEC 61511 standard recommended the separation of management systems (figure 1). As is typical across most industry verticals, however, cyberincidents tend to drive regulatory mandates; so, it is interesting to note that the introduction of the majority of SIS cyber-related regulation followed the Triton attack in 2017. Such an incident-driven approach to improved security, although beneficial, could itself be improved. Stakeholders will need a more proactive and forthright approach to stay ahead of the ever-burgeoning industrial control system (ICS) cyberthreats.

A list of the ICS cybersecurity incidents that have affected and changed the industry are shown in table 1. In 2000, the Maroochy Water cyberattack caused the release of thousands of gallons of untreated sewage. The Triton/Trisis/HatMan malware attack in 2017 was the first-ever publicly known malware on a very short list of ICS-specific malware designed to target safety instrumented systems. From an architecture perspective, and not focusing on the other security vulnerabilities, the Triton attackers took advantage of a lack of clearly defined standards mandating network boundary segmentation and enforcement. The industry has since focused on the “Trisis” attribution, since this was the first publicly disclosed SIS cyberattack.

Table 1. ICS Cyberincident timeline
Year	Type	Name	Description
2000	Attack	Maroochy Water	A cyberattack caused the release of more than 265,000 gallons of untreated sewage.
2010	Malware	Stuxnet	The world’s first publicly known digital weapon.
2010	Malware	Night Dragon	Attackers used sophisticated malware to target global oil, energy, and petrochemical companies.
2011	Malware	Duqu/Flame/Gauss	Advanced and complex malware used to get specific organizations, including ICS manufacturers.
2012	Campaign	Gas Pipeline Cyber Intrusion Campaign	ICS-CERT identified an active series of cyberintrusions targeting the natural gas pipeline sector.
2014	Attack	German Steel Mill	A steel mill in Germany experienced a cyberattack resulting in massive damage to the system.
2014	Malware	Black Energy	Malware that targeted human-machine interfaces in ICSs
2014	Campaign	Dragonfly/Energetic Bear No. 1	Ongoing cyber-espionage campaign primarily targeting the energy sector.
2015	Attack	Ukraine Power Grid Attack No. 1	The first known successful cyberattack on a country’s power grid.
2016	Attack	Kemuri Water Company	Attackers gained access to hundreds of the programmable logic circuits (PLCs) used to manipulate control applications and altered water treatment chemicals.
2016	Attack	Ukraine Power Grid Attack No. 2	Cyberattackers tripped breakers in 30 substations, turning off electricity to 225,000 customers in a second attack.
2017	Malware	CRASHOVERRIDE	The malware used to cause the Ukraine power outage was finally identified.
2017	Attack	Triton/Trisis/HatMan	Industrial safety systems in the Middle East targeted by sophisticated malware.

Industry response to Triton

Vendors, asset owners, cybersecurity first responders, and legislators have all provided various guidelines, best practices, advisories, and directives to reduce the exposure to threat actors targeting safety instrumented systems. Some of these guidelines include:

Safety systems must always be deployed on isolated networks.
All engineering workstations should be secured and never be connected to any network other than the safety network.
All methods of mobile data exchange with the isolated safety network, such as CDs, USB drives, and DVDs, should be scanned before use in the engineering workstations, or in any node connected to this network.
Laptops and PCs should always be properly verified to be virus and malware free, before connecting to the safety network or any safety controller.

Most of the controls identified to reduce the likelihood of this type of attack are straightforward, except the segmentation of the safety instrumented system network. We have analyzed the various phrases used by the industry to represent this control objective, and found the following variations:

Safety systems must always be deployed on isolated networks (automation vendor).
Networks used for industrial control systems should always be segregated from enterprise and/or public networks (automation vendor).
Locate control and safety system networks and remote devices behind firewalls and isolate them from the business network (government security agency).
Isolate safety instrumental systems (national regulation).

Depending on the industry stakeholder, this control objective and the various verbiage used to represent its intent present various interpretations.

Security architecture risk

Asset owners who focus on breach statistics as the main driver to address the murky waters of SIS security architecture clearly demonstrate a high risk tolerance. Threat models, which identify and prioritize potential threats specific to plant operations, need to be developed with a focus on the SIS architecture. Typically, asset owners with a mature risk program may have already mapped out these models during the risk framing and assessment phases.

ICS cybersecurity breach headlines and regulatory mandates will not replace a mature and well-thought-out risk management program with verified capabilities.

Safety system architectures and challenges

Separation versus isolation? The term “separation” has been used in the automation industry to mean the restriction of management functionality from the process control network to the SIS network. Separation will not impact information flows—such as combining the sequence of events (SOE) for SIS and process control systems (PCSs) for rapid trip response—but isolation will. Ease of operational comparison of SIS and PCS instrumentation measurements is challenging when systems are isolated.

In this section we attempt to define the commonly represented SIS and PCS network connections currently found across various asset owner facilities. The architecture design categories include isolated, interfaced, integrated: restricted, and integrated: open. This representation is high level and attempts to show the various system interactions.

In the isolated architectural approach, the PCS and SIS systems are completely segregated from each other. There is no interaction between the two systems. Asset owners use dedicated human-machine interfaces (HMIs) for PCS and SIS. This is a truly isolated implementation, both physically and logically (figure 2).

Pros:

Simpler hardening and lockdown: Since the SIS is completely isolated from the PCS, it should be a lot simpler to harden and lock down the systems, without worry of dependencies or architectural complexity.
Less concern with SIS from consequences of changes or modifications that occur on the PCS.
Confidence that SIS will act predictably if the PCS was compromised.

Cons:

External media dependency: Users eventually will require external access to the system for tasks, e.g., extracting event records for sequence of event analysis, bypasses, overrides, proof test records, or performing configuration changes and applying security updates. USB drives, which are often used to implement these updates, are not easy to protect.
Proper system hardening mandates leave asset owners managing two separate sets of defense-in-depth architectures. This creates a high potential for more work hours, longer downtimes, and additional areas where oversights might leave holes in the protection layers.
Increased operational load: With full segregation, the console operator will now have to monitor/utilize a SIS-dedicated HMI, in addition to the PCS HMIs. This is more problematic when operators must divide their attention when it comes to alarm reaction.
Promotes the tendency to ignore or downplay cybersecurity hygiene on SIS over time, due to their isolation.

In interfaced architecture, PCSs and SISs interface with each other using point-to-point interfaces, where individual PCS controllers connect to individual SIS controllers. Through these interfaces, the SIS sends information, such as trip events, pre-alarm triggers, bypasses, and SIS instrumentation values. It can also receive interlock reset requests and bypass requests (figure 3).

Pros:

Optimizing HMI utilization, where the PCS HMI will be used to view some SIS information. This includes generic SIS alarms and events, SIS instrumentation readings, and override commands as transferred from the PCS to the SIS.
Simplified troubleshooting during process events with unified views. Users do not need to check the SIS immediately, as summarized SIS events and instrument information are passed to the PCS.

Cons:

Increased engineering complexity and cost due to dependency on point-to-point connections. Careful design considerations need to be taken to ensure this connection is capable of handling the data exchange from performance, safety, and security perspectives.
Inability to get complete information from the SIS unless the SIS stations are accessed directly, as the point-to-point connection shares limited information by design. This forces technicians and engineers to access the SIS systems for diagnostic information and a detailed sequence of events. This may cause delays in identifying root causes of troubles.

In the integrated: restricted architecture, both the PCS and the SIS are fully integrated (figure 4). Measures are added to restrict and control access between the PCS and SIS.

Figure 4. Integrated: restricted architecture

Pros:

The PCS HMIs (and in some circumstances, the combined engineering stations) have full visibility to the SIS information, including the integrated/combined sequence of events and diagnostic data.
Implementing an integrated SIS/PCS architecture is less complicated and likely costs significantly less, as vendors typically provide unified development environments with built-in feature integration.

Cons:

Implemented network restriction measures between PCS and SIS may not be fully effective, depending on the PCS/SIS technology used. Some protocols used may be proprietary, “closed spec,” or encrypted, increasing the difficulty to implement deep packet inspection technologies.
More effort is required to secure this architecture due to the increased potential attack surface with direct access to the SIS.
Such measures (e.g., firewalls) may also be compromised, introducing some exposure.

The integrated: open architecture approach has both the SIS and PCS fully integrated, with no segregation or restrictions (figure 5).

Pros:

Simple to manage because it is centralized.
It may be possible to use the same instrument resource management system to manage both SIS and PCS instruments.
The PCS HMIs (and in some circumstances, the combined engineering stations) have full visibility to the SIS information, including the integrated/combined sequence of events and diagnostic data.
Implementing an integrated SIS/PCS architecture is less complicated, and likely costs significantly less, as vendors typically provide unified development environments with built-in feature integration.

Cons:

It is challenging to provide adequate protection profiles, and there is significant exposure with major consequences.
The system is not in compliance to regulatory and organizational policies and standards, such as NCA ECC-1 2018 and IEC 61511-2017.

Industry challenges

Until the term “isolation” is explicitly defined and clearly understood, asset owners may misinterpret their compliance to regulatory mandates with the potential for fines, and in extreme cases, a suspension of operational licenses. In addition, asset owners may have a false sense of security, assuming that their exposure to cyberthreats has been mitigated by a secure design architecture for their SIS, when in reality it is lacking.

Some might disagree with the conclusion that the term “isolation” is ambiguous and argue there is an agreed-upon common definition used by the automation industry. Systems can be physically or logically isolated and would meet the literal meaning of the word. Others have said that the “black channel” principle—which is the exchange of safety-related data and diagnostic information using the existing network connections—meets the “isolation” control objective.

We believe the automation industry and especially asset owners should consider the amount and magnitude of cybersecurity incidents as a key decision driver, as there will not be a reduction in their occurrence. In addition, the development and conformance efforts of industry consortiums such as ISA, through the development of standards and focused reports, will play a central role.
Asset owners should place significant effort during the procurement and design stages on a secure architecture, especially considering the long life cycles of ICS components and systems. Whatever is built will likely stay unchanged for many years.

Reader Feedback

We want to hear from you! Please send us your comments and questions about this topic to InTechmagazine@isa.org.

Like This Article?

Subscribe Now!

About The Authors

Nidhal Jamal has 18 years of experience in OT, covering manufacturing operations management, MESs, control systems, and OT cybersecurity. He has participated in multiple cybersecurity initiatives comprising security architecture, incident response and security design, assessments, and reviews. He is currently the head of system control technical support at Petro Rabigh. He has a BE in computer engineering from Vanderbilt University and is a certified Global Industrial Cyber Security Professional.

Uduak J. Daniels has more than 20 years of experience, 15 of which are in cybersecurity, and is currently an ICS cybersecurity specialist with Saudi Aramco (SA). He has participated in various information and operational technology and infrastructure cybersecurity assessments, consultancy, designs, and deployments. He is an ISA member, vice chair of the SA ICS cybersecurity standards committee, and a technical member representative for SA at ISCI ISASecure. Daniels has a BS in computer science and is a Certified Information Systems Security Professional and Certified Information Security Manger.