In February, Toyota halted production in all 14 of its Japanese plants after a significant parts supplier fell victim to a cyber attack. As manufacturers continue their journey towards digital transformation, Christina Hoefer, VP, Global Industrial Enterprise, Forescout Technologies, explains how they can improve the security of their connected environments.
The target of the Toyota attack was Kojima Press Industry Co., which manufactures metal, plastic and electronic components for vehicles, but it indirectly impacted Toyota’s just-in-time production control system. To prevent the infection spreading to other network components, the car manufacturer made the decision to halt production, which resulted in a five percent drop in car production and significant financial losses for the company.
The attack also demonstrated the true impact of supply chain attacks on manufacturers. As connectivity in their operational environments grows and interdependency chains with suppliers become more embedded in their networks, devastating and production halting cyber attacks are becoming a greater risk.
Hackers have discovered that by compromising production of key suppliers they can also shut down operations for their customers. The convergence of IT, Internet of Things (IoT) and operational technology (OT) systems, including industrial control systems (ICS), often plays a major role in supply chain attacks – and, more commonly internal, non-malicious cyber risks.
Given this heightened risk, how can manufacturers improve the security of their connected environments?
The digitalisation of manufacturing
For decades, IT and OT were seen as separate entities within organisations. In keeping with practices first defined by the Purdue Enterprise Reference Architecture, the two systems were entirely air gapped to never impact one another. While this separation kept OT networks more protected, today digital transformation efforts have merged the networks to improve efficiency, cut costs and improve safety for plant employees, but it has also raised the cyber stakes.
Digital transformation is underpinned by the convergence of OT and IT systems. Convergence doesn’t mean IT and OT systems and processes are collapsed into a single, flat system, but information is shared to allow them to interoperate. For manufacturers, the challenge is how to securely connect IT and OT systems that need to communicate, while preventing those that don’t from doing so. Oftentimes, unwanted communication links go unchecked and vulnerabilities hide in plain sight based on the assumption that OT and IT are separated when they are not. Such assumptions increase the chance that malware on one network may spread and impact other networks.
When thinking about manufacturing cybersecurity challenges, the issues most frequently faced include:
Because OT assets were never connected, they were not built with security or even integrity in mind. Adding security later can be exceptionally difficult because many assets cannot accommodate an agent. Some leading manufacturers are finally implementing ‘secure by design’ principles to newer technology, but that is still the exception.
It’s not uncommon for IT organisations to refresh technology every few years as new hardware, operating systems and applications evolve. In contrast, OT systems are built for reliability, they remain relatively static and have long lifecycles. Some OT assets may not get a refresh for up to 30 years.
Many OT systems are built for continuous production and are intended never to go offline. As the Toyota example illustrates, even an hour of downtime can mean staggering revenue loss. Moreover, attempting a security patch usually causes more problems than it solves. Even if safe patches exist there may be no maintenance window to shut down production, install and restart. These systems also feature decades-old technology that lacks processing power, making installing things like endpoint protection tools difficult.
Cyber attacks like the one that crippled Toyota make headlines, but daily issues like network or process misconfigurations, operational errors, resource usage spikes and other anomalies are ten times more likely to threaten productivity. Until it is investigated, an anomaly could indicate a process problem or a malicious attack. Either way, manufacturers must be able to detect intrusions, unwanted behaviour or equipment failure and respond quickly to avoid downtime.
The use of Internet of Things (IoT) devices in manufacturing environments is also exploding, for the same reasons as OT: to further reduce costs and deliver more value to customers. IoT devices are used to collect real-time data on production processes. This data flows into IT or even cloud services to enable better scheduling, forecasting and overall performance against metrics. They’re also used to manage facility systems such as building access control, HVAC, lighting and fire safety systems.
Despite their pivotal role, often IoT device communications are not tracked and monitored. Because it’s not clear who they communicate with it can be difficult to maintain a secure perimeter. And like OT devices, IoT systems use simple operating systems and off-the-shelf software components. Their firmware is rarely updated, so vulnerabilities abound, making them an easy target for hackers.
Manufacturing sites can be huge, with several production plants on a campus or geographically dispersed over several regions and countries. Each one of those environments may rely on thousands of systems and devices from different generations, built by different vendors on different architectures. Maintaining an accurate asset inventory with pen and paper is no longer possible. You need automation to continuously identify and assess all connected assets, from decade-old process controllers to dormant IT systems and new IoT devices.
OT engineers, as opposed to IT security staff, often work with OT systems. Tensions may arise when stakeholders primarily concerned with safety and productivity must now balance operational and cybersecurity risk, especially if it means shutting down operations. Couple that with the global shortage of skilled cybersecurity resources – and unclear ownership of IoT devices, which may fall through the cracks.
Cyber security best practices for manufacturers
When rolling out new digitalisation projects, organisations can prepare by following best practices such as the NIST Cyber security Framework, which outlines how to identify, protect, detect, respond and recover from threats. The following recommendations align with this framework and they are based on more than a decade of industrial threat research and experience:
Complete security starts with an accurate inventory of all assets, where they are and what they’re communicating with. Being able to detect where assets are upon connect (and their properties) helps engineers locate them in case of malfunction, misbehaving or other cyber issues. The challenge is that the same discovery approaches that work for IT and IoT might not work for sensitive OT devices given safety rules, vendor interoperability issues, industrial process requirements and other considerations. To avoid downtime or service disruption, they require agentless techniques or non-intrusive network monitoring such as deep packet inspection (DPI). OT networks also include many IT assets, so hybrid techniques are necessary.
A combination of passive and active techniques can be used to discover devices and processes of various ages from various vendors. Tools that continuously monitor the network infrastructure effectively locate assets upon connect without being intrusive. A system specific for OT/ICS networks must understand dozens of industrial protocols and be able to prioritize detected threats. Monitoring the process communications makes it possible to discover network misconfigurations and operational errors early on so OT engineers can diagnose behaviour and resolve issues more quickly.
OT engineers must understand both the cybersecurity and operational risks of each asset. Operational risks include process criticality, device behaviour and age relative to its lifecycle, while security risks should consider vulnerabilities and internet connectivity, as well as proximity to potentially infected assets and use of weak security standards. As with discovery, there are several ways to non-intrusively determine the vulnerabilities of OT assets, while most traditional IT assets can be actively scanned.
Risk assessment should also be automated and continuous, comparing the asset to a database of OT/ICS-specific Indicators of Compromise (IOCs) and Common Vulnerabilities and Exposures (CVEs). As a rule, assume that no system is vulnerability free. With the increase in supply chain vulnerabilities, they are only going to become more difficult to track. Manufacturers should strengthen the security perimeter to allow only the necessary access and monitor these connections and enforce network segmentation within the factories to separate critical process systems and vulnerable devices and apply the principles of least privileged access. This will also help build the foundation for a zero-trust architecture.
To avoid costly downtime, threats to operational continuity must be detected and investigated as early as possible. Asset discovery and risk assessment produces a flood of information about potential threats and vulnerabilities, not all of it urgent. To cut through the noise, OT engineers and security teams need a monitoring and detection system that prioritizes critical alerts based on both operational and cyber security risk and potential impact, with drill-down into details that help them make informed decisions about how to respond. Your security operations centre (SOC) should be able to handle security-related events in OT and IoT environments and can divert the operational events to a process automation team. To avoid overloading the OT team with too many escalations, the teams can define a handful of cases to start with and increase them over time.
Any risks and vulnerabilities identified above must be mitigated and, ideally, remediated, using the right technique based on all available information. While in IT the common approach is to patch, this may not be possible for OT. In manufacturing environments response actions range from automated initiation of remediation activities, such as creating a service ticket for an engineer to check a malfunctioning device or to tighten a firewall rule, to more drastic measures, such as access control and segmentation.
Vulnerable and critical systems, including unsupported legacy systems, should be segmented from the rest of the operations, and logical segments should be implemented where possible. For example, a security camera doesn’t need to connect to the process control server or data historian, and a robot arm doesn’t need direct internet access.
While automated mitigation and remediation can free scarce resources to focus on other priorities, it may not always be desirable in manufacturing environments. Enforcement policies can specify what actions should be taken manually, with human assistance or automatically based on all available information. A zero-trust policy engine can enforce flexible mitigation actions on networks and endpoints, from modest (apply updates) to stringent (quarantine device). This strategy will help protect vulnerable, high-risk and compromised devices while keeping mission-critical assets online. When proven safe and effective, manually initiated actions can be automated to reduce mean time to respond (MTTR) where it makes sense.
Security policies, from assessment and alerts to mitigation actions, should naturally involve communication between IT and OT teams. For example, how can the SOC security analyst inform the right OT engineer at the site? Many actions can be initiated automatically without risk to OT systems, such as tightening firewall rules that don’t touch process control communication and assessing the security posture of contractor laptops before granting access to a production network. In reality, policies often result from actual breaches. Suppose malware on the contractor’s laptop infected the network. How did you recover from the incident and restore operations, and what could you have done to prevent it from happening? Make sure to document incidents and determine better ways to protect, detect and respond, so you can recover more quickly if a similar incident occurs.
As manufacturing plants become increasingly connected, the importance of cybersecurity has never been greater. Manufacturers must gain a closer understanding of exactly what is on their network and how each device is interlinked and then take steps to secure all assets. Digital transformation offers significant benefits to manufacturers, but it also means it is no longer possible to keep critical OT disconnected. To reap on the benefits of this digitisation, security needs to run in tandem to innovation.