- Furious Warrior
- Posts
- Enhancing Security Monitoring and Logging for Operational Technology in Data Centers (Part 2)
Enhancing Security Monitoring and Logging for Operational Technology in Data Centers (Part 2)
Enhancing Security Monitoring and Logging for Operational Technology
Enhancing Security Monitoring and Logging for Operational Technology in Data Centers (Part 2)

Data Center
Want SOC 2 compliance without the Security Theater?
Oneleet is the all-in-one platform for SOC 2 Compliance & Attestation.
Get the automation software, penetration test, 3rd party audit, and vCISO services in one place!
Focus on what matters to build real-world security & pass security reviews!
Introduction
In data centers (DC), security monitoring and logging play a critical role in maintaining the operational integrity of the facility. This part of the series delves into the role of Information Technology (IT) and Operational Technology (OT) in supporting data center activities, highlighting the importance of monitoring for effective DC operations. Additionally, we explore the Purdue Enterprise Reference Architecture, a model that provides a structured approach to securing industrial control systems (ICS) used in data centers.
I. The Role of IT and OT in Data Center Operations
A. Intersection of IT and OT in Data Centers Data centers operate at the convergence of network, compute, and storage systems, all of which are supported by a complex array of industrial control systems (ICS). These systems bridge the gap between the physical and digital realms, enabling the seamless operation of data centers.
B. Increasing Threat Vectors As the number of ICS components in data centers grows, so do the potential threat vectors. Unlike traditional IT environments, where the primary concern is data confidentiality, the main risk in OT environments is the loss of service availability. This makes operational metrics critical for influencing security detection and monitoring systems.
II. Importance of Monitoring in Data Center Operations
A. Criticality of Availability In the context of industrial control systems and building management systems, availability is paramount. Any disruption can lead to significant operational failures, making continuous monitoring essential.
B. Influence on Security Systems Operational metrics from OT systems can directly enhance security detection and monitoring systems. By understanding the operational status of critical components, security systems can be more effectively tuned to detect and respond to threats.
III. The Purdue Model: A Framework for Securing Data Center ICS

Overview of the Purdue Enterprise Reference Architecture The Purdue Model is a six-layer framework designed to secure ICS environments by defining different levels of systems and their functions. It is comparable to the OSI model used in networking systems, but tailored for industrial environments.
Breakdown of the Purdue Model
Level 4/5 – Enterprise Layer
This layer includes traditional IT systems like email, Enterprise Resource Planning (ERP), and other business-specific applications. Here, confidentiality and integrity of information are the primary security concerns.
Level 3.5 – Demilitarized Zone (DMZ)
The DMZ serves as a buffer between the IT and OT networks, reducing the attack surface. It hosts remote access services, external connections, and patch management systems, ensuring that only validated resources connect to the ICS network.
Level 3 – Facilities/Process Control Network
This layer supports the ICS infrastructure, including Active Directory services, data historians, and network infrastructure. It is crucial for data center operations and the backbone of the control network.
Level 2 – Supervisory Control
Level 2 handles real-time monitoring, operations supervision, and control. It includes engineering workstations, Human-Machine Interfaces (HMI), and systems like the Building Management System (BMS) and Electrical Power Management System (EPMS).
Level 1/0 – Intelligent Devices & Physical Processes
The lowest levels of the Purdue Model encompass sensors, actuators, Programmable Logic Controllers (PLCs), and other devices that directly control physical processes within the data center.
IV. Common OT Systems in Data Centers
A. Mechanical Systems
Mechanical infrastructure includes the cooling stack, water storage, and treatment systems, which are vital for maintaining the temperature and environment within the data center. Key components include cooling towers, chillers, pumps, and air handling units.
B. Electrical Systems
Electrical infrastructure involves power distribution and monitoring systems, including substations, transformers, power distribution centers, and backup power sources. These systems are essential for ensuring a stable and reliable power supply.
C. Other Building Support Systems
These systems include life safety, access control, and other building support mechanisms, such as fire panels, smoke detectors, leak detection, and fire smoke dampers. They are critical for ensuring the safety and security of the data center environment.
Defining Critical Assets in Data Centers
Importance of Speed in Design and Operations
Data center designers and operators must move swiftly to meet capacity demands. This requires a robust risk management strategy to address cybersecurity challenges and secure the necessary capacity.
System Classification and Risk Management
Classifying systems within the Purdue Model framework allows for a structured approach to aligning security controls and prioritizing risk management activities. Systems lower in the model (Levels 0-2) have the greatest impact on human safety and operational availability.
Domain Classification for IT Operations
Data center operations can be further classified by the physical and logical domain sizes influenced by the control systems. This classification helps prioritize cybersecurity efforts based on the potential impact of damage or outages, from regional levels down to individual machines.
Event and Monitoring Requirements for Data Center Equipment
Devices responsible for monitoring, protecting, and controlling equipment in a data center are often in operation for decades. As cyber security threats continue to evolve, it is essential that these devices provide information that supports the timely investigation and resolution of security incidents.
The following tables outline the types of information these systems should provide to enhance our ability to monitor their security posture and supply data that drives effective analytics. These tables are organized into three key areas and should serve as initial guidance rather than a comprehensive set of recommendations:
Security-Relevant Features: This section highlights capabilities that enhance security measures, crucial for detecting, preventing, and mitigating cyber threats. Examples include:
- Real-Time Alerts: Automated notifications for unauthorized access or suspicious activity.
- Access Logs: Detailed records of system access for security audits.
- Encryption Protocols: Data encryption in transit and at rest to protect sensitive information.
Transform the way you run your business using AI (Extended Labour day Sale)💰
Imagine a future where your business runs like a well-oiled machine, effortlessly growing and thriving while you focus on what truly matters.
This isn't a dream—it's the power of AI, and it's within your reach.
Join this AI Business Growth & Strategy Masterclass and discover how to revolutionize your approach to business.
In just 4 hours, you’ll gain the tools, insights, and strategies to not just survive, but dominate your market.
What You’ll Experience:
🌟 Discover AI techniques that give you a competitive edge
💡 Learn how to pivot your business model for unstoppable growth
💼 Develop AI-driven strategies that turn challenges into opportunities
⏰ Free up your time and energy by automating the mundane, focusing on what you love
🗓️ Tomorrow | ⏱️ 10 AM EST
This is more than just a workshop—it's a turning point.
The first 100 to register get in for FREE. Don’t miss the chance to change your business trajectory forever.
Security Features | Purpose | Notes |
Authentication (unique ID) and authorization (ACL)
| Unique identifier for access logs, and restricted access to resources
| Not all devices/protocols have authentication ● Modbus device, for example, will reply to any Request |
Certificates, encryption keys
| Trusted users | TLS certificates as an example. ● Have they expired, when do they expire? ● When do they need to be renewed? ● Log a message / warning when these are about to expire? |
Heartbeat | Signal uninterrupted asset identity
| Capture loss of connectivity to client. ● Signal if unresponsive to a heartbeat message |
Default credentials alerts
| Provide mechanism to detect if default credentials have been changed
| Credential management: ● Passwords are not preferred. ● A device that will not do anything until default passwords or trust lists are changed. |
Logs
| Provide history of access and actions on assets
| Separate dedicated log for security events. ● Access logs ● Maintenance logs (Changes to system or user configuration) ● Centrally managed security log (protections to detect alterations) |
Secure communication protocols | Mitigate man-in-the-middle vulnerabilities | On startup, device may report protocols it is configured to use, and which ones are enabled/disabled Letʼs remove this as a requirement but fit it into the startup category (which still needs to be created) On startup it would be useful if the device reported the protocols it is configured to use |
2. Configuration Data
This area emphasizes the data assets should provide for investigations, especially after a security incident. Key aspects include:
- Firmware and Software Versions: Identifying outdated or vulnerable components.
- Network Configurations: Tracing unauthorized access routes.
- User Permissions: Ensuring only authorized personnel access sensitive system areas.
Any configuration change in general should generate risk signal
Configuration Item | Purpose | Notes
|
Acceptable system parameters (valid range)
| Create a baseline of system wide parameters once configured. Detect changes to this baseline.
| The change of system parameters could be a security event. ● Baseline configurations should likely be stored upstream of assets through an asset or Configuration management system. ● Include settings at the control system level and device level. |
Firmware Control
| Verification (ie, signed) that the firmware is of the correct version. Device specific.
| Facilitate firmware management ● Provide mechanism to know if new firmware is available ● Verify firmware version ● Ideally update devices in place without taking triggering an outage |
Network links (source & destination IPs, ports, and protocols)
| Network monitoring | Identification of resources to communicate with. ● Mapping of source and destination identifiers, and inclusion in logs ● A change could be a security event.
|
Device Identifier (Mac & IP address)
| Network identity | Name, Type, IP and MAC Address, serial number, certificate, category ● Example: “I am a HVAC controller or power meter” |
Device classification
| Verify the correct asset (or asset class) is physically/logically connected to the correct location.
| Similar to Device Identifier. ● Assist with detection of unauthorized assets connected to various locations within the system hierarchy
|
Session state | Track connection details | Provides a mechanism to know whether a device is still alive (or disconnected, reconnected, or replaced) |
Sensor/device units (Temperature, Voltage)
| Units associated with a measurement and open a scaling factor as well
| Useful to know if a unit of measurement or scaling factor ever changes. Difference between process level configuration and system level configuration a challenge.
|
Application in controller | Programming for specific applications. Potentially monitoring for changes in code.
| Like firmware control. Track application version to help triage for known security alerts |
Documentation that describes all possible security events for a given device
| List of possible events and what each one means. Like an error code lookup
| Potential risk to expose these in documentation |
3. Operational data, though not directly tied to security, is crucial for detecting anomalies that may signal security issues and for system maintenance. Examples include:
- System Uptime: Monitoring uptime and downtime to spot irregular patterns suggesting security breaches.
- Performance Metrics: Tracking CPU and memory usage to identify unusual behavior.
- Event Logs: Keeping detailed logs of system events to correlate with security incidents.
Item | Purpose | Notes |
Sensor/Device Valid range(Temperature, Voltage, Network traffic) | Provides ability to detect operational anomaly, as a potential signal of an attack | Operational anomalies would be used as a proxy for a potential security event. |
Sensor/Device functional range(Temperature, Voltage, Network traffic) | Provides ability to detect operational anomaly, as a potential signal of an attack | Like a valid range, except with additional customer context to restrict the operational range within the assetʼs function. For example, a temperature sensor may have a valid range from 0-255C, but a functional range of 15-65C. |
Exceeded thresholds/parameters | Alert/Alarm on user defined conditions for a given parameter | Built in alert/alarms tied with an assetʼs functional range contains the complexity of thresholding to within the assest |
System/application alerts | Flag different conditions of the system/application with different priorities. | System/application alerts may combine knowledge from various sensors across multiple assets to flag system conditions. |
Conclusion
Enhancing security monitoring and logging for Operational Technology (OT) in data centers is not just a best practice but a critical necessity for maintaining both operational availability and safety. As data centers become increasingly integral to the functioning of modern society, the importance of robust security measures cannot be overstated. By leveraging established frameworks like the Purdue Model, which provides a structured approach to segmenting and securing different layers of industrial control systems, data center operators can gain a comprehensive understanding of their network architecture and the role of critical systems within it.
Understanding the intricacies of these critical systems allows operators
Want SOC 2 compliance without the Security Theater?
Oneleet is the all-in-one platform for SOC 2 Compliance & Attestation.
Get the automation software, penetration test, 3rd party audit, and vCISO services in one place!
Focus on what matters to build real-world security & pass security reviews!
Please select up to three topics that interest you the most: |
Reply