You are on page 1of 81

Safety System/Emergency Shutdown System (ESD)

The Need for Safety Instrumentation


Managing and equipping industrial plant with the right components and sub-systems for optimal operational efficiency and safety is a complex task. Safety Systems Engineering (SSE) describes a disciplined, systematic approach, which encompasses hazard identification, safety requirements specification, safety systems design and build, and systems operation and maintenance over the entire lifetime of plant. The foregoing activities form what has become known as the safety Life-cycle model, which is at the core of current and emerging safety related system standards.

Risk and Risk Reduction Methods


Safety Methods employed to protect against or mitigate harm/damage to personnel, plant and the environment, and reduce risk include: Changing the process or engineering design Increasing mechanical integrity of the system Improving the Basic Process Control System (BPCS) Developing detailed training and operational procedures Increasing the frequency of testing of critical system components Using a safety Instrumented System (SIS) Installing mitigating equipment

Other terms used for safety systems are:


Safety Instrumented Systems (SIS), Emergency Shutdown System (ESD), Safety Related System (SRS), or E/E/PE Safety Related System (E/E/PE = Electric/Electronic/Programmable Electronic)

objectives of a shutdown control system


1- Protection of life 2- Protection of plant equipment 3- Avoidance of environmental pollution 4- Maximizing plant production i.e avoiding unnecessary shutdowns

Safety, Reliability, and Availability


a) Safety Safety means a sufficient protection from danger. Safety related controls are needed e.g. for trains, lifts, escalators, burns, etc. The safe controls must be designed in a way that any component fault and other imaginable influences do not cause dangerous states in the plant.

The safe state


is the state to which a system can be put out of its current operational state and which has a system specific lower hazard potential than the operational state. The absolutely safe with the lowest amount of energy involved. Quite often it is not possible to obtain the safe state without any danger involved, just by switching the device off (e.g. a plane). The plane in the airtaken as a system- has no safe state. Here the risk can only be reduced by redundant equipment (e.g. for propulsion and navigation systems).

Safety
is measured primarily by a parameter called Average Probability of Failure on Demand (PFDavg). This indicates the chance that a SIS will not perform its preprogrammed action during a specified interval of time (usually the time between periodic inspections).

Reliability
Reliability is the ability of a technical device to fulfill its function during its operation time. This is often no longer possible if one component has a failure. So the MTBF (Mean Time Between Failure) is often taken as a measurement of reliability. It can either be calculated statistically via systems in operation or via the failure rates of the components applied. The reliability does not say anything about the safety of a system! Unreliable systems are safe if an individual failure put the plant to the safe state each time.

Availability
Availability is the probability of a system being a functioning one. It is expressed in per cent and defines the mean operating time between two failures (MTBF) and the mean down time (MDT), according to the following formula:

The mean down time (MDT) consists of the fault detection time andin modular systems- the time it takes to replace defective modules. The availability of a system is greatly increased by a short fault detection time. Fast fault detection in modern electronic systems is obtained via automatic test routines and a detailed diagnostic display.

The availability can be increased through redundancy, e.g. central devices working in parallel, IO modules or multiple sensors on the same measuring point. The redundant components are put up in a way that the function of the system is not affected by the failure of one component.
Here as well a detailed diagnostic display is an important element of availability. Measures designed to increase availability have no effect on the safety. The safety of redundant systems is however only guaranteed, if there are automatic test routines during operation or if e.g. nonsafety related sensor circuits in 2-oo3 order are regularly checked. If one component fails, it must be possible to switch off the defective part in a safe way. A related measure is called Safety Availability. It is defined as the probability that a SIS will perform its preprogrammed action when the process is operating. It can be calculated as follows:

Safety Availability = 1 PFDavg


Another parameter is called the Risk Reduction Factor (RRF). It represents the ratio of risk without a SIS divided by the risk with a SIS. It can be calculated as follows:

PRF = 1/PFDavg

What is hazard and what is risk?


A hazard is an inherent physical or chemical characteristic that has the potential for causing harm to people, property, or the environment. In chemical processes, It is the combination of a hazardous material, an operating environment, and certain unplanned events that could result in an accident.

Hazards Analysis
Generally, the first step in determining the levels of protective layers required involves conducting a detailed hazard and risk analysis. In the process industries a Process Hazards Analysis (PHA) is generally undertaken, which may range from a screening analysis through to a complex Hazard and Operability (HAZOP) study, depending on the complexity of operations and severity of the risks involved. The latter involves a rigorous detailed process examination by a multidisciplinary team comprising process, instrument, electrical and mechanical engineers, as well as safety specialists and management representatives.

Risk
Risk is usually defined as the combination of the severity and probability of an event. In other words, how often can it happen and how bad is it when it does happen? Risk can be evaluated qualitatively or quantitatively. Roughly,

Risk reduction
Risk reduction can be achieved by reducing either the frequency of a hazardous event or its consequences or by reducing both of them. Generally, the most desirable approach is to first reduce the frequency since all events are likely to have cost implications, even without dire consequences. Safety systems are all about risk reduction. If we cant take away the hazard we shall have to reduce the risk. This means: Reduce the frequency and / or reduce the consequence The basic definitions of the safety related terminologies will be studied in this course; there are three main examples of the required safety actions as follow:

Emergency Shutdown (ESD)


Typical actions from ESD systems are: Shutdown of part systems and equipment; Isolate hydrocarbon inventories; Isolate electrical equipment; Prevent escalation of events; Stop hydrocarbon flow; Depressurize / Blow down; Emergency ventilation control; Close watertight doors and fire doors.

Process Shutdown (PSD)


A process shutdown is defined as the automatic isolation and de-activation of all or part of a process. During a PSD the process remains pressurized. Basically PSD consists of field-mounted sensors, valves and trip relays, a system logic unit for processing of incoming signals, alarm and HMI units. The system is able to process all input signals and activating outputs in accordance with the applicable Cause and Effect charts. Typical actions from PSD systems are: Shutdown the whole process; Shutdown parts of the process; Depressurize / Blowdown parts of the process.

Fire and Gas Control (F&G)


This is denoted as Fire Detection and Protection system FDP in some other definitions. FDP provides early and reliable detection of fire or gas, wherever such events are likely to occur, alert personnel and initiate protective actions automatically or manually upon operator activation. Basically the system consists of field-mounted detection equipment and manual alarm stations, a system logic unit for processing of incoming signals, alarm and HMI units. The system shall be able to process all input signals in accordance with the applicable Fire Protection Data Sheets or Cause & Effect charts. FDP SIL requirements typically range from SIL 2, SIL 1 or defined as a system without SIL requirement pending on the risk analysis.

Typical actions from FDP systems are:


Alert personnel; Release fire fighting systems; Emergency ventilation control; Stop flow of minor hydrocarbon sources such as diesel distribution to consumers; Isolate local electrical equipment (may be done by ESD); Initiating ESD and PSD actions; Isolate electrical equipment; Close watertight doors and fire doors.

Emergency Shutdown (ESD)


The Emergency Shutdown System (ESD) shall minimize the consequences of emergency situations, related to typically uncontrolled flooding, escape of hydrocarbons, or outbreak of fire in hydrocarbon carrying areas or areas which may otherwise be hazardous. Traditionally risk analyses have concluded that the ESD system is in need of a high Safety Integrity Level, typically SIL 2 or 3. Basically the system consists of field-mounted sensors, valves and trip relays, system logic for processing of incoming signals, alarm and HMI units. The system is able to process input signals and activating outputs in accordance with the Cause & Effect charts defined for the installation.

Typical actions from ESD systems are:


Shutdown of part systems and equipment Isolate hydrocarbon inventories Isolate electrical equipment (*) Prevent escalation of events Stop hydrocarbon flow Depressurize / Blowdown Emergency ventilation control (*) Close watertight doors and fire doors(*)

Process Shutdown (PSD)


The Process Shutdown system ensures a rapid detection and safe handling of process upsets. Traditionally risk analyses have concluded that the PSD system is in need of low to medium Safety Integrity Level. The reason for a low to medium requirement, being that PSD systems built in accordance with API RP 14C have requirements for both primary (the computerized system) and secondary (mechanical devices) protection. Basically the system consists of fieldmounted sensors, valves and trip relays, a system logic unit for processing of incoming signals, alarm and HMI units. The system is able to process all input signals and activating outputs in accordance with the applicable Cause & Effect charts.

Typical actions from PSD systems are:


Shutdown the whole process Shutdown parts of the process Depressurize /Blowdown parts of the process

Fire / gas Detection and Protection (FDP)


Typical actions from FDP systems are: Alert personnel Release fire fighting systems Emergency ventilation control (*) Stop flow of minor hydrocarbon sources such as diesel distribution to consumers. (*) Isolate local electrical equipment (may be done by ESD) Initiating ESD and PSD actions Isolate electrical equipment (*) Close watertight doors and fire doors(*) (*) - May alternatively form a part of the Emergency ShutDown system

Safety Process General Overview


Safety by definition is the absence of risk. There is risk in everything we do, so the safety process model is designed to effectively identify & reduce risk. This includes:
Physical plant risk; Human factor-related risk; Attitudinal Risk.

Sustained improvements in accident prevention can only come from changes to the overall mix of the above factors. The model defines Workplace risk as a formula such that:
RISK = Employee Exposure X Probability of the Accident Sequence Taking Place = Potential Consequence of the Accident Noting that Risk = Consequence x Frequency and Frequency = Demand rate x Probability of failure of the safety function We can define Five-Step Safety Process Model as follows:

Five-Step Safety Process Model

Step 1: Identification of risks that are producing accidents and injuries. Step 2: Perform accident / incident problem-solving on each identified risk:
1. Process includes: 2. Definition of problem 3. Contributing factors 4. Root Causes

Step 3: Develop a schedule for implementation of each preventive action Preventive action should all have
1. Responsible party 2. Resources to support actions 3. Timetable for completion:

Step 4: Continuously measure to ensure


preventive actions are working as expected. Measure timetable to ensure each action is enabled.

Step 5: Employees involved in work


environment must be given feedback on a continuous basis. (i.e. positive reinforcement).

The process for managing risk

the process for managing risk

Risk Evaluation
There is no such thing as zero risk. This is because no physical item has a zero failure rate, no human being makes zero errors and no piece of software design can foresee every possibility.

Key Questions to Ask


A process control engineer implementing a Safety Instrumented System must answer several questions: 1. What level of risk is acceptable? 2. How many layers of protection are needed? 3. When is a Safety Instrumented System required? 4. Which architecture should be chosen?

Risk assessment
The measurement of risk Quantitative scale: Minor Injury to one person involving less than 3 days absence from work Major Injury to one person involving more than 3 days absence from work Fatal consequences for one person Catastrophic Multiple fatalities and injuries. Qualitative scale Unlikely Possible Occasionally Frequently Regularly

Alternatively
One hazardous event occurring on the average once every 10 years will have an event frequency of 0.1 per year. A rate of 104 events per year means that an average interval of 10 000 years can be expected between events.

Another alternative is to use a semi-quantitative scale or band of frequencies to match up words to frequencies. For example: Possible = Less than once in 30 years Occasionally = More than once in 30 years but less than once in 3 years Frequently = More than once in 3 years Regularly = Several times per year. Once we have these types of scales agreed, the assessment of risk requires that for each hazard we are able to estimate both the likelihood and the consequence. For example: Risk item no. 1 Major injury likely to occur Occasionally Risk item no. 2 Minor injury likely to occur Frequently.

Risk matrix example 1

Risk matrix example 2

Scales of consequence

Risk classification of accidents

Concepts of Alarp and tolerable risk

The Alarp (as low as reasonably practicable) principle recognizes that there are three broad categories of risks: Negligible risk: Broadly accepted by most people as they go about their everyday lives, these would include the risk of being struck by lightning or of having brake failure in a car.

Tolerable risk: We would rather not have the risk but it is tolerable in view of the benefits obtained by accepting it. The cost in inconvenience or in money is balanced against the scale of risk, and a compromise is accepted. Unacceptable risk: The risk level is so high that we are not prepared to tolerate it. The losses far outweigh any possible benefits in the situation.

Alarp diagram

Step 1
The estimated level of risk must first be reduced to below the maximum level of the Alarp region at all costs.

This assumes that the maximum acceptable risk line has been set as the maximum tolerable risk for the society or industry concerned. This line is hard to find, as we shall see in a moment.

Step 2
Further reduction of risk in the Alarp region requires cost benefit analysis to see if it is justified. This step is a bit easier and many companies define cost benefit formulae to support cost justification decisions on risk-reduction projects.

The principle is simple If the cost of the unwanted scenario is more than the cost of improvement the risk reduction measure is justified. The tolerable risk region remains the problem for us. How do we work out what is tolerable in terms of harm to people, property and environment?

Establishing tolerable risk criteria


Examples are: Probable Loss of Life (PLL): Number of fatalities frequency of event Fatal accident rate (FAR): Number of fatalities per 108 h worked at the site where the hazard is present.

Fatal accident rate

Tolerable risk conclusion


The indications are that many companies determine tolerable risk targets using consensus from the types of statistics we have been looking at. Marzal concluded that the range of PLL values in industry is still a wide one from 103 to 106 for the upper level. We must also remember to allow for the effect of multiple hazard sources. It appears that financial cost benefit analysis often justifies greater risk reduction factors than the personal or environmental risk criteria. We shall revisit this issue when we come to safety integrity level (SIL) determination practices later in this course.

Practical exercise
Now is good time to try practical Exercise No. 1, which is set out towards the back of the manual in module 12. This exercise demonstrates the calculation of individual risk and FAR, and uses these parameters to determine the minimum risk reduction requirements.

Hazard analysis techniques


In the European Standard EN 1050 Annex B there are descriptions of several techniques for hazard analysis. The notes there make an important distinction between two basic approaches. These are called deductive and inductive. This is how the standard describes them: In the deductive method the final event is assumed and the events that could cause this final event are then sought.

Summary of hazard-identification methods


Here is a summary of the hazard-identification methods. It is useful to have this list because many companies will have preferences for certain methods or will present situations that require a particular approach. We need to have a choice of tools for the job and to be aware of their pros and cons. It is also apparent that similar methods will have a variety of names. All guides agree that Hazop provides the most comprehensive and auditable method for identification of hazards in process plants but that some types of equipments will be better served by the alternatives listed here.

Deductive method
A good example of a deductive method is Fault tree analysis or FTA. The technique begins with a top event that would normally be a hazardous event. Then all combinations of individual failures or actions that can lead to the event are mapped out in a fault tree. This provides a valuable method of showing all possibilities in one diagram and allows the probabilities of the event to be estimated. Deductive methods are useful for identifying hazards at earlier stages of a design project where major hazards such as fire or explosion can be tested for feasibility at each section of plant. Its like a cause and effect diagram where you start with the effect and search for causes.

Inductive method
So-called what if methods are inductive because the questions are formulated and answered to evaluate the effects of component failures or procedural errors on the operability and safety of the plant or a machine. For example, What if the flow in the pipe stops? This category includes: Failure Mode and Effects Analysis or FMEA Hazop studies Machinery concept hazard analysis (MHCA).

Rating for Safety


The following expression defines the relationship between safety Availability and PFD:

Safety Availability = 1 PFD It often may be desirable to express the SIL level in terms of the hazard reduction factor, where HRF is defined as: HRF = 1 / PFD

Safety Integrity Levels and different safety standards

AK Class & SIL

Linking Risks to SIL


To determine the application of a SIS for an actual installation, the control engineer should use a qualitative classification of risk assessment. A qualitative evaluation of safety integrity level weighs the severity and likelihood of the hazardous event. It also considers the number of independent protection layers addressing the same cause of a hazardous event.

Safety Integrity Level (SIL)


During the 1990s the concept of safety-integrity levels (known as SILs) evolved and is used in the majority of documents in this area. The concept is to divide the spectrum of integrity into a number of discrete levels (usually four) and then to lay down requirements for each level.

Clearly, the higher the SIL then the more stringent become the requirements.

Safety-Integrity Levels (SILs)

To further understand these important terms let us ask a fundamental question which is how frequently will failures of either type of function lead to accidents. The answer is different for the 2 types: For functions with a low demand rate, the accident rate is a combination of 2 parameters i) the frequency of demands, and ii) the probability the function fails on demand (PFD). In this case, therefore, the appropriate measure of performance of the function is PFD, or its reciprocal, Risk Reduction Factor (RRF).

For functions which have a high demand rate or operate continuously, the accident rate Page 32 of 189 is the failure rate, , which is the appropriate measure of performance. An alternative measure is mean time to failure (MTTF) of the function. Provided failures are exponentially distributed, MTTF is the reciprocal of . These performance measures are, of course, related. At its simplest, provided the function can be proof-tested at a frequency which is greater than the demand rate, the relationship can be expressed as:
PFD = T/2 or = T/(2 x MTTF), or RRF = 2/(T) or = (2 x MTTF)/T

Definitions of SILs for Low Demand Mode from BS EN 61508

Definitions of SILs for High Demand / Continuous Mode from BS EN 61508

So what is the SIL achieved by the function? Clearly it is not unique, but depends on the hazard and in particular whether the demand rate for the hazard implies low or high demand mode. SIL is a measure of the SIS performance related only to the devices that comprise the SIS. This measure is limited to device integrity, architecture, testing, diagnostics, and common mode faults inherent to the specific SIS design. It is not explicitly related to a cause-and-effect matrix, but it is related to the devices used to prevent a specific incident.
Further, SIL is not a property of a specific device. It is a system property; input devices through logic solver to output devices. Finally, SIL is not a measure of incident frequency. It is defined as the probability (of the SIS) to fail on demand (PFD). A demand occurs whenever the process reaches the trip condition and causes the SIS to take action.

The new ANSI/ISA S84.01 standard requires that assign a target safety integrity level (SIL) for all safety instrumented systems (SIS) applications. The assignment of the target SIL is a decision requiring the extension of the process hazards analysis (PHA) process to include the balance of risk likelihood and severity with risk tolerance.

Since SIL 4 is rarely used. SIL 3 is typically the highest specified safety level. Of the three commonly used levels, SIL3 has the greatest safety availability (RSA), and therefore the lowest average probability of failure on demand (PFD). Required Safety Availability (RSA) is the fraction of time that a safety system is able to perform its designated safety function when the process is operating.

A determination of the target safety integrity level requires:


1. An identification of the hazard involved. 2. Assessment of the risk of each of the identified hazard. In other words, how bad is each hazard and how often is it expected to occur. 3. An assessment of other Independent Protection Layers (IPLs) that may be in place.

Risk Level Factors Based On Frequency

Risk Level Factors Based On Severity

Safety Architectures
Several system architectures are applied in process safety applications, including single-channel systems to triple redundant configurations. Control engineers must best match architecture to operating process safety requirements, accounting for failure in the safety system.

One concern is that many safety systems in operation, or under construction, do not follow basic protection principles. Unsafe practices include: Performing the safety shutdown within the basic process control systems (BPCS) or distributed control systems (DCS). Using conventional programmable logic controllers (PLCs) in safety critical applications (Safety PLCs) are certified to meet safety critical applications to SIL2 and SIL3.) Implementing single element (non redundant) microprocessor- based systems on critical processor.

The conventional PLC architecture provides only a single electric path. Sensors send process
signals to the input modules. The logic solver evaluates these inputs, determines if a potentially hazardous condition exists, and energizes or de-energizes the solidstate output. (Fire and gas detection systems, for example, use the energized to trip philosophy.) Suppose the safety system de-energizes the output to move the process to a safe state. Suppose also that one of the components in the single path fails so that the output cannot be de-energized. Then the conventional PLC wont provide its desired safety protection function.

A special class of programmable logic controllers, called safety PLCs, represents an alternative. Safety PLCs provide high reliability and high safety via special electronics, special software, pre-engineered redundancy, and independent certification.
The safety PLC has input/output circuits designed to be fail-safe, using built-in diagnostics. The central processing unit (CPU) of a safety PLC has built-in diagnostics for memory, CPU operation, watchdog timer, and communication systems.

Accurately evaluating the safety level for a specific control device in the context of a potential hazardous event poses a major and difficult problem for many control engineers. Associations and agencies worldwide have made considerable progress toward establishing standards and implementation guidelines for safety instrumented systems. These standards attempt to match the risk inherent in a given situation to the required integrity level of the safety system. Unfortunately, many of these guidelines and standards are not specific to a particular type of process and deal only with a qualitative level of risk. Control engineers must use considerable judgment in evaluating risk and applying instrumentation that properly addresses established design procedures with budget restraints.

Typical Applications
A fault-tolerant control system identifies and compensates for failed control system elements and allows repair while continuing assigned task without process interruption. A high integrityn control system is used in critical process applications that require a significant degree of safety and availability. Some typical applications are: 1- Emergency Shutdown 2- Boiler Flame Safety 3- Turbine Control Systems 4- Offshore Fire and Gas Protection

1- Emergency Shutdown
Safety instrumented system provides continuous protection for safetycritical units in refineries, petrochemical/chemical plants and other industrial processes. For example, in reactor and compressor units, plant trip signals for pressure, product feed rates, expander pressures equalization and temperature are monitored and shutdown actions taken if an upset condition occur.

Traditional shutdown systems implemented with mechanical or electronic relays provide shutdown protection but can also cause dangerous nuisance trips. Safety instruments provide automatic detection and verification of field sensor integrity, integrated shutdown and control functionality, and direct connection to the supervisory data highway for continuous monitoring of safety critical functions.

2- Boiler Flame Safety


Process steam boilers function as a critical component in most refinery applications. Protection of the boiler from upset conditions, safety interlock for normal startup and shutdown, and flamesafety applications are combined by one integrated safety instrument system. In traditional applications, these functions had to be provided by separate, non-integrated components. But with fault tolerant, fail safe integrated controller, The boiler operations staff can use a critical resource more productively while maintaining safety at or above the level of electromechanically protection systems.

3- Turbine Control Systems


The control and protection of gas or steam turbines requires high integrity as well as safety. The continuous operation of the fault tolerant integrated controller provides the turbine operator with maximum availability while maintaining equivalent levels of safety. Speed control as well as start-up and shutdown sequencing are implemented in a single integrated system. Unscheduled outages are avoided by using hot spares for the I/O modules. If a fault occurs in a module, a replacement module is automatically activated without operator intervention.

4- Offshore Fire and Gas Protection


The protection of offshore platforms from fire and gas threats requires continuous availability as well as reliability. The safety instrument system provides this availability through online replacement of faulty modules; field wiring and sensors are managed automatically by built-in diagnostics. Analog fire and gas detectors are connected directly to the controller, eliminating the need for trip amps. An operator interface monitors fire and gas systems as well as diagnostics for the controller and its attached sensors. Traditional fire and gas panels can be replaced with a single integrated system, saving costly floor space while maintaining high levels of safety and availability.

You might also like