TH!NK Magazine - When 99% Is Not Enough

TH!NK Magazine - When 99% Is Not Enough

06/01/2010
David Bester & Michael Zerbs

EXPLORING THE GAP BETWEEN CONFIDENCE AND CERTAINTY

In the world of financial engineering, risk measures are used to estimate the probabilities of unexpected outcomes. VaR is commonly utilized to calculate the worst loss an institution can experience within a certain timeframe up to a confidence level of 99%. For some businesses however, a measure that only covers 99% of any variable is simply unacceptable. THINK takes a closer look at the demolition and nuclear power industries to understand how risk is managed when anything less than a perfect outcome can be a catastrophe.

A CONTROLLED DEMOLITION

When a financial institution is destroyed, something terrible has happened. When Controlled Demolition, Inc. (CDI) destroys something, it means everything has gone according to plan. CDI is an American company that demolishes structures with the kind of precision and planning usually associated with their creation. For over 60 years, 3 generations of the Loizeaux family have created a commercial explosives demolition industry through innovation, expertise, leadership and a methodology designed to guarantee complete predictability. This history includes the implosion of the Kingdome in Seattle, which holds the Guinness World Record for the largest structure by volume ever felled with explosives.

“Once we decide that we can safely perform a project from a technical standpoint, the first risk we manage is that of our client,” says CDI’s President, Mark Loizeaux. “We need to understand their perspective, their position in the industry and what they have to lose. Our solution to their problem needs to embrace those points, as well as those related to CDI’s scope of work and task at hand.”

CDI begins with engineering to see if the numbers match the company’s intuitive analysis for the likelihood of success, based on their experience in the demolition of thousands of structures. Not everything CDI does can be proven mathematically, because even with a set of original, as-built blueprints, a finite structural analysis would need to be performed to fully trust the data plugged into structural engineering formulas. “In order to achieve this level of trust, we would have to de-build the structure. Once this is done, there would be no need to implode it!” Mark points out.

The next step is to break the demolition down into a series of sequential tasks, with critical path management at each level, to ensure absolute control over the project’s successful outcome. This requires tremendous experience, extraordinary observation and management skills, and the ability to communicate clearly with not only the client, but also every single team member involved.

“We become the core clearing house for decision making, communication and performance. If we aren’t permitted this role, then we aren’t interested in the project,” states Mark. CDI takes this position because under common law in the United States, explosives-handling operations fall under strict or absolute liability. This means that operators are not considered innocent until proven guilty in the court’s eyes. Rather they are guilty until they can demonstrate their innocence. This puts CDI at extraordinary risk with regard to litigation, as the company’s insurance and reputation are first in line for claims.

EXPLOSIVE RELATIONSHIPS

Historically, the relationship between regulators and the demolition industry has been a tenuous courtship. “Regulators are accustomed to industries that rely on mathematical analysis and computer programs memorialized in technical publications and books,” observes Mark. While construction disciplines are taught in universities and there is a well-documented history of how to control a design/build process when a structure is erected, the same can’t be said for taking a structure apart. This is largely because the data on what actually exists in older, fatigued structures is too uncertain, and there is no large body of data or industry-sponsored groups to vouch for the data, as is the case in the construction industry. As a result, the National Demolition Association spends a great deal of money to educate regulators and maintain clear lines of communication.

“Regulatory agencies in our field don’t know as much about what we are doing as we do, particularly with regard to new and cutting-edge concepts. As a result, regulators tend to be more reactive than proactive, and show up once a problem has been identified at a demolition project. Prior to an accident, regulators tend to rely more on the individual reputations and track records of companies requesting permits. This is why CDI treasures and protects its reputation so aggressively,” says Mark.

Defense of this reputation extends to the company’s work as consultants. “The problem with many consultants is that they prognosticate from lofty ivory towers with either no hands-on experience or, if the consultant is long retired from the demolition industry, outdated data. A consulting opinion without a commitment to stand behind same is almost useless.” To maintain its reputation and support its mandate to understand client needs, CDI offers to perform the work at a future date, up to five years out, for every consulting project they handle. With this cost certainty in place, a developer wanting to borrow $100 million from a major U.S. bank for a new development can move forward knowing they can rely on a fixed price from a solvent, well-insured, internationally-recognized demolition company.

A POWERFUL REACTION

Gee Sham is a Senior Engineer in the Canadian nuclear industry, where the concept of 99% is not enough is often used in seminars and presentations. “A safety system in a nuclear power plant is in place to prevent a core meltdown. Having a safety system is insufficient; it must be able to actuate in response to a large coolant loss,” says Gee. In the nuclear industry, the target for safety system unavailability is generally between six and eight hours per year. If only 99% system availability is achieved, that facility’s license would be taken away. These standards are in place to prevent an incident like the Three Mile Island accident, where a partial core meltdown resulted in the release of a substantial volume of radioactive gases in 1979.

Probabilistic risk assessment and project risk management are the two main approaches used by the nuclear industry. “A fault tree analysis is a requirement for regulators. For any operational system, we are required to model every component down to the level of every single switch and relay, or even a spring within a switch, that could contribute to a system failure,” states Gee. He offers the example of a pump, which is vital to coolant injections. “The pump can fail, so we need to identify what could lead to this failure. One way to mitigate a pump fail is to install a secondary pump. But this might also fail. So we install a third pump. Every piece of equipment that has the potential to fail must be modeled probabilistically. We then assign a probability of failure to each individual component, which forms the fault tree.”

Within the tree framework, engineers know whether an individual component needs to be tested on a daily, weekly or monthly basis. These elements form the industry’s probabilistic risk model, which dictates how testing, maintenance and replacements are performed.

Channelization and duplicate redundancies are used to further improve system reliability. “We use sensors to detect low pressure in the Primary Heat Transport system. The system requires one sensor, but what happens in the case that this one sensor is not operating properly? We address this variable by using three sensors. Instead of one pump, we will have two or three in operation. Through this process we have protected ourselves against potential channel rejections. Channelization also reduces the possibility of generating a spurious signal because it requires two out of three channel logic to make it a true signal,” says Gee.

Taking a reactor offline is expensive, so maintenance is scheduled when demand is low. All risks of overrunning the schedule and exceeding the maintenance budget are logged through a risk register, which ties realistic numbers to specific components based on operating experience. To minimize risk that a project runs over time, repairs and maintenance variables are addressed in great detail. If a maintenance or refurbishment project involves opening up a major piece of equipment, the potential to encounter unforeseen problems must also be taken into account. Additional parts and engineering support must be secured on a cost/benefit analysis, as acquiring additional spares and maintaining engineering resources on standby for every conceivable situation is not feasible.

SHARED BORDER, DIFFERENT APPROACHES

There is a marked difference between nuclear oversight in the U.S. and Canada. In the U.S., the Nuclear Regulatory Commission specifies how each utility must operate in each power plant. The risk of each component is dictated by the agency and compliance is enforced. Canadian regulators, who oversee a smaller number of plants that operate in different fashions, provide directives and it is up to the Canadian plants to demonstrate compliance. “Let’s say there is one type of valve that is known to be problematic,” explains Gee. “The manufacturer sends out a bulletin and the U.S. regulator would specify what should be done. In Canada the regulator would ask, ‘What are you going to do to demonstrate that this is under control?’”

Operators on both sides of the border share one goal: a target of Zero Defect and Zero Tolerance. This extends to accident rates as well. Gee relates that at his plant, “We have achieved 6.5 million hours without any lost time accidents. We would normally complete between two and three million hours without an incident so we have achieved well above 99% in this area. But the target remains zero, and this applies to cut fingers, back injuries and anything that is an outcome of work-related injuries. Most nuclear and utility-related industries are on this track, and considering the volume of people working 40+ hours per week, our results are quite an achievement.”

PARALLEL LINES

How can regulators acquire higher quality information to make better assessments of the true risks facing institutions? One analogy can be drawn between the concepts of guilty until proven innocent cited by CDI and the growth in investor suitability testing in financial services. Prior to the credit crisis, the onus was generally on corporate or individual investors to ensure that they understood the risks of financial products, subject to fairly basic risk disclosure requirements. Today investor suitability testing is being demanded in much more depth, an indication that the industry has moved from a position that things are fine until there is a problem to a more proactive stance.

The issue of regulators in bordering countries bringing a different approach to industry oversight was referenced for the nuclear industry. The Markets in Financial Instruments Directive (MiFID) and the latest update to Undertakings for Collective Investment in Transferable Securities (UCITS IV) attempt to provide harmonized regulations for investment services across various borders of the European Union. These principles seek to deal with the European problem of connecting common passport rules with different perspectives on the nature and objectives of regulation that can vary from country to country.

Under common passport rules, Iceland’s Icesave bank was allowed access to the British market without much oversight by the FSA. More than 400,000 depositors from the UK and the Netherlands deposited funds into Icesave’s high interest accounts, but the credit crunch exposed the failings in the Icelandic banking model. It became clear that oversight by the Icelandic regulator was insufficient. Icesave is a good example of where innocent until proven guilty would have been the standard approach at one time, while today the host regulator would probably insist on its supervision complementing the home regulator’s supervision to a much greater extent.

One outcome is clear from all three industries: if regulators become aware of a problem only once something has blown up, they have become involved too late.

TOP TO BOTTOM, BOTTOM TO TOP

Within risk management and finance there is a debate whether you can go top down or bottom up: is it good enough to have some high-level assumptions about how the system in the aggregate behaves, or do you need to look at each individual position and model it?

One luxury the physical sciences have is that they are typically modeling processes with stable, well-defined properties. In social sciences, models are built to approximate how the human mind works, but we don’t understand human behavior to the same degree as we understand the physical properties of steel, or the pressure required to cause concrete columns to collapse. Trying to build a set of models with stable statistical properties over a baseline of assumptions is a more complex proposition.

Similar to the concept of redundancies in nuclear engineering, how can an organization know with certainty how much capital is enough when there isn’t enough certainty that their risk model captures the relevant range of outcomes effectively? To make matters more challenging, what is enough during normal market conditions often isn’t enough during periods of turmoil. Financial institutions need enough liquidity and risk capital to withstand large losses and must have the additional liquidity and loss bearing capacity to carry on with their business. With this additional liquidity in place, firms can maintain full market confidence, even when they can’t raise more capital and their balance sheet has large positions in illiquid and complex assets.

Many banks faced this problem during 2008 and 2009, when they had enough capital to make up for the initial wave of losses but were perceived to be left with insufficient funds to run a business or protect the bank in the event of a second wave. So how much redundancy is needed? One approach is to increase the availability and quality of capital and liquidity in case there is a large problem. A complementary approach is to utilize multiple redundant measures to better understand where problems could develop early on even if any specific model fails. A rigorous and disciplined approach to the implementation and interpretation of several redundant measures is one way in which the financial risk community could learn from other disciplines.

The financial industry accepts a certain risk tolerance. What practitioners can observe and perhaps learn from how this tolerance is defined in industries with physical properties is that operating with a 1 in 1,000 chance that unemployment doubles and the budget deficit goes up to 12% is not good enough. On a relative basis we should try to rethink what an acceptable risk tolerance is and try to move toward that line.

NORMALIZATION OF DEVIANCE

About a year before the credit crisis a small episode took place in the CDO market. There was a worry that the debt ratings on Ford and GM would be dropped below investment grade, causing a massive realignment of correlations. Some instruments were subsequently priced far outside of what the models had predicted. For a few weeks the market was quite concerned about these implications, but, ultimately, the expected downgrade did not occur and the market moved on.

Periodically situations like this occur where a small market disturbance suggests an underlying issue of model failure. Since no bank went under in these transactions and markets returned to normal, the models were not examined further. One possibly harmful outcome from the continued use of these models was that a false sense of security took hold, a behavior that practitioners in social and physical sciences must seek to avoid.

In the nuclear industry this phenomenon is referred to as normalization of deviance. Gee cites an example from the aerospace industry that underscores the potential damage a false sense of security brings. “NASA was using foam insulation to protect space shuttles from the heat caused by reentry into the earth’s atmosphere. The foam was known to come loose and damage the craft’s thermal protection system. This had happened four or five times and NASA management was aware of it, but it did not lead to a change in behavior.” As a result, a deviance occurred and became accepted as normal until an eventual catastrophe occurred and a shuttle disintegrated during reentry.

A similar normalization of deviance took place during the Three Mile Island accident, when a control panel light was known to malfunction. The operators got used to it and thought it was okay, but accepting this scenario as normal ultimately became a contributing factor in the reactor’s meltdown, as that light was designed to convey important information. “When I go into a reactor I take a gamma meter and if that meter goes off I back out even if everything appears perfectly normal. It’s easy to accept small modifications in protocol, but the stakes are too high to allow any deviance to go unchallenged. No matter how small the problem appears to be, it must be reported and it must be reviewed,” says Gee.

As a final question, Gee was asked what the financial industry could learn from his work. “Identify deviations. Challenge them. If you give high-risk mortgages and accept that they should be treated the same as other mortgages because everyone else is doing it, or because you believe the housing market is on a continual upswing, you have accepted deviance as normal. Once we start believing that this deviance will lead to predictable behavior it is the same thing as waiting for an accident to happen.”


The same question was asked of Mark at CDI, who had a slightly different take. “I think it is more appropriate to say that what I learned from the financial industry is what I apply to the demolition industry: never risk more than you’re prepared to lose.”