‘Hurricane Katrina Syndrome’:
Lessons from the NHS Hacking (May 2017)

‘Most man-made disasters and violent conflicts are preceded by incubation periods during which policy makers misinterpret, are ignorant of, or flat-out ignore repeated indications of impending danger’ (Boin & t’Hart, 2003:547)

Along with major organisations in over 100 countries world-wide, in recent days the UK National Health Service (NHS) suffered a series of ‘ransom ware’ cyber attacks that either closed down the IT systems, with a threat of total destruction of the systems unless a ransom was paid, or caused other parts of the system to close down their IT networks in order to prevent further spread and infection.

As is the usual situation in such cases, this has been described in terms of a dastardly attack from outside – though this time at least, for criminal purposes rather than terrorist objectives – but it became clear almost immediately that this attack falls into a classic format of a known vulnerability being ignored by people in power, even though they were fully aware of the potential catastrophic consequences of inaction, and had been given multiple small-scale warning of the effects of what would be a full-on attack.

‘Hurricane Katrina Syndrome’ ‘Despite the understanding of the Gulf Coast’s particular vulnerability to hurricane devastation, officials braced for Katrina with full awareness of critical deficiencies in their plans and gaping holes in their resources’ (US Congress, 2006:5)

‘What might be called the ‘Hurricane Katrina Syndrome’ is not confined to the NHS, or to any of the other countries that were involved, but it is something from which all risk managers can (and should) learn, whatever the nature of the organisation they are involved in.

This latest attack cannot be claimed to be unexpected. In fact, given the changing nature of cyber-attacks, it is clear that the combination of a high level of organisational criticality, combined with a diffused and de-centralised systems network that allowed multiple points of entry to be used to access every other area of the system, tied in with the chronic underfunding of appropriate security measures, meant that the IT systems in the NHS were little different to the Bank of England leaving the doors to its main gold vaults open to anyone who was passing.

The study of the failure of what are supposedly ‘High Reliability Organisations’ is a central part of the Deltar Level 4 ‘Advanced Risk and Crisis Management’ programmes that we now run for senior risk managers all over the world. I think it is worth repeating some of those lessons here, highlighted in the official reports into major global events that were themselves the result of ‘Hurricane Katrina Syndrome’. It is not difficult to see how the insertion of the words ‘NHS’ into each of these reports would mean that they would be exactly describing the chain of organisational, managerial and policy failures that were the direct cause of the vulnerabilities that lead to the attacks being able to be made both so easily and so successfully.

From the Space Shuttle Challenger Report (3) There was ‘pressure throughout the agency that directly contributed to unsafe launch operations. The committee feels that the underlying problem that lead to the Challenger accident was not poor communications or inadequate procedures …. [T]he fundamental problem was poor technical decision-making over a period of several years by top NASA and contractor personnel…. Information on the flaws in the joint design…. was widely available, and had been presented to all levels of shuttle management…. [T]here was no sense of urgency on their part to correct the design flaws in the SRB.

From the Space Shuttle Colombia Report (4) ‘[The Board] considered it unlikely that the accident was a random event; rather, it was related in some degree to NASA’s budget, history and programme culture, as well as to the politics, compromises and changing priorities of the democratic process. We are convinced that the management practices overseeing the Space Shuttle Programme were as much a cause of the accident as the foam that struck the left wing’.

From Deepwater Horizon Report

These failures (to contain, control mitigate, plan and clean-up) appear to be deeply rooted in a multi-decade history of organizational malfunction and short-sightedness. There were multiple opportunities to properly assess the likelihood and consequences of organizational decisions i.e. Risk Assessment and Management)…… [A]s a result of a cascade of deeply flawed failure and signal analysis, decision-making, communication and organizational-managerial processes, safety was compromised to the point that the blowout occurred with catastrophic effect’.

It becomes clear from reading these reports that the events that they describe are not random, unexpected or without cause. They are in fact the inevitable result of people in management positions consciously deciding to ignore problems which they are aware of, but which they have no intention of dealing with. The question that should be asked is not ‘Why did this happen?’, but ‘Why did we not do something about it?’.

‘Pathway to Disaster’

The organisational weaknesses that are the precursor to almost all disasters of this nature were identified by Charles Perrow in one of the most influential books on understanding and managing disasters, in his book ‘Normal Accidents: Living with High-Risk Technologies’(6). Perrow described the ‘Pathway to Disaster’ that can act as a quick test for identifying the inbuilt vulnerabilities that are almost certain to lead to a high-impact (and possibly catastrophic) event.

  • The crisis is the result of weaknesses within our own systems, not the result of an outside event
  • There are a series of low-level ‘Normal Accidents’ that highlight those weaknesses – but they are ignored
  • When the crisis is triggered, it is not recognised as a crisis because people think that it is the same as the previous low-level ‘accidents’
    • When you start to react to the disaster, there are three shortages:
    • Equipment
    • Manpower
    • Management skills
  • When you do react to the crisis, it does not respond as predicted (‘Law of Unintended Consequences’)
  • Once the disaster is over, lessons are not learned
  • Once the disaster is over, it can be clearly seen that it was an inevitable consequence of systemic weaknesses that were known, and ignored

What Happens Next….?

The question now is not ‘How do we fix the NHS’, but ‘How can we keep our critical national infrastructure safe from similar attacks – especially at a time when there is chronic underfunding, a lack of rational management structures that means that no-one is actually responsible for ensuring the safety and security of the systems, and when the speed of the evolution of cyber-threats is such that solutions that are effective today will undoubtedly be outmoded in three months time.

This one was on the NHS. What if the next one is on the global banking system…nuclear power stations….national transport….air traffic control….. global communications?

It would be nice to think that somebody, somewhere, is actually thinking about these questions in a serious manner.

‘Hurricane Katrina Syndrome’:
Lessons from the NHS Hacking (May 2017)

‘Most man-made disasters and violent conflicts are preceded by incubation periods during which policy makers misinterpret, are ignorant of, or flat-out ignore repeated indications of impending danger’ (Boin & t’Hart, 2003:547)

Along with major organisations in over 100 countries world-wide, in recent days the UK National Health Service (NHS) suffered a series of ‘ransom ware’ cyber attacks that either closed down the IT systems, with a threat of total destruction of the systems unless a ransom was paid, or caused other parts of the system to close down their IT networks in order to prevent further spread and infection.

As is the usual situation in such cases, this has been described in terms of a dastardly attack from outside – though this time at least, for criminal purposes rather than terrorist objectives – but it became clear almost immediately that this attack falls into a classic format of a known vulnerability being ignored by people in power, even though they were fully aware of the potential catastrophic consequences of inaction, and had been given multiple small-scale warning of the effects of what would be a full-on attack.

‘Hurricane Katrina Syndrome’
‘Despite the understanding of the Gulf Coast’s particular vulnerability to hurricane devastation, officials braced for Katrina with full awareness of critical deficiencies in their plans and gaping holes in their resources’ (US Congress, 2006:5)

‘What might be called the ‘Hurricane Katrina Syndrome’ is not confined to the NHS, or to any of the other countries that were involved, but it is something from which all risk managers can (and should) learn, whatever the nature of the organisation they are involved in.

This latest attack cannot be claimed to be unexpected. In fact, given the changing nature of cyber-attacks, it is clear that the combination of a high level of organisational criticality, combined with a diffused and de-centralised systems network that allowed multiple points of entry to be used to access every other area of the system, tied in with the chronic underfunding of appropriate security measures, meant that the IT systems in the NHS were little different to the Bank of England leaving the doors to its main gold vaults open to anyone who was passing.

The study of the failure of what are supposedly ‘High Reliability Organisations’ is a central part of the Deltar Level 4 ‘Advanced Risk and Crisis Management’ programmes that we now run for senior risk managers all over the world. I think it is worth repeating some of those lessons here, highlighted in the official reports into major global events that were themselves the result of ‘Hurricane Katrina Syndrome’. It is not difficult to see how the insertion of the words ‘NHS’ into each of these reports would mean that they would be exactly describing the chain of organisational, managerial and policy failures that were the direct cause of the vulnerabilities that lead to the attacks being able to be made both so easily and so successfully.

From the Space Shuttle Challenger Report (3)
There was ‘pressure throughout the agency that directly contributed to unsafe launch operations. The committee feels that the underlying problem that lead to the Challenger accident was not poor communications or inadequate procedures …. [T]he fundamental problem was poor technical decision-making over a period of several years by top NASA and contractor personnel…. Information on the flaws in the joint design…. was widely available, and had been presented to all levels of shuttle management…. [T]here was no sense of urgency on their part to correct the design flaws in the SRB.

From the Space Shuttle Colombia Report (4)
‘[The Board] considered it unlikely that the accident was a random event; rather, it was related in some degree to NASA’s budget, history and programme culture, as well as to the politics, compromises and changing priorities of the democratic process. We are convinced that the management practices overseeing the Space Shuttle Programme were as much a cause of the accident as the foam that struck the left wing’.

From Deepwater Horizon Report

These failures (to contain, control mitigate, plan and clean-up) appear to be deeply rooted in a multi-decade history of organizational malfunction and short-sightedness. There were multiple opportunities to properly assess the likelihood and consequences of organizational decisions i.e. Risk Assessment and Management)…… [A]s a result of a cascade of deeply flawed failure and signal analysis, decision-making, communication and organizational-managerial processes, safety was compromised to the point that the blowout occurred with catastrophic effect’.

It becomes clear from reading these reports that the events that they describe are not random, unexpected or without cause. They are in fact the inevitable result of people in management positions consciously deciding to ignore problems which they are aware of, but which they have no intention of dealing with. The question that should be asked is not ‘Why did this happen?’, but ‘Why did we not do something about it?’.

‘Pathway to Disaster’

The organisational weaknesses that are the precursor to almost all disasters of this nature were identified by Charles Perrow in one of the most influential books on understanding and managing disasters, in his book ‘Normal Accidents: Living with High-Risk Technologies’(6). Perrow described the ‘Pathway to Disaster’ that can act as a quick test for identifying the inbuilt vulnerabilities that are almost certain to lead to a high-impact (and possibly catastrophic) event.

  • The crisis is the result of weaknesses within our own systems, not the result of an outside event
  • There are a series of low-level ‘Normal Accidents’ that highlight those weaknesses – but they are ignored
  • When the crisis is triggered, it is not recognised as a crisis because people think that it is the same as the previous low-level ‘accidents’
    • When you start to react to the disaster, there are three shortages:

    • Equipment
    • Manpower
    • Management skills
  • When you do react to the crisis, it does not respond as predicted (‘Law of Unintended Consequences’)
  • Once the disaster is over, lessons are not learned
  • Once the disaster is over, it can be clearly seen that it was an inevitable consequence of systemic weaknesses that were known, and ignored

What Happens Next….?

The question now is not ‘How do we fix the NHS’, but ‘How can we keep our critical national infrastructure safe from similar attacks – especially at a time when there is chronic underfunding, a lack of rational management structures that means that no-one is actually responsible for ensuring the safety and security of the systems, and when the speed of the evolution of cyber-threats is such that solutions that are effective today will undoubtedly be outmoded in three months time.

This one was on the NHS. What if the next one is on the global banking system…nuclear power stations….national transport….air traffic control….. global communications?

It would be nice to think that somebody, somewhere, is actually thinking about these questions in a serious manner.

‘Синдром урагана Катрина’:
Уроки взлома НСЗ (май 2017 г.)

Чаще всего антропогенным бедствиям и насильственным конфликтам предшествуют инкубационные периоды, в течение которых политики ошибочно истолковывают, не знают или полностью игнорируют повторяющиеся признаки надвигающейся опасности’ (Boin & t’Hart, 2003:547)

Наряду с крупными организациями в более чем 100 странах по всему миру, в последние дни британская национальная служба здравоохранения (NHS) перенесла серию кибер-атак типа «ransom ware», которые либо остановили ИТ-системы с угрозой его полного уничтожения, если только не будет уплачен выкуп, или вынудили другие части системы закрыть свои программы, предотвращающие дальнейшее распространение и заражение.

Снова, этот факт был описан как результат подлого нападения извне – хотя на этот раз, по крайней мере, в преступных целях, а не в террористических целях, – но почти сразу стало ясно, что это нападение попадает под классический формат известной уязвимости, которая игнорировалась людьми, находящимися в руководстве, хотя они полностью осознавали потенциальные катастрофические последствия бездействия и получили многочисленные мелкие предупреждения о последствиях того, что было бы полным нападением.

Синдром урагана Катрина
Несмотря на понимание особой уязвимости побережья залива к разрушению ураганов, чиновники готовились к «Катрине» с полным осознанием критических недостатков в своих планах и зияющих дыр в своих ресурсах» (Конгресс США, 2006: 5)

То, что можно назвать «синдром урагана Катрина», не ограничивается Британской службой здравоохранения или какой-нибудь из других стран, которые были вовлечены, но это то, у чего все риск-менеджеры могут (и должны) учиться независимо от характера их организации.

Эта последняя атака не может считаться неожиданной. На самом деле, учитывая изменяющуюся природу кибератак, очевидно, что сочетание высокого уровня организационной критичности в сочетании с рассеянной и децентрализованной сетью систем, которая позволила использовать множественные точки входа для доступа к любой области системы, в сочетании с хроническим недофинансированием соответствующих мер безопасности, означало, что системы ИТ в NHS мало отличались от Банка Англии, оставляя двери в свои главные золотые хранилища открытыми для любого проходящего.

Изучение провала так называемых «Организаций Высокой Надежности» является центральной частью программ 4-го уровня «Управление рисками и Антикризисное Управление. Повышенный Уровень» компании Deltar, которые мы теперь проводим для старших менеджеров по управлению рисками по всему миру. Я думаю, что стоит снова здесь обратиться к некоторым из этих уроков, которые были отмечены в официальных отчетах о крупных глобальных происшествиях, которые сами были результатом «Синдрома урагана Катрина».

Из Отчета о космическом корабле «Челленджер» (3)
На всех оказывалось давление, которое непосредственно способствовало опасным запускам. Комитет считает, что основной проблемой, которая привела к аварии Челленджера, была не плохая связь или неадекватные процедуры, …. фундаментальной проблемой было принятие плохих технических решений в течение нескольких лет ведущими специалистами НАСА и подрядчиками …. Информация о недостатках в совместной разработке …. была широко представлена и была представлена на всех уровнях управления шаттлами … С их стороны не было необходимости срочно исправлять недостатки дизайна в SRB.

Из отчета о космическом корабле «Колумбия» (4)
«[Совет] считал маловероятным, что авария была случайным событием; скорее, это было в некоторой степени связано с бюджетом, историей и культурой программы НАСА, а также с политикой, компромиссами и изменением приоритетов демократического процесса. Мы убеждены, что практика управления, осуществляемая в рамках программы «Спейс Шаттл», была такой же причиной аварии, как и пена, поразившая левое крыло».

Из отчета об аварии на месторождении «Глубоководный Горизонт»

«Эти неудачи (в сдерживании, контроле, смягчении, планировании и очистке), по-видимому, глубоко укоренены в многолетней истории организационных сбоев и близорукости. Имелись многочисленные возможности для правильной оценки вероятности и последствий организационных решений, то есть оценки рисков и управления ими) … …… [К]ак результат каскада сильно ущербных неудач и анализа сигналов, принятия решений, коммуникационных и организационно-управленческих процессов, безопасность производства была скомпрометирована до такой степени, что выброс произошел с катастрофическим эффектом».

Из прочтения этих отчетов становится ясно, что описанные ими события не случайны, не неожиданны или не имеют оснований. Фактически они являются неизбежным результатом того, что люди на руководящих должностях сознательно решают игнорировать проблемы, о которых они знают, но которыми у них нет намерения заниматься. Вопрос, который следует задать – это не «Почему это произошло?», а «Почему мы не сделали что-то по этому поводу?».

‘Путь к катастрофе’

Организационные слабости, которые предшествовали почти всем бедствиям такого характера, были определены Чарльзом Перроу в одной из самых влиятельных книг о понимании и управлении стихийными бедствиями в его книге «Обычные аварии: жизнь с технологиями высокого риска» (6) , Перроу определил «Путь к катастрофе», который может служить быстрым тестом для идентификации встроенных уязвимостей, которые почти наверняка приведут к высокоэффективному (и, возможно, катастрофическому) событию.

  • Кризис – результат слабых мест в наших собственных системах, а не результат внешнего события
  • Существует серия низкоуровневых «Нормальных происшествий», которые выделяют эти недостатки, но они игнорируются
  • Когда кризис активизируется, он не признается в качестве кризисного события, потому что люди думают, что это то же самое, что и предыдущие «происшествия» низкого уровня,
    • Когда вы начинаете реагировать на катастрофу, есть нехватка в трех составляющих:

    • оборудовании;
    • рабочей силе;
    • навыках управления
  • Когда вы реагируете на кризисное событие, все идет не по плану («Закон непредвиденных последствий»)
  • Как только катастрофа миновала, уроки не изучаются
  • Как только катастрофа окончена, можно ясно увидеть, что это было неизбежным следствием системных слабостей, которые были известны, и игнорировались

Что будет дальше….?

Вопрос теперь заключается не в том, «Как мы отфиксируем НСЗ», а в том «Как мы можем сохранить нашу критическую национальную инфраструктуру в безопасности от подобных атак, особенно в условиях хронического недофинансирования, отсутствия рациональных структур управления, что означает, что никто самом деле не ответственен за обеспечение охраны и безопасности систем, и когда скорость эволюции киберугроз такова, что эффективные сегодняшние решения, несомненно, будут устаревать через три месяца».

Это произошло с НСЗ. Что, если следующая атака будет на глобальную банковскую систему … атомные электростанции … .национальный транспорт … .контроль за воздушным движением … .. глобальные коммуникации?

Было бы хорошо знать, что кто-то где-то серьезно размышляет об этих вопросах.

‘Hurricane Katrina Syndrome’:
Lessons from the NHS Hacking (May 2017)

‘Most man-made disasters and violent conflicts are preceded by incubation periods during which policy makers misinterpret, are ignorant of, or flat-out ignore repeated indications of impending danger’ (Boin & t’Hart, 2003:547)

Along with major organisations in over 100 countries world-wide, in recent days the UK National Health Service (NHS) suffered a series of ‘ransom ware’ cyber attacks that either closed down the IT systems, with a threat of total destruction of the systems unless a ransom was paid, or caused other parts of the system to close down their IT networks in order to prevent further spread and infection.

As is the usual situation in such cases, this has been described in terms of a dastardly attack from outside – though this time at least, for criminal purposes rather than terrorist objectives – but it became clear almost immediately that this attack falls into a classic format of a known vulnerability being ignored by people in power, even though they were fully aware of the potential catastrophic consequences of inaction, and had been given multiple small-scale warning of the effects of what would be a full-on attack.

‘Hurricane Katrina Syndrome’
‘Despite the understanding of the Gulf Coast’s particular vulnerability to hurricane devastation, officials braced for Katrina with full awareness of critical deficiencies in their plans and gaping holes in their resources’ (US Congress, 2006:5)

‘What might be called the ‘Hurricane Katrina Syndrome’ is not confined to the NHS, or to any of the other countries that were involved, but it is something from which all risk managers can (and should) learn, whatever the nature of the organisation they are involved in.

This latest attack cannot be claimed to be unexpected. In fact, given the changing nature of cyber-attacks, it is clear that the combination of a high level of organisational criticality, combined with a diffused and de-centralised systems network that allowed multiple points of entry to be used to access every other area of the system, tied in with the chronic underfunding of appropriate security measures, meant that the IT systems in the NHS were little different to the Bank of England leaving the doors to its main gold vaults open to anyone who was passing.

The study of the failure of what are supposedly ‘High Reliability Organisations’ is a central part of the Deltar Level 4 ‘Advanced Risk and Crisis Management’ programmes that we now run for senior risk managers all over the world. I think it is worth repeating some of those lessons here, highlighted in the official reports into major global events that were themselves the result of ‘Hurricane Katrina Syndrome’. It is not difficult to see how the insertion of the words ‘NHS’ into each of these reports would mean that they would be exactly describing the chain of organisational, managerial and policy failures that were the direct cause of the vulnerabilities that lead to the attacks being able to be made both so easily and so successfully.

From the Space Shuttle Challenger Report (3)
There was ‘pressure throughout the agency that directly contributed to unsafe launch operations. The committee feels that the underlying problem that lead to the Challenger accident was not poor communications or inadequate procedures …. [T]he fundamental problem was poor technical decision-making over a period of several years by top NASA and contractor personnel…. Information on the flaws in the joint design…. was widely available, and had been presented to all levels of shuttle management…. [T]here was no sense of urgency on their part to correct the design flaws in the SRB.

From the Space Shuttle Colombia Report (4)
‘[The Board] considered it unlikely that the accident was a random event; rather, it was related in some degree to NASA’s budget, history and programme culture, as well as to the politics, compromises and changing priorities of the democratic process. We are convinced that the management practices overseeing the Space Shuttle Programme were as much a cause of the accident as the foam that struck the left wing’.

From Deepwater Horizon Report

These failures (to contain, control mitigate, plan and clean-up) appear to be deeply rooted in a multi-decade history of organizational malfunction and short-sightedness. There were multiple opportunities to properly assess the likelihood and consequences of organizational decisions i.e. Risk Assessment and Management)…… [A]s a result of a cascade of deeply flawed failure and signal analysis, decision-making, communication and organizational-managerial processes, safety was compromised to the point that the blowout occurred with catastrophic effect’.

It becomes clear from reading these reports that the events that they describe are not random, unexpected or without cause. They are in fact the inevitable result of people in management positions consciously deciding to ignore problems which they are aware of, but which they have no intention of dealing with. The question that should be asked is not ‘Why did this happen?’, but ‘Why did we not do something about it?’.

‘Pathway to Disaster’

The organisational weaknesses that are the precursor to almost all disasters of this nature were identified by Charles Perrow in one of the most influential books on understanding and managing disasters, in his book ‘Normal Accidents: Living with High-Risk Technologies’(6). Perrow described the ‘Pathway to Disaster’ that can act as a quick test for identifying the inbuilt vulnerabilities that are almost certain to lead to a high-impact (and possibly catastrophic) event.

  • The crisis is the result of weaknesses within our own systems, not the result of an outside event
  • There are a series of low-level ‘Normal Accidents’ that highlight those weaknesses – but they are ignored
  • When the crisis is triggered, it is not recognised as a crisis because people think that it is the same as the previous low-level ‘accidents’
    • When you start to react to the disaster, there are three shortages:

    • Equipment
    • Manpower
    • Management skills
  • When you do react to the crisis, it does not respond as predicted (‘Law of Unintended Consequences’)
  • Once the disaster is over, lessons are not learned
  • Once the disaster is over, it can be clearly seen that it was an inevitable consequence of systemic weaknesses that were known, and ignored

What Happens Next….?

The question now is not ‘How do we fix the NHS’, but ‘How can we keep our critical national infrastructure safe from similar attacks – especially at a time when there is chronic underfunding, a lack of rational management structures that means that no-one is actually responsible for ensuring the safety and security of the systems, and when the speed of the evolution of cyber-threats is such that solutions that are effective today will undoubtedly be outmoded in three months time.

This one was on the NHS. What if the next one is on the global banking system…nuclear power stations….national transport….air traffic control….. global communications?

It would be nice to think that somebody, somewhere, is actually thinking about these questions in a serious manner.

‘Hurricane Katrina Syndrome’:
Lessons from the NHS Hacking (May 2017)

‘Most man-made disasters and violent conflicts are preceded by incubation periods during which policy makers misinterpret, are ignorant of, or flat-out ignore repeated indications of impending danger’ (Boin & t’Hart, 2003:547)

Along with major organisations in over 100 countries world-wide, in recent days the UK National Health Service (NHS) suffered a series of ‘ransom ware’ cyber attacks that either closed down the IT systems, with a threat of total destruction of the systems unless a ransom was paid, or caused other parts of the system to close down their IT networks in order to prevent further spread and infection.

As is the usual situation in such cases, this has been described in terms of a dastardly attack from outside – though this time at least, for criminal purposes rather than terrorist objectives – but it became clear almost immediately that this attack falls into a classic format of a known vulnerability being ignored by people in power, even though they were fully aware of the potential catastrophic consequences of inaction, and had been given multiple small-scale warning of the effects of what would be a full-on attack.

‘Hurricane Katrina Syndrome’
‘Despite the understanding of the Gulf Coast’s particular vulnerability to hurricane devastation, officials braced for Katrina with full awareness of critical deficiencies in their plans and gaping holes in their resources’ (US Congress, 2006:5)

‘What might be called the ‘Hurricane Katrina Syndrome’ is not confined to the NHS, or to any of the other countries that were involved, but it is something from which all risk managers can (and should) learn, whatever the nature of the organisation they are involved in.

This latest attack cannot be claimed to be unexpected. In fact, given the changing nature of cyber-attacks, it is clear that the combination of a high level of organisational criticality, combined with a diffused and de-centralised systems network that allowed multiple points of entry to be used to access every other area of the system, tied in with the chronic underfunding of appropriate security measures, meant that the IT systems in the NHS were little different to the Bank of England leaving the doors to its main gold vaults open to anyone who was passing.

The study of the failure of what are supposedly ‘High Reliability Organisations’ is a central part of the Deltar Level 4 ‘Advanced Risk and Crisis Management’ programmes that we now run for senior risk managers all over the world. I think it is worth repeating some of those lessons here, highlighted in the official reports into major global events that were themselves the result of ‘Hurricane Katrina Syndrome’. It is not difficult to see how the insertion of the words ‘NHS’ into each of these reports would mean that they would be exactly describing the chain of organisational, managerial and policy failures that were the direct cause of the vulnerabilities that lead to the attacks being able to be made both so easily and so successfully.

From the Space Shuttle Challenger Report (3)
There was ‘pressure throughout the agency that directly contributed to unsafe launch operations. The committee feels that the underlying problem that lead to the Challenger accident was not poor communications or inadequate procedures …. [T]he fundamental problem was poor technical decision-making over a period of several years by top NASA and contractor personnel…. Information on the flaws in the joint design…. was widely available, and had been presented to all levels of shuttle management…. [T]here was no sense of urgency on their part to correct the design flaws in the SRB.

From the Space Shuttle Colombia Report (4)
‘[The Board] considered it unlikely that the accident was a random event; rather, it was related in some degree to NASA’s budget, history and programme culture, as well as to the politics, compromises and changing priorities of the democratic process. We are convinced that the management practices overseeing the Space Shuttle Programme were as much a cause of the accident as the foam that struck the left wing’.

From Deepwater Horizon Report

These failures (to contain, control mitigate, plan and clean-up) appear to be deeply rooted in a multi-decade history of organizational malfunction and short-sightedness. There were multiple opportunities to properly assess the likelihood and consequences of organizational decisions i.e. Risk Assessment and Management)…… [A]s a result of a cascade of deeply flawed failure and signal analysis, decision-making, communication and organizational-managerial processes, safety was compromised to the point that the blowout occurred with catastrophic effect’.

It becomes clear from reading these reports that the events that they describe are not random, unexpected or without cause. They are in fact the inevitable result of people in management positions consciously deciding to ignore problems which they are aware of, but which they have no intention of dealing with. The question that should be asked is not ‘Why did this happen?’, but ‘Why did we not do something about it?’.

‘Pathway to Disaster’

The organisational weaknesses that are the precursor to almost all disasters of this nature were identified by Charles Perrow in one of the most influential books on understanding and managing disasters, in his book ‘Normal Accidents: Living with High-Risk Technologies’(6). Perrow described the ‘Pathway to Disaster’ that can act as a quick test for identifying the inbuilt vulnerabilities that are almost certain to lead to a high-impact (and possibly catastrophic) event.

  • The crisis is the result of weaknesses within our own systems, not the result of an outside event
  • There are a series of low-level ‘Normal Accidents’ that highlight those weaknesses – but they are ignored
  • When the crisis is triggered, it is not recognised as a crisis because people think that it is the same as the previous low-level ‘accidents’
    • When you start to react to the disaster, there are three shortages:

    • Equipment
    • Manpower
    • Management skills
  • When you do react to the crisis, it does not respond as predicted (‘Law of Unintended Consequences’)
  • Once the disaster is over, lessons are not learned
  • Once the disaster is over, it can be clearly seen that it was an inevitable consequence of systemic weaknesses that were known, and ignored

What Happens Next….?

The question now is not ‘How do we fix the NHS’, but ‘How can we keep our critical national infrastructure safe from similar attacks – especially at a time when there is chronic underfunding, a lack of rational management structures that means that no-one is actually responsible for ensuring the safety and security of the systems, and when the speed of the evolution of cyber-threats is such that solutions that are effective today will undoubtedly be outmoded in three months time.

This one was on the NHS. What if the next one is on the global banking system…nuclear power stations….national transport….air traffic control….. global communications?

It would be nice to think that somebody, somewhere, is actually thinking about these questions in a serious manner.

‘Hurricane Katrina Syndrome’:
Lessons from the NHS Hacking (May 2017)

‘Most man-made disasters and violent conflicts are preceded by incubation periods during which policy makers misinterpret, are ignorant of, or flat-out ignore repeated indications of impending danger’ (Boin & t’Hart, 2003:547)

Along with major organisations in over 100 countries world-wide, in recent days the UK National Health Service (NHS) suffered a series of ‘ransom ware’ cyber attacks that either closed down the IT systems, with a threat of total destruction of the systems unless a ransom was paid, or caused other parts of the system to close down their IT networks in order to prevent further spread and infection.

As is the usual situation in such cases, this has been described in terms of a dastardly attack from outside – though this time at least, for criminal purposes rather than terrorist objectives – but it became clear almost immediately that this attack falls into a classic format of a known vulnerability being ignored by people in power, even though they were fully aware of the potential catastrophic consequences of inaction, and had been given multiple small-scale warning of the effects of what would be a full-on attack.

‘Hurricane Katrina Syndrome’
‘Despite the understanding of the Gulf Coast’s particular vulnerability to hurricane devastation, officials braced for Katrina with full awareness of critical deficiencies in their plans and gaping holes in their resources’ (US Congress, 2006:5)

‘What might be called the ‘Hurricane Katrina Syndrome’ is not confined to the NHS, or to any of the other countries that were involved, but it is something from which all risk managers can (and should) learn, whatever the nature of the organisation they are involved in.

This latest attack cannot be claimed to be unexpected. In fact, given the changing nature of cyber-attacks, it is clear that the combination of a high level of organisational criticality, combined with a diffused and de-centralised systems network that allowed multiple points of entry to be used to access every other area of the system, tied in with the chronic underfunding of appropriate security measures, meant that the IT systems in the NHS were little different to the Bank of England leaving the doors to its main gold vaults open to anyone who was passing.

The study of the failure of what are supposedly ‘High Reliability Organisations’ is a central part of the Deltar Level 4 ‘Advanced Risk and Crisis Management’ programmes that we now run for senior risk managers all over the world. I think it is worth repeating some of those lessons here, highlighted in the official reports into major global events that were themselves the result of ‘Hurricane Katrina Syndrome’. It is not difficult to see how the insertion of the words ‘NHS’ into each of these reports would mean that they would be exactly describing the chain of organisational, managerial and policy failures that were the direct cause of the vulnerabilities that lead to the attacks being able to be made both so easily and so successfully.

From the Space Shuttle Challenger Report (3)
There was ‘pressure throughout the agency that directly contributed to unsafe launch operations. The committee feels that the underlying problem that lead to the Challenger accident was not poor communications or inadequate procedures …. [T]he fundamental problem was poor technical decision-making over a period of several years by top NASA and contractor personnel…. Information on the flaws in the joint design…. was widely available, and had been presented to all levels of shuttle management…. [T]here was no sense of urgency on their part to correct the design flaws in the SRB.

From the Space Shuttle Colombia Report (4)
‘[The Board] considered it unlikely that the accident was a random event; rather, it was related in some degree to NASA’s budget, history and programme culture, as well as to the politics, compromises and changing priorities of the democratic process. We are convinced that the management practices overseeing the Space Shuttle Programme were as much a cause of the accident as the foam that struck the left wing’.

From Deepwater Horizon Report

These failures (to contain, control mitigate, plan and clean-up) appear to be deeply rooted in a multi-decade history of organizational malfunction and short-sightedness. There were multiple opportunities to properly assess the likelihood and consequences of organizational decisions i.e. Risk Assessment and Management)…… [A]s a result of a cascade of deeply flawed failure and signal analysis, decision-making, communication and organizational-managerial processes, safety was compromised to the point that the blowout occurred with catastrophic effect’.

It becomes clear from reading these reports that the events that they describe are not random, unexpected or without cause. They are in fact the inevitable result of people in management positions consciously deciding to ignore problems which they are aware of, but which they have no intention of dealing with. The question that should be asked is not ‘Why did this happen?’, but ‘Why did we not do something about it?’.

‘Pathway to Disaster’

The organisational weaknesses that are the precursor to almost all disasters of this nature were identified by Charles Perrow in one of the most influential books on understanding and managing disasters, in his book ‘Normal Accidents: Living with High-Risk Technologies’(6). Perrow described the ‘Pathway to Disaster’ that can act as a quick test for identifying the inbuilt vulnerabilities that are almost certain to lead to a high-impact (and possibly catastrophic) event.

  • The crisis is the result of weaknesses within our own systems, not the result of an outside event
  • There are a series of low-level ‘Normal Accidents’ that highlight those weaknesses – but they are ignored
  • When the crisis is triggered, it is not recognised as a crisis because people think that it is the same as the previous low-level ‘accidents’
    • When you start to react to the disaster, there are three shortages:

    • Equipment
    • Manpower
    • Management skills
  • When you do react to the crisis, it does not respond as predicted (‘Law of Unintended Consequences’)
  • Once the disaster is over, lessons are not learned
  • Once the disaster is over, it can be clearly seen that it was an inevitable consequence of systemic weaknesses that were known, and ignored

What Happens Next….?

The question now is not ‘How do we fix the NHS’, but ‘How can we keep our critical national infrastructure safe from similar attacks – especially at a time when there is chronic underfunding, a lack of rational management structures that means that no-one is actually responsible for ensuring the safety and security of the systems, and when the speed of the evolution of cyber-threats is such that solutions that are effective today will undoubtedly be outmoded in three months time.

This one was on the NHS. What if the next one is on the global banking system…nuclear power stations….national transport….air traffic control….. global communications?

It would be nice to think that somebody, somewhere, is actually thinking about these questions in a serious manner.

‘Синдром урагана Катрина’:
Уроки взлома НСЗ (май 2017 г.)

Чаще всего антропогенным бедствиям и насильственным конфликтам предшествуют инкубационные периоды, в течение которых политики ошибочно истолковывают, не знают или полностью игнорируют повторяющиеся признаки надвигающейся опасности’ (Boin & t’Hart, 2003:547)

Наряду с крупными организациями в более чем 100 странах по всему миру, в последние дни британская национальная служба здравоохранения (NHS) перенесла серию кибер-атак типа «ransom ware», которые либо остановили ИТ-системы с угрозой его полного уничтожения, если только не будет уплачен выкуп, или вынудили другие части системы закрыть свои программы, предотвращающие дальнейшее распространение и заражение.

Снова, этот факт был описан как результат подлого нападения извне – хотя на этот раз, по крайней мере, в преступных целях, а не в террористических целях, – но почти сразу стало ясно, что это нападение попадает под классический формат известной уязвимости, которая игнорировалась людьми, находящимися в руководстве, хотя они полностью осознавали потенциальные катастрофические последствия бездействия и получили многочисленные мелкие предупреждения о последствиях того, что было бы полным нападением.

Синдром урагана Катрина
Несмотря на понимание особой уязвимости побережья залива к разрушению ураганов, чиновники готовились к «Катрине» с полным осознанием критических недостатков в своих планах и зияющих дыр в своих ресурсах» (Конгресс США, 2006: 5)

То, что можно назвать «синдром урагана Катрина», не ограничивается Британской службой здравоохранения или какой-нибудь из других стран, которые были вовлечены, но это то, у чего все риск-менеджеры могут (и должны) учиться независимо от характера их организации.

Эта последняя атака не может считаться неожиданной. На самом деле, учитывая изменяющуюся природу кибератак, очевидно, что сочетание высокого уровня организационной критичности в сочетании с рассеянной и децентрализованной сетью систем, которая позволила использовать множественные точки входа для доступа к любой области системы, в сочетании с хроническим недофинансированием соответствующих мер безопасности, означало, что системы ИТ в NHS мало отличались от Банка Англии, оставляя двери в свои главные золотые хранилища открытыми для любого проходящего.

Изучение провала так называемых «Организаций Высокой Надежности» является центральной частью программ 4-го уровня «Управление рисками и Антикризисное Управление. Повышенный Уровень» компании Deltar, которые мы теперь проводим для старших менеджеров по управлению рисками по всему миру. Я думаю, что стоит снова здесь обратиться к некоторым из этих уроков, которые были отмечены в официальных отчетах о крупных глобальных происшествиях, которые сами были результатом «Синдрома урагана Катрина».

Из Отчета о космическом корабле «Челленджер» (3)
На всех оказывалось давление, которое непосредственно способствовало опасным запускам. Комитет считает, что основной проблемой, которая привела к аварии Челленджера, была не плохая связь или неадекватные процедуры, …. фундаментальной проблемой было принятие плохих технических решений в течение нескольких лет ведущими специалистами НАСА и подрядчиками …. Информация о недостатках в совместной разработке …. была широко представлена и была представлена на всех уровнях управления шаттлами … С их стороны не было необходимости срочно исправлять недостатки дизайна в SRB.

Из отчета о космическом корабле «Колумбия» (4)
«[Совет] считал маловероятным, что авария была случайным событием; скорее, это было в некоторой степени связано с бюджетом, историей и культурой программы НАСА, а также с политикой, компромиссами и изменением приоритетов демократического процесса. Мы убеждены, что практика управления, осуществляемая в рамках программы «Спейс Шаттл», была такой же причиной аварии, как и пена, поразившая левое крыло».

Из отчета об аварии на месторождении «Глубоководный Горизонт»

«Эти неудачи (в сдерживании, контроле, смягчении, планировании и очистке), по-видимому, глубоко укоренены в многолетней истории организационных сбоев и близорукости. Имелись многочисленные возможности для правильной оценки вероятности и последствий организационных решений, то есть оценки рисков и управления ими) … …… [К]ак результат каскада сильно ущербных неудач и анализа сигналов, принятия решений, коммуникационных и организационно-управленческих процессов, безопасность производства была скомпрометирована до такой степени, что выброс произошел с катастрофическим эффектом».

Из прочтения этих отчетов становится ясно, что описанные ими события не случайны, не неожиданны или не имеют оснований. Фактически они являются неизбежным результатом того, что люди на руководящих должностях сознательно решают игнорировать проблемы, о которых они знают, но которыми у них нет намерения заниматься. Вопрос, который следует задать – это не «Почему это произошло?», а «Почему мы не сделали что-то по этому поводу?».

‘Путь к катастрофе’

Организационные слабости, которые предшествовали почти всем бедствиям такого характера, были определены Чарльзом Перроу в одной из самых влиятельных книг о понимании и управлении стихийными бедствиями в его книге «Обычные аварии: жизнь с технологиями высокого риска» (6) , Перроу определил «Путь к катастрофе», который может служить быстрым тестом для идентификации встроенных уязвимостей, которые почти наверняка приведут к высокоэффективному (и, возможно, катастрофическому) событию.

  • Кризис – результат слабых мест в наших собственных системах, а не результат внешнего события
  • Существует серия низкоуровневых «Нормальных происшествий», которые выделяют эти недостатки, но они игнорируются
  • Когда кризис активизируется, он не признается в качестве кризисного события, потому что люди думают, что это то же самое, что и предыдущие «происшествия» низкого уровня,
    • Когда вы начинаете реагировать на катастрофу, есть нехватка в трех составляющих:

    • оборудовании;
    • рабочей силе;
    • навыках управления
  • Когда вы реагируете на кризисное событие, все идет не по плану («Закон непредвиденных последствий»)
  • Как только катастрофа миновала, уроки не изучаются
  • Как только катастрофа окончена, можно ясно увидеть, что это было неизбежным следствием системных слабостей, которые были известны, и игнорировались

Что будет дальше….?

Вопрос теперь заключается не в том, «Как мы отфиксируем НСЗ», а в том «Как мы можем сохранить нашу критическую национальную инфраструктуру в безопасности от подобных атак, особенно в условиях хронического недофинансирования, отсутствия рациональных структур управления, что означает, что никто самом деле не ответственен за обеспечение охраны и безопасности систем, и когда скорость эволюции киберугроз такова, что эффективные сегодняшние решения, несомненно, будут устаревать через три месяца».

Это произошло с НСЗ. Что, если следующая атака будет на глобальную банковскую систему … атомные электростанции … .национальный транспорт … .контроль за воздушным движением … .. глобальные коммуникации?

Было бы хорошо знать, что кто-то где-то серьезно размышляет об этих вопросах.