Operational resilience in financial services
Paul Williams, Head of Division, Operational Risk and Resilience at the Prudential Regulatory Authority, and Anne Wetherilt, Senior Adviser at the Bank of England, consider the impact of the new Operational Resilience policy on the financial services sector.
In March 2021, the Bank of England (the Bank), the Prudential Regulatory Authority (PRA) and the Financial Conduct Authority (FCA) published a new approach to regulating operational resilience in the UK finance sector. This paper describes the policy, its main objectives and foundational elements. It starts with a brief overview of the UK financial system and the role of the financial authorities.
The UK financial system
The financial system can be defined in broad terms as the financial institutions, financial markets and financial market infrastructure firms that together provide the vital services which support the functioning of the real economy. This includes banks and building societies, investment banks, insurance companies, asset managers, pension funds and recognised payment systems, to name but a few. The system’s role is to: (i) provide the main mechanisms for paying for goods, services and financial assets; (ii) be the intermediate between savers and borrowers, and to channel savings into investments; and (iii) insure against and help disperse risk.
In the UK, supervision of individual institutions is delivered by three separate entities, namely the Bank, the PRA and the FCA, and collectively known as ‘the financial authorities’. This setup has been in existence since 2013 and reflects different underlying legislative regimes for banks, building societies, major investment firms and insurance firms (regulated by the PRA), financial market infrastructure firms (supervised by a separate directorate in the Bank) and other financial services firms and financial markets (regulated by the FCA). In practice, there is both overlap (some large firms are regulated by both the PRA and the FCA) and a clear separation of roles.
Specifically, the Bank is responsible for the overall stability of the UK financial system (‘financial stability’). This includes: (i) ensuring that individual entities manage risks in an appropriate manner by setting clear baseline expectations, (ii) ensuring that the level of resilience adapts to the risks the system faces, and (iii) enabling the system to absorb any shocks so it can continue serving the real economy. The Bank is expected to deliver the desired level of resilience in an efficient manner without hampering the ability of the system to serve the real economy.
The Bank’s broad financial stability objective is delivered in two main ways. First, through the regulation and supervision of individual institutions (‘microprudential’) and, secondly, through interventions aimed at the wider sector (‘macroprudential’). The latter is the explicit responsibility of the Financial Policy Committee (FPC). In practice, the two approaches work closely together; for example, banks’ capital holdings reflect both microprudential regulatory requirements and a macroprudential view on the desired level of capital in the system.
The FCA has a unique harm objective aimed at ensuring that financial markets are fair and competitive, and protecting consumers from harm. The PRA is tasked with promoting the safety and soundness of the firms it regulates, and has an additional objective to protect the interest of insurance policy holders. The Bank as supervisor of financial market infrastructure firms (such as payment systems) ensures that these entities are managed in a way that is consistent with their unique role in maintaining financial stability.
Traditionally, the UK financial authorities, in common with most financial authorities around the globe, focused primarily on financial resilience. For example, as part of the UK’s financial stress-testing programme, large UK banks are asked to demonstrate that they hold sufficient capital to absorb a range of severe but plausible shocks such as a sharp fall in house prices or a steep rise in oil prices. In recent years, however, the financial authorities have increased their focus on operational resilience, defined as the ability of the finance sector to absorb a severe but plausible operational shock such as, for example, a cyber incident or loss of physical infrastructure.
Regulating operational resilience: The authorities’ objectives
In early 2021, the financial authorities issued a new microprudential framework for operational resilience. This is the main focus of the present paper. In parallel, the FPC has led on a number of macroprudential initiatives relating to strengthening system-wide resilience. For the time being, this is focused more on cyber resilience.
A first in the world of central banking, the new operational resilience framework describes the authorities’ expectations of the financial institutions they regulate. Importantly, the new policy was a common endeavour between the different institutions thus ensuring a consistent approach. But the sector also depends on non-financial institutions, some regulated (such as, for example, telecoms and power), others unregulated (such as software and hardware companies). The paper briefly considers these regulatory perimeter issues.
Operational resilience is commonly viewed as an area where the incentives of public authorities and private companies are closely aligned. While the provision of financial services is clearly of interest to the public at large, individual firms are aware that their own reputation and profitability (their ‘safety and soundness’) depend critically on the trust that the public has in their ability to do so. Hence, there is an expectation, held by the public as well as the authorities, that firms will manage their risks prudently and make appropriate investments in their own operational resilience. But the actions of one individual firm affect all others and this is particularly relevant in the case of cyber: a firm with poor cyber resilience would be not only less able to recover from an incident but might also affect trust in the broader sector. As such, the financial authorities have reached the view that there is indeed a role for regulation in this area, both to make sure the ‘weakest links’ are identified and suitable actions taken and to encourage institutions to prepare for severe but plausible disturbance scenarios that they might be less inclined to pursue themselves.
At the same time, the financial sector enjoys an unusually high degree of public-private co-operation with the Cross Market Operational Resilience Group (CMORG) taking a lead role. CMORG, which brings together senior representatives from across the finance industry, oversees the UK’s exercising programme and delivers across a range of collective workstreams, some aimed at improving communication and co-ordination during an incident, others at developing new technical recovery capabilities that no firm on its own could undertake.
In the first instance, the authorities’ new policy framework outlines baseline expectations for firms’ operational resilience. In turn, supervisors will be tasked with delivering a supervisory programme to get assurance that firms are making the right decisions to build and/or maintain their operational resilience so they can deal with periods of severe stress. In common with the authorities’ existing approach to financial resilience, supervision is judgement based: we assess firms against not just current risks but also those that could plausibly arise further ahead. It is risk based, with enhanced supervisory activities for those issues and firms that are likely to pose the greatest risk to the objectives of the financial authorities. It is proportionate to ensure our interventions do not go beyond what is necessary in order to achieve our objectives.
The next section describes the key concepts that underpin the financial authorities’ approach to operational resilience.
Regulating operational resilience: key concepts
The financial authorities define operational resilience as the ability of firms and the financial sector as a whole to prevent, adapt, respond to, recover from, and learn from operational disruptions. The authorities’ approach to operational resilience is based on the assumption that, from time to time, disruptions will occur which will prevent firms from operating as usual and see them unable to provide their services for a period. Therefore, the new policy requires firms to set and meet clear standards for the services they provide and test their own ability to meet those standards. They should be able: to prevent disruption occurring to the extent practicable, adapt systems and processes to continue to provide services and functions in the event of an incident, return to normal running promptly when a disruption is over, and learn and evolve from both incidents and near misses.
Specifically, firms will be expected to identify important business services and set impact tolerances for these services. They must take action to ensure they are able to deliver their important business services within their impact tolerances. Furthermore, they must undertake regular testing against severe but plausible operationally disruptive scenarios so they can identify vulnerabilities and take mitigating action.
These concepts are explained in more detail below. The concepts are based on principles and aim to drive resilience outcomes without prescribing precisely how these outcomes should be achieved. Therefore, their relevance is not limited to financial services firms and have broad applicability to any firm or sector with operations which need to be resilient.
Important business services
A business service is a service that a firm provides to an external end-user. This is the first foundational concept of the new operational resilience policy and captures the specific business services that a firm delivers to external users. Users range from other financial institutions (such as foreign exchange transactions between two banks) and corporate users (such as a financial authorities lending to a corporate or facilitating mergers and acquisitions), to households (such as mortgage transactions) and the government (such as pension and benefit transfers).
Modern financial institutions provide hundreds of business services to thousands of external users. From an operational resilience perspective, not all are equal. While it is clearly preferable for no disruption to occur, the concept of ‘important’ business services captures a smaller number of business services that require a high degree of resilience. This could be because significant disruption could: (i) pose a risk to the firm (in terms of loss of revenue and/or reputation), (ii) pose risk to financial stability (due to widespread disruption of services), (iii) cause harm to consumers or affect the integrity of markets, or (iv) in the case of insurance firms, pose a threat to policyholders.
Firms are expected to review their important business services at least annually or sooner if a significant change occurs. For most firms, this will be a relatively short list of externally facing services for which the firm has chosen to build high levels of operational resilience in anticipation of operational disruption. Moreover, company boards and senior management are expected to make judgements in the selection of their important business services. This, the financial authorities consider, will facilitate better decision-making as firms build their own operational resilience.
The second foundational concept of the new operational resilience approach specifies the expected standard of resilience. An impact tolerance is defined as the maximum tolerable level of disruption to an important business service. This is likely to vary across firms and services. For example, high-value financial transactions typically require timely settlement so any disruption should be resolved within the business day. If such payments are disrupted, it would not only affect the direct counterparties but could also have spill-over effects on the wider system. Likewise, disruption at a credit-card provider could affect a very large part of the population, disrupting everyday economic activities and potentially leading to wider confidence effects. Equally, there are some financial services which are typically settled over a longer time frame so a relatively longer period of disruption can be tolerated in those instances.
Hence, when firms define their impact tolerances, they need to take into account the impact of failure, both on their own services and for some firms on the system as a whole. For example, the PRA will require its regulated firms to ask whether disruption that lasts beyond their tolerance would pose a risk to: (ii) the firm’s safety and soundness; (ii) the financial stability of the UK; and (iii) in the case of insurers, the policyholders’ protection. The FCA will ask firms to consider whether disruption could cause intolerable harm to consumers or risk to market integrity. While firms are expected to define ‘intolerable harm,’ the FCA has specified that this could include financial losses to the firm or its consumers, reputational damage, impact on market confidence, as well as loss of confidentiality, integrity of availability of data.
Although impact tolerances are defined with respect to a specific important business service, firms may also need to take a broader perspective and consider the impact of disruption on other related important business services. They may be related because, for example, they share common resources which support the delivery of multiple important business services or because simultaneous disruption could have compounding impacts on numerous external end-users.
Many services provided by financial institutions are highly complex, depending on a multitude of processes, people, data and technology. In order to achieve the desired level of operational resilience, firms must have a thorough understanding of these dependencies and their weaknesses. For this reason, the financial authorities expect firms to identify and document the necessary people, processes, technology, facilities and information required to deliver each of their important business services. This identification process is referred to as ‘mapping.’
Adequate mapping should enable firms to meet the following outcomes:
(i) The identification of vulnerabilities. Mapping an important business service should allow a firm to identify the resources that are critical to delivering an important business service, ascertain whether they are fit for purpose and consider what would happen if resources were to become unavailable.
(ii) Testing ability to remain within impact tolerances. Mapping should facilitate the testing of a firm’s ability to deliver important business services within their impact tolerances.
In common with other industries, many financial institutions depend on external service providers through outsourcing and other third-party relationships. As part of their mapping efforts, firms are expected identify these dependencies so they understand how they support important business services.
Finally, the policy requires firms to test regularly their ability to remain within impact tolerances in severe but plausible disruption scenarios. Impact tolerances assume a disruption has occurred. Therefore, testing the ability to remain within impact tolerances should not focus on preventing incidents from occurring but instead on recovery and response arrangements.
Firms should consider a range of severe but plausible scenarios. This could include previous incidents or near misses within the organisation, across the financial sector and in other sectors and jurisdictions. A testing plan should include realistic assumptions and evolve as the firm learns from previous testing rounds.
As an example, the PRA expects firms to develop a testing plan that details how they will gain assurance that they can remain within impact tolerances for their important business services. The nature and frequency of a firm’s testing should be proportionate to the potential impact that disruption could cause and whether the operational resources supporting an important business service have materially changed. When developing a testing plan, firms should consider the following:
(i) The type of scenario testing – which may include paper-based assessments, simulations or live-systems testing.
(ii) The frequency of the scenario testing – firms that implement changes to their operations more frequently should undertake more frequent scenario testing.
(iii) The number of important business services tested – firms that have identified more important business services should undertake more scenario testing to reflect this.
(iv) The availability and integrity of resources – impact tolerances are concerned with the continued provision of important business services. An important business service that can continue to be provided but has insufficient integrity is not within the impact tolerance. Firms should test their recovery plans for both availability and integrity scenarios proportionate to their size and complexity.
(v) How their environment is changing and whether this will give rise to different vulnerabilities.
The severity of scenarios could be varied by increasing the number or type of resources unavailable for delivering the important business service or extending the period for which a particular resource is unavailable. The mapping work that firms will undertake is likely to be useful in deciding how scenarios could be made more difficult.
The policy requires firms to ensure they are able to deliver their important business services within impact tolerances in severe but plausible scenarios. Mapping and testing the delivery of important business services will equip firms to establish whether and how they can remain within impact tolerances.
Firms are expected to develop and implement effective remediation plans for the important business services that testing shows would not be able to remain within their impact tolerance. Firms should take prompt action where they cannot remain within the impact tolerance so these plans should include appropriate timing for the necessary improvements. In developing these plans to improve resilience and prioritising their work, firms should also consider:
(i) The nature and scale of the risk that disruption to the important business service could have on financial stability (if applicable) and the firm’s own safety and soundness. Firms should prioritise those that pose the greatest risk.
(ii) Time criticality of the important business service which is high when the impact tolerance is set for a short amount of time. The policy expects firms to have undertaken planning and set up recovery and response arrangements in advance to be able to respond quickly to disruptions when they occur.
(iii) The scale of improvement necessary to remain within the impact tolerance. An important business service that is far from remaining within the impact tolerance may need to be prioritised over a business service that could nearly remain within its impact tolerance in a severe but plausible disruption.
Importantly, the policy expects firms to be able to remain within impact tolerances for important business services, irrespective of whether or not they use third parties in the delivery of these services. This means that firms should effectively manage their use of third parties to ensure they can meet the required standard of operational resilience. Although firms may assume that an arrangement is inherently less risky where the service provider is part of its own group, this is often not the case. The policy expects firms to manage risk and make appropriate arrangements to be able to remain within impact tolerance, whether using third parties that are other entities within their group or external providers.
In light of the increased reliance of financial institutions on third-party providers (such as Cloud Service Providers), the FPC has recently stated that additional policy measures to mitigate financial stability risks in this area are needed. While the FPC welcomes the engagement between the Bank, FCA and HM Treasury on how to tackle these risks, the FPC also recognises that without a cross‐sectoral regulatory framework (and cross‐border co‐operation where appropriate) there are limits to the extent to which financial regulators alone can mitigate these risks effectively.
Amongst the early lessons from the Covid-19 pandemic, the importance of careful planning for severe disruption events features prominently. There is anecdotal evidence that those financial institutions who were most advanced in their operational resilience planning (e.g. mapping their important business services) transitioned more smoothly to home-based working and other adjustments necessitated by the UK lockdowns. Likewise, the UK industry is well versed in running simulation exercises based on hypothetical disruption scenarios. As such, the new operational resilience policy issued by the UK financial authorities builds on existing capabilities but ensures that firms across the sector undertake adequate planning for events that may affect their own services and the industry as a whole.
Implementation of the new policy will be gradual, with the first assessments of important business services and impact tolerances due in early 2022, and mitigation of vulnerabilities identified by mapping and scenario testing over a slightly longer period. At the same time, many of the risks are dynamic and as such the authorities expect firms to keep their plans under regular review.
 Footnote to FSR
 Financial Policy Committee Record and Policy Statement, July 2021.