Fortifying cybersecurity: Strategies for secure and resilient systems
Eve Goode
Share this content
The recent CrowdStrike IT outage – where a faulty security update caused around 8.5 million devices to experience the dreaded ‘Blue Screen of Death’ (BSOD) error – underscores the critical need for a comprehensive, multi-layered strategy to risk management and security architecture.
In this exclusive, ISJ hears from PA Consulting’s Caroline Field, Mark Ioannou and Dan Taylor.
In the classic ‘CIA triad’ of confidentiality, integrity and availability, it is availability that often gets overlooked.
In other words, companies sometimes forget just how important timely and reliable access to information is and spend more time focusing on confidentiality and integrity.
So how can organisations effectively enhance their cybersecurity measures and bolster the availability of their services?
Know your critical services
First, organisations should map out which systems are integral to delivering their most critical services.
In a crisis, this will ensure they don’t miss any critical systems or waste time focusing on non-essential systems or services.
Key questions to ask are: ‘What is the impact of minimal, partial and total loss?’ and ‘what can the service and business tolerate?’
Once this is understood, companies can then conduct thorough risk assessments on those services.
This involves evaluating assets, identifying potential threats and assessing vulnerabilities, for example, through threat modelling or fault-tree analysis.
By prioritising risks based on their potential impact, rather than their likelihood, organisations can better allocate resources to mitigate significant threats.
Nonetheless, it is still important not to discount the unlikely threats and fail to plan for those – as the CrowdStrike incident demonstrated.
Asking ‘what if?’ and continuously monitoring and reviewing risk helps your organisation remain vigilant and prepared for any eventuality.
Implement rigorous testing and release cycles
To manage the rollout of new software, it is important to implement sandbox testing and ensure any changes can be rolled back.
Where this is provided by a third party, make sure they are both doing and able to do this: you need your cyber and IT supply chain to have a similar mindset towards cybersecurity as you do.
Developing and maintaining robust policies and procedures for suppliers plays a key role in achieving this, as does ensuring that third-party suppliers actually adhere to these policies to mitigate external risks.
An organisation’s policies and procedures need to reflect how they consume services as part of a shared responsibility model.
However, all the above won’t mean much if an organisation’s third-party supply chain can push changes straight to their production environment through auto-updates.
To prevent issues as in the CrowdStrike outage, organisations must implement rigorous testing throughout their environments and release cycles.
In collaboration with business and service leads, define a structured testing life cycle that ensures test scopes and templates are reviewed based on their intensity of use and impact of failure to mitigate a vulnerability.
This will vary business-to-business, service-to-service, but will significantly reduce your organisation’s chance of exposure.
Combined with a formal approval process and effective rollback plan, you can then be confident that every software release has been thoroughly vetted, to mitigate against disruptions and maintain the integrity of your IT environment.
Design operational resilience into your organisation
Operational resilience should be front of mind for many organisations in the wake of the IT outage, as it exposed just how inter-dependent and fragile our systems really are.
There are a few priority actions companies can take to bake resilience into their operating model and resources.
As a starting point, work with all teams responsible for ensuring resilience to encourage everyone to adopt an operational resilience mindset, with a shared idea of what ‘success’ looks like.
Resilience needs to be proactively woven into firm-wide processes.
For example, selecting multiple vendors can distribute risks across multiple suppliers and reduce the risk of supply chain failures for your company.
For extra support and collaboration, organisations should build trusted partner relationships that share risks, through the creation of shared vendor strategy blueprints aligned to resilience.
Implementing redundant systems can also help maintain continuity (and thereby resilience) in key operations during a disruption.
Measuring the impact of the investment made in resilience is useful in articulating why further investment in resilience may be needed.
Calculating the potential value at risk from an incident, whether that’s revenue or reputation as part of an organisation, helps to measure the tangible impact of resilience measures.
For long-lasting success, companies need to continuously monitor and improve in this area rather than treating it as a tick-box, one-off activity.
Speculative cyber-attackers
The global IT outage also showed us that threat actors are ready at the push of a button to capitalise on opportunities.
Criminals use the sense of urgency and panic that follows an incident to conduct further attacks in an attempt to access data or money.
Adopting industry-recognised security architecture frameworks, such as the NIST Cybersecurity Framework (CSF), provides a holistic approach to cybersecurity.
Applying appropriate controls for your organisation such as well-structured access management, applying the principle of least privilege and applying a robust DevSecOps framework for your software release cycle, combined with user training and awareness, can drastically mitigate any potential threat actor jumping on the opportunity.
Additionally, adopting a zero-trust architecture can significantly mitigate threats by ensuring that every access request is verified, whether from inside or outside the network.
This proactive approach ensures that threat actors, both internal and external, face multiple barriers to unauthorised access.
Plan and exercise for when it goes wrong
Regardless of whether it is a malicious attack or unintentional outage, the response is similar and organisations need to respond in a similar manner.
A tried and tested incident response plan, regularly refined through drills and simulations, ensures that the company is prepared for any eventuality.
Likewise, a well-defined governance structure, with clear roles and responsibilities, can make sure there is no single point of failure in recovery plans and can further enhance an organisation’s ability to respond swiftly and effectively to incidents.
Ultimately, to fail is not an option, so companies need to create a shared understanding and blueprint that guides all design and build decisions, while adopting “fail-safe” measures that limit the damage or propagation of a potential outage.
Measures such as wargaming and scenario planning support companies in navigating failure and practice recovery processes.
In today’s digital age, with the ever-growing need to outsource services to third-party suppliers, the importance of a multi-layered approach to cyber security risk management, security architecture and resilience cannot be overstated.
Comprehensive risk assessments, robust policies and procedures that feed into operational resilience strategies, meticulous testing and release cycles and recognised security frameworks, are the hallmarks of a strong security posture and resilience.
These measures ensure that even when unexpected issues arise, such as those seen with CrowdStrike, organisations can respond swiftly and effectively, minimising disruption and maintaining trust.