Richard has been with the firm since 1992 and was one of the founding partners
Disaster recovery is a critical component of business continuity planning that encompasses various aspects of an
company's operations. This article will explore the key elements of disaster recovery, including physical
recovery, server recovery, process recovery, Cyber Security and testing.
Physical Recovery
In any disaster, the most important element is the physical aspects. Is your building compromised? Flood, Fire,
Theft, Vandalism? Any number of disasters can take out your base of operation, so
having a backup plan that allows for temporary relocation of staff and stock is an important component. This area,
more than any is very specific to the individual business. Some companies that
deal in digital goods, can easily transition to remote working, whereas a manufacturing company cannot. The key to
assessing this is to first write down all the possible risks, including
flood, fire, theft, explosion, gas leak, etc and then for each detail a plan for recovery. A flood for example,
would likely involve vacating the lower floor, moving stock to other floors and securing pumping
equipment to reduce the damage, but that's just a guide, you need to visualise the risk, and how to deal with it.
Utilities
How would you deal with a power cut that lasted a week? or no water, or no internet. Uncommon and unlikely you may
think, but actually it does happen and having a plan to deal with it is important. You could for example
find a company who can provide a generator on rent or lease, a company who can tank water in, and a company (like
GEN) who can provide temporary wireless or satellite internet on demand. Simple solutions that should be
considered and documented. You should have the providers contact information, the equipment required and the
ballpark costs. For things like Power and Water, provisions should be made to connect it into the building.
Systems
Your computer systems probably drive the business in some way, and recovering from a system failure, even if its
isolated must be planned out. You may, for example take daily backups of data, and that's great - you would
be surprised how many customers come to GEN because they don't have a backup and something went very wrong.
Backups alone however are insufficient. Let's consider a few possible scenarios and then work through how we'd
protect the company:
Server Failure
A surprisingly common one, the server is dead and it's not coming back no matter how many times you switch it off and
on. If you have a maintenance agreement (or even if not) then contacting your
service provider should be the first step. Your provider will despatch engineers to site to repair the server, and
restore the software, but let's complicate it a little by assuming that there's a hardware failure
and parts aren't immediately available, as happens. What if the parts are 3 weeks out? This is of course promoting
the value of cold spares, which is as it sounds, a spare server setup with
all the software, that's on a shelf at your provider. GEN have currently about 450 servers on the shelf for our
maintenance customers as part of their disaster recovery programmes, each able to be delivered to site, same day
and setup.
Data Loss
Whether its a software failure, drive failure, or a randomware/virus attack, losing data is another surprisingly
common occurrence. Backups are the obvious solution, but a 'backup' can come in many flavours, some better
than others. For example, a database backup can be run daily, and taken off-site as is recommended, but what if you
don't realise there's a problem for a week? now your backup, and the other daily backups are useless, meaning
that you'll need to restore a week or a monthly backup. For a busy business this can mean thousands of records lost
and that's unacceptable. The fact is, for many software and system failures, they aren't evident immediately
so we need to plan for that. Replication is one way to circumvent this, as is transactions and versioning. GEN
frequently use off-site replication as a way of ensuring that data can be restored and rolled-back as needed.
Whatever the plan,
backups should be many and manifest. You can never have too many copies, only too few.
Process Recovery
Software is fantastic, it takes data in, processes it and spits data out. It can automation all areas of a business,
from quotations, sales, procurement, delivery, accountancy, and more, and this truly
invaluable software is only invaluable until it breaks. In a ranking of disaster recovery scenarios, software or
process failure ranks highly. This can be anything from a missing sales order to a complete
collapse of automation, and whilst a restore 'might' fix it temporarily, a process failure is a software failure and
it will re-occur again and again. It could be that a counter has exceeded its maximum size, or
that logfile is full, or a partition is out of space, or the list goes on and it's almost endless. You must have a
way to recovery from process failure, and this may mean having a manual process, a way to
generate paperwork manually, and to process business functions without the aid of the computer in the short term.
The excuse "We're sorry we didn't ship it, we have a computer failure" is used far too often and carries little
sympathy from your customers. Think about how you would survive if you just went in and switched it off, with all
those blank screens, how would the staff react? In most companies they would really just sit there and wait to be
told what to do, and
your disaster recovery plan needs to include, what to do.
Espionage & Cyber Security
Sadly, we do get involved with cases of internal or external data theft from companies, and are involved in forensic
analysis of data breaches and data theft on a regular basis. In most cases it's avoidable, and only occurs
due to bad or poorly maintained systems, security and processes. I know its unfamiliar, but from a disaster recovery
point of view you need to treat every employee as a potential threat. Assess each, and ask yourself what
would be the maximum amount of damage that employee could inflict. The guy who packs boxes, probably not that much,
but senior management, and IT is another story. For each potential risk, find a solution, perhaps restricting
access to sensitive systems, and reducing the number of records available in any report. You would be surprised how
many companies have zero access control, and everyone from the director to the storeman have full access to
everything - don't be like this.
Training
Your staff are the weakest link in many aspects of disaster prevention and recovery, you can plan for most things but not for people
because they are inherently unpredictable. GEN, as part of our cyber security services provide training to users
on how to protect the company from email, phone and physical threats, and how to effectively handle a crisis. We perform training on average 4 times a year,
and then we 'test' the training twice a year. Why do we have to 'test' you may ask? because people fail, even
when they know not to click a link or give up passwords over the phone, or leave visitors unattended, they do, and
they do it again and again. Training alone is insufficient, you need to test, identify the people who fail, and then
hammer it home until they don't.
Policy
Have a rigid and regularly reviewed network security policy, that ensures all networked devices are secure, and that
gateways are secure (We have a free checklist in the Downloads Section). If you have *any* cloud based services, they are a target and risk and must be properly secured. Ensure email is properly protected with antivirus and antispam, and
limit external email to only those who must have it. If you're using windows
then proper endpoint protection is a must, and make sure its regularly monitored and audited. A third party provider
can help with much of this, and GEN have a range of cyber security services to manage most of this, but
even when its fully outsourced, it is still a vital component of disaster recovery that YOU must ultimately be responsible for.
Attribution
For every scenario, assign a team with a team leader, this team will be responsible for that part of the recovery
plan, and its vital that everyone in the team knows their role and responsibilities. You will likely
have to adjust team membership and responsibility during the first few tests, but this is just part of the process.
Measure the performance of each team as a whole, and then each member to identify weak points and always ensure
there is
redundancy.
If you have no redundancy, then when the server is on fire, the team leader will be in Ibiza, I guarantee it.
Test, Test and Test again
Having spent weeks developing a comprehensive disaster recovery and business continuity plan, you must test it, test
it all, and then test it regularly and be creative.
I cannot stress enough how important testing is, even if you've thought of everything, I can guarantee you haven't.
Engaging an external provider to assist with the testing is one way to
generate failure scenarios that you might not have considered, enabling these to be included in future. You need to
be 100% confident that no matter what, there's a plan.
GEN
GEN have been helping customers with business continuity and disaster recovery for more than 30 years, and we've seen
it, done it and fixed it all. If you need help, we're here, even if its just to review the plan and point out
any possible weaknesses. Remember, the first hour is always free with GEN.
26 Votes
Comments (3)
Martin C
· 2024-07-12 17:19 UTC
Yeah we really need to do something like this, never had a problem but just reading some of these things it makes me wonder. We should at the very least have some sort of plan if the system goes tits up and now Im thinking about some of the staff who could do real damage if they wanted to, not that they would but people are unpredictable as you say.
Andrew M
· 2024-07-11 11:10 UTC
People are the weakest link, that is so true!! Nice article btw.
Ronny A
· 2024-07-10 23:27 UTC
Well, not just another AI generated bullshit, actually made me think about aspects of continuity that had completely passed me by, so thank u.
×
--- This content is not legal or financial advice & Solely the opinions of the author ---