Subscribe to GEN
Login to GEN
Add a Comment
In virtualisation, there are numerous strategies for backing up and ensuring a safe disaster recovery process, but there are also numerous caveats and considerations. In this article we'll discuss many of the common ones.
A Virtual machine, is a full system running in a virtual environment, as if it were running on a dedicated server. A container is a an isolated (usually) subset of a system running on top of a base system.
Not all virtualised services are equal, and its important to consider what is running in order to correctly select the most applicable and safe backup strategy. Incorrect strategies can in most cases still provide protection, but the reliability and efficiency varies widely.
Taking the time to carefully select a stratgey will improve restore effectiveness and reduce the time to recovery in almost all cases, and ultimately, a backup is only as good as its ability to restore from a failure.
Including web servers, and any server that simply serves files in response to requests. Think a basic webserver, a wiki like Dokuwiki, a SMB/AFP/NFS file server, these are basic file servers and have nothing dynamic or time sensitive. This does not include anything database driven.
Includes MSSQL, MySQL, MariaDB, Oracle etc and provides a database service either locally or remotely. Most database servers have mechanisms to prevent data loss, but these rely on the stability of the disk subsystem.
Often running applications that include a database, some dynamic code or applications and associated engines/runtimes. These can be complex and involve many integrated components such as databases, keyservers, CAs, communication layers, and AI.
The first decision is whether to backup the content of a virtual machine, or the image of the virtual machine. Essentially, the image involves a replication of the complete virtual machine including everything, whereas a content backup can be everything, or just a selection of things within the Virtual machine.
As an example, when considering a content backup, we could backup just the data directories from a file server, or run a local database backup and then backup that backup file, or backup the entire machine, but there are caveats with this.
A database for example, stores its data in files in the filesystem, but you cannot successfully just copy the files to a backup with the database running and have it restore. This is because the database, when in use is constantly updating these files. Instead for a database, you need to use its 'backup' function to create a backup, then back that up instead.
Content backup's are much smaller, and quicker (generally) and provide a faster restore as well as selective restore. That is, from a content backup we can restore just one file, or a directory, or all of it, whereas with an image backup, we can only restore the entire image.
When creating an image backup of your virtual machine, you have several options, each with their specific advantages and disadvantages. Image backups are slower and complete, but are low effort so we'll discuss them below
This is the safest and most robust backup for a virtual machine, and a successful restore is guaranteed. This backup shuts down the virtual machine, then creates an image of the configuration and disks. This is probably the most common form of backup in companies who need a reliable image, and its what we use internally unless requested otherwise. There will be a small amount of downtime, but only a few minutes day usually.
This creates a snapshot of the virtual machine, and then backs that up. This is very commonly used by many providers offering included backups, and its fairly reliable, but never 100%. A snapshot relies on the OS being quiescent at the time the snapshot is taken, and for many systems that's fine, but for heavily used machines this can fail to capture a fully functional backup (think databases, we can't just copy the files), so this should always be supplemented with a content backup for the risks like databases, etc.
Suspend, freezes the virtual machine in time and then takes an image. This provides a sometimes cleaner image than snapshot, and can produce a very restore-able backup, but, some virtual machines can crash when the suspend operation finishes. This is because realtime applications loose time, suddenly when its expecting to start a loop 5ms after the last, ten minutes have passed. For a file server, it will make no difference but for something like an automation system it does matter, and there's no real advantage over STOP.
Some systems, allow for images to be incremental, that is, during the backup process the current image is compared to the last backed up image, and only the changes are backed up, and this is perfectly fine as a system but its important to understand that the last backup is not all that's needed for a restore. You need the master backup, and 1 or more incrementals to affect a full restore.
When looking at content backups, its important to understand what's changing on any specific server. Sure the OS will change over time, but not significantly, so look at what applications you're using.
For a file server, you want to be backing up the files, and that can be done incrementally leading to one master backup, and a series of very small incrementals.
For a database server, you'll need to use the tools included with that database engine to perform a backup, then backup that backup file. If you're using MySQL/Maria for example, then its mysqldump/mariadb-dump to create a .sql file, then why not compress that with something like 7z, then back that up. Here's a simple bash script to backup a MariaDB server as an example:
#!/bin/bash _now=$(date +"%Y-%m-%d") _d=$(date '+%Y%m%d'); _path="/home/backup/" _file="mariadb_$_now" _pass="YourPasswordHere" _databases=`mariadb -e "SHOW DATABASES;" | grep -Ev "(Database|performance_schema)"` echo "Begin Database DUMP\n" for db in $_databases; do _filename="${_path}${_file}_${db}.sql" echo "${db} " mariadb-dump --databases $db > "$_filename" done echo "Compressing ${_path}${_file}*.sql\n" 7z a -p"${_pass}" -sdel -y "${_path}${_file}" "${_path}${_file}*.sql" echo "Compressing /root/BACKUP${_d}.zip\n" 7z a -p"${_pass}" -y "${_path}${_file}" "/root/JIGSAW_${_d}.zip" echo "Compresson Complete, now transferring\n" mount -t nfs 10.1.1.1:/volume1/BACKUP /backup if [ $? -eq 0 ]; then rsync -avz --delete ${_path}*.7z /backup umount /backup echo "Transfer Complete" else echo "Transfer Failed because Mount failed" fi echo "Housekeep Files" find "${_path}" -name '*.7z' -type f -mtime +2 -delete rm -f --verbose "${_path}*.sql" echo "Backup Ends"
In this example, we perform a backup of all user databases, we compress it with 7z and then we upload it via NFS to a remote server, finally purging old backups older than 2 days from the local server. This keeps two previous days backups locally, and unlimited backup's on the remote server. This is just an example, please don't use it as-is but understand the method and thinking behind it. In this case, the virtual machine is only a database server, so all we need is a periodic backup of the virtual machine, which is taken monthly, and these daily backups to give us a 100% restore position.
Using Windows Shadow snapshots is often the method used by content backup software to create a snapshot in time backup, but beware this can create backups that will not correctly restore especially where databse servers are in use.
There are numerious open source projects for backup and restore, and I'll list them here in no particular order. Be aware that these are not the only projects and I have no affiliation with them so do your own research and choose wisely.
Open source software has many advantages over closed source propriatery software, the most obvious one being price. Open source provides full source code, and allows freedom in its use, but support is limited. Having said that GEN support customers using open source backup solution, as do many others.
There are many vendors offering backup software in this space, and that's a great option if you're not able to write script or understand at a detailed level what changes, and don't want to use open source, but there are considerations here.
Proprietary Software, will create backups in an internal format, which requires that software to restore. So many times do we have a customer who's created a backup with xxx software, but the xxx software was on the machine that was lost, and they can't find their license keys, or worse the version of the software is no longer available and the updated version that they've been forced to buy won't accept the older backups. I'm not going to name any specific vendor, but you know who you are.
The other issue with proprietary software, is handling corruption especially with using tapes. If you have a tar or 7z archive then it can be reconstructed or refactored even with corruption, but a proprietary system is most likely to simply refuse the restore, and to cater for this scenario you need to ensure you're taking more backup's, double backup's or even triple.
Common propriatery backup software, in no specific order and with no affiliations (and definately no affiliate links). The licensing structures for some of these are incredibly complication and very few have pricing on their websites for this reason.
There are several 'accepted' policies for backup's, the oldest of which is Grandfather, Father, Son (GFS), which now days is probably called Grandperson, Parent, Child.
We take the grandfather monthly, the farther weekly, and the son daily. This means at any point in time you will have at least 3 backups to choose from, and depending on how far you want to go back, you can keep grandfather backups for an extended period. The grandfather for example, will have backups from this month, last month, the month before, the month before that. The father backup will have this week, last week, the week before upto 4 weeks, and the son backup will have yesterday, the day before up to 6 days.
In forever forward incremental, we take an initial complete image of the system, and then stores changes on an incremental basis. This saves the need to have multiple full images, but it can significantly increase the time and resources needed to complete a full restore to any point in time, since we need to restore the full image, and then restore every incremental change after that. Apple's TimeMachine uses the Forever Forward policy.
It is always important to secure the backups in more than one place, and these policies are effectively defining this. The first digit is the number of backups, the second is the number of storage mediums, the third is the number of backups stored off-site, and the forth if any signifies an air-gapped or physical off-site copy such as a portal hard drive or tape. The zero sometimes found on the end signifies that backups should be verified and tested, but this is very obviously the case for all backup stratigies.
When taking full images only, you can specify a retention policy for each backup, for example keeping all backups for a most 90 days, or keeping all backups for 7 days except for one weekly that is kept for 4 weeks and one monthly that is kept for 6 months. Backup software usually allows for configuration of this, or you can write your own scripts to achieve the same.
What is vitally important is that the backups are stored disparately. That is, You do not keep the backup on the same physical site as the source. You would be surprised to hear that we sometimes come across a HelpDesk ticket where a customer has a backup, but it's on the same physical server that is now dead. Keep a copy locally for sure, but always off-site it either to another site, to the cloud or via physical media.
Despite what anyone may tell you, a backup policy is unique to each scenario. Don't just blindly pick one of the above, but consider the restore requirements. Will there ever be a need for a part restore? Do you really need 30 copies going back 10 years? Maybe there are regulatory reasons that need to be considered, or maybe the contents of the backup can easily be rebuilt from other sources. Take time to consider the requirements and then select a policy that best suites the use case.
Some people may feel that having a single backup per period is insufficient, at which point you can take two backup's per period, storing those in two different places, OR, take one backup and replicate/copy that to a second location.
This avoids the risks associated with single point of failure. If, for example you're storing your backups on your NAS or SAN, and you have a fire.......
Cloud backup (and you know we offer a wide range of such services) is a great place for a second copy, its slower for sure but its safe and off-site. You always need to have a local copy, and you should always have an off-site copy. If you don't want to use cloud, then use portable hard disks or tapes and have someone alternate them ensuring one is always off-site (or three are always off-site).
Be very aware of security, since your backup's contain sensitive information. Locally make sure backups are stored in folders with restrictive permissions, and when using external media, make sure they are in a lockable fire-safe when not in the server, and finally for cloud backups make sure you are using strong encryption that requires you to enter a password or key for every backup. You cannot trust a cloud provider, even GEN to be 100% absolutely secure. We are, most are but never just rely on that.
GDPR, yes I know everyone hates GDPR but it does have a role to play here, and backing up data to a cloud provider CAN mean your data is leaving the UK at which point you have a responsibility to ensure your data-subjects are aware and have consented to this. Ultimately, just use a British cloud provider to avoid any additional administrative workload, but if you cannot, make sure your privacy notice has provisions and ensure all your data subjects are notified.
Choosing what and how you backup it VITAL to having a strong recovery position, and many companies don't invest enough time into getting this right, and unfortunately arrive at the HelpDesk in a crisis. Please take make the time to get this right, and always TEST your solution regularly, and audit its effectiveness and security.
If you need advice, remember the first hour is always free at GEN, and many backup software providers also offer free consultancy.
--- This content is not legal or financial advice & Solely the opinions of the author ---
Index v1.030 Standard v1.114 Module v1.062 Copyright © 2025 GEN Partnership. All Rights Reserved, Content Policy, E&OE. ^sales^ 0115 933 9000 Privacy Notice 66 Current Users, 193 Hits