There is no doubt about the importance of database backup, it can be said that it is the last line of defense for data security. In view of this, for backup, we usually do the following requirements:
Deploy in multiple locations
For the core database, we usually have the deployment requirements of two locations and three centers. The same goes for backups. A backup should have multiple copies, each stored in a different region.
Multimedia deployment
Multiple copies of a backup should be stored on different media, such as disk and tape, to prevent failure of a single medium.
Regularly check the validity of backups
Backups are just doing the right thing, and whether things are done right or not depends on the validity check of the backup.
The first two items above are recommended to be done when conditions permit. The third must be done.
Next, this article will continue to introduce the related topics of backup, mainly including the following aspects :
- Common classification of backups.
- How to check the validity of backups.
- RTO and RPO
Common classifications of backups
Physical backup
As the name implies, it is a backup of physical files. Its advantages and disadvantages are as follows:
The advantage is that the backup and recovery speed is fast. In particular, the recovery speed is directly related to the RTO of the database service. No instance is required to be online. When the instance is shut down, the files can be copied directly without worrying about the consistency of the backup. Shutting down an instance for backup, also known as “cold backup”.
The disadvantage is that the backup file is large , and when restoring, there are requirements for the platform, operating system, and MySQL version, which must be consistent or compatible. And backups can only be initiated locally. Because the physical file is copied, even if there are many “holes” in the file (caused by a large number of DELETEs), it cannot be shrunk by recovery. There are requirements on the storage engine of the table, and the MEMORY table cannot be backed up.
Logical backup
Logical records for the backup table. Its advantages and disadvantages are as follows:
Its advantage is strong portability. When restoring, there are no requirements for the platform, operating system, or MySQL version. More flexibility , especially when restoring, only one library or one table can be restored. There is no requirement for the storage engine of the table, and any type of table can be backed up. And the backup file is small , and the backup can be initiated remotely . After recovery, the space can be effectively reduced.
The disadvantage is that the backup and restore speed is slow. In fact, in terms of backup speed alone, multi-threaded backup is actually not slow. But the recovery speed, even multi-threaded recovery, is very slow. Backup will “pollute” the Buffer Pool , and business hot data will be expelled from the Buffer Pool by the backup data.
Offline Backup vs Online Backup
Offline backup, also known as “cold backup”, is the backup performed when the instance is shut down. At this time, only physical backup can be performed, that is, full copy of physical files.
Online backup, also known as “hot backup”, refers to the backup performed during the running of the instance. At this point, both physical and logical backups can be performed.
Due to less business intrusion, online backup is generally used online.
Full backup vs incremental backup
Full backup means backing up the full data of the entire instance.
Incremental backup, that is, backing up only the data that has “changed” since the last backup.
Generally speaking, it is relatively simple to implement incremental backup based on physical backup. Taking MySQL as an example, it is only necessary to determine whether the LSN of the data page has changed . For logical backup, it is difficult to implement, such as the common incremental backup based on a certain time field, but in fact, it is difficult to ensure that the data before a certain period of time is not modified or deleted.
RTOs and RPOs
When measuring the disaster recovery capability of a data center, there are two commonly used indicators:
RTO: Recovery Time Objective, which means that data must be recovered within this time after a disaster occurs.
During this period of data recovery, the service is unavailable, so the RTO is also the maximum allowable unavailability time for the service. If we ask that the maximum unavailable time of the service is 30 minutes, then the RTO is 30 minutes. The smaller the RTO, the stronger the resilience of the disaster recovery system.
RPO: Recovery Point Objective, which refers to the point in time to which data can be recovered after a disaster occurs.
For example, I have a system that does a full backup every day at 0:00. When the system fails, it will be restored based on the last backup. If the system fails at 3am, we lose 3 hours of data. In extreme cases, the system fails at 23:59 and we lose 24 hours of data. The 24 hours here is the RPO of this system. The smaller the RPO, the better the system can guarantee the integrity of the data.
Ideally, both RTO and RPO are 0, which means that in the event of a disaster, the system will recover immediately and no data will be lost. Of course, the smaller the RTO and RPO, the higher the investment cost.
Specifically in MySQL, in order to reduce RTO and RPO, we can start from the following aspects:
RTO
- Increase the backup frequency and shorten the backup cycle.
- Choose physical backups instead of logical backups.
- Add delayed slave library.
- Automation of recovery processes.
RPO
- Increase the backup frequency and shorten the backup cycle.
- Build Binlog Server to backup Binlog. When a failure occurs, we can do point-in-time recovery based on backup and Binlog.
- Add delayed slave library.
Summarize
From an RTO perspective, physical backups should be selected as far as possible over logical backups. If you want to use logical backup, you should try to choose a multi-threaded backup tool and a multi-threaded recovery tool. From the perspective of RPO, the backup frequency should be increased as much as possible and the backup cycle should be shortened.
But every coin has two sides, using physical backup or increasing the backup frequency will undoubtedly increase the storage cost. Therefore, when determining a backup strategy and selecting a backup tool, the RTO and RPO of the business should be taken into consideration, combined with the storage cost.
Most companies will adopt a unified backup strategy, such as one full backup per day. In addition, for the online core business, if there is only backup, it is still difficult to effectively reduce the RTO and RPO of the database service. It is recommended to deploy the delayed slave library.