Online backup strategies overview:
So what is an incremental backup solution,
and why should one consider using it over a regular full online backup?
· “The goal of an incremental backup is to back up only those data blocks that have changed since a previous backup” (taken from Oracle Documentation)
· Setting up and running incremental backup is extremely efficient when approximately daily data modification is significantly less then a size of a whole database.
Incremental backup is a complementary feature for a base backup implementation and it relies on a common online backup functionality, like switching database into a backup mode to start writing changes into WALs instead of data files.
Extending online backup functionality with an ability to calculate and track checksums on data-blocks in the data files,
Another important part of an incremental backup tool is a function, or set of functions, responsible for identifying and storing delta for every binary change in the database files, modified since the last run of a full hot backup.
Last part is identical to traditional hot backup, and it would be storing corresponding archive logs for a time period, when database was is in a backup mode.
Now let’s bring up a real-life example of the typical point-in-time recovery process.
And compare recovery out of a regular full online database backup
to PITR from an incremental backup.
Usually, in order to perform a point-in-time restore, DBA will have to copy datafiles from backup location first, then start a recovery process, which will process generated archive log files (WAL)) to apply changes sequentially on top of the datafiles, up to desired point in time, defined by DBA.
Sounds good. But let’s imagine a situation, when we are running a full hot backup once a week on Sundays, and then, for some reason, we’ll need to restore from a backup, on Thursday or Friday.
Copy process for both datafiles and WALs from a latest backup will not be big deal.
But second part of recovery process involves applying changes for all the transactions no the database from the log files may cause significant delay for PITR to complete.
Depending on I/O subsystem performance, daily system load and amount of hourly-committed transactions, time ratio may be up to ‘one to one’ for the WAL files generation versus recovery. In other words, Production system will have to remain down for up to next 3 to 4 days, before recovery process will get to completion.
And now let’s enable incremental backups and look at recovery process,
· Initial file-copy stays in place,
· Applying binary delta only once from the closest nightly incremental backup
· run a PIT recovery process for a rest of the WAL files (since the last incremental backup and up to desired point in time)
For the given example applying binary delta means shortcutting (or skipping) up to 4 days of a WAL files recovery
and going straight to the point in time when a last incremental backup has been executed,
Incremental backup runtime and size of delta in data files are in a direct proportion. Backup is even more efficient, if the delta is coming from ‘update’,
rather from ‘insert’ and delete SQL statements (true for most of the OLTP systems design).
However, Incremental Backup may be less efficient for OLAP and data-warehouse data-stores, in cases when the whole database is getting refreshed on a regular basis.
With an incremental backup feature enabled,
PITR becomes a matter of hours and minutes instead of days
for a huge in size and heavy-loaded OLTP systems.
Use-case: How to choose an optimal backup strategy for a highly-loaded ERP Production database:
Size : 1-2 TB Database,
Load : 100-500 concurrent users 27x7 Global Live Production system
Daily Changes: approximately 10-15% of data in tables.
· Full database hot backup is taking longer then 8 hours.
· Running a full online backup nightly causes severe performance degradation noted by users, and affects scheduled databases jobs
Solution (high level action plan):
Implement and enable incremental backup schedule on a database server by:
· running weekly full online hot-backups ~8 hours,
· scheduling incremental backups to run nightly ~1 hour
· hourly WAL backups to be scheduled to archive and purge older log files