Tuesday, October 28, 2008

Unique backup names

Linux provides rich tools for automating many tasks. One of them is the backup of important system files or databases. Though rich enough, the task of using these tools efficiently is left to the system's administrator creativity. One of the problems, junior system administrator encounter in backing up is the file name to use. If the process of naming backup files is done manually, then this post is not of any help. However, if one thought of automating this task, then continue reading.

Let me give a simple scenario on how this problem occurs. Consider, for example, the command in backing up a database in PostgreSQL:

/usr/bin/pg_dump -h localhost -p 5432 -U postgre -F c -b -f ~/db_name.backup db_name

the name that follows ~/ is the name of the backup name. If you are going to add this entry to anacron (or any scheduling software you like), at any specified period the db_name.backup gets overwritten every time anacron is executed. Thus, if you want to study the past contents of the database, it can't be done because what you'll have is the latest contents of the database. If you want to study the progress of your database it might be good that you'll have different filenames for your database backup file. But, the question is 'How'? If you are a senior system administrator, you might already know the trick. But here's my way of approaching the problem.

The tools that we will be needing are: date, xargs, pipeline (|), and some simple shell script. For more information about these utilities, you can use google.com. :D

The date utility will be used here as an id generator. For me, the date tool can provide enough information that can tell us the date and time (hour, minute and seconds) where the backup file is created. The date command that we will use here is:

date +"%m%d%y%H%M%S"

which gives the output:

102808212438

The 10 represents the month (10 for October), 28 represents the date of the month, 08 for the year, 21 the hour (9 pm for 24 hour format), 12 the minute and 38 the seconds. So, with this simple command we can generate a unique id.

With this at hand, we can now have a filename that can contain this id. For example,

db_name102808212438.backup

Every time anacron runs, we'll have a different filename and no file gets overwritten (of course, we expect that the backup parameters in anacron is not set to run every half a second. Otherwise, the problem that we outlined would again occur. But who would do such a thing? Backing up a database in n micro seconds???)

Ok, so we have now a means to have a unique filename what's next? What we need to do is to create a simple shell script that will accept our unique id. The contents of the script might look like this:

/usr/bin/pg_dump -h localhost -p 5432 -U postgre -F c -b -f ~/db_name$1.backup db_name

And since we need to put this in anacron, we need to create another script that will pass our id to the script file (if you know how to do these things requiring only a single script, your comment will be appreciated). For the purpose of illustration, suppose that the name of the script containing the instruction above is backupdb.sh. We then need to create another shell script that will pass the id to backupdb.sh, the shell script must contain:

date +"%m%d%y%H%M%S" | xargs sh backupdb.sh

And that's it, saving the shell script and including it in the entry of anacron will solve the stated problem.