SnapBack

A Python script for creating and maintaining a set of backups of Amazon Web Services Elastic Block Store volumes on the Simple Storage Service

Amazon Web Services (AWS) offers a facility to conveniently take snapshots of  Elastic Block Store (EBS) volumes which are saved to its Simple Storage Service (S3).  No additional organization of snapshots in directories is offered natively nor  is it possible to name snapshots. In order to use the snapshots in a typical backup scheme, it is desirable to group them by frequency (e.g., yearly, quarterly, hourly) and to be able to specify for how long to retain snapshots  of each frequency.

Files of snapshot metadata are maintained, using this script, which group the snapshot metadata by frequency and delete snapshots which have aged beyond their retention specifications.

Snapshot metadata are kept in any of the following frequency files in directory ss_path: yearly, quarterly, weekly, daily, hourly, minutely. Each  frequency file may contain metadata for more than one volume_id.

When this script is called, an EBS snapshot is taken and the metadata returned by ec2-create-snapshot are captured.

If no frequency files are present in the directory, no other action is taken (i.e., if there are no frequency files, no snapshots will be deleted as new ones accumulate). Since snapshots will not be deleted, if this script continues to be called, since no snapshots are being deleted, eventually the EBS snapshot quota will be reached.

If there are frequency files, the time from the current snapshot is compared to the time of the most recent snapshot of volume_id in each of  the frequency files present. The current metadata are appended to the appropriate longest frequency file. For example, if it has been 14 weeks since the most recent snapshot in quarterly, 6 months since the most recent snapshot in yearly and 30 minutes since the last snapshot in hourly, the current snapshot will be appended to quarterly since it has been more than 13 weeks between the current time and the most recent quarterly snapshot.  It is not appended to yearly since it has not been a year since the most recent yearly snapshot.

After the current snapshot is appended to a frequency file, the metadata in all frequency files are checked and any which are older than the retention periods specified in the command line arguments (or defaults) are deleted  (using ec2-delete-snapshot) and their metadata are deleted from the frequency files. A retention option of 0 means that no additional snapshot metadata will be stored in the corresponding frequency file and none of the snapshots in that file will be deleted.

The default retention times are 5 years, 2 quarters, 26 weeks, 14 days, 120 hours and no minutes for a maximum total of 158 snapshots.

In typical use, this script would be called by cron at the frequency of the highest frequency file. It could, however, be called more frequently than that with no ill effects. In that case the snapshot metadata are saved to  the highest frequency file. For example, if it is called every 5 minutes and there is an hourly file but no minutely file, metadata from the snapshots will be saved in the hourly file every 5 minutes. This behavior  subverts the notion that each frequency file corresponds to snapshots taken at that frequency but it is better than the alternative of not taking a snapshot when the user believes that one is being taken.

In order to set up metadata storage files to save backups in the default  frequencies, could do, from the appropriate directory:

$ touch yearly quarterly weekly daily hourly

 SnapBack.py