Restore of a few small files from a large backup take a long time

Issue Symptoms

Restore of a small file or directory (e.g., only a few megabytes large) takes a long time from a large filesystem backup (e.g., a directory containing 1 terabyte of data).

Issue Description

Amanda stores filesystem backups in either of two general image formats:

  • UNIX/Linux filesystem, NFS, and CIFS backups are tar archives, usually GNU tar.
  • Windows NTFS backups are based on the ZIP64 format.

Both formats restore by starting at the beginning of the image and streaming through it serially, whether you are restoring the entire image or only selected files/directories. Therefore, the time to restore files can be similar to the time it took to create the backup image.

For restoring only selected files, tar or the Zmanda Windows Client must still start stream through the entire image from beginning to end, restoring your files when it finds them.

Resolution

To reduce the restore time for selected files, the backup time must be reduced. One way of doing this is to split it into multiple, smaller objects that back up different parts of the original, large object.

Consider the following example:

$ ls /big/data
dir1/
dir2/
largedir1/
largedir2/
file1
file2
  1. Create a backup object for ” /big/data” and exclude the large subdirectories (“./largedir1” and ” ./largedir2“). Please see the following documentation for details about exclude syntax:

    http://docs.zmanda.com/Project:Amanda_Enterprise_3.3/ZMC_Users_Manual/Backup_What#Exclude_Specifications
     
  2. Back up each sub-directory as a separate object:
    • /big/data/largedir1
    • /big/data/largedir2
  3. Or (UNIX/Linux, NFS, and CIFS only) create multiple /big/data backup objects with different aliases (Backup| What > Advanced Options > Alias), and use exclude and include properties. The exclude property can be set in Backup| What, but the include property must be manually added to the “/etc/amanda/<BackupSet>/disklist.conf ” file. Please see the following article for more information:

    http://wiki.zmanda.com/index.php/How_To:Split_DLEs_With_Exclude_Lists

By splitting the large object into smaller ones and restoring from the smaller object, Amanda can stream through less data for the restore process.