VMware ESX Backup fails with fault detail “FileFaultFault”

This article is for Amanda Enterprise (AE)

Issue Symptom

FAILURE DUMP SUMMARY:
127.0.0.1 "\\esx-hostname\datastore\vm-name" lev 0 FAILED [missing size line from sendbackup]

Issue Description

  • You see “missing size line from sendbackup” in the FAILURE DUMP SUMMARY for VMware ESX backups.
  • In the FAILED DUMP DETAILS, you see “FileFaultFault“, as in the example below:
FAILED DUMP DETAILS:
/-- 127.0.0.1 "\\esx-hostname\datastore\vm-name" lev 0 FAILED [missing size line from sendbackup]
[...snip...]
? SOAP Fault:
? -----------
? Fault string: Error caused by file /vmfs/volumes/7d6d855e-78934c23-9e89-4b8d90bd2bd0/vm-name/vm-name.vmdk
? Fault detail: FileFaultFault
[...snip...]
? Exit status <255> while running </usr/lib/amanda/application/amvmware-helper --operation get-changedAreas --server esx-hostname --username esx-backup-user --password ********** --vmname vm-name --snapshot-name zmanda-YYYY-MM-DD-hh-mm-ss --change-id * --disk-device-key 2000 --metadata-file /tmp/amanda//BackupSet-datastore-vm-name-backup/vm-name.meta>
? Application (14876) amvmware returned 1
? dumper: strange [missing size line from sendbackup]

Resolution

VMware ESX returns the “FileFaultFault” error when Changed Block Tracking (CBT) attempts to analyze the Virtual Machine’s VMDK file and encounters an error. To resolve this issue follow the Reset Procedure in Enabling Changed Block Tracking (CBT) for VMWare Guests

If these steps do not resolve the issue and the datastore is on a non-local filesystem (e.g., NFS), there is a more exhaustive procedure described in the “NFS_2_local_datastore_move.pdf” file attached to this article. Again, it is critical to follow the steps given verbatim. If you have access to VMware vMotion, please do not use it during the process.

CBT on NFS

The FileFaultFault error can occur on VMFS datastores, but it happens more frequently on NFS datastores. VMware does not officially support CBT on NFS datastores when the “*” query is used, as in “--change-id *“. This is explained in VMware’s VDDK API documentation:

http://pubs.vmware.com/vsphere-50/topic/com.vmware.ICbase/PDF/vddk_prog_guide.pdf

The following restrictions are imposed on the “*” query when determining allocated areas of a virtual disk:

  • The disk must be located on a VMFS volume (backing does not matter).
  • The virtual machine must have had no (zero) snapshots when changed block tracking was enabled.

Despite this limitation, we have seen many customers successfully use CBT on NFS, and the FileFaultFault error is almost always resolved by either the procedure described in this article or by the longer procedure in “NFS_2_local_datastore_move.pdf“.