Operational Defect Database

BugZero found this defect 4252 days ago.

Veeam | kb1680

VSS Timeout when backing up Exchange VM

Last update date:

12/28/2021

Affected products:

Veeam Backup & Replication

Affected releases:

ALL

Fixed releases:

No fixed releases provided.

Description:

Challenge

The backup of an Exchange server VM fails with: Unfreeze error:[Backup job failed] Cannot create shadow copy of the volumes containing writer’s data A VSS critical writer has failed. Writer name: [Microsoft Exchange Writer]. Class ID: [{76fe1ac4-15f7-4bcd-987e-8e1acb462fb7}]. Instance ID: [{0db23250-4d1e-42c1-8d14-2be32f448184}]. Writer's state: [VSS_WS_FAILED_AT_FREEZE]. Error code: [0x800423f2].] If you run the command ‘vssadmin list writers’ on the Exchange server after the job fails, typically you will see an Exchange Writer has failed because of a timeout error (error code 9).

Cause

Starting in Veeam Backup & Replication v8To overcome this VSS limitation, Veeam Backup & Replication utilizes the Microsoft VSS persistent snapshots technology for backup of Microsoft Exchange VMs. If Microsoft Exchange fails to be frozen within the allowed period of time, Veeam Backup & Replication automatically fails over to the persistent snapshot mechanism. To learn more about this new feature please read:https://helpcenter.veeam.com/docs/backup/vsphere/persistent_snapshots.html "VSSControl: Failed to freeze guest, wait timeout" Refers to the limit imposed by Microsoft VSS writers on the duration of a freeze. This timeout is not configurable. Veeam uses VSS to freeze applications immediately prior to creating the VMware snapshot, and then sends the thaw command as soon as snapshot creation is complete. VSS will only hold a freeze on the writers for up to 60 seconds (20 for Exchange), so several steps must fit within this timeframe: Verification of freeze state1 Snapshot creation request via VIM API2 Snapshot creation on the ESXi host Return of snapshot information via VIM API2 Thaw request to Microsoft VSS1 Thawing of VSS writers’ I/O 1 If a network connection to the guest OS is not available, VIX API will be used, which introduces additional latency.2 These steps should usually be near-instantaneous, but if the vCenter is heavily loaded or has a high latency to the ESXi hosts, the delay may be significant.

Solution

This issue is an infrastructure issue that can be difficult to narrow down. The following is a comprehensive list of resolutions that customers have used to resolve the issue: First, make sure that you can create a windows backup of the VM using VSS. This will prove that the issue isn’t specifically VSS-related, but is instead a combination of VSS and VMware snapshot technology. Ensure that you have no other backup vendor agents on the server you are backing up and if you do, uninstall them. If you need to do VSS operations on a guest OS you should be doing this with only one backup product. Note that Veeam uses Microsoft VSS and other software vendors may use their own VSS providers/writers and that those backup solutions making successful backups are not a valid comparison. Reboot of the Exchange Server ESX(i) host not having enough resources VMware snapshot takes longer than 20 seconds (hardcoded Exchange VSS Writer timeout) Exchange freeze is too I/O intensive on the back-end storage, which may necessitate that the backup time and/or the datastore the Exchange server is located on may need to be modified. COM+ Event System Service may need to be restarted. Root cause unknown. In some cases, customers have scripted this service to restart prior to backup. The latency between VC and Hosts can cause backing up through the host directly to produce successful VSS backups whereas going through the VC causes freeze issues. If Veeam does not have direct network communication to Exchange, as a test, put Veeam on a network that does have network connectivity to Exchange and see if that resolves the issue. Direct network communication is not necessary however if underlying issues with VIX are occurring then we will try to use IP to communicate and in some cases this does not work properly because of the network architecture One thing that is extremely important if you are attempting to use "connectionless-mode" for VSS (i.e. if there is a firewall and thus we rely on the VIX API to communicate) is that you must make sure that the account being used for Application-Aware Processing is either the "built-in" local administrator or the "built-in" domain administrator (i.e. it must have a "well-known" SID ending in 500), other local or domain administrator accounts will not work. (See: KB1788) Ensure there is no snapshot present on the Exchange VM prior to the backup starting, as that could cause additional storage I/O. The exchange server may need additional resources (CPU/RAM) if it is taxed during the unfreeze.

Additional Resources / Links

Share:

BugZero® Risk Score

What's this?

Coming soon

Status

Solved

Learn More

Search:

...