June 13, 2018  —  Acronis

VMware: "To Quiesce or not to Quiesce?"

Acronis Cyber Protect
formerly Acronis Cyber Backup
Other languages available: Español

A large number of articles have been written about taking virtual machine snapshots, but they all fall short, as they are very theoretical and complex. In this entry, we are going to cover the practical side of taking virtual machine snapshots on a VMware vSphere platform to use as backups.

Backup your virtual machine immediately

We will answer the following questions:

  • What is a snapshot in idle mode?
  • Why use it?
  • What problems can I get if I go into idle mode?
Acronis
instantánea de la máquina virtual

Use snapshots for backups

In the VMware vSphere environment, you can create a snapshot in two different ways:

  • A snapshot that includes the memory state of the virtual machine
  • And one with idle mode of the guest file system

When backing up a virtual machine that uses the VMware vStorage API for Data Protection, never use the first option that includes the memory state of the virtual machine. If the virtual machine has 8 to 16 GB of RAM or more, creating an incremental backup will take a long time, because it is too large (incremental backups also include the size of the RAM). In addition, you may run into other technical difficulties.

The alternative is idle mode. This is a more viable option because it involves preparing the guest operating system (primarily, the file system) for a backup.

What is idle mode?

As quoted from a VMware knowledge base in an article, "Idle mode is the process by which data on a disk is prepared for an appropriate state in which to backup. This process can include operations such as flushing the operating system cache buffer to the hard drive or other tasks specific to higher-level applications."

Unfortunately, this description does not explain what happens to the virtual machine during this process. This is what we want to investigate.

First, when you use a VMware Snapshot Provider Service in VM Tools, you initiate the process of creating a new Volume Snapshot Service (VSS) within the guest Operating System (OS). All VSS writers that have been registered, which you can view using the "vssadmin list writers" command, receive the request and prepare the applications for backup by writing transactions from memory to disk. After the VSS writers complete this task, they inform the VMware Tools Service, through the VMware Snapshot Provider, that the task has been completed and that the system is ready for a snapshot.

The software for VMware vSphere uses the following settings, when preparing for a VMware snapshot:

Quiesced = ON, Memory = OFF Quiesced = OFF, Memory = OFF

Note that VMware completely controls the snapshot creation process. Let's review the first option, that is, when Quiesced is set to ON.  

Why do we need to go into idle mode?

There are many reasons to set idle mode to Idle mode to ON. For example, you will avoid the problem of reverting Update Sequence Number (USN) when restoring an Active Directory, once the domain controller recovers from a VSS-enabled backup, the InvocationID is successfully reset and has a healthy entry in the Event Log:

Event ID 1109: Active Directory has been restored from backup media or configured to host an application partition. The invocationID attribute for the controller of this domain has been changed.

It will also avoid problems in the recovery of SQL Server or other applications.

Acronis backup software, such as Acronis Backup 12.5, performs these operations correctly on all types of operating systems, servers, and applications running inside a virtual machine.

How can you verify that the snapshot was created successfully with VSS?

There are several ways to determine if a snapshot was created successfully. It can be checked down to the application level.

First, check the Event Viewer. When creating a snapshot based on the options quiesced = ON snapshot memory = OFF (see screenshot at the beginning of this entry), the application logs show the following event from VSS writers:

Acronis
quiescing

Notes: The VSS error with event ID 12289 that can be seen in the screenshot is not a problem. It is related to the 3.5 inch floppy disk. To fix the problem, just remove the VM configuration floppy:

Acronis
El error VSS con el ID de suceso 12289

Alternatively, you can use the Datastore Browser component in the vSphere client to check if the snapshot was created successfully. Once the idle mode snapshot has been created, you should be able to see a ***vss_manifests*.zip style file in the VM folder in the data storage.

Inside the folder, there is a backup.xml file containing the description of each of the VSS writers found on the guest system, and also the metadata present in each writer in writerX.xml.

Acronis
Datastore Browser

It is important to note that, if the vss_manifests.zip file only contains a backup.xml file, this usually means that the snapshot was created using VSS. Still, it is a troublesome snapshot. Failed snapshots are easy to spot, but it is important to recognize when VMware reports that it has a successful snapshot, when in fact it is has not.

In the following sections, we will discuss the causes that cause a snapshot to fail.

Environment requirements

Obviously, there are benefits to using the idle mode option, but problems can often arise from improper initial environment settings. The necessary official requirements can be found here.

Let's see what to look for to determine if you are having these kinds of issues.

First, make sure your system supports application-consistent snapshots.  

Acronis
vSphere

Second, for idle mode to work, you have to install VSS components in VMware Tools and update to the latest version.

Acronis
vmware tools

Versions 3.5 and earlier of vSphere used the Legato Sync Driver for idle mode. It guaranteed consistency at the file level, but not at the application level, which is precisely what we need for VSS components. Legato has been replaced by the VMware Snapshot Provider. To verify that it is installed, find the VMware Snapshot Provider Service and the appropriate COM+ components on the virtual machine.

Acronis
VMware Snapshot Provider Service

What problems can appear in this phase?

If VMware Snapshot Provider Service is offline or not installed, VMware will still report that the snapshot was successful with the option quiescing = ON, memory = OFF. However, the snapshot will be taken without VSS and the Legato Sync driver will be used instead.

Acronis
quiescing = ON

This behavior differs from Windows 2008 and later versions; no events will appear in the log. Instead, VSS starts and stops again.  

Third, one of the typical problems in setting idle mode is the disk.EnableUUID = true parameter of the .vmx parameters for the virtual machine.

Setting this parameter only makes sense for guest systems based on Windows 2008 and later (this option is not supported in Windows 2003). This parameter only exists in vSphere 4.1 or later versions. In other words, if you migrate an old machine to a newer one, you may not have these settings.  

Acronis
configuración

When this parameter is missing or wrongly set, the snapshot will be created successfully, but without VSS. This can lead to inconsistent backup. If the backup.xml file is empty in the vss_manifests.zip file (usually it contains a log of VSS activity), this indicates that the parameter is not enabled Fourth, make sure there are no dynamic disks in the virtual machine. VSS will not work with dynamic disks, be it a system unit or a storage unit. A snapshot will be created, but the vss_manifests.zip file will be empty, as well as the event logs that are within the guest operating system. This occurs in Windows 2008 and later versions and is repeated on IDE drives (with the exception of IDE CD-ROM, which will not affect snapshots). Make sure the number of available SCSI slots on an SCSI controller equals the number of drives. For example, if there are 8 SCSI disk drives in SCSI1, you will not have enough slots.

Fifth, one reason many users complain to VMware support is a broken VSS inside a guest machine. These users assume that the snapshot failure was caused by VMware; however, the problem is at the guest operating system level. Here, you can see a screenshot of what happens when trying to create a snapshot in idle mode, when a new SQL database could not be installed successfully. The .iso virtual drive was unmounted during installation and the installer did not like this.

Acronis
Este problema en particular se resuelve simplemente reiniciando la máquina virtual

This particular problem is solved by simply restarting the virtual machine.

Although restarting can help in other cases as well, sometimes the damage done to the VSS is irreparable and restarting it will do nothing. To examine the VSS, run Windows Backup and try to back up the System State. If it fails, the problem is in the VSS. If it works, the problem is on the hypervisor side.

VMware has published several articles addressing this topic in its Knowledge Base, including Troubleshooting Volume Shadow Copy (VSS) Quiesce-Related Issues and Failed to Quiesce Snapshot of the Windows 2008 R2 Virtual Machine. In fact, one of these articles proposes to set disk.EnableUUID to false. This effectively rules out the use of VSS when taking a snapshot in idle mode. Although this is not the ideal solution, it can be applied as an temporary measure. Either way, be careful, because it can cause problems when restoring systems that require application consistency, such as reverting USN.

Let’s recap

Issues number 2, 3, and 5 cause most of the snapshot consistency issues. Also, sometimes, no snapshot is created. Regardless of the difficulties involved in snapshots, there is something that should not be forgotten, it is not enough to create a backup, you have to verify that it can really be recovered. There are several posts on our blog that explain what best practices are and why creating backups is important for your business.

Take a look at your servers today. Make sure your virtual machines have backups. Review your disaster recovery plan. If you need help, call us or try Acronis Backup 12.5. Today, there is no faster, more comprehensive, and easier-to-use backup solution on the market. With Acronis Backup 12.5, "To inactivate or not to inactivate" will no longer be the question.

About Acronis

A Swiss company founded in Singapore in 2003, Acronis has 15 offices worldwide and employees in 50+ countries. Acronis Cyber Protect Cloud is available in 26 languages in 150 countries and is used by over 20,000 service providers to protect over 750,000 businesses.