That… could be a problem…

11Apr/121

vDR – causing problems…

For those new to vSphere 5's GUI, there's a new column that's been added to the Virtual Machine view by the name of "Needs Consolidation".
Needs Consolidation

This option was put in due to the occasional problem when Snapshots did not delete properly and would leave the delta files remaining in the VM's folder while the Snapshot Manager would show no snapshots existing.

With this option added to the columns, you should also take note of the option within the Snapshot options for each VM which will now allow a user to select the "Consolidate" function
Consolidate Snapshot

As noticed with the first screenshot, we had a couple systems which were requiring some consolidation to them. So another admin went through and hit the consolidated button and got hit with a "Unable to access file since it is locked" error. Normally, you can go through and figure out which file is being locked with some command line work or by rebooting the host (via: VMware KB: 10051) however our VM is running so there's something else going on.
Locked File

I still decided to dive into the CLI and check it out. I was stunned...
Deltas!

18 deltas... 18! Regardless of the vmsn file in there, there was no record of there being any snapshots.

In this case, that system probably hasn't even been rebooted 18 times much less been snapshot that many times... Except, vDR (VMware Data Recovery) is setup on it to do daily snaps. So I checked the vDR appliance settings and I found 8 disks too many attached.
Locked File

After removing all of those extra hard disks, the consolidations would succeed. Note, it took a while, but they did succeed.
Locked File

Just another reminder of while vDR is a great tool to have on hand, it should definitely not be the one and only method of backup

17Feb/126

Snapshot stuck “In Progress” Workaround

Came in to work today to find a VM stuck at the "In Progress" status while taking a snapshot. We use vDR as a complementary subset to our backup plans, and vDR had the unfortunate task of calling the snapshot which is now hung.

The official error read "An error occurred while quiescing the virtual machine. See the virtual machine's event log for details." One problem with that, I couldn't log into the system. The snapshot was far enough along to freeze the IO, so I had to jump into CLI and kill the task.

To kill the task for a VM, jump into the CLI for the host (in this case it was through the iDRAC and local terminal... bad, I know) and run a: ps | grep vmx command to see all the processes while searching for vmx's
Login & Command

Locate the Parent Process ID (the second column) for the hung VM, and run: kill *parent process ID* to end the process. In this case, it was: kill 465724
***DISCLAIMER: Be very careful doing this, if you don't kill the proper process, it can do harm to your ESXi host***
Kill Parent Process

Instantly, the task remaining in progress should change to have a status of "The attempted operation cannot be performed in the current state (Powered off)."
New Status

Now, it's time to check out that error. In this case I received a "Volume Shadow Copy Service error: Unexpected error DeviceIoControl" with the rest of the error seemingly pointing at the generic floppy drive. I know this is the error because it's pointing dead at the VMware Snapshot Provider and in a state of "DoSnapshotSet"
Windows Error

That's incredibly weird. So I uninstall both the floppy drive and it's controller, also remove it from the VM's settings while it's powered off. I boot the system back up, the floppy drive has reinstalled itself. Very odd, so I just uninstall the drive and then disable the controller.
Device Manager

So it's snapshot time again, right? Well, not really. I retried the snapshot, it freezes again. Time for some googling after I kill the parent process for the VM again.

What I came up with is that there is apparently something with Windows Server 2008 R2 systems having SQL 2008. This was a frequent topic over many VMware Communities posts, however no one really every had an answer on what was going on internal to the VM which would cause this problem. I know we have 4 or 5 SQL servers running and this is the first system we've run into this problem on.

Anyways, the best workaround I found was to disable the disk.EnableUUID parameter on the VM. Please note that by disabling this, you effectively disable VSS for the snapshot (ie. no quiescing). So I maintain that this is only a workaround and not yet a true solution

To do this, shut down the VM. Right click, Edit Settings, hit the Options tab, and click on "Configuration Parameters"
Config Parameters

In the Configuration Parameter pop-up screen, look for the "disk.EnableUUID" setting and change the value to read "false"
Enable UUID False

Click OK a couple times and boot the system up. Once it's booted up, try giving it a snapshot while checking the option for "Quiesce guest file system". This time, everything was successful. I ran the test and I also had the vDR appliance run another snapshot to get that one up to date

Hopefully I can do some more research and turn up some better answers, and at worst I'll create a support ticket and see if VMware Support can point me in a better direction

6Jul/110

Installing the VMware vDR Appliance…

Go to VMware's website, go to the "Downloads" section, click on "vSphere 4" and scroll down until you see "VMware Data Recovery", click on the "Download" button.
Accept the EULA and download the ISO. Once downloaded, extract the ISO (I use 7zip)
Remote Setup Wizard

Go into your vCenter and go to "File" then "Deploy OVF Template..."
Deploy OVF

Browse out to where the ISO was extracted to and select the ovf file in the VMwareDataRecovery-ovf-i386 folder, then click "Next"
Remote Setup Wizard

Verify the information, then click "Next"
OVF Details

Select the name and add it to where it should go (the cluster in this case)
Name

Select the cluster and then the individual host
Host Config

Select the Datastore to store the files and the format of the disk
Disk Setup

Select the Network it should be on
Network Config

Select the Timezone
Time Config

Verify the information and click "Finish"
Verify Config

Wait for the system to be successfully deployed
Installing
Completed Install

With the appliance installed, the plugin for vCenter will now be needed. To install it go back to the extracted ISO folder, run the "VMwareDataRecoveryPlugin.msi"
Plugin Installer

Click "Next", "Next", "Next", choose "I Agree" and "Next", "Next", "Close"
Plugin Installer
Plugin Installer
Plugin Installer
Plugin Installer
Plugin Installer
Plugin Installer

Once installed, by clicking on the "Plug-ins" through vCenter and then "Manage Plug-ins", the Plug-in Manager should look similar to this:
Plugin Installed