Small but Full-fat VMWare install.

ddrueding

Fixture
Joined
Feb 4, 2002
Messages
19,728
Location
Horsens, Denmark
R2 wasn't explicitly on the OK list for vCenter, so I stuck with 2008 Standard.

VMWare is currently offering a 40% off deal on their VSA as part of a bundle with some Essential Plus licenses. The whole thing (1 vCenter, 1 VSA, 3 Essential Plus, all with 1 year support) was about $11k.
 

Mercutio

Fatwah on Western Digital
Joined
Jan 17, 2002
Messages
22,269
Location
I am omnipresent
Unrelated to anything going on in this thread but... Doug, do you happen to have any idea why I'd be seeing six hour+ boot times on a Server 2003 machine after a vCenter Converter live migration?

When I've done conversions in the past with VMware Workstation or P2V converter, it's taken a while but I've wound up with working VM. Here... six hours and I just got a login prompt. And now it's been "logging in" for 10 minutes.
 

Handruin

Administrator
Joined
Jan 13, 2002
Messages
13,920
Location
USA
I've seen long and erratic boot times in server 2003 because of RAM resources on the physical machine and ESX was doing the memory swapping for that specific VM. The requested VM RAM size didn't fit into the envelope on the ESX and nasty swapping occurred. Do you see any red, yellow, or blue bars (or values) in the VI client resource summary tab for that VM (Red: swap, Yellow: balloon:, blue: compression)? Is the ESX server under any other duress (high disk latency, high CPU ready % times)?

Other ideas in the case of a vmware converter could be that the virtual machine version was changed, or the underlying drivers are confused because the VMware tools are different for the destination when compared to the source.

If none of the above are causing the problem, can you stop the booting and try resetting the VM configuration back to a basic 1 vCPU and 1-2GB of RAM to see if that allows it to boot?
 

Mercutio

Fatwah on Western Digital
Joined
Jan 17, 2002
Messages
22,269
Location
I am omnipresent
No and no. My host is a 16GB Xeon E1230. The only guest running on it right now is the converted 32-bit Windows 2003 machine configured with the same 2GB RAM the source system had. The guest has a physical drive all to itself.
 

ddrueding

Fixture
Joined
Feb 4, 2002
Messages
19,728
Location
Horsens, Denmark
Wow, I haven't encountered that problem either. Can you try exporting it to another ESX server or other machine running VMWare Player just to rule out the hardware?
 

Mercutio

Fatwah on Western Digital
Joined
Jan 17, 2002
Messages
22,269
Location
I am omnipresent
I only have the one ESXi server on site.
Though, just to prove a point, I downloaded the .ISOs, installed SBS2003 and verified that Exchange functions in another guest on that machine since my last post. The converted guest hasn't changed from "Applying Computer Settings" in all that time, though it doesn't seem to be locked up either.

I'm wondering if disk access is just ridiculously slow for some reason. The source machine is using an LSI SCSI controller to run two drives in RAID1.

The conversion took 22 hours to make a ~88GB VM on my host. I haven't done a conversion in a couple years, but I remember making a 30GB guest in about four hours using VMware Workstation on a Core 2 Duo system, and that was over a 100Mbit link.
 

Mercutio

Fatwah on Western Digital
Joined
Jan 17, 2002
Messages
22,269
Location
I am omnipresent
That is catastrophically slow. I'm able to convert machines of that size in well under 4 hours.

OK, but where's the problem? I have no counters on the ESXi machine that relate to disk activitiy; the CPU and RAM are where they're supposed to be, and this is still the initial start-up after conversion. I haven't even seen a desktop yet.

The source machine is still running and it performs as well as can be expected for eight year old hardware.

The second VM is perfectly fine and responsive. Starts up in ~2 minutes, talks to the internet, does normal Server 2003 stuff.

I'd just transfer data from the source machine if I could get away with it, but I'm dealing with the fact that the source server has a line of business app that isn't available any more, which is why I converted in the first place. It's not time critical, just... weird.
 

Handruin

Administrator
Joined
Jan 13, 2002
Messages
13,920
Location
USA
If you want to capture the esxtop output to a CSV and send it to me (or put it on your website to download or something), I'd be glad to review it and see if I can find any issues with performance on any components.

I've also done conversions in the past and they haven't taken anywhere near that long of time to convert and then power up. I would be able to rule out latency issues and cpu ready time, or even aborts if you were to take a sample of the esxtop output for 10-20 minutes while this thing is performing bad. You will just need to identify the name of the datastore and the actual VM name as it's identified in the tree so that the data can be matched to the esxtop output easier.
 

Mercutio

Fatwah on Western Digital
Joined
Jan 17, 2002
Messages
22,269
Location
I am omnipresent
At this point I'll probably have to wait until another weekend to play with it any more.

I don't actually think there's an issue with the hard disk itself. I'm wondering if there's an issue with controller emulation or a driver conflict or something. I was able to create a native install of the same version of Windows (SBS2003r2) and it's as responsive as I would expect it to be. It's just the converted system that's weird.
 

Handruin

Administrator
Joined
Jan 13, 2002
Messages
13,920
Location
USA
At least consider taking an esxtop snapshot if the problem persists. There may be more useful information in there than you might expect. It records a ton of info, including things like bus resets. So much, that I've never seen Excel open a spreadsheet with 16,000+ columns prior to opening one of these. :)
 

ddrueding

Fixture
Joined
Feb 4, 2002
Messages
19,728
Location
Horsens, Denmark
The new version of Veeam has a neat field, when performing a backup, right below the backup speed in MB/s, it states what the bottleneck is (Source Disk, Target Disk, Client, Proxy, etc).
 

ddrueding

Fixture
Joined
Feb 4, 2002
Messages
19,728
Location
Horsens, Denmark
I now need to reduce the size of a VM's disk. Inside the Win2003 VM, the partition only consumes the first 140GB of the 220GB disk (the other partition has been deleted). Now I would like to make the disk itself only 140GB. Suggestions?
 

Mercutio

Fatwah on Western Digital
Joined
Jan 17, 2002
Messages
22,269
Location
I am omnipresent
Boot the VM with something like a Hiren's BootCD .iso. Run something under the "partition resizer" category, then resize the disk with VMware's editor?
 

ddrueding

Fixture
Joined
Feb 4, 2002
Messages
19,728
Location
Horsens, Denmark
The partition resizing is done; it had a partition that is no longer needed. The "resize the disk with VMWare's editor" is the part I'm stuck on.

I shut down the VM, right-click on it and choose "edit". Selecting the disk shows everything greyed out. I know having active snapshots can cause this, but I don't think I have any?
 

ddrueding

Fixture
Joined
Feb 4, 2002
Messages
19,728
Location
Horsens, Denmark
With all snapshots removed it will let me increase the size of the virtual disk, but not decrease it. Next I'll try using the vSphere vCenter Converter Standalone Client (damn that is a long name) to migrate it to my desktop and then back.
 

Handruin

Administrator
Joined
Jan 13, 2002
Messages
13,920
Location
USA
The VMware converter is the easiest way to reduce the size of a disk and even then it can be a pain. I researched other ways of reducing the size and it usually came back to using the VMware converter tool.

Increasing the size is much easier as you've probably seen. You could also consider using the thin disk option, but that leaves the risk of overcommitment.
 

Handruin

Administrator
Joined
Jan 13, 2002
Messages
13,920
Location
USA
Any error messages in the vCenter console? I would not have resized the partition before using the vmware converter tool. It will take care of the repartitioning from what I've seen.
 

ddrueding

Fixture
Joined
Feb 4, 2002
Messages
19,728
Location
Horsens, Denmark
Didn't resize the partition, simply deleted one that wasn't needed (from when it was a physical machine years ago). I'll have the exact error code when I get into work.
 

ddrueding

Fixture
Joined
Feb 4, 2002
Messages
19,728
Location
Horsens, Denmark
I suspect the disk is failing? What would be the best practice for recovering the VM files to another drive if a clone fails? Is there a more robust method?
 

Handruin

Administrator
Joined
Jan 13, 2002
Messages
13,920
Location
USA
Does the VM still power on even though you can't clone it to another location? Does it pass a filesystem scan to make sure it isn't corrupted? I'm just trying to see if there was any problems with the actual VMDK file that may have caused the clone to fail.

Can you right click on the datastore inside vCenter and "Browse Datastore.."? You can usually find the list of datastore by first clicking on the ESXi server and then looking at the summary tab. Assuming you have enough room on the workstation where you are running vCenter, you can click on the icon that looks like a drive with a down arrow and actually download the entire contents of the VM locally. From here you could then possibly use the VMWare converter tool to then "convert" it back to your ESXi environment as a new VM. Once you have confirmed that the new converted VM works, remove the existing one by deleted from disk to remove all the contents. Obviously only do this once your 100% confident the converted VM is operating correctly.


===========
The crude way of copy a VM is by connecting to the remote tech support console and using the CLI. This feature can be enabled using vCenter by going to: configuration > Security Profile. Click on "edit" and then select the Remote Tech Support (SSH)" option and click the options button. From there you can start that service.

You should then be able to SSH into the host and then change to the VMFS directory by doing:

cd /vmfs/volumes/

In there you can list the volumes (your datastores). You will see both a non-human readable name of something like "16ec387a-e7473c5e-3c1b-e052e6a120b2" and then the name you actually used to create a datastore (like HDDBDrive).

You can then change into that directory to view the folders containing each of your VMs. Yours looks to be located in a folder named "HDDB".

From here you can also scp (or WinSCP) the folder and files to another location.
 

ddrueding

Fixture
Joined
Feb 4, 2002
Messages
19,728
Location
Horsens, Denmark
Thanks for the detailed assistance Handruin, I appreciate you taking the time.

Does the VM still power on even though you can't clone it to another location?

The VM is powered on, and is functioning funkily (occasional 60ms+ latencies and the resulting queue length pile-up, rarely bits of information not being saved). That is one of my worries; backups are not succeeding and new data is entering the VM. I really need to be able to recover the machine at this point.

Does it pass a filesystem scan to make sure it isn't corrupted? I'm just trying to see if there was any problems with the actual VMDK file that may have caused the clone to fail.

How do I run a scan?

Can you right click on the datastore inside vCenter and "Browse Datastore.."?

Yup, and it shows all the files for the VM. One thing to note is that this VM almost completely fills the disk it's on (4.73GB free of 227.75GB). Not sure if this would be a factor.

Assuming you have enough room on the workstation where you are running vCenter, you can click on the icon that looks like a drive with a down arrow and actually download the entire contents of the VM locally. From here you could then possibly use the VMWare converter tool to then "convert" it back to your ESXi environment as a new VM. Once you have confirmed that the new converted VM works, remove the existing one by deleted from disk to remove all the contents. Obviously only do this once your 100% confident the converted VM is operating correctly.

I don't have enough locally, but I tried "converting" to a network drive and it failed after 90 minutes at 51%. I also tried cloning the VM to another drive in the same machine, and that failed as well.

The crude way of copy a VM is by connecting to the remote tech support console and using the CLI. This feature can be enabled using vCenter by going to: configuration > Security Profile. Click on "edit" and then select the Remote Tech Support (SSH)" option and click the options button. From there you can start that service.

You should then be able to SSH into the host and then change to the VMFS directory by doing:

cd /vmfs/volumes/

In there you can list the volumes (your datastores). You will see both a non-human readable name of something like "16ec387a-e7473c5e-3c1b-e052e6a120b2" and then the name you actually used to create a datastore (like HDDBDrive).

You can then change into that directory to view the folders containing each of your VMs. Yours looks to be located in a folder named "HDDB".

From here you can also scp (or WinSCP) the folder and files to another location.

I might give that a shot. Shame I can't just mount the drive on my workstation and copy the files that way, can I?
 

ddrueding

Fixture
Joined
Feb 4, 2002
Messages
19,728
Location
Horsens, Denmark
Just tried a Veeam backup, and these are the errors it throws:

12/6/2011 2:02:09 PM :: Queued for processing at 12/6/2011 2:02:09 PM
12/6/2011 2:04:55 PM :: Required resources have been assigned
12/6/2011 2:04:57 PM :: VM processing started at 12/6/2011 2:04:57 PM
12/6/2011 2:04:57 PM :: VM size: 220.0 GB
12/6/2011 2:04:57 PM :: Using source proxy VMware Backup Proxy [nbd]
12/6/2011 2:05:08 PM :: Production datastore 'SSD_OS_HDDB' is getting low on free space (4.9 GB left), and may run out of free disk space completely due to open snapshots.
12/6/2011 2:05:08 PM :: Creating snapshot
12/6/2011 2:05:15 PM :: Saving '[SSD_OS_HDDB] HDDB/HDDB.vmx'
12/6/2011 2:05:21 PM :: Saving '[SSD_OS_HDDB] HDDB/HDDB.vmxf'
12/6/2011 2:05:26 PM :: Saving '[SSD_OS_HDDB] HDDB/HDDB.nvram'
12/6/2011 2:05:30 PM :: Hard Disk 1 (220.0 GB)
12/6/2011 2:57:45 PM :: Removing snapshot
12/6/2011 3:02:49 PM :: Timed out waiting for all VDDK disks to close.
12/6/2011 3:04:16 PM :: Error: Client error: VDDK error: 1.Unknown error
Unable to retrieve next block transmission command. Number of already processed blocks: [172900].

12/6/2011 3:04:16 PM :: Busy: Source 99% > Proxy 65% > Network 0% > Target 13%
12/6/2011 3:04:16 PM :: Primary bottleneck: Source
12/6/2011 3:04:16 PM :: Processing finished with errors at 12/6/2011 3:04:16 PM
 

Handruin

Administrator
Joined
Jan 13, 2002
Messages
13,920
Location
USA
Thanks for the detailed assistance Handruin, I appreciate you taking the time.



The VM is powered on, and is functioning funkily (occasional 60ms+ latencies and the resulting queue length pile-up, rarely bits of information not being saved). That is one of my worries; backups are not succeeding and new data is entering the VM. I really need to be able to recover the machine at this point.



How do I run a scan?

I meant a basic windows (assuming windows) disk scan to ensure the NTFS filesystem wasn't corrupt. If you can't don't sweat it, it was just another data point for me.

Yup, and it shows all the files for the VM. One thing to note is that this VM almost completely fills the disk it's on (4.73GB free of 227.75GB). Not sure if this would be a factor.

Yes, this can absolutely be a factor. Depending on how yours is configured, the VM swap is often times stored in the same location as the VM. You would see a file with an extention of vswp. That file will be the same size as the RAM allocation and in some cases can be overlooked when migrating and you won't be able to power on your VM. This could also be a way for you to temporarily get some extra space by configuring the swap file into another datastore temporarily. Depending on the size of it, we could use this in the future as a way to circumvent the current issue.

In this case, it may just be that you don't have enough free space. I cannot remember if you had any snapshots enabled on this VM. If you do, they grow over time and if left on for any length of time, it can take a considerable amount of time to remove a snapshot, especially if there are other VMs on the same datastore also containing snapshots. VMware will try to consolidate the snapshot changes into a new vmdk file which means extra space is needed when consolidating.

I don't think that's the case here, but can you check? Even if the GUI doesn't show as having snapshots in the snapshot manager, can you use the browsing tool on the datastore and see if there are any files that look like this: "<hostname>-Snapshot4.vmsn" or anything with the ".vmsn" extension? If so, then it could be a case where Veeam has put your VM into a snapshot in order to perform its backup, but was not able to remove the snapshot due to not having enough free space. I've seen problems with removing snapshot when there are space issues. There is more details in this KB article.

I don't have enough locally, but I tried "converting" to a network drive and it failed after 90 minutes at 51%. I also tried cloning the VM to another drive in the same machine, and that failed as well.



I might give that a shot. Shame I can't just mount the drive on my workstation and copy the files that way, can I?

If the network drive is failing, it could be a timeout issue. Instead of converting to a network drive, can you use the browsing method to download the VM to the same network drive? You'll need to have it powered down to get a complete backup. Another potential way to solve this may be to "clone" the VM rather than migrate or convert, but it doesn't sound like you have enough space to do this.

Yes, if you have the ability to mount the ESX datastore on your workstation, I believe you can copy the files, but be very careful not to let windows (assuming it's windows) put a signature or ID on the drive when it detects it. It's been a long time since I've tried this, so I'd need to review some info before recommending you try this.
 

ddrueding

Fixture
Joined
Feb 4, 2002
Messages
19,728
Location
Horsens, Denmark
I meant a basic windows (assuming windows) disk scan to ensure the NTFS filesystem wasn't corrupt. If you can't don't sweat it, it was just another data point for me.

Ah, I wasn't sure if vCenter or the ESXi machine itself had something of the sort. I'll be able to run this scan tonight when the machine is offline.

Yes, this can absolutely be a factor. Depending on how yours is configured, the VM swap is often times stored in the same location as the VM. You would see a file with an extention of vswp. That file will be the same size as the RAM allocation and in some cases can be overlooked when migrating and you won't be able to power on your VM. This could also be a way for you to temporarily get some extra space by configuring the swap file into another datastore temporarily. Depending on the size of it, we could use this in the future as a way to circumvent the current issue.

Yup. 2GB .vswp file, I'll look into relocating it.

I don't think that's the case here, but can you check? Even if the GUI doesn't show as having snapshots in the snapshot manager, can you use the browsing tool on the datastore and see if there are any files that look like this: "<hostname>-Snapshot4.vmsn" or anything with the ".vmsn" extension? If so, then it could be a case where Veeam has put your VM into a snapshot in order to perform its backup, but was not able to remove the snapshot due to not having enough free space. I've seen problems with removing snapshot when there are space issues. There is more details in this KB article.

There is a .vmsn file, but it is 28.25KB, should I be worried about this?

If the network drive is failing, it could be a timeout issue. Instead of converting to a network drive, can you use the browsing method to download the VM to the same network drive? You'll need to have it powered down to get a complete backup. Another potential way to solve this may be to "clone" the VM rather than migrate or convert, but it doesn't sound like you have enough space to do this.

I tried the "clone" to another drive on the same ESX server and it failed as well. I'll try copying manually from the datastore browser, perhaps I can get all the files one at a time.

Yes, if you have the ability to mount the ESX datastore on your workstation, I believe you can copy the files, but be very careful not to let windows (assuming it's windows) put a signature or ID on the drive when it detects it. It's been a long time since I've tried this, so I'd need to review some info before recommending you try this.

I'll leave this as a worst-case solution...don't sweat it at the moment.


Thanks again for all the help. I've attached a list of all the files in the datastore for reference.
 

Handruin

Administrator
Joined
Jan 13, 2002
Messages
13,920
Location
USA
Don't worry about moving the swap file just yet. Yes, you have snapshots. :) If you're telling me that the OS sees roughly a single 230GB drive (give or take a few) and no other drive, then that means you have about 1GB+ in snapshot delta data.

You HDDB.vmdk is your original drive. The HDDB-Snapshot4948.vmsn is the memory state at the time of the snapshot (no need to worry about it). All the other *.vmdk files are the child/redo/delta log disks that record every change that happens on the virtual disk while in snapshot mode.

Is there anything listed in vCenter if you right click on that VM and go to the snapshot manager? If you're OK with removing the snapshots, this may help get things in order. You would typically delete all snapshots and this will tell ESXi to merge all those changes from all the other vmdk files into the main HDDB.vmdk file. This could take some time to merge, but once it's done, you should be able to browse the directory store and see that those files have been removed. This all assumes you do not have any snapshots you need or want to keep for specific reasons. Cleaning these up should also fix your error with Veeam.

Edit: removing snapshots while the VM is powered down is faster than removing them while it is running live. If you have the option to power off the machine to perform this task, I recommend it. It won't hurt anything to leave it powered on, but it could take longer.
 

ddrueding

Fixture
Joined
Feb 4, 2002
Messages
19,728
Location
Horsens, Denmark
Now that I was able to kick everyone out, I've got the machine shut down. Turns out that was taken while a backup was being attempted. After the backup was stopped, the snapshots were removed. Now that I'm in the Datastore Browser, I'm just trying a file-by-file move from the 250GB Intel SSD to a 600GB Raptor in the same machine. The main file is 230GB and it says it will take 250 minutes, that ain't right. If past attempts are any indication it won't even show a fail until more than halfway through, so this is going to be another very long night.
 

ddrueding

Fixture
Joined
Feb 4, 2002
Messages
19,728
Location
Horsens, Denmark
Copy failed again. Now trying to download to my local machine, but it shows 760 minutes to complete. Not sure what to try next, I'll do a scandisk in the OS and then try moving the drive to one of the other ESXi boxes and copying from there.
 

ddrueding

Fixture
Joined
Feb 4, 2002
Messages
19,728
Location
Horsens, Denmark
Scandisk shows nothing. Moved some spare files off the drive to free disk space and checked for updates. Now it won't start saying out of disk space. Great.
 

ddrueding

Fixture
Joined
Feb 4, 2002
Messages
19,728
Location
Horsens, Denmark
Bought DiskInternals VMFS recovery program, mounted the drive and a 2TB WD to my machine, and got a copy to a FAT32 formatted disk. Now uploading the files to another disk on another running ESX server. It currently says it will take 2100 minutes to finish the copy of this 220GB file. When I started 3 hours ago it said 3500 minutes, and one (of 10?) progress bars has lit up. I'd cancel it but I don't know what else to do at this point. Maybe taking a nap and giving it a chance to finish?
 

Howell

Storage? I am Storage!
Joined
Feb 24, 2003
Messages
4,740
Location
Chattanooga, TN
Sorry, I haven't been keeping track of the troubleshooting but you might try to reduce the size of the VM by defraying and zeroing the free space. I do this twice a year in support of my backups.
 

Mercutio

Fatwah on Western Digital
Joined
Jan 17, 2002
Messages
22,269
Location
I am omnipresent
Live conversion time on my 8 year old server: 17 hours and 52 minutes.
Using different physical drive than the first attempt. Transfer rate 1.7MB/sec.

Sob.
 

Mercutio

Fatwah on Western Digital
Joined
Jan 17, 2002
Messages
22,269
Location
I am omnipresent
Conversion attempt #2: Heartened by ddrueding's now-transparent lies about performance, I take another shot at using the vCenter converter. 11 hours later, this machine, which has otherwise been completely idle, sitting at 2 - 5% CPU utilization, 20% RAM and > 2% of 1Gbps LAN for the whole time according to my performance counters (I even disabled the nightly backups to spare the drives that work during the process) through the whole process, is at 40% with an estimated time remaining of 15.5 hours to go. The data transfer rate for the conversion appears to be about 850kbps.

Near as I can tell, the ESXi machine on the other side is around the same, basically idle.

My source's drives are year old 10krpm Cheetah SAS drives and my RAID card says they're fine and I ran a chkdsk on them Tuesday night. They're only connected at 1.5Gbps SAS, but that's more than fast enough.

Converter.exe is running at normal priority on the source. I'm transferring to a different physical drive (destination is a 1TB 7200rpm Barracuda ES) than the one I used for the last attempt. The two machines are connected to a dedicated (cheap, but still, there's nothing else plugged into it) 1Gbps switch with a total of 6 feet of cat6 between them.

Where the hell is the bottleneck?!?
 
Top