0

Don’t over-complicate!

Having recently relocated my home office and my home lab within my house, I have set about rebuilding my lab from scratch. As it evolves or my needs change, a rebuild is good to purge out the remnants of the various experiments and tests that I’ve done. However, I will sometimes fall into the trap of trying to be too clever.

Take last night as an example. I happened to read about a piece of software called Cobbler. To save anyone having to read what is quite a lengthy man page, Cobbler manages the provisioning of operating systems from a single server. I thought it would be great if I could automate and control the complete rebuild of my entire lab from bare metal to fully functional at the touch of a few buttons with my QNAP NAS acting as the Cobbler server.

After a little more research, I grabbed the source code and tried to shoe-horn it onto my NAS. Part way through, and encountering problems, I realized that I was vastly over-complicating this rebuild. Let’s face it, how many times do I actually need to reinstall everything from the ground up? Once or maybe twice per major release at most.

Thankfully I only wasted an evening on it although it was fun. I might still try and work it out in the future but there are more important things to do in the meantime.

0

Fixing “DRS Invocation Not Completed”

I ran into an error today that I haven’t seen before. My vSphere 5.0 cluster displayed the message “DRS invocation not completed” on the Summary tab and, I noticed, it stopped moving VMs around automatically too.

screenshot305

I tried changing some of the DRS settings and running DRS from the DRS tab of the cluster just to try and see if that would get things going but to no avail.

Interestingly, I couldn’t find any mention of the message on the VMware KB site or anything useful in Google. I was tempted to turn DRS off completely and then try re-enabling it but that would have removed my resource pools.

It was then that I noticed that some of the hosts weren’t reporting any memory or CPU utilisation even though I knew them to be hosting VMs.

screenshot307

As an experiment I tried disconnecting and then reconnecting these hosts in turn. Once reconnected I started seeing DRS initiated vMotions occur to rebalance the cluster and the message disappeared from the cluster’s summary tab.

So, I’m not sure why it happened but a simple, non-disruptive solution fixed it.

Just thought I’d share…

0

Do my ESXi hosts have the same VLANs?

PowerCLIIn a small vSphere environment that I’ve recently been working on, I started to notice that some of my VMs were disappearing off the network from time to time. Reboots of the VM didn’t seem to fix the issue but a quick vMotion of the VM to another host did.

If you haven’t figured it out yet, one of my hosts was missing a VLAN and VMs connected to a certain portgroup were affected whenever they ran on the host.

vSphere will warn you if a host that you’re trying to migrate a VM to doesn’t have the right portgroup and host profiles (if you’re using Enterprise Plus licensing) will alert you to the fact that a portgroup isn’t configured with the right VLAN ID but nowhere in vSphere will you get an alert if a required VLAN is not being presented to a host. So you have to use other means to check this information.

You could manually examine the properties of each physical NIC in turn but that could take some time. The method that I used on this occasion was a PowerCLI script. I could have written one myself but a quick google lead me to a script written by Luc Dekens that did what I wanted already (and a little more besides). I modified it to suit my needs (demonstrating to the person in the remote datacenter that there was a network misconfiguration) and ran it. The output is below:

[ps]Host:  esx1.mydomain.com

vmnic0  VLAN224 VLAN227

vmnic1  VLAN224 VLAN227

vmnic2  VLAN250 VLAN252 VLAN251

vmnic3  VLAN250 VLAN252 VLAN251

Host:  esx2.mydomain.com

vmnic0  VLAN227 VLAN226 VLAN224

vmnic1  VLAN227 VLAN226 VLAN224

vmnic2  VLAN251 VLAN252 VLAN250

vmnic3  VLAN251 VLAN252 VLAN250

Host:  esx3.mydomain.com

vmnic0  VLAN224 VLAN227 VLAN226

vmnic1  VLAN224 VLAN227 VLAN226

vmnic2  VLAN250 VLAN252 VLAN251

vmnic3  VLAN250 VLAN252 VLAN251

Host:  esx4.mydomain.com

vmnic0  VLAN224 VLAN226

vmnic1  VLAN224 VLAN226

vmnic2  VLAN250 VLAN251

vmnic3  VLAN250 VLAN251 VLAN252[/ps]

As you can see, there are some discrepancies in which VLANs are presented to the four hosts that I ran it against and vmnic2 on Host4 was the one causing my problems. The hosts are supposed to have the vmnics paired (vmnic0/vmnic1 in one pair and vmnic2/vmnic3 in another) with identical configuration between the hosts.

The modified script that I used is attached below. Many thanks, as always, LucD.

Show-PNICVLANs.ps1

0

Reset VM Stuck at 95%

I’m not convinced that this is supported, but it did work. As with anything on a blog, use at your own risk.

I was working on rebuilding my home lab and wanted to clear down the host that my vCenter VM was sitting on. Before doing that I wanted to rescue some files from it (long story). For some reason it hung on me and wouldn’t respond so I tried to reset it. This process got as far as 95% and then got stuck 🙁

One way to unstick such a VM is to SSH onto the hosts that it’s running on and use the vm-support command. How?

Run “vm-support -x” to show the world IDs of the running VMs on the host:

The one that I wanted was 9190. Using “vm-support -X 9190” and answering “y” to the three questions that follow will, eventually, result in you getting control back of the VM without affecting anything else. Just remember, try it at your own risk 🙂