Home Lab, Part 2…

Teaching an Old Dog…

Well, I have to say I am not happy about the ‘old’ part, but with the shoulder aches, and the knee injury, not to mention the extra poundage…. let’s just say I am feeling it these days!

In a previous post I discussed the home lab I am putting together, and at that point, I only had the gear assembled, but no software installed.  Since then, I have managed (during the evenings and stolen moments on the weekend) to get ESX installed, and a couple virtual machines built running SQL and VMware vCenter.  I also pressed an old PC into service under the desk to serve as a management station, domain controller, and DNS server.  I am using a Thecus nPro 4100 NAS device I picked up some years ago (yes, I know – it’s not EMC… more on that later), and that houses all my ISO files and installation binaries.  At first, everything went swimmingly…  I built the vCenter and SQL vms using an NFS mount on the n4100Pro, and ran both virtual machines on the same ESX server at first.  Once I had the vCenter built, and I added the ESX hosts into my inventory, I configured up the vmotion networks and tried to move one of the vms to another server….

Well, that’s when my bliss ended.  The vmotion started, but timed out.  Then my datastore dropped out of the inventory, and of course once that happened the virtual machines read ‘inaccessible.’  I waited a little bit, but soon I lost the connection to my vCenter (because the virtual machines had lost connection to their datastore)… you get the idea.  I connected my vSphere client to the ESX hosts one at a time and restarted my SQL and vCenter servers, but my mind was already churning…  I have dedicated NICs for NFS storage, connected to a Dell PowerConnect Gigabit switch, and dedicate NICs for the COS, VM traffic, and vmotion on a couple little NetGear GbE switches I picked up a while back.  More than sufficient for a lab, right?

But, VMware just works, doesn’t it?

I won’t take you through the whole troubleshooting process, primarily because I didn’t get anywhere myself.  Finally, I called my partner in crime, Larry Whitlock, for a second opinion.  Larry and I cover the mid-Atlantic area for EMC Enterprise customers.  Larry has been a VCP about as long as I have, and an EMC’er for much longer.  After walking him through it all on the phone, and talking it out for a while, he said, “You know, I had a problem just like this a while back…”  He suggested I check the vmkernel log for dropped packets – maybe the NICs and ESX are having an argument over MTU size or something?

There were no dropped packets in the vmkernel log, but I found evidence of a handful here and there when I ran ifconfig at the command line.  Now, I had done all my homework – I was running Intel GbE NICs that support Jumbo frames, as did the Dell switch, and the NAS device.  I had followed all the correct procedures to enable Jumbo frames, and all three systems (NAS, switch, ESX) all reported the MTU was set to 9000, as is appropriate.  Yet here are my dropped packets!  VMware was certainly trying to do what I was asking it to, but nevertheless it couldn’t.

Yes, I am sure someone is going to point out that the MTU setting clearly reads 1280, but that is only because I took the screen shot after running some additional tests…  I’m a new blogger – forgive me.  I was caught up in troubleshooting and didn’t have the presence of mind to capture the evidence in its pristine state.  Anyways, at Larry’s suggestion I used the ESX console to reset the vmknic and the storage vSwitch to a lower MTU – that’s right, 1280.  My performance immediately improved, and while still not stellar, it is serviceable. vMotion will actually work, and I can deploy from templates across my systems just fine…. though it takes a while (but it’s just a lab!).

Learning the New Tricks

My lesson in all this, of course, is to change my thinking.  I have always been a software guy… Yes, I have had to deal with troublesome hardware before, but based on my background and training (Microsoft, Citrix, VMware), wondering if the hardware is doing what it is supposed to is always last on my list.  If it’s plugged in and the lights are blinking, of course it’s working!  Isn’t it?

My thanks to Larry for reminding this old dog that there is always something new to learn, and pointing out to me that I am going to have to stretch some to get comfortable in this new role.  It’s good to be reminded – even if it is somewhat humbling in the process.

2 comments

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s