VMware E1000 virtual NIC issue
While deploying a new virtual ISA server this week I came across a rather unfortunate error that caused VMware Virtual Center to crash. Luckily the error was fairly easy to reverse and to avoid in future but I think it is worth a post as the error message you receive at first glance looks like a fairly catastrophic error.
In VMware ESX 3 you can have a maximum of 4 NICs per virtual machine. In most scenarios this will be more than adequate, however in some situations you want to have more network cards than this, e.g in an ISA server where you want to separate out multiple DMZ networks.
One way around this limitation is to use the Intel e1000 emulation when creating the virtual network cards. The network card that appears in windows is using an emulated version of the Intel e1000 chipset. Intel provide a windows driver for this chip that lets it use VLan tagging. Windows sees the original NIC and then in the driver properties you can specify extra Vlans that the card will function on. This then creates a second “fake” adaptor that windows sees as an entirely separate NIC that is only able to communicate on your chosen VLan. You can carry on adding multiple “virtual-virtual” NICs in this way. This “virtual-virtual” NIC then functions in windows exactly like it was a separate physical network card with its own protocol settings etc…
So in the diagram above Windows would see 6 NIC’s attached but as far as VMware is concerned there is only one NIC installed in the VM.
ESX: 3.5.0 158874
Virtual Center: 2.5 147633
Windows Server 2003 X86 Standard
After creating my extra NICs in Windows I went back in to Virtual Center and everything suddenly disappeared! After some ping testing it seems the ESX hosts were still functioning correctly but that the Virtual Center server had crashed. After logging into the Virtual Center server it seems the VMware services had stopped. Restarting the service immediately caused it to fall over again. After looking around in the VMware logs the following error was showing up:
An unrecoverable problem has occurred, stopping the VMware VirtualCenter service. Check database connectivity before restarting. Error: Error[VdbODBCError] (-1) “ODBC error: (23000) – [Microsoft][SQL Native Client][SQL Server]Violation of PRIMARY KEY constraint ‘PK_VPX_IP_ADDRESS’. Cannot insert duplicate key in object ‘dbo.VPX_IP_ADDRESS’.” is returned when executing SQL statement “INSERT INTO VPX_IP_ADDRESS (ENTITY_ID, DEVICE_ID, IP_ADDRESS) VALUES (?, ?, ?)
The error seems to be because Virtual Center keeps a record of the IP addresses associated with the guest virtual machines. In the case where there are multiple “virtual-virtual” NICs in Windows it is possible for one VMware NIC to have multiple IP addresses that are the same (namely 0.0.0.0 if they are on DHCP and not connected yet). When VMware is writing these IP addresses into its internal database it was getting a primary key collision and falling over. The solution to the problem was to remove the VLan tagged network cards in Windows which then allows the Virtual Center services to be restarted.
To successfully add the VLAN tagged NICs into windows you need to add them one at a time and make sure that you assign an IP address to each one before adding the next one. This makes sure there are no IP collisions in the Virtual Center database!