hi !
if you were hit by the "sudden loss of network connectivity" and want to discuss how to solve that by switching the hosts nic or doing some other magic - please go here: http://communities.vmware.com/thread/91454?tstart=0
if you are able to do more in depth analysis and can check if your networking issues show the same symphtoms i`m seeing or if you are a vmware support engineer, then this thread is for you !
i`m reporting my in-depth analysis here because i think it contains valuable information.
for myself, i assume it`s a vmware bug.
repost:
I didn`t see the network issue for a long time, but a customer had it today and i had a chance to analyze.
one VM he was using for admin purpose (cisco tools) lost it`s BRIDGED network connection.
Host is Windows 2003 Standard SP2, Guest is Win XP Professional
i could not make it work again, reboot of VM didn`t help.
i switched vm from bridged network to host-only-network and set a different IP inside VM, but that didn`t help either.
i also changed the VMs nic from AMD PCNet to e1000 , but that also didn`t help.
here are some more details:
ping
from host (192.168.109.1)
to VM (192.168.109.10)
connected via vmnet1
ping request times out
C:\Programme\VMware\VMware Server>vnetsniffer /e vmnet1
len 74 src 00:50:56:c0:00:01 dst 00:0c:29:07:db:b4 IP src 192.168.109.1 dst 192.168.109.10 ICMP ping request
len 74 src 00:50:56:c0:00:01 dst 00:0c:29:07:db:b4 IP src 192.168.109.1 dst 192.168.109.10 ICMP ping request
len 74 src 00:50:56:c0:00:01 dst 00:0c:29:07:db:b4 IP src 192.168.109.1 dst 192.168.109.10 ICMP ping request
as we can see, the packets from the host appear on vmnet1 - but no response from VM.
now - vice versa
ping
from VM (192.168.109.10)
to host (192.168.109.1)
but ping in VM tells that "destination host unreachable"
let`s take a look:
C:\Programme\VMware\VMware Server>vnetsniffer /e vmnet1
len 42 src 00:0c:29:07:db:b4 dst ff:ff:ff:ff:ff:ff ARP sender 00:0c:29:07:db:b4 192.168.109.10 target 00:00:00:00:00:00 192.168.109.1 ARP request
len 42 src 00:50:56:c0:00:01 dst 00:0c:29:07:db:b4 ARP sender 00:50:56:c0:00:01 192.168.109.1 target 00:0c:29:07:db:b4 192.168.109.10 ARP reply
len 42 src 00:0c:29:07:db:b4 dst ff:ff:ff:ff:ff:ff ARP sender 00:0c:29:07:db:b4 192.168.109.10 target 00:00:00:00:00:00 192.168.109.1 ARP request
len 42 src 00:50:56:c0:00:01 dst 00:0c:29:07:db:b4 ARP sender 00:50:56:c0:00:01 192.168.109.1 target 00:0c:29:07:db:b4 192.168.109.10 ARP reply
len 42 src 00:0c:29:07:db:b4 dst ff:ff:ff:ff:ff:ff ARP sender 00:0c:29:07:db:b4 192.168.109.10 target 00:00:00:00:00:00 192.168.109.1 ARP request
len 42 src 00:50:56:c0:00:01 dst 00:0c:29:07:db:b4 ARP sender 00:50:56:c0:00:01 192.168.109.1 target 00:0c:29:07:db:b4 192.168.109.10 ARP reply
len 42 src 00:0c:29:07:db:b4 dst ff:ff:ff:ff:ff:ff ARP sender 00:0c:29:07:db:b4 192.168.109.10 target 00:00:00:00:00:00 192.168.109.1 ARP request
len 42 src 00:50:56:c0:00:01 dst 00:0c:29:07:db:b4 ARP sender 00:50:56:c0:00:01 192.168.109.1 target 00:0c:29:07:db:b4 192.168.109.10 ARP reply
as we can see, VM sends ARP to vmnet1, packets pass vmnet1, reaching vmnet1 virtual host interface and host is giving arp reply.
i can see, that host has learned correct MAC/IP of VM (arp -a) , but it seems that VM never receives those arp replies.
VM doesn`t receive ANY packet, as the interface statistics tell.
that may explain, why VM is sending arp request again and again.
further info:
If i assign the VM`s network identity (ethernet0.generatedAddress) to a different VM on same host (e.g. linux vm instead of windows), the problem remains and the new/different VM inherits the network issue.
the linux vm now has the same symptom as the windows VM.
BUT - if i change the VMs mac adress to a different one, the problem goes away.
If i revert the mac, the problem re-appears.
it seems, that the vmware virtual networking/switch has got "stuck" with that specific mac adress and doesn`t forward any packets into a VM anymore.
so we have:
packet from guest-os -> vmnic -> vmnet1 -> vmnet1-host-nic -> host-os --->OK!
packet from host-os -> vmnet1-host-nic -> vmnet1 --|||here must be a problem||| -> vmnic ->guest-os --->NotOK!!!
for me, this really looks like an issue with vmware virtual networking, i.e. the virtual hub/switch implementation.
AddOn Information, found some days later:
since i could relocate the problem from a windows vm to a linux vm (by assiging the same mac adress to that vm) and reproduce it there immediately, now some real crazy observation:
if i put the virtual ethernet interface inside the guest into promiscuous mode ("tcpdump -i eth0" or "ifconfig eth0 promisc up"), the problem is gone immediately !
here is some more debug output:
ping from host to vm (which doesn`t work:)
C:\Programme\VMware\VMware Server>vnetstats /lines:1 /interval:1000 vmnet1
Ports Rcv Xmt BrRcv BrXmt Err Dr NoP NoB Err Dr NoP NoB
4 3 22278 26104 0 0 0 0 0 0 0 0 0 0
Ports Rcv Xmt BrRcv BrXmt Err Dr NoP NoB Err Dr NoP NoB
4 3 22278 26104 0 0 0 0 0 0 0 0 0 0
Ports Rcv Xmt BrRcv BrXmt Err Dr NoP NoB Err Dr NoP NoB
4 3 22278 26104 0 0 0 0 0 0 0 0 0 0
Ports Rcv Xmt BrRcv BrXmt Err Dr NoP NoB Err Dr NoP NoB
4 3 22278 26104 0 0 0 0 0 0 0 0 0 0
Ports Rcv Xmt BrRcv BrXmt Err Dr NoP NoB Err Dr NoP NoB
4 3 22279 26104 0 0 0 0 0 0 0 0 0 0
Ports Rcv Xmt BrRcv BrXmt Err Dr NoP NoB Err Dr NoP NoB
4 3 22279 26104 0 0 0 0 0 0 0 0 0 0
Ports Rcv Xmt BrRcv BrXmt Err Dr NoP NoB Err Dr NoP NoB
4 3 22279 26104 0 0 0 0 0 0 0 0 0 0
Ports Rcv Xmt BrRcv BrXmt Err Dr NoP NoB Err Dr NoP NoB
4 3 22279 26104 0 0 0 0 0 0 0 0 0 0
Ports Rcv Xmt BrRcv BrXmt Err Dr NoP NoB Err Dr NoP NoB
4 3 22279 26104 0 0 0 0 0 0 0 0 0 0
as we can see vmnet1 is receiving packet(s) , but doesn`t transmit any.
now , after putting eth0 in VM into promiscuous mode:
C:\Programme\VMware\VMware Server>vnetstats /lines:1 /interval:1000 vmnet1
Ports Rcv Xmt BrRcv BrXmt Err Dr NoP NoB Err Dr NoP NoB
4 3 22302 26104 0 0 0 0 0 0 0 0 0 0
Ports Rcv Xmt BrRcv BrXmt Err Dr NoP NoB Err Dr NoP NoB
4 3 22303 26104 0 0 0 0 0 0 0 0 0 0
Ports Rcv Xmt BrRcv BrXmt Err Dr NoP NoB Err Dr NoP NoB
4 3 22303 26104 0 0 0 0 0 0 0 0 0 0
Ports Rcv Xmt BrRcv BrXmt Err Dr NoP NoB Err Dr NoP NoB
4 3 22303 26104 0 0 0 0 0 0 0 0 0 0
Ports Rcv Xmt BrRcv BrXmt Err Dr NoP NoB Err Dr NoP NoB
4 3 22303 26104 0 0 0 0 0 0 0 0 0 0
Ports Rcv Xmt BrRcv BrXmt Err Dr NoP NoB Err Dr NoP NoB
4 3 22303 26104 0 0 0 0 0 0 0 0 0 0
Ports Rcv Xmt BrRcv BrXmt Err Dr NoP NoB Err Dr NoP NoB
4 3 22303 26104 0 0 0 0 0 0 0 0 0 0
Ports Rcv Xmt BrRcv BrXmt Err Dr NoP NoB Err Dr NoP NoB
4 3 22305 26106 0 0 0 0 0 0 0 0 0 0
Ports Rcv Xmt BrRcv BrXmt Err Dr NoP NoB Err Dr NoP NoB
4 3 22307 26108 0 0 0 0 0 0 0 0 0 0
Ports Rcv Xmt BrRcv BrXmt Err Dr NoP NoB Err Dr NoP NoB
4 3 22309 26110 0 0 0 0 0 0 0 0 0 0
Ports Rcv Xmt BrRcv BrXmt Err Dr NoP NoB Err Dr NoP NoB
4 3 22311 26112 0 0 0 0 0 0 0 0 0 0
Ports Rcv Xmt BrRcv BrXmt Err Dr NoP NoB Err Dr NoP NoB
4 3 22313 26114 0 0 0 0 0 0 0 0 0 0
Ports Rcv Xmt BrRcv BrXmt Err Dr NoP NoB Err Dr NoP NoB
4 3 22317 26118 0 0 0 0 0 0 0 0 0 0
Ports Rcv Xmt BrRcv BrXmt Err Dr NoP NoB Err Dr NoP NoB
4 3 22319 26120 0 0 0 0 0 0 0 0 0 0
as we can see, packets being received and transmitted - and all is well.
So it seems the virtual switch implementation has a problem sending packets
with specific mac adress to specific virtual switch ports or to VM with that mac adress.
who has got the same issue ?
how can we further debug this problem ?
how can i reset the "virtual switch/hub" on windows ?