I have a VMware-server-1.0.4-56528. Both guest and host are Mandriva 2008. The gues doesn't have many services running, basicly just apache. The load isn't high (0.1 - 0.4), 256mb ram (more than 80mb free). Problem is that the guest system sometimes hangs. strace on the vmware-vmx process doesn't show anything useful: it's not blocked, i have many pages of scroll, just like the system would be running and i didn't notice and errors. On the guest all the logs stop. I can't access the guest on console (the server console overall still works), nor ssh. There is plenty of availabe hdd space, the guest OS is almost a fresh install (1month old), didn't find any huge files that could bring the system to stop.
In vmware log i noticed that when the guest hangs i start getting these messages:
Mar 19 12:26:25: vmx| VLANCE: Ethernet1 skipped 9 time(s)
Mar 19 12:26:25: vmx| VLANCE: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Mar 19 12:26:25: vmx| VLANCE: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8
Mar 19 12:26:25: vcpu-1| VLANCE: Ethernet0 skipped 8 time(s)
Mar 19 12:26:25: vcpu-1| VLANCE: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Mar 19 12:26:25: vcpu-1| VLANCE: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8
Mar 19 12:26:25: vcpu-0| VLANCE: Ethernet1 skipped 10 time(s)
Mar 19 12:26:25: vcpu-0| VLANCE: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Mar 19 12:26:25: vcpu-0| VLANCE: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9
Mar 19 12:26:26: vcpu-0| VLANCE: Ethernet1 skipped 11 time(s)
Mar 19 12:26:26: vcpu-0| VLANCE: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Mar 19 12:26:26: vcpu-0| VLANCE: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10
Mar 19 12:26:26: vmx| VLANCE: Ethernet0 skipped 9 time(s)
Mar 19 12:26:26: vmx| VLANCE: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Mar 19 12:26:26: vmx| VLANCE: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 9
Mar 19 12:26:26: vcpu-1| VLANCE: Ethernet0 skipped 10 time(s)
Mar 19 12:26:26: vcpu-1| VLANCE: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Mar 19 12:26:26: vcpu-1| VLANCE: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 10
Mar 19 12:26:26: vcpu-1| VLANCE: Ethernet1 skipped 12 time(s)
Mar 19 12:26:26: vcpu-1| VLANCE: 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Mar 19 12:26:26: vcpu-1| VLANCE: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 11
Mar 19 12:26:26: vcpu-0| VLANCE: Ethernet0 skipped 11 time(s)
Mar 19 12:26:26: vcpu-0| VLANCE: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Mar 19 12:26:26: vcpu-0| VLANCE: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 11
This is the last view a top when it froze:
top - 12:27:33 up 1:00, 3 users, load average: 0.11, 0.14, 0.16
Tasks: 88 total, 1 running, 87 sleeping, 0 stopped, 0 zombie
Cpu(s): 0.0%us, 0.5%sy, 0.0%ni, 82.7%id, 10.9%wa, 5.9%hi, 0.0%si, 0.0%st
Mem: 256040k total, 175336k used, 80704k free, 72092k buffers
Swap: 1044184k total, 0k used, 1044184k free, 55052k cached
So the load is not high, still available free memory.
The worst part is that i can't duplicate the problem. I took to load 2, with disk_write, but still nothing.
top - 15:24:52 up 10 min, 4 users, load average: 1.96, 1.20, 0.58
Tasks: 91 total, 4 running, 87 sleeping, 0 stopped, 0 zombie
Cpu(s): 40.2%us, 17.6%sy, 0.0%ni, 41.7%id, 0.0%wa, 0.5%hi, 0.0%si, 0.0%st
Mem: 256040k total, 252136k used, 3904k free, 2804k buffers
Swap: 1044184k total, 100k used, 1044084k free, 205608k cached
-
total-cpu-usage---- -dsk/total- -net/total- -paging-system
usr sys idl wai hiq siq| read writ| recv send| in out | int csw
15 12 61 9 1 1|1069k 464k| 0 0 | 0 136B| 60 220
38 19 40 0 2 1|2304k 0 |4326B 11k| 0 0 | 65 243
46 23 29 0 2 1|1952k 0 |2261B 5642B| 0 0 | 89 246
45 20 33 0 2 2|1192k 0 |3300B 13k| 0 0 | 76 291
Any help would be appreciated.
Thanks