I hate when things don’t go my way. One server in our NLB (Network Load Balancing) cluster did not want to join the cluster anymore. When I issued a wlbs start, it tried for a couple of seconds to join the cluster, but then remained in ‘Converging’ state. A couple of times I saw an entry in the System Log " ... does not have the same number or type of port rules ...".
I tried:
- Reboot: worked 1st time, but after that: did not help
- Compare rules 1: compared output of
wlbs display: changed all load parameters to ‘Equal’ (I normally give the servers weights that take into account the # of processors and #MB RAM) - Compare rules 2: compared output of
wlbs display: identical - Compare rules 3: compared output of
regedit -e wlbs.reg.%COMPUTERNAME%.reg HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\WLBS: no significant differences - Recreate rules 1: recreate all 18 port rules on rogue server (so much fun :-/ ): did not help
- Recreate rules 2: change ALL servers to use only 1 rule: did not help
- Curse: did not help
- Restart service: disable NLB on network adapter, press OK, re-enable NLB => Bingo! Server back in the cluster without a blink.
Now I only have to re-create the 18 rules on all servers and I’m done!
Mental note to self: check out if I can build a dedicated load balancing device in Linux, one that (1) takes into account server load (give work to least busy server) (2) response time on individual ports (and automatically disable non-responsive ports) (3) has a web interface, so I can configure from any server in the subnet.
If you're new here, you may want to subscribe to my RSS feed or receive updates via email. Thanks for visiting!
Related posts:
- Port redirection in Windows We use port redirection/proxy often on our platforms. In the...
- No life without CURLs If you’re a web server administrator – as I am...
0 Responses to “Converge already! (Struggling with WLBS)”