Converge already! (Struggling with WLBS)

I hate when things don’t go my way. One server in our NLB (Network Load Balancing) cluster did not want to join the cluster anymore. When I issued a wlbs start, it tried for a couple of seconds to join the cluster, but then remained in ‘Converging’ state. A couple of times I saw an entry in the System Log " ... does not have the same number or type of port rules ...".

I tried:

  • Reboot: worked 1st time, but after that: did not help
  • Compare rules 1: compared output of wlbs display: changed all load parameters to ‘Equal’ (I normally give the servers weights that take into account the # of processors and #MB RAM)
  • Compare rules 2: compared output of wlbs display: identical
  • Compare rules 3: compared output of regedit -e wlbs.reg.%COMPUTERNAME%.reg HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\WLBS: no significant differences
  • Recreate rules 1: recreate all 18 port rules on rogue server (so much fun :-/ ): did not help
  • Recreate rules 2: change ALL servers to use only 1 rule: did not help
  • Curse: did not help
  • Restart service: disable NLB on network adapter, press OK, re-enable NLB => Bingo! Server back in the cluster without a blink.

Now I only have to re-create the 18 rules on all servers and I’m done!

Mental note to self: check out if I can build a dedicated load balancing device in Linux, one that (1) takes into account server load (give work to least busy server) (2) response time on individual ports (and automatically disable non-responsive ports) (3) has a web interface, so I can configure from any server in the subnet.

These icons link to social bookmarking sites where readers can share and discover new web pages.
  • del.icio.us
  • digg
  • Furl
  • Ma.gnolia
  • NewsVine
  • Reddit
  • TailRank

If you're new here, you may want to subscribe to my RSS feed or receive updates via email. Thanks for visiting!

0 Responses to “Converge already! (Struggling with WLBS)”


  1. No Comments

Leave a Reply