High Availabilty with heartbeat and ldirectord

tate_harmann · Jun 14, 2006

Hello,
I am setting up a highly available, load balancing apache cluster. I think I have everything in place, and everything works except the load balancing. Heartbeat is used for the failover and works fine. I am using source hashing as the scheduling-method for ldirectord. Ldirectord does see the two nodes as the out put of "ipvsadm -L -n" shows:
SLES9-CLUSTER1:~ # ipvsadm -L -n
IP Virtual Server version 1.2.0 (size=4096)
Prot LocalAddressort Scheduler Flags
-> RemoteAddressort Forward Weight ActiveConn InActConn
TCP 192.168.200.79:80 sh
-> 192.168.200.78:80 Route 1 0 0
-> 192.168.200.77:80 Local 1 0 0

And when I shut down one of the boxes, they are pulled from the pool and the master will roll to the other like it is supposed to. However, the actual web request on port 80 fails when going to the non-local node (192.168.200.78 in the above example.) It will come through fine on the local node. So about half of the web requests fail. I did enable ip forwarding, is there anything else I need to do? Oh, it is suse enterprise linux 9, and the service address gets bound to eth0 as eth0:0. I don't know if this is right, but most of the examples I found online set up the service address as lo:0.
I can post some config files if needed.

thank you,

tate_harmann · Jun 15, 2006

I guess I just modified the config found here:
http://www.howtoforge.org/high_availability_loadbalanced_apache_cluster

All I need is step 6, everything else works. However, I did mine with only two boxes instead of four. Each has a loadbalancer and http service on it. Only one loadbalancer is active at a time, but I still want both boxes to balance the http requests. There is an article I found on doing this very setup:
http://www.ultramonkey.org/2.0.1/topologies/sl-ha-lb-eg.html

However, I needed to tweak mine a little as I am running SLES 9. I am basically using a hybrid config between the two tutorials.

thanks,

noahlau · Jun 15, 2006

tate_harmann said:

I guess I just modified the config found here:
http://www.howtoforge.org/high_availability_loadbalanced_apache_cluster

All I need is step 6, everything else works. However, I did mine with only two boxes instead of four. Each has a loadbalancer and http service on it. Only one loadbalancer is active at a time, but I still want both boxes to balance the http requests. There is an article I found on doing this very setup:
http://www.ultramonkey.org/2.0.1/topologies/sl-ha-lb-eg.html

However, I needed to tweak mine a little as I am running SLES 9. I am basically using a hybrid config between the two tutorials.

thanks,
Click to expand...

In HA cluster, Only one loadbalancer is active at a time. The another one is standby load balancer which will be active when the primary loadbalancer is failed.

falko · Jun 15, 2006

tate_harmann said:

However, the actual web request on port 80 fails when going to the non-local node (192.168.200.78 in the above example.) It will come through fine on the local node.
Click to expand...

I only see local IP addresses in your post...

tate_harmann · Jun 15, 2006

Sorry, what I mean is the output of the "ipvsadm -L -n" command lists the nodes as local or route:
TCP 192.168.200.79:80 sh
-> 192.168.200.78:80 Route 1 0 0
-> 192.168.200.77:80 Local 1 0 0

They are all private ip addresses. 192.168.200.79 is my virtual address, .77 is the active load balancer but is also an available node to recieve http requests. .78 is the other node, but I'm not sure that the loadbalancer is passing requests to that node or not. Since half of my requests were failing, I assumed the ones that failed were the ones getting forwarded to .78 and then getting dropped.

tate_harmann · Jun 15, 2006

OK, I think I found my problem here:
The Linux Virtual Server has three different ways of forwarding packets: Network Address Translation (NAT), IP-IP encapsulation or tunnelling and Direct Routing.

* Direct Routing: Packets from end users are forwarded directly to the real server. The IP packet is not modified, so the real servers must be configured to accept traffic for the virtual server's IP address. This can be done using a dummy interface, or packet filtering to redirect traffic addressed to the virtual server's IP address to a local port. The real server may send replies directly back to the end user. That is if a host based layer 4 switch is used, it may not be in the return path.

I need to set up an ip alias on my loopback (lo:0) for the apache web server to accept connections for the virtual ip address (192.168.200.79). However, the tutorial explains how to do it in debian, do you know how it is done on SLES 9? I'll check in the meantime. Thanks,

tate_harmann · Jun 15, 2006

Yes,
That was the problem. I just did:

ifconfig lo:0 192.168.200.79 255.255.255.255

to add the alias, and the server started accepting requests.

thanks,

_stephan_ · Aug 28, 2007

Hi there!

So, i did like the howto described, but when i nmap the VIP, the http and mysql port is filtered.. Is there anything else i have to do? e.g. change the default route or add a new route on the realservers?

Thanks!

_stephan_ · Aug 28, 2007

No ideas?

hm, i thought it would be easier to get some hints...

greets

falko · Aug 29, 2007

_stephan_ said:

Hi there!

So, i did like the howto described, but when i nmap the VIP, the http and mysql port is filtered..
Click to expand...

What do you mean with "filtered"?

_stephan_ · Aug 29, 2007

Hi,

nmap-ing the VIP shows only the HTTP and MySQL ports as filtered. But, now it's only sometimes filtered.. So it works a couple of hours, after a restart of 1 RS, the ports change to filtered.. strange, isn't it?

greets.

Tenebris · Mar 20, 2009

Loopback alias

I've been trying to do this and the real server loses all contact with this outside world.
In fact, the server won't respond to any requests after I add such a loopback alias.
Any one else here having the same issue?

Solomon

tate_harmann said: ↑

Yes,
That was the problem. I just did:

ifconfig lo:0 192.168.200.79 255.255.255.255

to add the alias, and the server started accepting requests.

thanks,
Click to expand...

falko · Mar 20, 2009

Which distribution are you using, and which tutorial (URL) did you follow?

Tenebris · Mar 20, 2009

Re: Loopback alias

I'm using CentOS 5 and I was following a tutorial out of several pages:

First, the O'Reilly Book, Linux System Administrator's Guide, under the chapter for load balancers.
Second, http://www.jedi.com/obiwan/technology/ultramonkey-rhel4.html, which followed pretty much the same logic.
Third, http://www.ultramonkey.org/3/topologies/sl-ha-lb-eg.html

I even used the "correction" script from http://classcast.blogspot.com/2006/12/two-node-lvs-dr-setup-on-centos.html that was supposed to solve the loopback alias problem...
Except the the "correction" script locks out everything once it tries to raise the loopback alias. Also the correction script wants an executable that doesn't exist: /etc/ha.d/rc.d/arptables-noarp-addr_takeip. (I did a yum search for arptables and ended up installing arptables_jf, but that didn't install such an executable either).

I've tried experimenting with different configurations out of ldirectord.cf, including changing gate to masq and (gasp!) ipip.

I'm pretty sure my sysctl settings are correct, but here they are:
On my load balancer:
net.ipv4.ip_forward = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.shmmax = 68719476736
kernel.shmall = 4294967296

...and on my nodes:
net.ipv4.ip_forward = 1
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
net.ipv4.conf.all.arp_ignore = 1
net.ipv4.conf.eth0.arp_ignore = 1
net.ipv4.conf.all.arp_announce = 2
net.ipv4.conf.eth0.arp_announce = 2

...and my LB's ldirectord.cf is as follows:
checktimeout=10
checkinterval=12
autoreload=no
logfile="local0"
quiescent=no
virtual=10.0.0.100:80
real=10.0.0.101:80 gate
real=10.0.0.102:80 gate
service=http
request="ldirectord.html"
receive="I'm alive!"
scheduler=rr
protocol=tcp
checktype=negotiate

There is an "ldirectord.html" on each of the nodes that is successfully acknowledged... if the node is not running with a loopback alias. If I do set my node's loopback alias as follows:
ipconfig lo:0 10.0.0.100 netmask 255.255.255.255
...the node stops responding to the load balancer. However, I can still hit the node from anywhere else except the load balancer.

If I take the loopback alias down on the nodes, ldirectord says it can see the nodes, but any attempt to hit the virtual IP now times out.

falko · Mar 21, 2009

Instead of setting up a loopback alias, you can try this on the nodes (in /etc/sysctl.conf):
net.ipv4.ip_nonlocal_bind=1

This allows the nodes (and therefore Apache) to listen to IPs that are currently not bound to them.

Tenebris · Mar 24, 2009

Tried that just now, but...

...still no dice. However, since then, I've noticed some interesting other behavior...

I tried setting the LB's "checkinterval" value to 30, so that it checks to see if it can access nodes 30 seconds apart. (Or a "tick" in old MUD parlance). At this current point, the loopback interface on every node is down.

Then I fire up ldirectord, and let it see the nodes. (If the loopback alias on the nodes is currently up, then it won't get a response from the nodes, and will flag those nodes as unavailable.)

If I were to hit the Virtual IP from a web browser it'll time out.
However, if I turn on the loopback aliases on the nodes right now, everything works perfectly - the requests successfully route to a random node.

At least, until the next tick, maximum 30 seconds later, at which point, the load balancer cannot make a request of the node and marks it as being nonfunctional.

It is almost as if the Load Balancer does forward packets to the node, but cannot receive confirmation that it has done so. ldirectord marks the node disabled after "checkinteraval" seconds have passed, because requests to the node don't come back. It is obvious that the node is listening, but is unable to respond to the LB because the node's loopback alias is set to the Virtual IP.

Any help would be appreciated.

(From a loopback standpoint, I don't understand how a node is ever expected to communicate with another server when the node loopback alias is set to be the same as that other server.)

Solomon Chang

adam0x54 · Mar 25, 2009

Use heartbeat v2 and make an active/active apache configuration. The documentation is kinda stupid but once you get it, it simply works. I spent like a week to figure it out but it works.

Get the CentOS RPM from here: http://download.opensuse.org/repositories/server:/ha-clustering:/lha-2.1/RHEL_5/

http://www.linux-ha.org/FactSheetv2

-Adam

Log in or Sign up

High Availabilty with heartbeat and ldirectord

tate_harmann New Member

tate_harmann New Member

noahlau New Member

falko Super Moderator ISPConfig Developer

tate_harmann New Member

tate_harmann New Member

tate_harmann New Member

_stephan_ New Member

_stephan_ New Member

falko Super Moderator ISPConfig Developer

_stephan_ New Member

Tenebris New Member

falko Super Moderator ISPConfig Developer

Tenebris New Member

falko Super Moderator ISPConfig Developer

Tenebris New Member

adam0x54 New Member

Share This Page

Log in or Sign up

High Availabilty with heartbeat and ldirectord

tate_harmann New Member

tate_harmann New Member

noahlau New Member

falko Super Moderator ISPConfig Developer

tate_harmann New Member

tate_harmann New Member

tate_harmann New Member

_stephan_ New Member

_stephan_ New Member

falko Super Moderator ISPConfig Developer

_stephan_ New Member

Tenebris New Member

falko Super Moderator ISPConfig Developer

Tenebris New Member

falko Super Moderator ISPConfig Developer

Tenebris New Member

adam0x54 New Member

Share This Page

Useful Searches