Public and private network + High Availability Apache Cluster

Discussion in 'HOWTO-Related Questions' started by teleted, Jul 9, 2007.

  1. teleted

    teleted New Member

    Hello.

    I have a question about the great HOW-TO article about setting up a high availability Apache cluster.

    I am following this tutorial, but I am using Ubuntu 6.07 LTS Server instead of Debian Sarge. It has the necessary virtual server support in the kernel.

    The tutorial sets up the cluster entirely on a local, publicly inaccessible network. I'm trying to put the load balancers on the public network, with "real" IPs.

    What I am not clear on is exactly how the cluster servers should be networked, given that the load balancers need to be publicly accessible using a "real" IP address, while the cluster nodes themselves need to be on a local 192.168.0.XXX network.

    I have my Internet connection going to a router, then to my two load balancers on eth0 (NIC 1)

    Then, on eth1 (NIC 2) on each load balancer, I have network cables going to another router. Two Apache nodes are plugged in to that router.

    Is this the proper physical network setup? The tutorial didn't cover this, and I haven't been able to find it spelled out on the Linux Virtual Server site either. If it isn't proper, can someone tell me the way to network the cluster using a real address and local network?
     
  2. teleted

    teleted New Member

    Output from tests on page 3

    Presently, pinging or SSHing to my "real" ip address works for about a minute, then stops. Then starts again a few minutes later, then stops and repeats. It looks like ldirectord is exiting. Browsing to the "real" address during that tune when ldirectord is running never gets a web page response.

    Using telnet to port 80, I can see that, while I am on the load balancer console, the web server on the load balancer and two clusters respond to requests.

    Code:
    # telnet 192.168.0.101 80
    Trying 199.32.87.62...
    Connected to 199.32.87.62.
    Escape character is '^]'.
    GET /ldirector.html {RETURN KEY PRESSED}
    Test Page
    
    However, if I put a unique document on the two web cluster nodes and try to request that through the real IP (loadbalancer public address), they are not found. Evidently it is not passing the request to the cluster nodes.

    Code:
    # telnet 192.168.0.101 80
    Trying 199.32.87.62...
    Connected to 199.32.87.62.
    Escape character is '^]'.
    GET /different_file.html {RETURN KEY PRESSED}
    {404 error message from web server returned}
    
    199.32.87.62 is the public address (altered for posting).
    199.32.87.254 is the gateway.

    192.168.0.101 = cluster 1
    192.168.0.102 = cluster 2
    192.168.0.103 = loadb1
    192.168.0.104 = loadb2


    [highlight]Here is the output from page three of the tutorial.[/highlight]

    Code:
    loadb1:~$ ip addr sh
    1: lo: <LOOPBACK,UP> mtu 16436 qdisc noqueue 
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
        inet 127.0.0.1/8 scope host lo
        inet6 ::1/128 scope host 
           valid_lft forever preferred_lft forever
    2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
        link/ether 00:11:43:35:e1:a2 brd ff:ff:ff:ff:ff:ff
        inet 192.168.0.103/24 brd 192.168.0.255 scope global eth0
        inet6 fe80::211:43ff:fe35:e1a2/64 scope link 
           valid_lft forever preferred_lft forever
    3: eth1: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc pfifo_fast qlen 1000
        link/ether 00:11:43:35:e1:a4 brd ff:ff:ff:ff:ff:ff
        inet 199.32.87.62/24 brd 199.32.87.254 scope global eth1
        inet6 2001:18e8:2:330:211:43ff:fe35:e1a4/64 scope global dynamic 
           valid_lft 2591985sec preferred_lft 604785sec
        inet6 fe80::211:43ff:fe35:e1a4/64 scope link 
           valid_lft forever preferred_lft forever
    4: sit0: <NOARP> mtu 1480 qdisc noop 
        link/sit 0.0.0.0 brd 0.0.0.0
    
    Code:
    loadb1:~$ sudo -s
    Password:
    root@loadb1:~# ipvsadm -L -n
    IP Virtual Server version 1.2.1 (size=4096)
    Prot LocalAddress:Port Scheduler Flags
      -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
    TCP  199.32.87.62:80 rr
      -> 192.168.0.101:80             Route   1      0          0         
      -> 192.168.0.102:80             Route   1      0          0         
    
    Code:
    root@loadb1:~# ldirectord ldirectord.cf status
    ldirectord for /etc/ha.d/ldirectord.cf is running with pid: 4288
    
    Code:
    loadb1:~$ /etc/ha.d/resource.d/LVSSyncDaemonSwap master status
    master running
    (ipvs_syncmaster pid: 4427)
    
    Code:
    loadb1:~$ cd /etc/ha.d
    loadb1:/etc/ha.d$ cat ha.cf
    logfacility        local0
    bcast        eth0                # Linux
    mcast eth0 225.0.0.1 694 1 0
    auto_failback off
    node        loadb1
    node        loadb2
    respawn hacluster /usr/lib/heartbeat/ipfail
    apiauth ipfail gid=haclient uid=hacluster
    loadb1:/etc/ha.d$ cat haresources
    loadb1        \
            ldirectord::ldirectord.cf \
            LVSSyncDaemonSwap::master \
            IPaddr2::199.32.87.62/24/eth1/199.32.87.254
    
    Code:
    loadb1:/etc/ha.d$ cat ldirectord.cf 
    checktimeout=10
    checkinterval=2
    autoreload=no
    logfile="/var/log/ldirector-local0"
    quiescent=yes
    
    virtual=199.32.87.62:80
            real=192.168.0.101:80 gate
            real=192.168.0.102:80 gate
            fallback=127.0.0.1:80 gate
            service=http
            request="ldirector.html"
            receive="Test Page"
            scheduler=rr
            protocol=tcp
            checktype=negotiate
    
     
  3. teleted

    teleted New Member

    ldirectord log output

    Code:
            
    loadb1:/etc/ha.d$ cat /var/log/ldirector-local0 
    [Mon Jul  9 14:05:02 2007|ldirectord.cf] Removed real server: 192.168.0.101:80 ( x 199.32.87.62:80
    [Mon Jul  9 14:05:02 2007|ldirectord.cf] Removed real server: 192.168.0.102:80 ( x 199.32.87.62:80
    [Mon Jul  9 14:05:02 2007|ldirectord.cf] Removed virtual server: 199.32.87.62:80
    [Mon Jul  9 14:05:02 2007|ldirectord.cf] Linux Director Daemon terminated on signal: TERM
    [Mon Jul  9 14:07:02 2007|ldirectord.cf] ldirectord is stopped for /etc/ha.d/ldirectord.cf
    [Mon Jul  9 14:07:02 2007|ldirectord.cf] Exiting with exit_status 3: Exiting from ldirectord status
    [Mon Jul  9 14:07:33 2007|ldirectord.cf] ldirectord is stopped for /etc/ha.d/ldirectord.cf
    [Mon Jul  9 14:07:33 2007|ldirectord.cf] Exiting with exit_status 3: Exiting from ldirectord status
    [Mon Jul  9 14:07:34 2007|ldirectord.cf] ldirectord is stopped for /etc/ha.d/ldirectord.cf
    [Mon Jul  9 14:07:34 2007|ldirectord.cf] Exiting with exit_status 3: Exiting from ldirectord status
    [Mon Jul  9 14:07:35 2007|ldirectord.cf] Starting Linux Director v1.77.2.36 as daemon
    [Mon Jul  9 14:07:35 2007|ldirectord.cf] Added virtual server: 199.32.87.62:80
    [Mon Jul  9 14:07:35 2007|ldirectord.cf] Added fallback server: 127.0.0.1:80 ( x 199.32.87.62:80) (Weight set to 1)
    [Mon Jul  9 14:07:35 2007|ldirectord.cf] Quiescent real server: 192.168.0.102:80 mapped from 192.168.0.102:80 ( x 199.32.87.62:80) (Weight set to 0)
    [Mon Jul  9 14:07:35 2007|ldirectord.cf] Quiescent real server: 192.168.0.101:80 mapped from 192.168.0.101:80 ( x 199.32.87.62:80) (Weight set to 0)
    [Mon Jul  9 14:07:35 2007|ldirectord.cf] Restored real server: 192.168.0.101:80 ( x 199.32.87.62:80) (Weight set to 1)
    [Mon Jul  9 14:07:35 2007|ldirectord.cf] Deleted fallback server: 127.0.0.1:80 ( x 199.32.87.62:80)
    [Mon Jul  9 14:07:35 2007|ldirectord.cf] Restored real server: 192.168.0.102:80 ( x 199.32.87.62:80) (Weight set to 1)
    [Mon Jul  9 14:11:07 2007|ldirectord.cf] Removed real server: 192.168.0.101:80 ( x 199.32.87.62:80
    [Mon Jul  9 14:11:07 2007|ldirectord.cf] Removed real server: 192.168.0.102:80 ( x 199.32.87.62:80
    [Mon Jul  9 14:11:07 2007|ldirectord.cf] Removed virtual server: 199.32.87.62:80
    [Mon Jul  9 14:11:07 2007|ldirectord.cf] Linux Director Daemon terminated on signal: TERM
    [Mon Jul  9 14:12:49 2007|ldirectord.cf] ldirectord is stopped for /etc/ha.d/ldirectord.cf
    [Mon Jul  9 14:12:49 2007|ldirectord.cf] Exiting with exit_status 3: Exiting from ldirectord status
    [Mon Jul  9 14:12:49 2007|ldirectord.cf] ldirectord is stopped for /etc/ha.d/ldirectord.cf
    [Mon Jul  9 14:12:49 2007|ldirectord.cf] Exiting with exit_status 3: Exiting from ldirectord status
    [Mon Jul  9 14:12:50 2007|ldirectord.cf] Starting Linux Director v1.77.2.36 as daemon
    [Mon Jul  9 14:12:50 2007|ldirectord.cf] Added virtual server: 199.32.87.62:80
    [Mon Jul  9 14:12:50 2007|ldirectord.cf] Added fallback server: 127.0.0.1:80 ( x 199.32.87.62:80) (Weight set to 1)
    [Mon Jul  9 14:12:50 2007|ldirectord.cf] Quiescent real server: 192.168.0.102:80 mapped from 192.168.0.102:80 ( x 199.32.87.62:80) (Weight set to 0)
    [Mon Jul  9 14:12:50 2007|ldirectord.cf] Quiescent real server: 192.168.0.101:80 mapped from 192.168.0.101:80 ( x 199.32.87.62:80) (Weight set to 0)
    [Mon Jul  9 14:12:50 2007|ldirectord.cf] Restored real server: 192.168.0.101:80 ( x 199.32.87.62:80) (Weight set to 1)
    [Mon Jul  9 14:12:50 2007|ldirectord.cf] Deleted fallback server: 127.0.0.1:80 ( x 199.32.87.62:80)
    [Mon Jul  9 14:12:50 2007|ldirectord.cf] Restored real server: 192.168.0.102:80 ( x 199.32.87.62:80) (Weight set to 1)
    [Mon Jul  9 14:13:18 2007|ldirectord.cf] Configuration file '/etc/ha.d/ldirectord.cf' has changed on disk
    [Mon Jul  9 14:13:18 2007|ldirectord.cf]  - ignore new configuration
    [Mon Jul  9 14:13:27 2007|ldirectord.cf] Removed real server: 192.168.0.101:80 ( x 199.32.87.62:80
    [Mon Jul  9 14:13:27 2007|ldirectord.cf] Removed real server: 192.168.0.102:80 ( x 199.32.87.62:80
    [Mon Jul  9 14:13:27 2007|ldirectord.cf] Removed virtual server: 199.32.87.62:80
    [Mon Jul  9 14:13:27 2007|ldirectord.cf] Linux Director Daemon terminated on signal: TERM
    [Mon Jul  9 14:15:25 2007|ldirectord.cf] ldirectord is stopped for /etc/ha.d/ldirectord.cf
    [Mon Jul  9 14:15:25 2007|ldirectord.cf] Exiting with exit_status 3: Exiting from ldirectord status
    [Mon Jul  9 14:15:56 2007|ldirectord.cf] ldirectord is stopped for /etc/ha.d/ldirectord.cf
    [Mon Jul  9 14:15:56 2007|ldirectord.cf] Exiting with exit_status 3: Exiting from ldirectord status
    [Mon Jul  9 14:15:56 2007|ldirectord.cf] ldirectord is stopped for /etc/ha.d/ldirectord.cf
    [Mon Jul  9 14:15:56 2007|ldirectord.cf] Exiting with exit_status 3: Exiting from ldirectord status
    [Mon Jul  9 14:15:57 2007|ldirectord.cf] Starting Linux Director v1.77.2.36 as daemon
    [Mon Jul  9 14:15:57 2007|ldirectord.cf] Added virtual server: 199.32.87.62:80
    [Mon Jul  9 14:15:57 2007|ldirectord.cf] Added fallback server: 127.0.0.1:80 ( x 199.32.87.62:80) (Weight set to 1)
    [Mon Jul  9 14:15:57 2007|ldirectord.cf] Quiescent real server: 192.168.0.102:80 mapped from 192.168.0.102:80 ( x 199.32.87.62:80) (Weight set to 0)
    [Mon Jul  9 14:15:57 2007|ldirectord.cf] Quiescent real server: 192.168.0.101:80 mapped from 192.168.0.101:80 ( x 199.32.87.62:80) (Weight set to 0)
    [Mon Jul  9 14:15:57 2007|ldirectord.cf] Restored real server: 192.168.0.101:80 ( x 199.32.87.62:80) (Weight set to 1)
    [Mon Jul  9 14:15:57 2007|ldirectord.cf] Deleted fallback server: 127.0.0.1:80 ( x 199.32.87.62:80)
    [Mon Jul  9 14:15:57 2007|ldirectord.cf] Restored real server: 192.168.0.102:80 ( x 199.32.87.62:80) (Weight set to 1)
    
    Code:
    loadb1:/etc/ha.d$ tail /var/log/messages
    Jul  9 14:15:57 loadb1 heartbeat: info: /sbin/ip -f inet addr add 199.32.87.62/24 brd 199.32.87.254 dev eth1
    Jul  9 14:15:57 loadb1 heartbeat: info: /sbin/ip link set eth1 up
    Jul  9 14:15:57 loadb1 kernel: [42949421.020000] ADDRCONF(NETDEV_UP): eth1: link is not ready
    Jul  9 14:15:57 loadb1 heartbeat: /usr/lib/heartbeat/send_arp -i 200 -r 5 -p /var/lib/heartbeat/rsctmp/send_arp/send_arp-199.32.87.62 eth1 199.32.87.62 auto 199.32.87.62 ffffffffffff
    Jul  9 14:15:57 loadb1 kernel: [42949421.070000] NET: Registered protocol family 17
    Jul  9 14:16:00 loadb1 kernel: [42949424.000000] tg3: eth1: Link is up at 10 Mbps, half duplex.
    Jul  9 14:16:00 loadb1 kernel: [42949424.000000] tg3: eth1: Flow control is off for TX and off for RX.
    Jul  9 14:16:00 loadb1 kernel: [42949424.020000] ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
    Jul  9 14:16:07 loadb1 heartbeat[4097]: info: Local Resource acquisition completed. (none)
    Jul  9 14:16:07 loadb1 heartbeat[4097]: info: local resource transition completed.
    
     
  4. spitzbueb

    spitzbueb New Member

    Hi!

    Does anybody have a solution for loadbalancing with public ips?

    Thanks and Regards

    Markus
     
  5. falko

    falko Super Moderator Howtoforge Staff

    Ask your provider (where you have your servers) to help you with this. They should be able to give you a virtual IP address. :)
     
  6. spitzbueb

    spitzbueb New Member

    Thanks for your reply!

    I'm having the following configuration:

    Two real servers with vmware server installed. I'm just looking at one of the two servers:

    1st Real server (host):
    - 1st interface: public IPs, connected to the internet over eth0
    - 2nd interface: 1 private IP (192.168.1.2) connected over a switch to the other virtual server host

    I followed your tutorial, but used the configuration like this:

    two virtual clients : loadb1 (192.168.0.11) and loadb2 (192.168.0.12) bridged to eth1 (local net)
    and instead of the local virtual IP a public virtual IP bridged to eth0

    two virtual clients: web1 (192.168.0.61) and web2 (192.168.0.62) bridged to eth1 (local net)

    I can access websites on web1/2 on the local net directly.
    But I can't connect to the webservers using the load balancer over the public IP.

    Configuration on load1: (load2 the same just the other local ip)

    /etc/ha.d/haresources:
    Code:
    load1        \
            ldirectord::ldirectord.cf \
            LVSSyncDaemonSwap::master \
            IPaddr2::aaa.aaa.aaa.aaa/24/eth0 \
            IPaddr2::bbb.bbb.bbb.bbb/24/eth0
    
    /etc/ha.d/ldirectord.cf:
    Code:
    checktimeout=10
    checkinterval=2
    autoreload=no
    logfile="local0"
    quiescent=yes
    
    virtual=aaa.aaa.aaa.aaa:80
            real=192.168.0.61:80 gate
            real=192.168.0.62:80 gate
            fallback=127.0.0.1:80 gate
            service=http
            request="ldirector.html"
            receive="Test Page"
            scheduler=wrr
            protocol=tcp
            checktype=negotiate
    
    virtual=bbb.bbb.bbb.bbb:80
            real=192.168.0.61:80 gate
            real=192.168.0.62:80 gate
            fallback=127.0.0.1:80 gate
            service=http
            request="ldirector.html"
            receive="Test Page"
            scheduler=wrr
            protocol=tcp
            checktype=negotiate
    
    Where aaa.aaa.aaa.aaa and bbb.bbb.bbb.bbb are public IPs.

    sysctl -p:
    Code:
    net.ipv4.ip_forward = 1
    
    ipvsadm:
    Code:
    IP Virtual Server version 1.2.1 (size=4096)
    Prot LocalAddress:Port Scheduler Flags
      -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
    TCP  bbb.bbb.bbb.bbb:www wrr
      -> 192.168.0.62:www             Route   1      0          0
      -> 192.168.0.61:www             Route   1      0          0
    TCP  aaa.aaa.aaa.aaa:www wrr
      -> 192.168.0.62:www             Route   1      0          0
      -> 192.168.0.61:www             Route   1      0          0
    
    Connection from a client (internet) over the load balancer's public IP
    ipvsadm -L -c -n:
    Code:
    IPVS connection entries
    pro expire state       source             virtual            destination
    TCP 00:51  SYN_RECV    clientip:61625 aaa.aaa.aaa.aaa:80 192.168.0.62:80
    
    The state "SYN_RECV" never changes and the client gets a timeout.


    Configuration on web1/2:

    ifconfig:
    Code:
    eth0    Link encap:Ethernet  HWaddr 00:0C:29:25:96:2E
              inet addr:192.168.0.61  Bcast:192.168.0.255  Mask:255.255.255.0
              inet6 addr: fe80::20c:29ff:fe25:962e/64 Scope:Link
              UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
              RX packets:1671007 errors:0 dropped:0 overruns:0 frame:0
              TX packets:1352975 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:1000
              RX bytes:148462185 (141.5 MiB)  TX bytes:166755807 (159.0 MiB)
              Interrupt:177 Base address:0x1400
    
    lo        Link encap:Local Loopback
              inet addr:127.0.0.1  Mask:255.0.0.0
              inet6 addr: ::1/128 Scope:Host
              UP LOOPBACK RUNNING  MTU:16436  Metric:1
              RX packets:0 errors:0 dropped:0 overruns:0 frame:0
              TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
              collisions:0 txqueuelen:0
              RX bytes:0 (0.0 b)  TX bytes:0 (0.0 b)
    
    lo:0      Link encap:Local Loopback
              inet addr:aaa.aaa.aaa.aaa  Mask:255.255.255.255
              UP LOOPBACK RUNNING  MTU:16436  Metric:1
    
    lo:1      Link encap:Local Loopback
              inet addr:bbb.bbb.bbb.bbb  Mask:255.255.255.255
              UP LOOPBACK RUNNING  MTU:16436  Metric:1
    
    sysctl -p:
    Code:
    net.ipv4.conf.all.arp_ignore = 1
    net.ipv4.conf.eth0.arp_ignore = 1
    net.ipv4.conf.all.arp_announce = 2
    net.ipv4.conf.eth0.arp_announce = 2
    
    The gateway for the webservers:
    route:
    Code:
    Kernel IP routing table
    Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
    192.168.0.0     *               255.255.255.0   U     0      0        0 eth0
    default         192.168.0.1   0.0.0.0         UG    0      0        0 eth0
    
    when the client connects:
    netstat:
    Code:
    Active Internet connections (w/o servers)
    Proto Recv-Q Send-Q Local Address           Foreign Address         State
    tcp        0      0 aaa.aaa.aaa.aaa:www     clientIP:61676 SYN_RECV
    
    And here, it stays in the "SYN_RECV" state, too.

    In my opinion the packets from the client are forwarded over the loadbalancer to the webserver. But here apache doesn't get the packets...

    I don't know how to check each step, to evaluate the error...

    Do you have any clue what the problem could be?

    Thank you very much and best regards

    Markus
     
  7. spitzbueb

    spitzbueb New Member

    Hi

    I got it to work.

    LVS with two Interfaces is only possible with LVS-NAT.

    I still have one problem adding the gateways over heartbeat (IPaddr2) for the external interface.

    Regrads

    Markus
     
  8. andeas-2008

    andeas-2008 New Member

    Same problem

    Hi Markus,

    I am interested in your solution, because I am having the same problem here. I do not have a seperate firewall, but my loadbalancers have the public ips. I have 2 webservers behind them and configured them with private ips as well as with the virtual public ips on lo. Unfortuanetly I only get http access if the have also a public interface configured. All the rest is the same as yours. Can you give me a hint?

    Thanks,

    Andreas
     
  9. spitzbueb

    spitzbueb New Member

    Set gateway and iptables so the lb forwards traffic

    Hi

    My problem was that I forgot to set the gateway for the public interfaces, as the loadbalancer heartbeat config only sets the one for the private interface.

    So when the server came up I wrote:

    Code:
    route add default gw aaa.bbb.ccc.ddd eth1
    where as aaa.bbb.ccc.ddd is the gateway and eth1 the interface alias of the public interface.

    My problem then was, that I haven't got internet connection on the virtual webservers (they only have a private network) and I didn't want to set up a gateway just for them.

    So I used the active load balancers private ip as gateway for the virtual servers. By typing the following ipchains command:

    Code:
    iptables -t nat -A POSTROUTING -j MASQUERADE -s 192.168.0.0/24
    To forward traffic in the 192.168.0.0 net.

    Just ask if you have more questions.

    regards
     
  10. andeas-2008

    andeas-2008 New Member

    Thanks a lot, this works for me as well. To summarize: There are 2 possiblities for a setup:

    1. 2 interfaces, 1 public, 1 private. The loadbalancer (lb) forwards te request to the webserver and the webserver sends its answer directly through the public interface. In this case the webserver needs a public IP and can therefore be accessed from outside (although you can close all incoming ports via iptables). In this case you have to assign the virtual public IPs to the loopback device.

    2. LVS-NAT: In this case the request will be sent through the loadbalancer (as you described above). Here, I do not need any publi interfaces for the webserver. I guess, I also do not need to assign the public virtual IP to the loopback interface, right?


    Thanks,

    Andreas
     
  11. andeas-2008

    andeas-2008 New Member

    Ah, another question: Does the passive loadbalancer overtake the private IP from the active one in case of a failover? Otherwise you will loose your gateway then ...
     
  12. spitzbueb

    spitzbueb New Member

    hi

    1. That's correct, imho.

    2. Dito

    Yes it overtakes the public ip, as you configured it in heartbeat.

    But you can enable routing and add the default gateway to the passive load balancer at the beginning. So if anything goes wrong with the active loadbalancer it should work if the public ip is overtaken by the passive load balancer.
    I can not tell you if it's really working, because I stopped using this configuration in a productive environment over vmware.
    I had difficulties with the loadbalancer, it stopped sharing the traffic, but hearbeat wasn't aware of it... So no access to the websites even with two loadbalancers...

    regards
     

Share This Page