Hi everyone, I am following the HowToForge tutorial that details how to setup a high-availability load-balanced apache2 cluster and I have run into some problems. I am at step 7 on page 4 of the tutorial where the author states “You can now access the web site that is hosted by the two Apache nodes by typing http://192.168.0.105 in the browser”, but that step is not working for me. I get “the page could not be displayed” when I try to go to my virtual IP in my browser. I also cannot telnet to port 80 or port 443 on my virtual IP. All the tests on page 3 of the tutorial (ip addr sh eth0, ldirectord ldirectord.cf status, ipvsadm -L –n, and /etc/ha.d/resource.d/LVSSyncDaemonSwap master status) pass successfully with the exact same results as shown in the examples. Any thoughts as to what I am doing wrong? Here’s what I have: All servers are on the same network segment with no firewalls in between. balancer1 – load balancer running Debian etch 4.0r4, IP address: 192.168.0.12 balancer2 – load balancer running Debian etch 4.0r4, IP address: 192.168.0.13 maia1 – web server running OpenSuSE 11 and Apache2, IP address: 192.168.0.7 maia2 – web server running OpenSuSE 11 and Apache2, IP address: 192.168.0.6 Virtual cluster IP: 192.168.0.8 Balancer1 ha.cf: logfacility local0 bcast eth0 # Linux mcast eth0 225.0.0.1 694 1 0 auto_failback off node balancer1 node balancer2 respawn hacluster /usr/lib/heartbeat/ipfail apiauth ipfail gid=haclient uid=hacluster Balancer2 ha.cf: logfacility local0 bcast eth0 # Linux mcast eth0 225.0.0.1 694 1 0 auto_failback off node balancer1 node balancer2 respawn hacluster /usr/lib/heartbeat/ipfail apiauth ipfail gid=haclient uid=hacluster Balancer1 haresources: balancer1 \ ldirectord::ldirectord.cf \ LVSSyncDaemonSwap::master \ IPaddr2::192.168.0.8/24/eth0/192.168.0.255 Balancer2 haresources: balancer1 \ ldirectord::ldirectord.cf \ LVSSyncDaemonSwap::master \ IPaddr2::192.168.0.8/24/eth0/192.168.0.255 Balancer1 ldirectord.cf: checktimeout=10 checkinterval=2 autoreload=no logfile="local0" quiescent=yes ## HTTP virtual=192.168.0.8:443 real=192.168.0.7:443 gate real=192.168.0.6:443 gate fallback=127.0.0.1:443 gate service=https request="ldirector.html" receive="Test Page" scheduler=rr protocol=tcp checktype=negotiate ## HTTPS virtual=192.168.0.8:80 real=192.168.0.7:80 gate real=192.168.0.6:80 gate fallback=127.0.0.1:80 gate service=http request="ldirector.html" receive="Test Page" scheduler=rr protocol=tcp checktype=negotiate Balancer2 ldirectord.cf: checktimeout=10 checkinterval=2 autoreload=no logfile="local0" quiescent=yes ## HTTP virtual=192.168.0.8:443 real=192.168.0.7:443 gate real=192.168.0.6:443 gate fallback=127.0.0.1:443 gate service=https request="ldirector.html" receive="Test Page" scheduler=rr protocol=tcp checktype=negotiate ## HTTPS virtual=192.168.0.8:80 real=192.168.0.7:80 gate real=192.168.0.6:80 gate fallback=127.0.0.1:80 gate service=http request="ldirector.html" receive="Test Page" scheduler=rr protocol=tcp checktype=negotiate Thank you very much in advance for looking into this. Please let me know if there is any other information I can provide.
I assume you mean /var/log/messages, right? If so, here's an excerpt of what I all see in that log (repeating every so often): Oct 3 03:54:03 balancer1 ldirectord[2547]: Quiescent real server: 192.168.0.6:443 ( x 192.168.0.8:443) (Weight set to 0) Oct 3 03:54:08 balancer1 ldirectord[2547]: Restored real server: 192.168.0.6:443 ( x 192.168.0.8:443) (Weight set to 1) Oct 3 04:13:22 balancer1 -- MARK -- Oct 3 04:33:22 balancer1 -- MARK -- Oct 3 04:53:23 balancer1 -- MARK -- Is there another log somewhere I can check?
Here's the /var/log/syslog contents from balancer1: Oct 4 06:25:06 balancer1 syslogd 1.4.1#18: restart. Oct 4 06:53:43 balancer1 -- MARK -- Oct 4 07:13:43 balancer1 -- MARK -- Oct 4 07:17:01 balancer1 /USR/SBIN/CRON[4509]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) Oct 4 07:33:43 balancer1 -- MARK -- Oct 4 07:40:20 balancer1 ldirectord[2547]: Quiescent real server: 192.168.0.6:443 ( x 192.168.0.8:443) (Weight set to 0) Oct 4 07:40:24 balancer1 ldirectord[2547]: Restored real server: 192.168.0.6:443 ( x 192.168.0.8:443) (Weight set to 1) Oct 4 07:53:44 balancer1 -- MARK -- Here's the /var/log/syslog contents from balancer2: Oct 4 06:25:27 balancer2 syslogd 1.4.1#18: restart. Oct 4 06:50:17 balancer2 -- MARK -- Oct 4 07:10:18 balancer2 -- MARK -- Oct 4 07:17:02 balancer2 /USR/SBIN/CRON[2581]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) Oct 4 07:30:21 balancer2 -- MARK -- Oct 4 07:50:22 balancer2 -- MARK -- Oct 4 08:10:22 balancer2 -- MARK -- Oct 4 08:17:01 balancer2 /USR/SBIN/CRON[2586]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) Oct 4 08:30:26 balancer2 -- MARK -- Oct 4 08:50:38 balancer2 -- MARK -- Oct 4 09:10:38 balancer2 -- MARK -- Oct 4 09:17:01 balancer2 /USR/SBIN/CRON[2591]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) Oct 4 09:30:38 balancer2 -- MARK -- Oct 4 09:50:44 balancer2 -- MARK -- Oct 4 10:10:49 balancer2 -- MARK -- Oct 4 10:17:01 balancer2 /USR/SBIN/CRON[2596]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) There are hundreds of these repeating over and over in the Apache2 logs on maia1 and pretty much nothing else except other non-related HTTP requests to this server (since it is currently an active web server): 192.168.0.12 - - [04/Oct/2008:13:13:16 -0500] "GET /ldirector.html HTTP/1.1" 200 9 "-" "libwww-perl/5.805" 192.168.0.12 - - [04/Oct/2008:13:13:18 -0500] "GET /ldirector.html HTTP/1.1" 200 9 192.168.0.12 - - [04/Oct/2008:13:13:18 -0500] "GET /ldirector.html HTTP/1.1" 200 9 "-" "libwww-perl/5.805" 192.168.0.12 - - [04/Oct/2008:13:13:20 -0500] "GET /ldirector.html HTTP/1.1" 200 9 192.168.0.12 - - [04/Oct/2008:13:13:21 -0500] "GET /ldirector.html HTTP/1.1" 200 9 "-" "libwww-perl/5.805" I see the same thing in the Apache2 logs on maia2: 192.168.0.12 - - [05/Oct/2008:21:21:52 -0500] "GET /ldirector.html HTTP/1.1" 200 9 "-" "libwww-perl/5.805" 192.168.0.12 - - [05/Oct/2008:21:21:54 -0500] "GET /ldirector.html HTTP/1.1" 200 9 192.168.0.12 - - [05/Oct/2008:21:21:54 -0500] "GET /ldirector.html HTTP/1.1" 200 9 "-" "libwww-perl/5.805" 192.168.0.12 - - [05/Oct/2008:21:21:57 -0500] "GET /ldirector.html HTTP/1.1" 200 9 192.168.0.12 - - [05/Oct/2008:21:21:57 -0500] "GET /ldirector.html HTTP/1.1" 200 9 "-" "libwww-perl/5.805" Does that help?
Balancer1 ifconfig: balancer1:~# ifconfig eth0 Link encap:Ethernet HWaddr 00:08:74:9E:47:12 inet addr:192.168.0.12 Bcast:192.168.0.255 Mask:255.255.255.0 inet6 addr: fe80::208:74ff:fe9e:4712/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:3845629 errors:0 dropped:0 overruns:1 frame:0 TX packets:3192221 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:1453406895 (1.3 GiB) TX bytes:371182676 (353.9 MiB) Interrupt:11 Base address:0x2c00 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:8 errors:0 dropped:0 overruns:0 frame:0 TX packets:8 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:560 (560.0 b) TX bytes:560 (560.0 b) Balancer2 ifconfig: eth0 Link encap:Ethernet HWaddr 00:03:FF:92:95:F0 inet addr:192.168.0.13 Bcast:192.168.0.255 Mask:255.255.255.0 inet6 addr: fe80::203:ffff:fe92:95f0/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:1042693 errors:0 dropped:0 overruns:0 frame:0 TX packets:466587 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:809290912 (771.7 MiB) TX bytes:85876112 (81.8 MiB) Interrupt:11 Base address:0xec00 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 inet6 addr: ::1/128 Scope:Host UP LOOPBACK RUNNING MTU:16436 Metric:1 RX packets:8 errors:0 dropped:0 overruns:0 frame:0 TX packets:8 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:560 (560.0 b) TX bytes:560 (560.0 b) The virtual IP is pingable: C:\Documents and Settings\User>ping balancer Pinging balancer.mydomain.com [192.168.0.8] with 32 bytes of data: Reply from 192.168.0.8: bytes=32 time<1ms TTL=64 Reply from 192.168.0.8: bytes=32 time<1ms TTL=64 Reply from 192.168.0.8: bytes=32 time<1ms TTL=64 Reply from 192.168.0.8: bytes=32 time<1ms TTL=64 Ping statistics for 192.168.0.8: Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 0ms, Maximum = 0ms, Average = 0ms There are no firewalls or anything between any of the 4 nodes I am working with. Everything else is also pingable as well (both balancers individually, both web servers, etc). It's almost acting like port forwarding isn't working correctly. I can't telnet to ports 80 or 443 on the virtual IP even though I can ping it. As I noted in my original post, all of the tests pass successfully that verify the actual cluster is running (ip addr sh eth0, ldirectord ldirectord.cf status, ipvsadm -L –n, and /etc/ha.d/resource.d/LVSSyncDaemonSwap master status). I literally copied and pasted the tutorial examples in PuTTY windows when I set these up (changing IPs where appropriate). I even went so far as to download Debian sarge and went through the tutorial thinking it was a problem with etch, but I got the same results then too. One more note , I can telnet directly to ports 80 and 443 on the web servers directly so I know they are working (as well as I can browse webpages on them).
To be honest, I'm not sure what's wrong. Maybe you should try this tutorial instead: http://www.howtoforge.com/high-availability-load-balancer-haproxy-heartbeat-debian-etch
help me plz hey buddy have you resolved the problem which you were facing because i am facing the same problem and stuck in test# 7 on page 4............
I ran into similar problems as well. I don't recall why, but after many retries and reviewing several great tutorials out there, I wrote my own documentation that worked when tested several times. You can see it here: http://www.ctrip.ufl.edu/apache2-cluster-in-debian-lenny-howto