Hi there, I'm having a bit of a problem with haproxy. We've got two haproxy servers pointing at 4 windows web servers. Now mostly they work fine but when the client reloads their application they insist on doing all four servers simultaneuously. I know, but we've discussed it and that's the way they're doing it. Now when they reload the application, sometimes they get errors that the site isn't there (because it isn't) anyway, I'm trying to figure out how to get haproxy to hold on and keep trying until it gets a valid server . So someone browsing the site will experience a wait for a new page but they won't get an error message. Now it takes about 15 seconds for the application to reload so I thought a timeout time of 90 seconds would be sufficient but it's not, also it's set to retry 30 times and that's hot helping either. Code: global log 127.0.0.1 local0 log 127.0.0.1 local1 notice #log loghost local0 info #maxconn 4096 maxconn 8192 #debug #quiet user haproxy group haproxy stats socket /tmp/haproxy.sock mode 777 daemon defaults log global # mode http option httplog option dontlognull retries 25 option redispatch #maxconn 2000 maxconn 5000 contimeout 100000 clitimeout 100000 # clitimeout 50000 srvtimeout 100000 # srvtimeout 50000 listen webfarm 89.185.144.170:80 mode http stats enable stats auth tibus:IgsbiW85 stats auth myhome:myh0m3m0n balance roundrobin cookie JSESSIONID prefix option httpclose option abortonclose option forwardfor option httpchk HEAD /check.txt HTTP/1.0 server webA 111.111.111.112:80 cookie A check server webB 111.111.111.113:80 cookie C check server webC 111.111.111.114:80 cookie C check server webD 111.111.111.115:80 cookie D check errorloc302 503 http://www.MYSITE.com/ listen HTTPS 89.185.144.170:443 # mode http mode tcp balance roundrobin option httpclose option forwardfor server webA 111.111.111.112:443 check server webB 111.111.111.113:443 check server webC 111.111.111.114:443 check server webD 111.111.111.115:443 check errorloc302 503 http://www.MYSITE.com/ any suggestions as to where I'm going wrong?
I think you could get away with it by significantly increasing the number of retries and by increasing the health check retries, so as to cover the quick restart. You don't want haproxy to see your servers as down at any point since it won't even try to connect and will immediately return 503. But as long as it believes they're still up and tries to connect multiple times, it should work. It's not the best you can do for high availability, but your customer seems to know what's best (ie: break availability to be able to deploy in one click). So let him get harmed by his stupidity and next time he'll listen to your advices which are much smarter than his ignorance.