LVS-DR localnode issues

Discussion in 'Technical' started by BobGeorge, Aug 11, 2017.

  1. BobGeorge

    BobGeorge Member

    I've got two load balancing nodes, with Heartbeat providing high availability between them, and it also starts up "ldirectord" to deal with the LVS.

    Clients send requests to a virtual IP address (VIP), which is then distributed to the realservers. I've gone with "direct routing", where LVS simply rewrites the MAC address on the packet to the chosen realserver and passes it on. The advantage of this is that the realserver gets the original packet from the client "as is" (save for the MAC address change but, at an application level, that's usually an irrelevancy). It's transparent load balancing, with neither the client nor the realserver having any clue that it just happened.

    Well, except that the IP address on the packet is sent to the VIP from the client, so the realserver has to be persuaded to respond to packets addressed to the VIP. But it can't simply be directly assigned the VIP, as that would mean all the realservers (and the load balancers) are all VIP and an ARP request of "who's VIP?" would have all of them responding simultaneously. Not good. So the realservers must be assigned the VIP in a way that doesn't have them responding to ARP requests.

    What I did was simply create a dummy interface with the VIP. Dummy interfaces, not representing any real device, don't respond to ARP requests but can be given the VIP, so that the realserver believes that something addressed to the VIP is a packet addressed to it.

    This works nicely. If I point a client browser at the VIP, then one of the realservers responds, serving up the page.

    I've also turned on persistence, so that once LVS decides which realserver is going to handle the request, it logs it in the connection tracking and will route subsequent traffic to the same realserver (whilst a configurable timeout period hasn't expired).

    The issue I'm having is routing traffic to the load balancer itself. LVS will recognise that a realserver with the IP address "127.0.0.1" is local and will, thus, just route the packet directly to the appropriate port (or so the documentation tells me). This is "localnode" routing.

    But when I try this with the ISPConfig interface - on port 8080 and using HTTPS - then I just get "connection refused".

    My initial thoughts were that the problem was that the packet is addressed to the VIP (192.168.0.99) but this is only a virtual IP address (so it can be swapped between load balancers) and the actual IP address of the load balancer is 192.168.0.100. So Apache is not listening to the right IP address?

    But I tried everything - sticking the VIP in all the relevant "Listen" and VirtualHost places - and this doesn't fix it. Also, the "connection refused" is immediate, and doesn't look like a "not listening" timeout, but more like an actual refusal to connect.

    Does anyone have experience with this sort of thing, or any clue why I can't route traffic from the VIP to https://127.0.0.1:8080 and have Apache respond to this?

    The actual load balancing itself works, but this problem is blocking access to the ISPConfig interface.

    I guess, if all else fails, I could shift the ISPC interface onto a different server, so it's not "localnode" - as the direct routing part is working fine - but I'd rather not have to de-construct and re-construct this ISPC cluster all over again.
     
  2. BobGeorge

    BobGeorge Member

    Strike that.

    I'd simply misconfigured one of the realservers, so it was still sending ARP requests. Thank you, Wireshark.

    As mentioned, the ARP requests have to be off on the realservers or when the router asks "who has the virtual IP address?" then multiple machines respond, which doesn't make sense - who should the packet be sent to? - and it just refuses the connection.

    Once the errant ARP request was silenced, then it all worked as I'd originally expected it to.
     
    till likes this.
  3. BobGeorge

    BobGeorge Member

    Well, apologies for that. It's not intended to make anyone grumpy.

    If it helps at all then, as we upgraded our network connection to a nice symmetric fibre optic, I had to completely redo it all, pretty much from scratch, after this anyway. I don't mind if you want to take some schadenfreude out of that. ;)
     

Share This Page