Let's Encrypt (Again)

Discussion in 'ISPConfig 3 Priority Support' started by BobGeorge, May 21, 2018.

  1. BobGeorge

    BobGeorge Member

    I've got a small issue with Let's Encrypt.

    When I try to create a new LE certificate with ISPConfig, Apache fails to restart. A fatal error and I have to manually fix the issue to get Apache back up and running.

    The error itself is actually found in a different vhost file for a website I created - and upgraded to SSL - ages back.

    Basically, the vhost file refers to the SSL certificate found in /var/www/domain.tld/ssl. The file is, going by memory, something like "domain.tld-le-0001.crt". And looking this file up in the file system, I can see the issue. The files in "/var/www/domain.tld/ssl" have "-0001" suffixed to it. But the actual file has no "-0001" suffix, it's just "domain.tld-le.crt" in the vhost file and in the "/etc/letencrypt/archive".

    So I fixed the problem by manually editing out the "-0001" suffix from the filenames of "/var/www/domain.tld/ssl" (well, actually, as these files are links, then I just deleted them and recreated the links again but without the unnecessary "-0001" suffix).

    Then Apache restarts just fine and everything's back to normal.

    Now, I presume that the "-0001" suffix is added, as is convention, when there's more than one file with the same filename to force a unique filename. But the links in "/var/www/domain.tld/ssl" are doing this when there is, in fact, no reason to do so. The vhost file refers to a non-suffixed certificate file and the actual certificate files in Let's Encrypt's directories are non-suffixed too.

    At some point - perhaps when I was originally sorting things out and fixing my earlier problems - this particular website might well have ended up with a "domain.tld-le.crt" certificate and a "domain.tld-le-0001.crt". Just because, you know, as I was setting things up initially, I tried to grab an LE certificate more than once as part of the old "trial and error" needed to get things working.

    The thing is, though, when I got it all working, I probably - though, granted, I don't recall doing so - cleared out the unnecessary earlier "failed attempts" and then grabbed the certificate again. The current one. The one that does work.

    But ISPConfig - as I assume that it's actually ISPConfig that's responsible for creating those link files in "/var/www/domain.tld/ssl" to create the "bridge" between Apache's vhost file and Let's Encrypt's directory of certificates - is still creating links for the old "-0001" versions that no longer exist.

    And so the problem is that vhost file refers to a file without "-0001" in the website's "ssl" folder and can't find that, as the links have had "-0001" added to them. Apache fails to restart, as the "SSLCertificate" directive is pointing to a non-existent file.

    And the bigger problem is that - and I consider this a design flaw in Apache itself - because it hits this fatal error in this earlier website vhost file, it just aborts and fails to restart. Which, of course, means Apache isn't running and ALL THE WEBSITES go down.

    (This is why I think this is a design flaw in the way Apache works. As it loads up the entire config - the main config file and all the individual vhost files are "include"'d into what is, in effect, one long config file (just one split into many files by much use of "include") - if there's one mistake in any of the included vhost files, then the whole thing fails.

    This is just a design flaw, in my opinion. Each vhost file should really be loaded individually and if one of them fails, then just that vhost fails. It can then otherwise continue to load the others and carry on.

    As it currently works, a single syntax error - a typo - or single "obsolete" directive that's no longer accepted or a single reference to a non-existent file (as I've got), brings down the ENTIRE web server. And all the other websites - which are not broken and would work just fine - are also brought down with that single broken site.

    One error - as I've got here - that sneaks into the configuration of just one of the websites and all of them are brought down, as the web server won't come up.

    This is simply a design flaw in the web server itself. This is not sensible nor useful behaviour.

    If, instead, Apache were to load each vhost file individually - with some kind of "load_vhost" directive - then it could pass / fail each one individually. So the web server can still come back up with any / all the non-failed vhosts, continuing to provide a service to the other sites, which are the majority, and only failing to bring up that one vhost that's failed.

    Or, in other words, one broken vhost configuration should not be a fatal error for the entire web server. It is a "partial fatal error", as it were, in that this one vhost can't be brought up - it's fatal to that vhost - but it shouldn't equate to being fatal to all vhosts and the web server itself.)

    Anyway, this design flaw in Apache itself to one side, my issue is that if I do anything regarding Let's Encrypt certificates - create a new one, renewal, removal or anything like that - then ISPConfig will, of course, re-sync the LE certificates and somewhere in the configuration is the "ghost" of a certificate - that "-0001" suffix - that no longer exists.

    So I guess my question here is whether there's some simple way for me to clear this "ghost" or force a "reset" of the configuration to how things are now. Because until I can clear out this "ghost", any activity to do with Let's Encrypt brings back the "ghost" links, which breaks Apache's config and means that Apache dies, bringing down my web servers and - rather catastrophically - killing all web services.

    (Hmm, I guess Apache does come with a config checker utility. That you can run the config through a checker, to see if it's valid, before you actually pull the gun to use that config. This, I assume, is what the Apache devs would tell me is what should happen and, thus, it's not the "design flaw" I claim it to be. Well, I still think it's poor design and it's simply not working here, as the restarts are failing hard and killing all service, over something that shouldn't affect anything but that single broken vhost.)
     
  2. till

    till Super Moderator Staff Member ISPConfig Developer

    ISPConfig searches for the right SSL cert automatically and adjusts the symlinks.
    1) Ensure that you run the current ISPConfig version 3.1.11, this function does not exist in old versions.
    2) Ensure that the Let's encrypt check is not disabled under system > server config.
    3) In case you use a custom vhost template, then ensure that the SSL sections of that template (the paths to SSL certs) is using the same way that the current vhost template in ISPConfig uses.
     
  3. till

    till Super Moderator Staff Member ISPConfig Developer

    And btw: I totally agree that the behavior from apache server is not useful at all,it would be way better if he would just skip ssl for the given site, but that's nothing that we can change. You might file a complaint at their Bugtracker though.
     

Share This Page