Let's Encrypt + Debian 8 + Nginx - Service not reloaded after cert renewal

Discussion in 'ISPConfig 3 Priority Support' started by JanThiel, Apr 25, 2018.

  1. JanThiel

    JanThiel Member

    Hi there,
    I am looking for some troubleshooting advice. The setup looks as the following:
    - Debian 8
    - Certbot 0.10.2
    - ISPC 3.1.11 (Master / Slave setup)
    - Nginx

    On two of our slaves where rarely config changes are done, we ran (independently) into a situation where the LE cert expired. After doing some research, we found out that the cert renewal works without any issues and the new certs were all around and symlinked correctly. But the nginx process was never reloaded / restarted afterwards and thus no new certs being delivered to the customer.
    A simple service nginx restart made everything working again. Triggering any other action in ISPC that reloads / restarts the service also did the trick. On other machines we never stumbled upon this issue which was mainly because we do have regular site config updates which trigger the nginx restart anyway.

    So as far as I were able to follow up: The ISPC LE Cron which is written in PHP seems to be working fine. Except of the service unit restart. Sadly I have no idea how to dig in deeper at this point.

    I consider that this might be a timing issue looking at the cron Code:
    https://git.ispconfig.org/ispconfig...ib/classes/cron.d/900-letsencrypt.inc.php#L69
    Because if the restart indicator file is not there (synchronously) after the exec is done then no service restart will be performed. We did have runtimes of the LE renewal of several seconds (up to >10s). Depending on the amount of alias domains.

    But this is just the only clue I could come up with. Any ideas or advices how to debug this further? Or are there known configs that might lead to such conditions (although I do not think so looking at the ISPC LE Cron source code)?

    Thanks and best regards,
    Jan
     
  2. till

    till Super Moderator Staff Member ISPConfig Developer

    Exec waits for the command to be finished when the command is not sent to the background, at least as far as I know. But you might try adding a sleep command after the exec. I guess the problem might be that the certs have been renewed already by the master, so there is nothing left to be renewed on the mirror and therefore nginx does not get reloaded.

    Probably a reload of nginx once a night by cronjob might be the easiest solution.
     
  3. JanThiel

    JanThiel Member

    Thanks for your feedback Till.

    I do (still ;-) ) believe that exec works that way too, but that's the only line of code that looks like something being able to cause this situation. And race conditions are always a PITA to debug...
    Regarding Master / Slave. This is not a mirror Setup, it's just the Master as our control server running the interface with all the independent slaves being attached to it. The master does not have the same domains nor certs configured.

    The next expected cert renewal run on one of the servers is in about two or three days, so I might give the sleep() a try then.
    I also already considered the nginx restart by cron but disliked that solution. As this happens on two independent servers of ours, other ISPC users might also run into it. That's why I would prefer finding the issue and a "real" solution for it.

    So if the sleep is everything I can try, I will get back to you in about three days, after the renewal run went through. If someone else comes up with others ideas, I am happy to put them into the code before the renewal happens.

    Best,
    Jan
     
  4. till

    till Super Moderator Staff Member ISPConfig Developer

    If it's not a mirror, then my guess from above was wrong of course :) The beehaviour of a non mirrored slavee should not be different from a standalone or master server, quitee strange. Maybe you should add some debug code too that writes into a log file when exec is called and when the code inside the "if(file_exists($marker_file) && trim(file_get_contents($marker_file)) == '1') {"is executed.
     
  5. JanThiel

    JanThiel Member

    Do you have any convenient logging mechanism in place at this point of code execution I can use to add the debug output?
     
  6. till

    till Super Moderator Staff Member ISPConfig Developer

    The function $app->log($msg, $priority); should be available there. Priority values are: 0 = DEBUG, 1 = WARNING, 2 = ERROR, the level corresponds to the log levels that you can set in ISPConfig UI for the server. Messages will get logegd to ispconfig.log file when priority >= log level.
     
    JanThiel likes this.
  7. JanThiel

    JanThiel Member

    And another question in this context:
    If we assume the exec call being reliable sync (which I would expect at the moment) why the hassle with the le.restart file? If the exec call is done, there should be no problem in restarting the httpd in any case like it is done in the other occurences at the Cron. The only benefit I currently see would be to prevent the httpd from being restarted, if the LE renewal fails. And if this is intended, I would question why this isn't done at the other renewal calls.
    Just something that came up my mind reading the code again.

    //Edit: Thanks for the Log info!
     
  8. till

    till Super Moderator Staff Member ISPConfig Developer

    This code exists to ensure that nginx or apache gets restarted only when there was a renewal and not every night. When you run certbot renew and there is nothing to be renewed, then the post hook command will not be run.
     
  9. JanThiel

    JanThiel Member

    Hi Till,
    the run went through yesterday night as expected. Nginx wasn't restarted again... as expected.
    I patched the ISPC LE Cron file ( /usr/local/ispconfig/server/lib/classes/cron.d/900-letsencrypt.inc.php - this is the correct file, isn't it?) like this and set the Log level for this Server to Debug via the ISPC Interface (I added the debug lines prefixed wit "LE-DEBUG"):

    PHP:
    $server_config $app->getconf->get_server_config($conf['server_id'], 'server');
                    if(!isset(
    $server_config['migration_mode']) || $server_config['migration_mode'] != 'y') {
                            
    $letsencrypt explode("\n"shell_exec('which letsencrypt certbot /root/.local/share/letsencrypt/bin/letsencrypt /opt/eff.org/certbot/venv/bin/certbot'));
                            
    $letsencrypt reset($letsencrypt);
                            if(
    is_executable($letsencrypt)) {
                                    
    $app->log('LE-DEBUG - Running LE Renewal'LOGLEVEL_DEBUG);
                                    
    $version exec($letsencrypt ' --version  2>&1'$ret$val);
                                    if(
    preg_match('/^(\S+|\w+)\s+(\d+(\.\d+)+)$/'$version$matches)) {
                                            
    $type strtolower($matches[1]);
                                            
    $version $matches[2];
                                            if((
    $type != 'letsencrypt' && $type != 'certbot') || version_compare($version'0.7.0''<')) {
                                                    
    exec($letsencrypt ' -n renew');
                                                    
    $app->services->restartServiceDelayed('httpd''force-reload');
                                            } else {
                                                    
    $app->log('LE-DEBUG - Writing LE Restart File'LOGLEVEL_DEBUG);
                                                    
    $marker_file '/usr/local/ispconfig/server/le.restart';
                                                    
    $cmd "echo '1' > " $marker_file;
                                                    
    exec($letsencrypt ' -n renew --post-hook ' escapeshellarg($cmd));
                                                    
    $app->log('LE-DEBUG - Renew is done'LOGLEVEL_DEBUG);
                                                    if(
    file_exists($marker_file) && trim(file_get_contents($marker_file)) == '1') {
                                                            
    unlink($marker_file);
                                                            
    $app->log('LE-DEBUG - Restarting HTTPD Service'LOGLEVEL_DEBUG);
                                                            
    $app->services->restartServiceDelayed('httpd''force-reload');
                                                    }
                                                    
    $app->log('LE-DEBUG - Everything done'LOGLEVEL_DEBUG);
                                            }
                                    } else {
                                            
    exec($letsencrypt ' -n renew');
                                            
    $app->services->restartServiceDelayed('httpd''force-reload');
                                    }
                            }
                    } else {
                            
    $app->log('Migration mode active, not running Let\'s Encrypt renewal.'LOGLEVEL_DEBUG);
                    }
    I then checked the time of the new LE Certs (28.04. - 21:20) and read the /var/log/ispconfig/ispconfig.log

    There are only these two lines (over and over again but nothing else):
    28.04.2018-21:21 - DEBUG - Calling function 'check_phpini_changes' from plugin 'webserver_plugin' raised by action 'server_plugins_loaded'.
    28.04.2018-21:21 - DEBUG - Remove Lock: /usr/local/ispconfig/server/temp/.ispconfig_lock

    From my current learnigns, it looks like the ISPC LE Cron file was never executed or I should have seen the log output...

    Neither crontab nor cron.d / cron.daily contain any Letsencryt native Cronjobs. So I am a bit confused where the renewal call comes from at all...

    I also checked for the LE executables but those look fine:
    SERVER:/# which letsencrypt certbot /root/.local/share/letsencrypt/bin/letsencrypt /opt/eff.org/certbot/venv/bin/certbot
    /usr/bin/letsencrypt
    /usr/bin/certbot

    SERVER:/# /usr/bin/letsencrypt -v
    Root logging level set at 10
    Saving debug log to /var/log/letsencrypt/letsencrypt.log
    certbot version: 0.10.2
    Arguments: ['-v']
    ...

    SERVER:/# /usr/bin/certbot -v
    Root logging level set at 10
    Saving debug log to /var/log/letsencrypt/letsencrypt.log
    certbot version: 0.10.2
    Arguments: ['-v']
    ...

    How can I check what might be wrong with the ISPC Crons? Any other ideas?

    Thanks and have a sunny sunday,
    Jan
     
    Last edited: Apr 29, 2018
  10. Taleman

    Taleman Well-Known Member HowtoForge Supporter

    Looks like there are two certbots and two let`s encrypts installed. This may cause issues.
     
  11. JanThiel

    JanThiel Member

    Just checked: /usr/bin/letsencrypt is just a symlink to /usr/bin/certbot. Looks like a legacy failsafe.
    So no, there is just one version installed.

    Anyway it shouldn't matter as only one script is/should be called by the ISPC LE Cron.

    Jan
     
  12. till

    till Super Moderator Staff Member ISPConfig Developer

    I'll do some tests here on my server and will report back.
     
  13. JanThiel

    JanThiel Member

    Hi Till, any updates on your side?
    We had this happening again and were only able to circumvent it using a cron with service nginx reload once each month.
    Have a great weekend,
    Jan
     
  14. till

    till Super Moderator Staff Member ISPConfig Developer

    I do not have had this issue on any of my servers nor does anyone else reported this, so it might be that it is something special with your setup that causes this.

    Please check if there are any cronjobs for letsencrypt or certbot in /etc/cron.d/ folder.
     

Share This Page