[SOLVED] Monitoring data not updated after multiserver upgrade

Discussion in 'Installation/Configuration' started by ggallo, Mar 14, 2024.

  1. ggallo

    ggallo New Member

    Hi All!

    I'm in the process of upgrading all of our ISPconfig servers from Debian 10 to Debian 12. In the starting state, all servers was on Debian 10 and ISPconfig was on 3.2.11p1 on all servers. All servers updated to latest available packages.
    I have separate admin panel server, web server, DB server, mail server, two DNS servers and a webmail server.
    Everything worked fine.

    I thought that I start with the least important server, the webmail. Upgraded it to Debian 11 and then to 12 as per Howtoforge's how-tos on upgrading ISPconfig servers from Debian 10 to 11 and Debian 11 to 12 respectively. Went through all steps on Debian and ISPconfig upgrading.
    All went well, webmail server upgraded to Debian 12, ISPconfig upgraded to 3.2.11p2. Reconfigured everything with ispconfig_updater.sh except master DB permissions.
    In the next few days I repeated this on the two DNS servers without any issues.
    All upgrades showed up on admin panels monitoring page, and all data (server load, memory usage, logs, etc.) updated regularly as usual.

    Yesterday evening I upraded the admin panel server and then the database server exactly the same way. Both servers upgraded to Debian 12 and ISPconfig upgraded to 3.2.11p2 without any problems. On the admin panel server, I instructed ispconfig_updater to update master DB permissions, too.

    The problem started here, because after the admin panel server upgrade, none of the other 6 servers update their status information (monitoring) in the master server's monitoring_data table - except the system_update rows, which is updated by all servers...
    Even the database server updated after the admin panel server show up in the admin panel monitoring page as Debian Buster and ISPconfig 3.2.11p1, so this is not updated after 12 hours (but the server is really on Debian Bookworm and ISPconfig 3.2.11p2).

    Now, 2 servers left on Debian 10, 5 servers upgraded to Debian 12, all servers running ISPconfig 3.2.11p2. All serveres database version is 99. All servers use the distro's default PHP version (7.3 on D10, 8.2 on D12), update-alternatives for PHP also ran.

    Of course I downloaded the htf_common_issues script, ran on every server, but nothing wrong shown up, so I'm not included that many lines of output in this post.

    Even admin panel functionality is working correctly (so database connectivity from slaves to master working), if I change anything on any server with the panel, the job finishes in a minute, job queue empty, and the corresponding server do the job.
    Only monitoring data not updated (and I'm afraid quota usage data also not updated...).

    I already set debug level on the database server, server.sh and cron.sh both run without errors. I even wrote some extra debug output in cron.php and some cron classes (copied the original, the restored after debug). Everything seems working, but no new data written to the master database.

    Where I must continue?
     
  2. till

    till Super Moderator Staff Member ISPConfig Developer

    Try to run optimize and repair on monitor_data table on the master e.g. with phpmyadmin, maybe the table is crashed.
     
  3. ggallo

    ggallo New Member

    Thank you for your quick answer!

    Unfortunately this two actions did not solve the problem.
    For the optimize, MySQL said "Table does not support optimize, doing recreate+analyze instead".
    For the repair, MySQL said "The storage engine of the table doesn't support repair" (table is default install, InnoDB based).
     
  4. ggallo

    ggallo New Member

    After a few hours of troublesghooting, I finally found the cause.

    I saw that some of the items updated in the master's monitor_data table for all hosts, but many of them not updated.
    I wrote some debug prints in cron.php an cron sub-classes on one of the non-updateing slave servers and found out that the non-updating cron sub-classes did not run at all on schedule.

    I went to ISPconfig's Gitlab to check the the sources of crontab (parent) class changes. But it not changed in 2 years. I read the code and I found out that there is a 'running' field, which prevents job execution if it's true (1). This information is stored on all servers in the local database (not on the master), in sys_cron table.
    I checked the non-updating server's tables, and many rows contained 1 in the 'running' column.

    I then updated all the rows with 'running'=1 to 0, and all servers started updating all the data in the master's monitor_data table.

    I didn't know where I messed up the upgrade process to let those 'running' fields stood on 1.
    I think it was an accident that I restarted the master database in the upgrade process in the wrong moment.
     
    till likes this.

Share This Page