I have a big problem with my multi server environment (5 servers, master-web, mail, db, dns1 and dns2). All of them are VPS with Xen. All begins with a update to debian 8 that not have succes. Restored a previous snapshot, and the work queue stopped working. 2016-01-09 12:36 dns2.***********.com Update server 2016-01-09 12:36 dns2.***********.com Update server 2016-01-09 12:35 dns1.***********.com Update server 2016-01-09 12:35 dns1.***********.com Update server 2016-01-09 12:35 db.***********.com Update server 2016-01-09 12:35 db.***********.com Update server 2016-01-09 12:35 mail.***********.com Update server 2016-01-09 12:35 mail.***********.com Update server If I execute sudo /usr/local/ispconfig/server/server.sh Master server: 09.01.2016-12:49 - DEBUG - Set Lock: /usr/local/ispconfig/server/temp/.ispconfig_lock 09.01.2016-12:49 - DEBUG - Remove Lock: /usr/local/ispconfig/server/temp/.ispconfig_lock finished. Mail and the rest of servers show some similar. 09.01.2016-12:50 - DEBUG - Set Lock: /usr/local/ispconfig/server/temp/.ispconfig_lock /usr/bin/fail2ban-client /sbin/iptables /sbin/ip6tables 09.01.2016-12:50 - DEBUG - Remove Lock: /usr/local/ispconfig/server/temp/.ispconfig_lock finished. What can I do?
The above shows that the slaves are working fine, there is just no newer data available. The problem is probably that the snapshot of the master contains less data than the slaves due to the restore of a older database version (snapshot of the server). Check that the value of the "updated" column in the "server" database table on the master server and the salve server is not higher the the highest ID of the sys_datalog column on the master server. The sync works like this: get changes from sys_datalog where sys datalog id > vaue of updated column of the server record in the server table.
Hi Till. In server table: 1 - master/web -- 8456 2 - mail ----------- 8429 3 - db ------------- 8430 4 - dns1 --------- 8431 5 - dns2 --------- 8432 In sys_datalog table: 1 - master/web -- 8456 2 - mail ------------- 8466 3 - db -------------- 8460 4 - dns1 ----------- 8462 5 - dns2 ----------- 8464 I see that a multiserver setup with snapshots is not a good idea...
Check the updated columns in server table of all slave servers, if one of them is > 8456 then change it to 8456, so all updates that you do from now on on the master get picked up by the slaves again.
All ok Till, job queue works again, but now I'm another problem... Monitor tell me that DNS have warnings in log: Writing BIND domain file failed: /etc/bind/pri.domain.com zone domain.com/IN: has no NS records zone domain.com/IN: not loaded due to errors. I've delete the warnings, zone afected and created again but warning persist. When I updated the cell in table updated, first time put in 8056 and no 8456.... (sorry... many hours without sleep...) then fix it with correct value -.-
I've deleted the entire customer and warnings and now seems it's working again. DNS zone created correctly in /etc/bind/named, and job queue processes perfectly. The monitor debug seems ok. Thank you so much for your help Till.