Hi, I have added a second MX server to the ISPconfig cluster and would like to synchronize everything via dovecot-replication. However, the replicator has stopped working at about half of the 300GB of mails - or is really slow (2GB in the last 15h). Since then I get more error messages like "failed: Resource temporarily unavailable". Is there any way to force a resync or restart the replication? Or at least see what actually happens (something like the rsync verbose output)? here is my dovecot configuration Code: auth_default_realm = example.com protocol imap { mail_plugins = $mail_plugins quota imap_quota notify replication imap_sieve } protocol pop3 { mail_plugins = $mail_plugins quota notify replication } protocol lda { mail_plugins = $mail_plugins sieve quota notify replication } protocol lmtp { mail_plugins = $mail_plugins sieve quota notify replication } doveadm_password = ******************************** doveadm_port = 12345 replication_max_conns = 10 ssl_client_ca_dir = /etc/ssl/certs # Replicator process should be started at startup, so it can start replicating users immediately: service replicator { process_min_avail = 1 } # The mail processes need to have access to the replication-notify fifo and socket. service aggregator { fifo_listener replication-notify-fifo { user = vmail mode = 0600 } unix_listener replication-notify { user = vmail mode = 0600 } } # Enable doveadm replicator commands service replicator { unix_listener replicator-doveadm { mode = 0600 } } # Create a listener for doveadm-server service doveadm { user = vmail inet_listener { port = 12345 ssl = yes } } service config { unix_listener config { user = vmail } } plugin { mail_replica = tcps:mail2.example.com } plugin { sieve_plugins = sieve_imapsieve sieve_extprograms # From elsewhere to Spam folder imapsieve_mailbox1_name = Junk imapsieve_mailbox1_causes = COPY imapsieve_mailbox1_before = file:/etc/dovecot/rspamd/rspamd-learn-spam.sieve # From Spam folder to elsewhere imapsieve_mailbox2_name = * imapsieve_mailbox2_from = Junk imapsieve_mailbox2_causes = COPY imapsieve_mailbox2_before = file:/etc/dovecot/rspamd/rspamd-learn-ham.sieve sieve_pipe_bin_dir = /etc/dovecot/rspamd sieve_global_extensions = +vnd.dovecot.pipe +vnd.dovecot.environment } snipped from mail.log Code: root@mail2:~ $ cat /var/log/mail.log | grep replicator-doveadm 2023-04-26T09:03:25.789550+02:00 mail2 dovecot: doveadm([email protected])<194934><onvFJbjMSGR2+QIAFL3U8g>: Error: net_connect_unix(/run/dovecot/replicator-doveadm) failed: Resource temporarily unavailable 2023-04-26T09:03:54.126310+02:00 mail2 dovecot: doveadm([email protected])<194962><abpZM9jMSGSS+QIAFL3U8g>: Error: net_connect_unix(/run/dovecot/replicator-doveadm) failed: Resource temporarily unavailable 2023-04-26T09:04:54.081370+02:00 mail2 dovecot: doveadm([email protected])<195028><QB51DRXNSGTU+QIAFL3U8g>: Error: net_connect_unix(/run/dovecot/replicator-doveadm) failed: Resource temporarily unavailable I made sure, that the ports are open. Code: root@mail:~ $ nmap -p 12345 mail2.example.com Starting Nmap 7.93 ( https://nmap.org ) at 2023-04-26 09:11 CEST Nmap scan report for mail2.example.com (10.0.1.141) Host is up (0.00031s latency). PORT STATE SERVICE 12345/tcp open netbus MAC Address: 1A:14:9B:... (Unknown) Nmap done: 1 IP address (1 host up) scanned in 0.24 seconds the current status of replication. Code: root@mail:~ $ doveadm replicator status '*' username priority fast sync full sync success sync failed [email protected] none 00:06:13 18:50:43 00:06:11 - [email protected] none 17:38:01 21:32:40 17:36:37 - [email protected] none 00:04:44 21:32:43 - y [email protected] none 00:01:23 21:32:43 - y [email protected] none 00:00:59 20:03:04 - y [email protected] none 00:04:44 21:32:09 - y [email protected] none 00:04:44 21:32:43 - y Code: root@mail:~ $ df | grep /var/vmail Filesystem 1K-blocks Used Available Use% Mounted on /dev/rbd3 411725224 276419236 114318084 71% /var/vmail root@mail2:~ $ df | grep /var/vmail Filesystem 1K-blocks Used Available Use% Mounted on /dev/rbd3 411725224 166388972 224348348 43% /var/vmail
I just saw a new massage in the dsync status which can be a hint to whats going on Code: root@mail:~ $ doveadm replicator dsync-status username type status [email protected] normal Waiting for handshake [email protected] normal Waiting for handshake - Not connected - Not connected ...
The Dovecot Wiki is your best friend here. Enable Dovecot's feature to log errors and debug messages into separate log files and then check the logs to see what is going on. If you are using Dovecot 2.2.x then use this wiki: https://wiki.dovecot.org Otherwise https://doc.dovecot.org/ Which Dovecot version do you use? Dovecot 2.3 is known to have replication issues in older versions. You should also check for yourself if it makes sense to update to the newest Dovecot version, but if you do so, make sure that you have a working backup ready, in case something goes horribly wrong...
Apparently, the replication process was stopped for some reason and could not be resumed. The notification itself worked, so new messages were replicated. I could jumpstart the whole replication with: Code: doveadm sync -1 -A -f remote:mail2.example.com