dovecot-replicator is slow or not working anymore

Discussion in 'Installation/Configuration' started by frashman, Apr 26, 2023.

  1. frashman

    frashman New Member

    Hi, I have added a second MX server to the ISPconfig cluster and would like to synchronize everything via dovecot-replication. However, the replicator has stopped working at about half of the 300GB of mails - or is really slow (2GB in the last 15h). Since then I get more error messages like "failed: Resource temporarily unavailable".

    Is there any way to force a resync or restart the replication? Or at least see what actually happens (something like the rsync verbose output)?

    here is my dovecot configuration
    Code:
    auth_default_realm = example.com
    protocol imap {
      mail_plugins = $mail_plugins quota imap_quota notify replication imap_sieve
    }
    protocol pop3 {
      mail_plugins = $mail_plugins quota notify replication
    }
    protocol lda {
      mail_plugins = $mail_plugins sieve quota notify replication
    }
    protocol lmtp {
      mail_plugins = $mail_plugins sieve quota notify replication
    }
    doveadm_password = ********************************
    doveadm_port = 12345
    replication_max_conns = 10
    
    ssl_client_ca_dir = /etc/ssl/certs
    
    # Replicator process should be started at startup, so it can start replicating users immediately:
    service replicator {
      process_min_avail = 1
    }
    
    # The mail processes need to have access to the replication-notify fifo and socket.
    service aggregator {
        fifo_listener replication-notify-fifo {
            user = vmail
            mode = 0600
        }
    
        unix_listener replication-notify {
            user = vmail
            mode = 0600
        }
    }
    
    # Enable doveadm replicator commands
    service replicator {
        unix_listener replicator-doveadm {
            mode = 0600
        }
    }
    
    # Create a listener for doveadm-server
    service doveadm {
        user = vmail
        inet_listener {
            port = 12345
            ssl = yes
        }
    }
    service config {
        unix_listener config {
            user = vmail
        }
    }
    
    plugin {
        mail_replica = tcps:mail2.example.com
    }
    plugin {
        sieve_plugins = sieve_imapsieve sieve_extprograms
    
        # From elsewhere to Spam folder
        imapsieve_mailbox1_name = Junk
        imapsieve_mailbox1_causes = COPY
        imapsieve_mailbox1_before = file:/etc/dovecot/rspamd/rspamd-learn-spam.sieve
    
        # From Spam folder to elsewhere
        imapsieve_mailbox2_name = *
        imapsieve_mailbox2_from = Junk
        imapsieve_mailbox2_causes = COPY
        imapsieve_mailbox2_before = file:/etc/dovecot/rspamd/rspamd-learn-ham.sieve
    
        sieve_pipe_bin_dir = /etc/dovecot/rspamd
    
        sieve_global_extensions = +vnd.dovecot.pipe +vnd.dovecot.environment
    }
    
    snipped from mail.log
    Code:
    root@mail2:~ $ cat /var/log/mail.log | grep replicator-doveadm
    2023-04-26T09:03:25.789550+02:00 mail2 dovecot: doveadm([email protected])<194934><onvFJbjMSGR2+QIAFL3U8g>: Error: net_connect_unix(/run/dovecot/replicator-doveadm) failed: Resource temporarily unavailable
    2023-04-26T09:03:54.126310+02:00 mail2 dovecot: doveadm([email protected])<194962><abpZM9jMSGSS+QIAFL3U8g>: Error: net_connect_unix(/run/dovecot/replicator-doveadm) failed: Resource temporarily unavailable
    2023-04-26T09:04:54.081370+02:00 mail2 dovecot: doveadm([email protected])<195028><QB51DRXNSGTU+QIAFL3U8g>: Error: net_connect_unix(/run/dovecot/replicator-doveadm) failed: Resource temporarily unavailable
    

    I made sure, that the ports are open.
    Code:
    root@mail:~ $ nmap -p 12345 mail2.example.com
    Starting Nmap 7.93 ( https://nmap.org ) at 2023-04-26 09:11 CEST
    Nmap scan report for mail2.example.com (10.0.1.141)
    Host is up (0.00031s latency).
    
    PORT      STATE SERVICE
    12345/tcp open  netbus
    MAC Address: 1A:14:9B:... (Unknown)
    
    Nmap done: 1 IP address (1 host up) scanned in 0.24 seconds
    
    the current status of replication.
    Code:
    root@mail:~ $ doveadm replicator status '*'
    username                                                         priority fast sync full sync success sync failed
    [email protected]                                                none     00:06:13  18:50:43  00:06:11     -    
    [email protected]                                               none     17:38:01  21:32:40  17:36:37     -    
    [email protected]                                            none     00:04:44  21:32:43  -            y    
    [email protected]                                           none     00:01:23  21:32:43  -            y    
    [email protected]                                             none     00:00:59  20:03:04  -            y    
    [email protected]                                                 none     00:04:44  21:32:09  -            y    
    [email protected]                                              none     00:04:44  21:32:43  -            y    
    
    Code:
    root@mail:~ $ df | grep /var/vmail
    Filesystem     1K-blocks      Used Available Use% Mounted on
    /dev/rbd3      411725224 276419236 114318084  71% /var/vmail
    
    root@mail2:~ $ df | grep /var/vmail
    Filesystem     1K-blocks      Used Available Use% Mounted on
    /dev/rbd3      411725224 166388972 224348348  43% /var/vmail
    
     
  2. frashman

    frashman New Member

    I just saw a new massage in the dsync status which can be a hint to whats going on

    Code:
    root@mail:~ $ doveadm replicator dsync-status
    username                                             type   status               
    [email protected]                            normal Waiting for handshake
    [email protected]                               normal Waiting for handshake
                                                         -      Not connected       
                                                         -      Not connected
    ...
    
     
  3. michelangelo

    michelangelo Active Member

    The Dovecot Wiki is your best friend here.
    Enable Dovecot's feature to log errors and debug messages into separate log files and then check the logs to see what is going on.
    If you are using Dovecot 2.2.x then use this wiki: https://wiki.dovecot.org

    Otherwise https://doc.dovecot.org/

    Which Dovecot version do you use?

    Dovecot 2.3 is known to have replication issues in older versions.
    You should also check for yourself if it makes sense to update to the newest Dovecot version, but if you do so, make sure that you have a working backup ready, in case something goes horribly wrong...
     
  4. frashman

    frashman New Member

    Apparently, the replication process was stopped for some reason and could not be resumed. The notification itself worked, so new messages were replicated. I could jumpstart the whole replication with:
    Code:
    doveadm sync -1 -A -f remote:mail2.example.com
    
     
    Th0m and till like this.
  5. Th0m

    Th0m ISPConfig Developer Staff Member ISPConfig Developer

    See https://forum.howtoforge.com/threads/dsync-faq.86543/ as well.
     

Share This Page