Hi, during the last months I worked a lot on improving our mailcluster. We use several Debian 8 based HP-Servers with Dovecot, Postfix and ISPConfig 3.1.2. The /var/vmail - directory is shared accross the cluster via GlusterFS, while i write this we use gluster version 3.9.1. In generel this works fine and we made really good experiences with glusterfs. It's robust, the self-healing works fine - only speed is a little drawback. But under heavy load we discovered more and more problems with the index files of Dovecot which get by default stored within the maildir. We recognized more and more messages like these: "Error: Corrupted transaction log file ...." "Error: Log synchronization error at seq=5,offset=17212 for ...." " Index ..... : Lost log for seq=67 offset=33312" and so on. we had so many errors that the problems became user-visible. For example users could not move Emails within IMAP-Folders, the emails simply "jumped back". Or sometimes users got internal server error messages. To me it seems that the file locking of all these Dovecot index files does not work properly accross GlusterFS although i set the corresponding switches in the Dovecot config file /etc/dovecot/conf.d/10-mail.conf: mail_temp_dir = /var/tmp mail_fsync = always mmap_disable = yes fsync_disable = no mail_nfs_storage = yes mail_nfs_index = yes lock_method = fcntl I spent a lot of time searching along several boards, but found no useable solution. So i started experimenting with moving the index files somewhere else. I decided to use /var/vmail-index. As root i executed: mkdir /var/vmail-index chown vmail:vmail /var/vmail-index (Remark: /var/vmail-index is on a different file system and different harddrives than /var/vmail, in my case this lowered I/O-load and IMAP-access-times) now edit /etc/dovecot/dovecot-sql.conf and comment out this line: # user_query = SELECT email as user, maildir as home, CONCAT( maildir_format, ':', maildir, '/', IF(maildir_format='maildir','Maildir',maildir_format)) as mail, uid, gid, CONCAT('*:storage=', quota, 'B') AS quota_rule, CONCAT(maildir, '/.sieve') as sieve FROM mail_user WHERE (login = '%u' OR email = '%u') AND `disable%Ls` = 'n' AND server_id = '1' and replace it by: user_query = SELECT email as user, maildir as home, CONCAT( maildir_format, ':', maildir, '/', IF(maildir_format='maildir','Maildir',maildir_format),':INDEX=/var/vmail-index/%d/%n') as mail, uid, gid, CONCAT('*:storage=', quota, 'B') AS quota_rule, CONCAT(maildir, '/.sieve') as sieve FROM mail_user WHERE (login = '%u' OR email = '%u') AND `disable%Ls` = 'n' AND server_id = '1' Restart dovecot: /etc/init.d/dovecot force-reload From now an Dovecot will place index files under a directory structure like this: /var/vmail-index/DOMAIN/USER/.xxxxxxx/ If you start this on a busy server the I/O-Load will increase for a while as Dopvecot has to create a lot of directories. But as he is done with it IMAP-access gets really really fast! From now on we discovered none of the above described index erros. I wondered if there might be any problems if each Dovecot uses it's own index and users get moved to another server my our loadbalancer. But it works fine. The drawback of this procedure is that you have to care about you index-directory by yourself as deletion of mailboxes or domains in ISPConfig will not cover deletion of the correxpondig folders in /var/vmail-index. So i created a little bash-script which gets executed every night on each machine: #!/bin/bash # Verbose? 0 -> off, 1-> little, 2-> a lot verbose=1 # Really delete? echbetrieb=true # Maildir-Path MAILDIR='/var/vmail' # Index-Path INDEXDIR='/var/vmail-index' test ! -d $MAILDIR && echo "MAILDIR $MAILDIR is no directory, exit." && exit 1 test ! -d $INDEXDIR && echo "INDEXDIR $INDEXDIR is no directory, exit." && exit 1 test $verbose -gt 0 && echo "Clearing index dir $INDEXDIR" test $echtbetrieb || echo "Testing - no deletion" cd $INDEXDIR for domain in $(ls) do if [ -d $INDEXDIR/$domain ] then if [ -d $MAILDIR/$domain ] then test $verbose -gt 1 && echo "$MAILDIR/$domain still exists" for user in $(ls $MAILDIR/$domain) do if [ -d $MAILDIR/$domain/$user ] then test $verbose -gt 1 && echo "$MAILDIR/$domain/$user still exists" else test $verbose -gt 0 && echo "$MAILDIR/$domain/$user doesn't exist any more, gets deleted" test $echtbetrieb && rm -rf $INDEXDIR/$domain/$user fi done else test $verbose -gt 0 && echo "$MAILDIR/$domain doesn't exist any more, gets deleted" test $echtbetrieb && rm -rf $INDEXDIR/$domain fi # sleep 1 fi done ------------------------------------------------------------- So - maybe this is a little bit helpful for somebody. I'm not sure whether this is "ISPConfig-update-safe", i'll know after the next update.
Thank you for providing the details on scaling ISPConfig on your setup. I recommend writing a small ispconfig plugin which binds itself to the mailbox delete event to remove the index files instead of a bash script that has to go trough all directories. Regarding update safeness, just take care that you store the modified template for the dovecot-sql.conf file in /usr/local/ispconfig/server/conf-custom/install/ to make the config changes update-safe.
Did you try to make your changes to /etc/dovecot.conf and not to /etc/dovecot/conf.d/10-mail.conf? A server with ispconfig does not read conf.d/*
Oh - you're absolutely right, i must have been blind .... i fixed that. I gave it a try and turned one node back to the orignal config of dovecot-sql.conf. The result was frustrating. The locking works now with the settings in /etc/dovecon.conf, but accessind mailboxes gets now really slow. The load avarage raised to 10 while it is around 0.5-2 with my configuration and index files on different fast hartddrives. So moving the index files off the shared file system turned out to be an enormous speed enhancement in our setup. We located /var on a SAS-RAID1 so access is really fast. Sorry for my stupid question. Do i simply put a copy of my modified dovecot-sql.conf in that directory, is that enough? Yes, this will be a project within the next weeks. The shell script was just a first and quick solution.
You can try different mount-options for your glusterfs. Is there any need for such a setup? I would use replication for 2 nodes and dovecot director if you need more servers
I tried several mount options for glusterfs in the past, but that didn't do any serious progress. Anyway, we have glusterfs running for years, and it works fine. We use a cluster of Fortigate Firewalls as load balancer. They use quite intelligent algorithms for load balancing.
Keep in mind, that Dovecot is really fast on NFS (https://wiki2.dovecot.org/NFS). You should use Director: Directors are mainly useful for setups where all of the mail storage is seen by all servers, such as with NFS or a cluster filesystem. (https://wiki2.dovecot.org/Director).
Again: we use an external hardware-based load balancer so there's no need for the director. Our external hardware based solution takes load off the mailservers. That depends on the number of concurrent accesses. During peak times we have >1500 concurrent IMAP accesses per node. So moving files that are accessed very often off the shared file system is definately a speedup. Shared GlusterFS or NFS volumes will never be faster than local SAS- or SSD-storage.
Better use the template of that file that you can find in the install/tpl/ of the ISPConfig tar.gz file and put that into conf-custom after you added your modifications to that template.
How would you do full-text search when indexes are now different, which has not much options but external FTS? FTS with Solr can't share the same Solr index now. So you can't have one server do indexing and others use it. - Wonder if it will work to just rewrite dovecot index on not FTS indexing nodes with one of current indexer. - Wonder what would happen with something like Elastic cluster on this - it will probably give wrong results and have each email indexed same number of times the nodes you have. How large is your dovecot index going per 1Gb of emails? I see around 1% of total. If using glusterfs fuse to connect to the local brick, making sure it reads from local yields noticeable performance results. cluster.nufa on, cluster.choose-local on, maybe even mount with "xlator-option=*replicate*.read-subvolume-index=X" Also, I have mail_fsync = never #mmap_disable = no mail_nfs_storage = yes #mail_nfs_index = no #but mail_fsync = optimized for lmtp from postfix protocol lmtp { mail_plugins = quota sieve acl zlib listescape #mail_crypt auth_socket_path = /usr/local/var/run/dovecot/auth-master mail_fsync = optimized }
please disregard full-text search with solr works fine with indexes out of synced file system with email, though mentioned above might be useful for someone
had typo in server name in cron commit, and read-only solr nodes were not grabbing uncommitted part on reload
maybe someone needs this info /etc/crontab @daily root curl http://solr:8983/solr/dovecot/update?optimize=true @hourly root curl http://solr:8983/solr/dovecot/update?commit=true
It worked out! First of all, thanks for posting this solution. It helped me a lot, because I needed to have the directory / var / customers / mail as a symbolic link to another network storage and the index was giving a lot of errors. I am using the latest version of Froxlor (0.10.25) and I adapted its index cleaning script, since the Froxlor structure contains the client folder before the domain. (ex: var / customers / mail / clientA / domain / user) Below is who you are interested in. #!/bin/bash # Verbose? 0 -> off, 1-> little, 2-> a lot verbose=2 # Really delete? echtbetrieb=true # Maildir-Path MAILDIR='/var/customers/mail' # Index-Path INDEXDIR='/var/customers/indexes' test $echtbetrieb || echo "Testing - no deletion" cd $INDEXDIR for customer in $(ls) do for domain in $(ls $INDEXDIR/$customer) do if [ -d $INDEXDIR/$customer/$domain ] then if [ -d $MAILDIR/$customer/$domain ] then test $verbose -gt 1 && echo "$MAILDIR/$domain still exists" for user in $(ls $INDEXDIR/$customer/$domain) do if [ -d $MAILDIR/$customer/$domain/$user ] then test $verbose -gt 1 && echo "$MAILDIR/$domain/$user still exists" else test $verbose -gt 0 && echo "$MAILDIR/$domain/$user doesn't exist any more, gets deleted" test $echtbetrieb && rm -rf $INDEXDIR/$customer/$domain/$user fi done else test $verbose -gt 0 && echo "$MAILDIR/$domain doesn't exist any more, gets deleted" test $echtbetrieb && rm -rf $INDEXDIR/$customer/$domain fi # sleep 1 fi done done
Posting the script in CODE tags would make it more readable. Now all indentation is lost, for example.