Hello to all and thank you for this awsome piece of software. I'm testing it and I discover the server hangs several times per day (more of less at the same time). My server has 1 cpu and 2 Gb of RAM and in the time it hangs I have errors of "out of memory" caused by clamd. It happens 4/5 times per day, what can I check? Should I stop and remove clam ? (I would avoid this) What can I check? Thanks for your help
Most likely clamav is just the symptom and not the source of the problem. What you have to find out is where the high load comes from, is it from websites or from mail system. You can e.g. see that with the top command and in the mail.log and website logs.
here TOP: Code: top - 15:34:29 up 6 days, 20:02, 1 user, load average: 0.00, 0.00, 0.75 Tasks: 141 total, 1 running, 104 sleeping, 0 stopped, 0 zombie %Cpu(s): 1.3 us, 0.3 sy, 0.0 ni, 98.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 1992084 total, 271880 free, 1155060 used, 565144 buff/cache KiB Swap: 0 total, 0 free, 0 used. 543004 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 14087 www-data 20 0 659120 38100 15536 S 0.7 1.9 0:00.13 apache2 775 root 20 0 677772 12968 0 S 0.3 0.7 18:43.22 fail2ban-server 13869 www-data 20 0 659136 38004 15380 S 0.3 1.9 0:00.10 apache2 15190 mysql 20 0 1154408 217080 15064 S 0.3 10.9 0:06.15 mysqld 15458 root 20 0 42908 4164 3428 R 0.3 0.2 0:00.57 top 1 root 20 0 225628 7172 4468 S 0.0 0.4 1:46.41 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.08 kthreadd 4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/0:0H 6 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 mm_percpu_wq 7 root 20 0 0 0 0 S 0.0 0.0 0:21.60 ksoftirqd/0 8 root 20 0 0 0 0 I 0.0 0.0 1:05.88 rcu_sched 9 root 20 0 0 0 0 I 0.0 0.0 0:00.00 rcu_bh 10 root rt 0 0 0 0 S 0.0 0.0 0:00.00 migration/0 11 root rt 0 0 0 0 S 0.0 0.0 0:02.45 watchdog/0 12 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/0 13 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kdevtmpfs 14 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 netns 15 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_tasks_kthre 16 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kauditd 17 root 20 0 0 0 0 S 0.0 0.0 0:00.37 khungtaskd 18 root 20 0 0 0 0 S 0.0 0.0 0:00.08 oom_reaper 19 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 writeback 20 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kcompactd0 21 root 25 5 0 0 0 S 0.0 0.0 0:00.00 ksmd 22 root 39 19 0 0 0 S 0.0 0.0 0:13.05 khugepaged 23 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 crypto 24 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kintegrityd 25 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kblockd 26 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 ata_sff 27 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 md 28 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 edac-poller 29 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 devfreq_wq 30 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 watchdogd 34 root 20 0 0 0 0 S 0.0 0.0 47:44.10 kswapd0 35 root 20 0 0 0 0 S 0.0 0.0 0:00.00 ecryptfs-kthrea 77 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kthrotld 78 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 acpi_thermal_pm 79 root 20 0 0 0 0 S 0.0 0.0 0:00.00 scsi_eh_0 80 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 scsi_tmf_0 81 root 20 0 0 0 0 S 0.0 0.0 0:00.00 scsi_eh_1 82 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 scsi_tmf_1 88 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 ipv6_addrconf 97 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kstrp 114 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 charger_manager 154 root 20 0 0 0 0 S 0.0 0.0 0:00.00 scsi_eh_2 155 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 scsi_tmf_2 180 root 0 -20 0 0 0 I 0.0 0.0 0:35.00 kworker/0:1H 266 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 raid5wq 318 root 20 0 0 0 0 S 0.0 0.0 0:39.61 jbd2/sda1-8 319 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 ext4-rsv-conver 409 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 iscsi_eh 416 root 20 0 44508 4008 1632 S 0.0 0.2 0:09.50 systemd-udevd 421 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 ib-comp-wq 422 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 ib_mcast 424 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 ib_nl_sa_wq 426 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rdma_cm 445 root 20 0 105904 184 0 S 0.0 0.0 0:00.03 lvmetad 550 root 20 0 12368 3128 0 S 0.0 0.2 0:08.32 haveged 551 systemd+ 20 0 141912 588 0 S 0.0 0.0 0:01.31 systemd-timesyn 589 systemd+ 20 0 80020 716 0 S 0.0 0.0 0:01.42 systemd-network 600 systemd+ 20 0 71004 2292 1244 S 0.0 0.1 0:22.68 systemd-resolve I have 3 websites for test and only one email account just for test. mail.log Code: Nov 19 14:37:41 panel postfix/cleanup[11497]: 49F9E3F373: message-id=<[email protected]> Nov 19 14:37:41 panel postfix/smtpd[11281]: disconnect from localhost[127.0.0.1] ehlo=1 mail=1 rcpt=1 data=1 quit=1 commands=5 Nov 19 14:37:41 panel postfix/qmgr[1156]: 49F9E3F373: from=<[email protected]>, size=2402, nrcpt=1 (queue active) Nov 19 14:37:41 panel amavis[20854]: (20854-04) Passed UNCHECKED {RelayedOpenRelay}, [127.0.0.1] <[email protected]> -> <[email protected]>, Message-ID: <[email protected]>, mail_id: xiZDdIlVVwzV, Hits: 0.25, size: 1951, queued_as: 49F9E3F373, 351395 ms Nov 19 14:37:42 panel postfix/smtp[11146]: 39DA63F0A9: to=<[email protected]>, relay=127.0.0.1[127.0.0.1]:10024, delay=352, delays=0.09/0.04/0.39/352, dsn=2.0.0, status=sent (250 2.0.0 from MTA(smtp:[127.0.0.1]:10025): 250 2.0.0 Ok: queued as 49F9E3F373) Nov 19 14:37:42 panel postfix/qmgr[1156]: 39DA63F0A9: removed Nov 19 14:37:42 panel postfix/smtp[11517]: connect to mx.zoho.com[204.141.42.121]:25: No route to host Nov 19 14:37:43 panel dovecot: pop3-login: Disconnected (no auth attempts in 0 secs): user=<>, rip=127.0.0.1, lip=127.0.0.1, secured, session=<hJccnwR7btd/AAAB> Nov 19 14:37:43 panel dovecot: imap-login: Disconnected (no auth attempts in 0 secs): user=<>, rip=127.0.0.1, lip=127.0.0.1, secured, session=<N9ocnwR7CuJ/AAAB> Nov 19 14:37:43 panel dovecot: imap-login: Disconnected (no auth attempts in 0 secs): user=<>, rip=127.0.0.1, lip=127.0.0.1, secured, session=<ln8dnwR7DOJ/AAAB> Nov 19 14:37:43 panel dovecot: pop3-login: Disconnected (no auth attempts in 0 secs): user=<>, rip=127.0.0.1, lip=127.0.0.1, secured, session=<q4AdnwR7cNd/AAAB> Nov 19 14:37:43 panel postfix/smtpd[11548]: connect from localhost[127.0.0.1] Nov 19 14:37:43 panel postfix/smtpd[11548]: lost connection after CONNECT from localhost[127.0.0.1] Nov 19 14:37:43 panel postfix/smtpd[11548]: disconnect from localhost[127.0.0.1] commands=0/0 Nov 19 14:37:43 panel postfix/smtpd[11548]: connect from localhost[127.0.0.1] Nov 19 14:37:43 panel postfix/smtpd[11548]: lost connection after CONNECT from localhost[127.0.0.1] Nov 19 14:37:43 panel postfix/smtpd[11548]: disconnect from localhost[127.0.0.1] commands=0/0 Nov 19 14:37:47 panel postfix/smtp[11517]: 49F9E3F373: to=<[email protected]>, relay=mx2.zoho.com[8.40.222.121]:25, delay=7.5, delays=1.9/0.78/1.6/3.3, dsn=2.0.0, status=sent (250 Message received) Nov 19 14:37:47 panel postfix/qmgr[1156]: 49F9E3F373: removed Nov 19 14:39:59 panel postfix/pickup[6750]: 0BEF53F373: uid=5006 from=<[email protected]> Nov 19 14:39:59 panel postfix/cleanup[11759]: 0BEF53F373: message-id=<[email protected]> Nov 19 14:39:59 panel postfix/qmgr[1156]: 0BEF53F373: from=<[email protected]>, size=2309, nrcpt=1 (queue active) Nov 19 14:39:59 panel amavis[20856]: (20856-04) NOTICE: reconnecting in response to: err=2006, HY000, DBD::mysql::st execute failed: MySQL server has gone away at (eval 110) line 173. Nov 19 14:39:59 panel amavis[20856]: (20856-04) (!)connect to /var/run/clamav/clamd.ctl failed, attempt #1: Can't connect to a UNIX socket /var/run/clamav/clamd.ctl: Connection refused Nov 19 14:40:00 panel amavis[20856]: (20856-04) (!)connect to /var/run/clamav/clamd.ctl failed, attempt #1: Can't connect to a UNIX socket /var/run/clamav/clamd.ctl: Connection refused Nov 19 14:40:00 panel amavis[20856]: (20856-04) (!)ClamAV-clamd: All attempts (1) failed connecting to /var/run/clamav/clamd.ctl, retrying (2) Nov 19 14:40:02 panel dovecot: pop3-login: Disconnected (no auth attempts in 0 secs): user=<>, rip=127.0.0.1, lip=127.0.0.1, secured, session=<LxhlpwR7HNh/AAAB> Nov 19 14:40:02 panel postfix/smtpd[11841]: connect from localhost[127.0.0.1] Nov 19 14:40:02 panel postfix/smtpd[11841]: lost connection after CONNECT from localhost[127.0.0.1] Nov 19 14:40:02 panel postfix/smtpd[11841]: disconnect from localhost[127.0.0.1] commands=0/0 Nov 19 14:40:02 panel dovecot: imap-login: Disconnected (no auth attempts in 0 secs): user=<>, rip=127.0.0.1, lip=127.0.0.1, secured, session=<1k5lpwR7tuJ/AAAB> Nov 19 14:40:06 panel amavis[20856]: (20856-04) (!)connect to /var/run/clamav/clamd.ctl failed, attempt #1: Can't connect to a UNIX socket /var/run/clamav/clamd.ctl: Connection refused Nov 19 14:40:06 panel amavis[20856]: (20856-04) (!)ClamAV-clamd av-scanner FAILED: run_av error: Too many retries to talk to /var/run/clamav/clamd.ctl (All attempts (1) failed connecting to /var/run/clamav/clamd.ctl) at (eval 113) line 659.\n Nov 19 14:40:06 panel amavis[20856]: (20856-04) (!)WARN: all primary virus scanners failed, considering backups Can you help me understand?
You have to run top at a time with high load. The top output you posted is at a time where there was nearly no load at all. And regarding mail.log, try to restart clamav daemon and then check if it is running again.
There is thread about clamd and memory: https://www.howtoforge.com/community/threads/centos-7-5-clamd-consuming-too-much-memory.79725/ Also, install command dstat and use Code: dstat --disk-tps --disk-util --dstat-cpu --mem --load --cpu -C total -s 10 360 to see if maybe disk is bottleneck. If not, experiment with other dstat options to get more info.