So loosely followed this howto: http://www.howtoforge.com/antispam_smtp_proxy and it worked well, but after a few hours of operation, perl would just grab insane amounts of memory and severely delay mail delivery. ClamAV would accelerate the problem, grabbing even more memory, so I disabled/removed it. My question here is this: how can I figure out what causes this memory hogging? I researched and found that Perl sometimes has a problem when the net::dns is called rapidly, so I disabled most of the dns-related checks in assp(RBL and such). Even after doing so, ASSP provides superb accuracy... but Perl continues to go awry after a couple hours, and can only be fixed(afaict) by a reboot. Here's a "Top" command executed while Perl has "run away".... notice how much memory it's taking(>80%) Code: top - 06:17:15 up 13:18, 1 user, load average: 2.78, 2.30, 1.98 Tasks: 57 total, 2 running, 55 sleeping, 0 stopped, 0 zombie Cpu(s): 0.7%us, 0.7%sy, 0.0%ni, 0.0%id, 97.7%wa, 0.0%hi, 1.0%si, 0.0%st Mem: 125084k total, 123100k used, 1984k free, 68k buffers Swap: 369452k total, 184720k used, 184732k free, 680k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3898 nobody 18 0 189m 102m 64 D 1.0 83.5 14:02.39 perl 3830 mysql 18 0 123m 800 68 S 0.0 0.6 0:09.35 mysqld 5816 root 15 0 2320 580 420 R 0.0 0.5 0:00.16 top 5785 mr 15 0 8020 300 184 R 0.0 0.2 0:00.08 sshd 3711 root 16 0 1700 144 144 S 0.0 0.1 0:14.03 syslogd 4126 root 18 0 2280 128 128 S 0.0 0.1 0:01.33 cron 4155 root 18 0 20020 112 84 S 0.0 0.1 0:00.50 apache2 4064 root 15 0 4952 68 68 S 0.0 0.1 0:23.26 master It's true that the box only has 128MB ram, and I can see that Perl's memory-hogging eats into the swapfile... but it behaves so nicely for the first couple hours after a reboot - memory consumption is stable, swapfile use is negligible, everything is great. Here's a copy of the "top" command when things are running ok: Code: top - 08:40:54 up 2:21, 1 user, load average: 0.08, 0.06, 0.07 Tasks: 77 total, 3 running, 74 sleeping, 0 stopped, 0 zombie Cpu(s): 1.3%us, 0.0%sy, 0.0%ni, 97.7%id, 1.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 125084k total, 121808k used, 3276k free, 1272k buffers Swap: 369452k total, 104k used, 369348k free, 42592k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 3896 nobody 15 0 31976 29m 1696 R 1.3 23.9 4:39.70 perl 3828 mysql 15 0 123m 16m 4796 S 0.0 13.3 0:00.23 mysqld 4152 root 18 0 20024 6016 3448 S 0.0 4.8 0:00.04 apache2 4172 www-data 18 0 20024 3200 616 S 0.0 2.6 0:00.00 apache2 4173 www-data 18 0 20024 3200 616 S 0.0 2.6 0:00.00 apache2 4174 www-data 18 0 20024 3200 616 S 0.0 2.6 0:00.00 apache2 4175 www-data 21 0 20024 3200 616 S 0.0 2.6 0:00.00 apache2 4176 www-data 21 0 20024 3200 616 S 0.0 2.6 0:00.01 apache2 4647 mr 15 0 5508 3024 1444 S 0.0 2.4 0:00.19 bash 4612 postfix 15 0 5324 2816 2292 S 0.0 2.3 0:00.25 smtpd 4613 postfix 15 0 5320 2812 2292 S 0.0 2.2 0:00.26 smtpd It's been running fine here for just over two hours, but anytime now I expect it to wig out. Is there a log somewhere for runaway perl executions? Anyone have any ideas? Thanks in advance. btw I found a legend for the "stat" field: STAT The state of the task is shown here. The state is either S for sleeping, D for uninterruptible sleep, R for running, Z for zombies, or T for stopped or traced. These states are modified by trailing < for a process with negative nice value, N for a process with positive nice value, W for a swapped out process "uninterruptible sleep"? my head hurts....
Is there a better place to ask about this? I'm still having the same issue.... I'll keep looking but just wondering.
Well, I bit the bullet and cannibalized an old machine for an extra 256MB ram. It's only been up for 5 minutes, but it hasn't touched the swap at all. It appears to have fixed it, but I'll post again if it didn't!