Hi Folks - I am in the process of setting up a new (remote) web-server (debian 3.1, kernel 2.6) to replace my old aging one. I have successfully transferred the web directories, users and 42goisp configuration files to it, and am in the process of testing sites before I make the switch. Unfortunately, fairly frequently it seems to freeze/lock-up for a few seconds, and when I examine the logs I find (in the syslog) what appears to have been out of memory problems where oom-killer has stepped in and killed a process (usually mysqld). I wonder if this is what is causing the system to lock-up, and if so what I might be able to do about it.... It puzzles me because the machine has very little load at the moment, and is running pretty much the bare minimum processes. Below is a typical excert from the syslog. If anyone can help in any way I would be grateful... <snip> Jul 26 11:51:50 localhost kernel: oom-killer: gfp_mask=0x1d2 Jul 26 11:51:50 localhost kernel: DMA per-cpu: Jul 26 11:51:50 localhost kernel: cpu 0 hot: low 2, high 6, batch 1 Jul 26 11:51:50 localhost kernel: cpu 0 cold: low 0, high 2, batch 1 Jul 26 11:51:50 localhost kernel: Normal per-cpu: Jul 26 11:51:50 localhost kernel: cpu 0 hot: low 32, high 96, batch 16 Jul 26 11:51:50 localhost kernel: cpu 0 cold: low 0, high 32, batch 16 Jul 26 11:51:50 localhost kernel: HighMem per-cpu: empty Jul 26 11:51:50 localhost kernel: Jul 26 11:51:50 localhost kernel: Free pages: 2752kB (0kB HighMem) Jul 26 11:51:50 localhost kernel: Active:61090 inactive:60234 dirty:0 writeback:0 unstable:0 free:688 slab:2778 mapped:121210 pagetables:889 Jul 26 11:51:50 localhost kernel: DMA free:1424kB min:20kB low:40kB high:60kB active:6012kB inactive:5992kB present:16384kB Jul 26 11:51:50 localhost kernel: protections[]: 10 356 356 Jul 26 11:51:50 localhost kernel: Normal free:1328kB min:692kB low:1384kB high:2076kB active:238348kB inactive:234944kB present:498560kB Jul 26 11:51:50 localhost kernel: protections[]: 0 346 346 Jul 26 11:51:50 localhost kernel: HighMem free:0kB min:128kB low:256kB high:384kB active:0kB inactive:0kB present:0kB Jul 26 11:51:50 localhost kernel: protections[]: 0 0 0 Jul 26 11:51:50 localhost kernel: DMA: 0*4kB 0*8kB 33*16kB 14*32kB 1*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1424kB Jul 26 11:51:50 localhost kernel: Normal: 0*4kB 0*8kB 1*16kB 1*32kB 0*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 1328kB Jul 26 11:51:50 localhost kernel: HighMem: empty Jul 26 11:51:50 localhost kernel: Swap cache: add 11543237, delete 11543217, find 1415125/1447558, race 4+38 Jul 26 11:51:50 localhost kernel: Out of Memory: Killed process 981 (mysqld). </snip> Occasionaly getting: 'Out of Memory: Killed process 1631 (grpconv).' instead of mysqld
more info (right now - obviously difficult to check when it actually freezes - but you can see there should be no problem, mysql is the biggest app but is taking up virually no space).... vmstat -s -S M 495 M total memory 54 M used memory 17 M active memory 20 M inactive memory 441 M free memory 0 M buffer memory 18 M swap cache 1913 M total swap 27 M used swap 1886 M free swap 1124330 non-nice user cpu ticks 2559 nice user cpu ticks 55440 system cpu ticks 240589628 idle cpu ticks 160936 IO-wait cpu ticks 0 IRQ cpu ticks 111761 softirq cpu ticks 8767092 pages paged in 79986237 pages paged out 222597 pages swapped in 14412690 pages swapped out 2524898472 interrupts 52454484 CPU context switches 1151505178 boot time 403014 forks ps -e --cols=79 -o %mem,user,pid,command --no-header --sort=%mem | sort -r 2.6 mysql 14198 /usr/sbin/mysqld --basedir=/usr --datadir=/var/lib/mysql -- 0.1 root 14228 bash 0.1 root 14227 ps -e --cols=79 -o %mem,user,pid,command --no-header --sort 0.1 root 14226 sleep 10 0.1 root 14199 logger -p daemon.err -t mysqld_safe -i -t mysqld 0.1 root 14197 /bin/sh /usr/bin/mysqld_safe 0.1 root 11124 bash 0.0 www-data 14184 /usr/sbin/apache2 -k start -DSSL 0.0 www-data 14183 /usr/sbin/apache2 -k start -DSSL 0.0 www-data 14182 /usr/sbin/apache2 -k start -DSSL 0.0 www-data 14181 /usr/sbin/apache2 -k start -DSSL 0.0 www-data 14180 /usr/sbin/apache2 -k start -DSSL 0.0 servera 11121 -sh 0.0 servera 11120 sshd: serveradmin@pts/4 0.0 root 8964 /usr/sbin/apache2 -k start -DSSL 0.0 root 8334 /bin/bash /root/42go/sv/42go_wconf 0.0 root 8333 /root/42go/httpd/bin/42go_httpd -DSSL 0.0 root 622 [kjournald] 0.0 root 621 [kjournald] 0.0 root 61 [aio/0] 0.0 root 60 [kswapd0] 0.0 root 5 [kacpid] 0.0 root 59 [pdflush] 0.0 root 58 [pdflush] 0.0 root 4 [khelper] 0.0 root 48 [kblockd/0] 0.0 root 3 [events/0] 0.0 root 370 [kjournald] 0.0 root 338 [md0_raid1] 0.0 root 328 [shpchpd_event] 0.0 root 326 [pciehpd_event] 0.0 root 2 [ksoftirqd/0] 0.0 root 28191 /bin/sh /usr/bin/mysqld_safe 0.0 root 26044 /usr/sbin/cron 0.0 root 234 [khubd] 0.0 root 226 [scsi_eh_1] 0.0 root 225 [scsi_eh_0] 0.0 root 224 [ata/0] 0.0 root 21359 /usr/sbin/courierlogger imapd-ssl 0.0 root 21357 /usr/sbin/couriertcpd -address=0 -stderrlogger=/usr/sbin/co 0.0 root 21311 /usr/sbin/courierlogger imaplogin 0.0 root 21309 /usr/sbin/couriertcpd -address=0 -stderrlogger=/usr/sbin/co 0.0 root 21276 /usr/sbin/courierlogger pop3d-ssl 0.0 root 21274 /usr/sbin/couriertcpd -pid=/var/run/courier/pop3d-ssl.pid - 0.0 root 21220 /usr/sbin/courierlogger courierpop3login 0.0 root 21218 /usr/sbin/couriertcpd -pid=/var/run/courier/pop3d.pid -stde 0.0 root 21192 /usr/lib/courier/authlib/authdaemond.plain 0.0 root 21191 /usr/lib/courier/authlib/authdaemond.plain 0.0 root 21190 /usr/lib/courier/authlib/authdaemond.plain 0.0 root 21189 /usr/lib/courier/authlib/authdaemond.plain 0.0 root 21188 /usr/lib/courier/authlib/authdaemond.plain 0.0 root 21187 /usr/lib/courier/authlib/authdaemond.plain 0.0 root 21186 /usr/sbin/courierlogger -pid=/var/run/courier/authdaemon/pi 0.0 root 2112 /sbin/getty 38400 tty6 0.0 root 2111 /sbin/getty 38400 tty5 0.0 root 2110 /sbin/getty 38400 tty4 0.0 root 2109 /sbin/getty 38400 tty3 0.0 root 2108 /sbin/getty 38400 tty2 0.0 root 2102 /sbin/getty 38400 tty1 0.0 root 20985 /usr/sbin/saslauthd -m /var/spool/postfix/var/run/saslauthd 0.0 root 20984 /usr/sbin/saslauthd -m /var/spool/postfix/var/run/saslauthd 0.0 root 20983 /usr/sbin/saslauthd -m /var/spool/postfix/var/run/saslauthd 0.0 root 20982 /usr/sbin/saslauthd -m /var/spool/postfix/var/run/saslauthd 0.0 root 20981 /usr/sbin/saslauthd -m /var/spool/postfix/var/run/saslauthd 0.0 root 2075 /sbin/mdadm -F -i /var/run/mdadm.pid -m root -f -s 0.0 root 2068 /usr/sbin/sshd 0.0 root 203 [kseriod] 0.0 root 1 init [2] 0.0 root 1988 /usr/sbin/inetd 0.0 root 1925 /usr/sbin/lwresd 0.0 root 1907 /sbin/klogd 0.0 root 1904 /sbin/syslogd 0.0 root 14179 /root/42go/cronolog --symlink=/var/log/httpd/42go_access_lo 0.0 root 14135 /usr/lib/postfix/master 0.0 root 11094 sshd: serveradmin [priv] 0.0 postgres 2061 /usr/lib/postgresql/bin/pg_autovacuum -D -p 5432 -L /var/lo 0.0 postgres 2056 postgres: stats collector process 0.0 postgres 2055 postgres: stats buffer process 0.0 postgres 2051 /usr/lib/postgresql/bin/postmaster -D /var/lib/postgres/dat 0.0 postfix 14138 qmgr -l -t fifo -u -c 0.0 postfix 14136 pickup -l -t fifo -u -c 0.0 nobody 9196 proftpd: (accepting connections) 0.0 daemon 2078 /usr/sbin/atd 0.0 adm42go 9201 /home/adm42go/42go/tools/clamav/bin/freshclam -d -c 10 --da 0.0 adm42go 8655 /root/42go/httpd/bin/42go_httpd -DSSL 0.0 adm42go 8338 /root/42go/httpd/bin/42go_httpd -DSSL
I have a feeling that this problem is related to a malfunction with 42goisp... Everytime I add a new site inside 42goisp, it takes a couple of minutes before the home directory and the settings in the Vhosts_42go.conf file get changed. It seems that immediately prior to this, the oom-killer message block above appears in the messages & syslog. I have been talking to Til at projektfarm about another problem, in that the home directories are created owned by root instead of the owner I have set in the server-settings-web- httpd User ... I also notice that I have to keep repairing the 42go db because when I check it it has a few tables with the error: warning: 3 clients are using or haven't closed the table properly I wonder if these problems are related... The history of this installation is that the 42go db and configuration as well as blocks of the password, shadow, group & gshadow files are copied from an existing older server running apache 1.3. This one is running Apache 2, but more or less the same version of mysql (4.0.24 as opposed to 4.0.18). I have modified the isp_server & isp_server_ip tables manually to have the correct ip address and look for apache2 instead of apache, but maybe I have missed something?.... Of course it could be coincidence that the oom-killer is kicking in when I use 42goisp - it is about the only thing I am doing at the moment.... If you can provide any insight it would be very useful
I think the problem might be that the 42goISP binaries do not work properly on your new platform (because they were compiled on another platform).
Hi Falko - not quite sure what you mean, I downloaded the tarball from projektfarm and installed same as ever (it's an intel debian 3.1 box).... I did however copy all the mysql files from my old server(running mysql 4.0.18) to the new (running mysql 4.0.24), including the mysql db and the 42go db. I then modified the isp_server & isp_server_ip files manually to reflect the change of ip address, apache to apache2, and different domain. Is it possible I broke mysql by doing this? or just db42go? Til Brehm suggests I reinstall 42go to get back to the original db42go settings for this machine, then mysql dump in all the other db42go tables apart from these two. What is the best way to proceed in your opinion? Thanks