HELP : Debian Squeeze perfect setup crashes Apache and MySql randomly

Discussion in 'Technical' started by ircf, Feb 25, 2012.

  1. ircf

    ircf Member

    Hello,

    We have 2 servers (web and mail) running ISP Config 3.0.3.

    All went great until last thursday at 6PM when Apache2 and MySql crashed on our web server.

    Since then, this server randomly crashes (between 15 minutes and 3-4 hours), we found this in the kern.log :

    Code:
    Feb 25 21:09:16 ns3 kernel: [ 1037.526123] REISERFS error (device sda8): vs-2100 add_save_link: search_by_key ([-1 2362445 0x1001 DIRECT]) returned 1
    Feb 25 21:09:16 ns3 kernel: [ 1037.526223] REISERFS (device sda8): Remounting filesystem read-only
    Feb 25 21:09:16 ns3 kernel: [ 1037.526230] REISERFS warning (device sda8): clm-6006 reiserfs_dirty_inode: writing inode 2362445 on readonly FS
    
    Each server have the same hardware, besides its quite new (bought in 2010 but used only since 2011) :

    Dell R610
    CPU : 1 x 2.26 Ghz (Intel xeon E5507)
    RAM : 4GB
    Data partition : /dev/sda8 /var 800GB used 20%
    Log partition : /dev/sda9 /var/log 10G used 75%
    Swap : 2GB
    OS : Debian Squeeze (apt upgraded)

    Web server programs :
    apache2
    mysql
    vlogger
    postfix
    fcgid+php-cgi+suexec
    fail2ban
    postgresql
    rkhunter
    bind (primary DNS)
    pure-ftpd
    ISP Config as master (mail server uses ISP Config as slave)
    Web server is hosting approx 200 domains and websites

    Mail server programs :
    squirrelmail
    pop-courier
    imap-courier
    postfix
    amavis
    spamassassin
    clamav
    mysql
    apache2
    rkhunter
    fail2ban
    bind (secondary DNS)

    At first I thought it was a swapping problem (it seems our provider set a too small swap partition), so I try to reduce RAM use : I set Apache MPM Prefork MaxClients to 40 instead of default which I think is 150 and fcgid FcgidMaxProcesses to 40 instead of previously set 100, I removed some useless apache modules and disabled fail2ban and postgresql which we don't really use for now.

    Unfortunately, this didn't solve anything, server crashed about an hour after...

    I tried to fix with :

    Code:
    reiserfsck --fix-fixable /dev/sda8 
    And then reboot, but it didn't solve neither.

    We also tried to fix mysql databases (as we first thought it was a mysql issue) :

    Code:
    mysqlcheck -A -r -p
    That indeed fixed many tables and improved a little mysql speed, but of course this didn't fix the crash.

    Finally, we sent a ticket to our provider to tell them to try to fix the partition (our provider is theorically responsible for hardware issues)

    On top of this, each time the web server (master) crashes, it output hundreds of MB of binary data in the /var/log/ispconfig/ispconfig.log and on the mail server, amavis also crashes (and messages get stuck in postqueue).

    We commented the ISP Config server.sh cron task, so it "fixed" the huge logging issue, but the amavis crash still occur, I suppose it's related to the crash of the master MySql database...

    Does this issue happened to any of you ? Does anyone have any idea about how we can fix that ?

    We would really appreciate some help, because this is really a critical issue for us. Thank you for your help.
     
    Last edited: Feb 25, 2012
  2. ircf

    ircf Member

    Crash fixed

    Hello,

    After hours of investigation during whole week-end we finally found the culprit : one website running Prestashop in production with a huge frequentation had enabled option "PS_SMARTY_FORCE_COMPILE" (since thurdsay 16PM). This option forces smarty to recompile templates on each request and made a huge consumption of CPU that lead to these repeated crashes.

    We disabled this option this morning and since then server stopped crashing.

    Now we would like to set a cpulimit to avoid this bug to come again. I've found this article : http://www.howtoforge.com/how-to-limit-cpu-usage-of-a-process-with-cpulimit-debian-ubuntu but I don't really see how to enable it on my configuration any hint please ?

    Thank you.
     
  3. ircf

    ircf Member

    More infos

    On our way to try to fix these crashes, we made some improvements to reduce our RAM usage, like :
    - Remove E_NOTICE from PHP logs, by replacing :
    Code:
    error_reporting = E_ALL & ~E_DEPRECATED
    by :
    Code:
    error_reporting = E_ALL & ~E_DEPRECATED & ~E_NOTICE
    in /etc/php5/*/php.ini. Note that you have to restart apache2 and if you use php-apc pray for it to clean cache or else it'll keep PHP Notices.
    We made this because we had found 1000s of PHP Notice lines on a website right before crash so we first thought it might be a vlogger issue.

    - Replacing mpm_prefork by mpm_worker and removing mod_php (useless because all our websites now use fcgid) : I just had seen several blog articles saying that mpm_worker is best for multi-core CPU and consumes less RAM than prefork, I'm not sure about this but removing mod_php from apache2 config reduced each apache2 process RES (RAM usage) by about 3-5MB each so I kept it even if it didn't fix my problem... Note that you have to enable fcgid on ALL sites (even phpmyadmin or default) and you won't be able to use mod_php again.

    - We also removed apache2 modules we don't use, like : ruby, suphp, cgid and davfs, so each apache2 process consume now about 15MB instead of 25MB.

    For next weeks/months we'd like to :
    - Set cpulimit (if possible we'd like to have a "graceful limit behaviour" like processing by chunks instead of refusing to process, but dunno how it works...)
    - Reduce PHP memory_limit : We set it to 128MB but it seems a lot, maybe 64MB ? Like CPU I dunno if Apache2 would just throw errors or if it could process chunks
    - Increase RAM : 4GB -> 16GB
    - Tune cpulimit, mpm worker and fcgid parameters (need to find simple formulas to fit each param to our config)
     
    Last edited: Feb 27, 2012

Share This Page