Monit not restarting stalled programs - how to troubleshoot?

Discussion in 'Server Operation' started by flameproof, Jul 17, 2012.

  1. flameproof

    flameproof Member

    I have Monit running on my CentOS 5.2 VPS. Due to probably lack of RAM (and probably some memory hogs in my VPS range) some programs stall sometimes.

    Management software is Webmin/Virtualmin.

    So far Monit restarted them. However, since a short while Monit will no restart Postfix and Dovecot. Monit itself sometimes stalls too, but when I restart it my understanding is that will autostart other apps that are down - and it doesn't.

    How can I troubleshoot that?

    /etc/monit.d/monitrc :

    Code:
    set daemon  60
    set logfile syslog facility log_daemon
    set mailserver localhost
    set mail-format { from: abuse@****.com }
    set alert admin@**********.com
    #set httpd port 2812 and
    #     SSL ENABLE
    #     PEMFILE  /var/certs/monit.pem
    #     allow admin:test
    
    check process postfix with pidfile /var/spool/postfix/pid/master.pid
       group mail
       start program = "/etc/init.d/postfix start"
       stop  program = "/etc/init.d/postfix stop"
       if failed port 25 protocol smtp then restart
       if 5 restarts within 5 cycles then timeout
    
    check process dovecot with pidfile /var/run/dovecot/master.pid
       group mail
       start program = "/etc/init.d/dovecot start"
       stop  program = "/etc/init.d/dovecot stop"
       if failed port 25 protocol smtp then restart
       if 5 restarts within 5 cycles then timeout
    
    check process usermin with pidfile /var/usermin/miniserv.pid
       start program = "/etc/init.d/usermin start"
       stop  program = "/etc/init.d/usermin stop"
       group misc
       if failed host 127.0.0.1 port 20000 type tcp then restart
       depends usermin_init
    	
    check file usermin_init with path /etc/init.d/usermin
    group misc
    
    check process webmin with pidfile /var/webmin/miniserv.pid
    start program = "/etc/init.d/webmin start"
    stop  program = "/etc/init.d/webmin stop"
    group misc
    if failed host 127.0.0.1 port 10000 type tcp then restart
    depends webmin_init
    	
    check file webmin_init with path /etc/init.d/usermin
    group misc
    
    
     
  2. falko

    falko Super Moderator Howtoforge Staff

    Which virtualization technique do you use? Do you have access to the host system?
     

Share This Page