server slow respond to http

Discussion in 'Server Operation' started by abubin, Aug 9, 2011.

  1. abubin

    abubin New Member

    I have this server that serves some media ads (graphics files) with heavy traffic and a few websites that is not heavy traffic. I have also installed ISPconfig 3 (version 3.0.2.2) on it. Using ubuntu 8.04.4.

    Lately, this server is very unresponsive. I don't know what happened cause suddenly the server's http is very unresponsive.

    I tried looking into all the log files but was unable to find the solution. Apache service is up and down. One thing I found in the log is apache saying some error about maxclients reached. Ask me to increase. However, the maxclients is already at 700. I increased it to 800 and then 1000 and I still get the error after only mere minutes of restarting apache service.

    Then I look into apache server-status and see lots of thread in apache that is in "K". It means still in keepalive. That is probably why the maclients used up so quickly. Then I turned keepalive OFF. No more apache PID in "K" but http is still up and down.

    Anyone have any idea what else to look?

    Is it someone attacking my apache service?
     
    Last edited: Aug 9, 2011
  2. vidas

    vidas New Member

    Hi!

    How is your I/O? What HDDs are you using? What is the apache memory footprint (an the total memory) and are you using prefork or worker mpm?
     
  3. abubin

    abubin New Member

    I/O seems to be fine:
    I am not sure what HDD we are using but this is a server hosted in a datacenter. Should be SAS or SCSI.

    Using default apache2 that comes with ubuntu 8.04. Should be using prefork MPM. Memory footprint is around 9-11mb.

    Total memory is 4GB.

    As of today, server is more responsive with my SSH sessions (no more timeout on SSH). And typing in commands also more responsive. However, apache server is still being detected as hit and miss.

    We have a loadbalancer (haproxy) running and also nagios that keep detecting this server to not respond to http every few clicks.

    BTW,

    There is also this error in apache error log:
    [Tue Aug 09 16:24:23 2011] [notice] mod_fcgid: process /var/www/domain.com/web/wing/index.php(21176) exit(server exited), terminated by calling exit(), return code: 0

    Could that be the reason for the problem? As far as I have read, this is not a serious problem. I did changed that domain to using mod-php instead which stop this notice. But problem still persist.
     
    Last edited: Aug 9, 2011
  4. falko

    falko Super Moderator Howtoforge Staff

    I suggest you set up munin to find out what's going on. I also suspect long I/O wait times.
     
  5. Mark_NL

    Mark_NL Member

    ssh timeouts, check your interfaces
    Code:
    ifconfig | grep errors
    see if they give a lot of errors
    check dmesg as well

    and as Falko said, install munin to track everything.
     
  6. abubin

    abubin New Member

    ssh seems to be back to being stable now. So no errors in ifconfig.

    Narrowed down the problem to being http only. Server load is not even high. Barely 0.23.

    I am really pulling my hair on this one (not like I have much hair left). Already installed munin and will wait for the data to come in. Any other suggestions?
     
  7. falko

    falko Super Moderator Howtoforge Staff

  8. abubin

    abubin New Member

    here is the login to munin server. URL is http://174.143.149.122/munin/vn239/vn239.html. The login is tempo and tempo112.

    Please take a look and let me know. Notice the broken graph because the server connection is not very good. Not sure if it is due to the existing problem with the http or the server's connection generally.

    If I put the munin master in the same server, it will be worst as I will have problem loading the graph as the http is keep giving timeout.
     
  9. abubin

    abubin New Member

    Do not think it's apache performance issue. Cause you can see from the munin graph, apache is not exactly maxed out. Plus we have setup a lot of servers with similar apache settings.
     
  10. Mark_NL

    Mark_NL Member

    Code:
    dmesg | grep Machine check events logged
    does that give any results?
     
  11. abubin

    abubin New Member

    No output from that command.

    We have moved out one of the website that is having the highest traffic. The apache services seems to be back to normal. Actually there are virtually no traffic right now. HTTP is responding as it should.

    So problem is definitely caused by this website having too high traffic? If you look at the munin graph, you can see the apache process dipped to almost zero. That is the moving out of the website. How is it possible if that site is causing load, there would have been cpuload and memory load and i/o load and so on.

    The thing is, the website with that traffic has been running fine for 1 year already. We moved out the site to another server very similar to this. And the server can handle this site no problem. Only different is that it does not run ISPconfig 3. Now, I am not blaming ISPconfig but just want to put everything on the table and view from all perspective.

    Any idea on how to proceed to solve this problem is very much appreciated.

    Here is the web traffic for both of the most active website in the server:
    Code:
    Domain 	        This month 	Last month 	This year 	Last year
    vn.domain.com 	5 821 MB 	906 MB 	31 535 MB 	27 567 MB 	
    asvn.domain.com 	12 100 MB 	55 765 MB 	320 022 MB 	0 MB
    
    The asvn.domain.com, we have moved out 2 days ago. Still have the same problem. vn.domain.com just moved out 6-7 hours ago.

    sysctl.conf
    Code:
    net.ipv6.conf.eth0.autoconf = 0
    net.core.rmem_max = 16777216
    net.core.wmem_max = 16777216
    net.ipv4.tcp_rmem = 4096 87380 16777216
    net.ipv4.tcp_wmem = 4096 65536 16777216
    net.ipv4.tcp_no_metrics_save=1
    net.core.netdev_max_backlog = 2500
    net.ipv4.ip_local_port_range = 1024 65535
    fs.file-max = 65555
    net.ipv4.netfilter.ip_conntrack_max = 131072
    
     
    Last edited: Aug 11, 2011
  12. Mark_NL

    Mark_NL Member

    Looking at your graphs now, i still see gaps after you moved the busy site.

    I'm putting my money on a broken network adapter.
     
  13. abubin

    abubin New Member

    ah..thanks for this suggestion...made me realize maybe the hosting provider is rate limiting our bandwidth. Let me check with them for mrtg graph.

    Also, how do I justify to the hosting provider that the network adapter is giving problem? Is there a way to check from my side?

    Edit:

    I just got the MRTG graph from my hosting company.
    Third graph (FastEthernet0/10) is the server which I said is having the problem.

    First (FastEthernet0/14) and second (FastEthernet0/9) graph is the other two servers that is starting to give problem now. Looks like traffic is flying off the roof up to 80mb. But it looks very suspicious because it just spike up and down.

    Any idea?
     

    Attached Files:

    Last edited: Aug 12, 2011
  14. Mark_NL

    Mark_NL Member

    The spikes are weird.. are you having crons run at those times?

    The spikes are blue on the switch, that's outgoing for the switch on a FastEthernet port, that's max 100mbit going towards your server. So there's your problem. Your hitting your maximum port speed. Now to find out why!

    You could install iptraf and keep it opened up in a screen (in case it spikes again) .. when the connection is back, check iptraf to see who send you all the traffic and on which port.
     
  15. falko

    falko Super Moderator Howtoforge Staff

  16. Mark_NL

    Mark_NL Member

    I thought of that as well, iowait can occur from almost everything else that the process needs to wait on .. a MySQL reply, a file operation, a slow nfs mount, a busy network card, etc etc ..

    he's topping his max switchport speed, i bet he moved the busy website to the machine that's attached to Fa0/9 .. max 97mbit. maybe some script kiddie is dossing him since it's OUT on the switch port, so IN for the server.
     
  17. abubin

    abubin New Member

    wow! lots of suggestions. Let's take them one by one.

    yes, we do have some rsync scripts running that will rsync data into the servers. These are mostly media files that we use for serving users. This rsync does run between different servers in different countries. But it has been running for a year and nothing changed recently. No reason for it to suddenly spike the traffic.

    Our mysql an apache are actually already optimized. This has been implemented regionwide in servers around other parts of Asia. However, not discounting more improvement can be made.

    As for I/O wait, what else can be done beside changing to a faster HDD? Implement faster file system?

    Anyway, we have completely restructured our system in this 3 servers. The Haproxy is now balancing between all three servers. All running in multi backend balancing with ACLs. Theorically problem should still be there because same traffics are being directed into same three servers. Do not understand why it is okay now. However, will ask for MRTG graph tomorrow to see if traffic trend has changed.

    BTW, speaking of haproxy. We used the haproxy howto in this website. Totally great stuffs. Thanks to howtoforge team.
     
    Last edited: Aug 12, 2011
  18. Turbanator

    Turbanator Member HowtoForge Supporter

    What version of mysql are you running? There are some bugs in 5.5.12 + that won't show any load issues, but will kill your server.

    As a test, put up phpinfo.php on a domain (script with no db access), and see how fast response is. If it's normal, then chances are mysql is your issue.

    I jumped in because this is my issue on Fedora 15...time to downgrade.
     
  19. falko

    falko Super Moderator Howtoforge Staff

    There are some measures you can take. For example, you could use tmpfs ( http://www.howtoforge.com/storing-files-directories-in-memory-with-tmpfs ) for you cache directories (if you do caching) or use memcached to store your cache in memory.

    You could switch off access logging in Apache if you don't need it (for example if you use Google analytics), but I don't recommend to switch off error logging.
    Disable .htaccess files by setting AllowOverride None and placing the .htaccess directives directly in your vhosts.

    Allow browsers to cache static files (like images, css, js) so that they don't have to fetch it from your server after the first access (see http://www.howtoforge.com/make-brow...es-with-mod_expires-on-apache2-debian-squeeze ).
     
    Last edited: Aug 13, 2011
  20. abubin

    abubin New Member

    We are using latest version of mysql that comes with ubuntu 8.04 which is 5.0.51a-3ubuntu5.8. This should be okay as we have are using this version on a lot of our other servers. We do not use edge version of packages unless it is a feature that we need.
     

Share This Page