[RESOLVED] High CPU usage on my server causing site inaccessibility

Discussion in 'Server Operation' started by anark10n, Dec 2, 2022.

Tags:
  1. anark10n

    anark10n Member

    Hello there, not sure this is the place to ask this, but if not, my apologies.
    So, checking my resource graphs for my VPS provided by my vendor, I see high CPU usage spikes (>140% according to the graphs) two to three times daily and lasts for about 45 min to >1h (along high disk activity, but network activity remains low), causing all my sites to be inaccessible (for all protocols: http, ssh, ftp) but not for all the forcing me to hard reboot the VPS to restore access. How would i go about detecting what's causing these spikes after a reboot?
     
  2. Taleman

    Taleman Well-Known Member HowtoForge Supporter

    Monitor what is happening on that host. There are various top commands, they show what is using most resources. htop, nettop, iotop, and system statistic tools like dstat, vmstat. Maybe start with commands uptime, df -hT, free -h --wide.
    Use Internet Search Engines with
    Code:
    linux monitor resources
     
    Last edited: Dec 2, 2022
  3. ahrasis

    ahrasis Well-Known Member HowtoForge Supporter

    VPS normally shares virtual cpu unless it clearly specify it uses a dedicated one, which could be one of the reason, just a thought. If that spikes are coming from your host, then as @Taleman said, you can monitor your host so you would know what caused such spikes.
     
  4. anark10n

    anark10n Member

    Will those methods work after a reboot? Because I'm not able to login via ssh when the spikes happen to run those commands.
     
  5. Taleman

    Taleman Well-Known Member HowtoForge Supporter

    No.
    After reboot you can read logs.
    The so far proposed methods were meant to monitor what is happening right now. Seems you would need something that continuously monitors the host and writes to a log what is happening, so you can later see what was going on. There probably are tools like this, use Internet Search Engines to find them.
    If the problem is only you can not log in when the high load happens, I would keep a login session running and when the spike comes use that already logged in session to issue commands.
     
    ahrasis likes this.
  6. nhybgtvfr

    nhybgtvfr Well-Known Member HowtoForge Supporter

    munin should provide historical graphs.
    or use monit to email alert you to high cpu / load eg > 80% cpu, load > number of vcpu cores. that should give you a chance to login via ssh and run eg top before it becomes completely unresponsive.

    my guess is the culprit would be a bad/dodgy php script running on a site, or clamav running a server scan. i'd currently be leaning towards clamav.
     
    ahrasis likes this.
  7. anark10n

    anark10n Member

    I've installed munin and it's reporting the following critical issue:
    Screenshot 2022-12-11 at 22-56-04.png
    Not entirely sure how to fix that.
     
  8. Taleman

    Taleman Well-Known Member HowtoForge Supporter

    I can not see what is in that image. Please write the issue and what Munin says about it.
     
  9. pyte

    pyte Well-Known Member HowtoForge Supporter

    You could just reboot the machine, stop all services, and see what happens. If the system is working correctly start services one by one and see what causes the issue and the further investigate why.
    I would reboot, stop all services that are running on the host, like db,ftp,webserver etc. run "htop" and check CPU/RAM usage. After that start one of the services and check again and so on...
    The Kernellog might give you a hint too.
     
  10. till

    till Super Moderator Staff Member ISPConfig Developer

    The error is about bytes free, so you probably ran out of RAM (if the limits you set there are appropriate for your system, otherwise it can also be a false alert). You could also check with 'top' command how much free ram you have and it might help adding a swap file if you don't have a swap partition. Or add/assign more ram to the server.
     
  11. anark10n

    anark10n Member

    My memory usage is averaging 60% (4GB total) and my swap averages at 25% (8GB total).
    Found this topic on the issue. Apparently it's been this way for a long while. Do you think the suggested steps are advisable?
     
  12. Taleman

    Taleman Well-Known Member HowtoForge Supporter

    It does not matter what the average is. What matters if host runs out of memory.
    Are the load spikes still happening? How often? At the same time of day or day of week?
    The thread seems to say the error is spurious and due to bad defalt configuration in munin. I would ignore it and go back to finding the cause of high loads.
    What show commands (paste in CODE tags, please):
    Code:
    uptime
    free -h
    df -hT /
     
  13. anark10n

    anark10n Member

    So, i managed to log in during one of the spikes (had to wait a long while to get in), and found out it was a clients sites causing the cpu spikes due to their whatever custom php code they had uploaded. Disabling the php on those sites has resolved the spikes in usage.
     
    ahrasis and till like this.
  14. nhybgtvfr

    nhybgtvfr Well-Known Member HowtoForge Supporter

    if they want to run cpu heavy code, upsell them a dedicated vps :D
     

Share This Page