Hi. The last couple of days I'm experiencing very high CPU load in a VPS container that I'm managing. The VPS is controlled via ISPConfig 3.1. The VPS serves only one website(Wordpress), a couple of mailboxes and one MySQL database. No DNS on that container. The problem started after a sudden reboot of the server and fail2ban failed to start. I noticed that after a couple of hours and a lot of brute force attacks to xmlrpc.php(know WP issue) file of Wordpress website. So after stop/start of fail2ban the rules(jails) started/banned attackers a should. The issue though appeared again and the situation getting worst hour by hour. All the services seem to be active but the webserver is down from time to time or very slow. All other services (ssh, mail, ftp etc) work as supposed to. I run chkrootkit and no issues found(except know false-positive bindshell). I run rkhunter and reported: Code: System checks summary ===================== File properties checks... Files checked: 146 Suspect files: 0 Rootkit checks... Rootkits checked : 376 Possible rootkits: 0 Applications checks... Applications checked: 6 Suspect applications: 0 The system checks took: 18 minutes and 32 seconds All results have been written to the log file: /var/log/rkhunter.log One or more warnings have been found while checking the system. Please check the log file (/var/log/rkhunter.log) /var/log/rkhunter.log file is included in this post. top command shows high load of mysqld (CPU 122%) and is included as a txt file in this post. ps aux command output is attached to this post too. I noticed the "nobody" user near the end running "PassengerLoggingAgent" that has a warning in rkhunter log too. Code: USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 32065 0.0 0.0 223192 1988 ? Ssl Jul22 0:00 PassengerWatchdog root 32068 0.0 0.0 512892 2428 ? Sl Jul22 0:00 PassengerHelperAgent nobody 32074 0.0 0.0 226616 4692 ? Sl Jul22 0:00 PassengerLoggingAgent Is that normal? uptime reports: Code: 00:07:32 up 1 day, 21:19, 1 user, load average: 7.44, 8.81, 9.37 The server is up-to-date with the latest not vulnerable apache2 version 2.4.10-10+deb8u10 See here: https://www.debian.org/security/2017/dsa-3913 UPDATE: There is one issue after apt-get upgrade command: Code: The following packages have been kept back: hhvm 0 upgraded, 0 newly installed, 0 to remove and 1 not upgraded. Sorry for the long post. I tried to include as much information I could think of. Any help is much appreciated. UPDATE 2: ISPProtect full version found no issues in /var/www folder Kind Regards
The high load in MySQL is most likely caused by a lot of database connections, probably coming from web2. So your server is probably not hacked and that's why scan tools don't find anything, there is just some kind of DOS. Take a look into the access.log of web2 website e.g. with tail -f command, you will probably see a lot of traffic there. When you see many connects from the same IP or same IP subnet, then youshouldd consider banning them with iptables r route command.
The website has already medium to high traffic(10k users/day) and was kind of difficult to identify the bots. Anyway you were right. I run: Code: cat 20170723-access.log | awk -F\" '{print $6}' | sort | uniq -c | sort -n and found 3 (bad?) bots that was abusing the website. AhrefsBot with over 17000 requests MJ12bot with over 12000 requests GrapeshotCrawler with over 9000 requests while Googlebot made 5000 requests and bingbot 4300 AhrefsBot, MJ12bot, GrapeshotCrawler seemed to ignore Crawl-Delay or/and Disallow in robots.txt. Additionally they use wide range of IPs so I created a jail(rule) in fail2ban to ban those IPs for one day each. I'm in Virtuozzo container so iptables command throws an error (???). In access log I see 403 from those bots now and the load is much better. Is this normal? Additionally after fail2ban restart (maybe because of iptables error) the IPs getting unbanned. I also run: Code: awk -F\" '($2 ~ /\.(jpg|gif)/ && $4 !~ /^https:\/\/www\.mydomain\.com/){print $4}' 20170723-access.log \ | sort | uniq -c | sort and found thousand of requests from copy-content websites with hot-linking to the images in my server. I added a redirect in .htaccess Checked the slow queries in mysql log but nothing extreme found except search functionality which seems normal because of the big number of articles (~30k). Code: # Query_time: 10.092863 Lock_time: 0.000106 Rows_sent: 34 Rows_examined: 28062 I found this error in /var/log/upstart/php5-fpm.log Code: [24-Jul-2017 18:39:00] WARNING: [pool web2] server reached pm.max_children setting (50), consider raising it [24-Jul-2017 18:39:31] WARNING: [pool web2] child 21785 exited on signal 9 (SIGKILL) after 396.086924 seconds from start [24-Jul-2017 18:39:31] NOTICE: [pool web2] child 22829 started [24-Jul-2017 18:40:20] WARNING: [pool web2] server reached pm.max_children setting (50), consider raising it [24-Jul-2017 18:42:05] WARNING: [pool web2] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 8 children, there are 18 idle, and 38 total children [24-Jul-2017 18:42:06] WARNING: [pool web2] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 16 children, there are 19 idle, and 40 total children [24-Jul-2017 18:42:07] WARNING: [pool web2] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 18 idle, and 41 total children [24-Jul-2017 18:42:08] WARNING: [pool web2] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 32 children, there are 19 idle, and 43 total children [24-Jul-2017 18:43:54] WARNING: [pool web2] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 8 children, there are 19 idle, and 39 total children [24-Jul-2017 18:45:54] WARNING: [pool web2] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 8 children, there are 19 idle, and 45 total children [24-Jul-2017 18:45:55] WARNING: [pool web2] seems busy (you may need to increase pm.start_servers, or pm.min/max_spare_servers), spawning 16 children, there are 19 idle, and 46 total children [24-Jul-2017 18:46:07] WARNING: [pool web2] server reached pm.max_children setting (50), consider raising it [24-Jul-2017 18:47:16] WARNING: [pool web2] server reached pm.max_children setting (50), consider raising it Kind Regards
If you have more RAM, then you should consider increasing the php fpm limits of that site (options tab). And check the FPM mode that you use. Personally, I prefer the ondemand mode, if PHP is new enough to support it.
Thanks for responding. FPM with CGI and SuEXEC enabled. FPM in dynamic mode: PHP-FPM pm.max_children: 50 PHP-FPM pm.start_servers: 25 PHP-FPM pm.min_spare_servers: 20 PHP-FPM pm.max_spare_servers: 30 PHP-FPM pm.max_requests: 300 If I raise I get this kind of error: Code: [24-Jul-2017 19:29:27] WARNING: [pool web2] child 29605 exited on signal 9 (SIGKILL) after 296.005956 seconds from start [24-Jul-2017 19:29:27] NOTICE: [pool web2] child 30501 started [24-Jul-2017 19:29:28] WARNING: [pool web2] child 29618 exited on signal 9 (SIGKILL) after 295.664036 seconds from start [24-Jul-2017 19:29:28] NOTICE: [pool web2] child 30508 started which seems RAM overload What about: Kind regards
IPtables in virtuozzo may make some problems, but you can use the route command for banning instead. https://www.faqforge.com/linux/how-to-block-access-to-a-server-by-ip-address-on-linux/ Restarting the server or fail2ban will lead to an unban.