I'm pretty sure MySQL isthe culprit, but I'm not sure how to find the problem. I hope that someone can suggest some ideas for how to debug this. Basically my web server goes down to its knees after a while... or so it seems. When I last ran top when it was crawling, mysqld was taking 8.2% of the mem and there were a lot of httpd (more than usual) and sendmail tasks listed at the top when sotred by memory. The machine has 2GB of memory, but when I run free -m it reports: Mem: Total/2013 used/1996 free/17 Is this a red flag right there? Any suggestions on how to approach the issue? Thanks.
There are bunch of these in the mysql log, but I think they're just after the problem when I was trying to shutdown/restart mysql: InnoDB: Check that you do not already have another mysqld process InnoDB: using the same InnoDB data or log files. InnoDB: Unable to lock ./ibdata1, error: 11 In the apache logs, there's a ton of ModSecutiry spam that I probably need to clean up somehow: [error] [client X] ModSecurity: Warning. Operator EQ match: 0. [error] [client X] ModSecurity: Could not set variable "resource.alerted_960903_compression" as the collection does not exist. Lots of other random msgs.. Warnings: Module pgsql already loaded Module gd already loaded My database is about 2.2GB, 1.4GB in MyIASM and the rest in InnoDB
You can try to repair your tables, either in phpMyAdmin,, or on the command line: http://dev.mysql.com/doc/refman/5.0/en/repair-table.html http://dev.mysql.com/doc/en/Repair.html
I happened again when I woke up; the server was pretty dead. When I ran top, mysql was at the top with 7.2% of memory, but there were like an infinite number of http processes; I thought that might be the issue. restarting httpd was very slow: Code: Stopping httpd: [ OK ] Starting httpd: (98)Address already in use: make_sock: could not bind to address [::]:80 (98)Address already in use: make_sock: could not bind to address 0.0.0.0:80 no listening sockets available, shutting down Unable to open logs But shutting down and restarting http, didnt seem to fix it. It was back up, bust still slow.. simple pages with very little mysql worked fine, but anything else was still pretty much dead. I'll try running the repairs.
Is it possible this is a memory issue -- or would I have seen errors in the logs about memory? I bumped up the memory on the server from 2GB to 8GB, but I still have no idea what is causing this issue.
Do all log files/log directories exist? What's the output of Code: df -h ? Any errors in Apache's error log?
20 hr uptime since the memory upgrade. There are logs, I only got that error when the server was in it's "messed up state". sudo cat /etc/httpd/logs/error_log | grep Error and sudo cat /etc/httpd/logs/error_log | grep ERROR return nothing. The apache log is just cluttered with mod_security warnings and such. I can't seem to find anything relevant. Code: df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/VolGroup00-LogVol00 224G 15G 198G 7% / /dev/sda1 99M 20M 75M 21% /boot tmpfs 4.0G 0 4.0G 0% /dev/shm
Yes, the /etc/httpd/logs/access_log files are there. I thought it might have to do with my mysql config. my.cnf: Code: [mysqld] datadir=/var/lib/mysql socket=/var/lib/mysql/mysql.sock user=mysql # Default to using old password format for compatibility with mysql 3.x # clients (those using the mysqlclient10 compatibility package). old_passwords=1 max_connections=500 query-cache-type = 1 query-cache-size = 128M set-variable=long_query_time=5 log-slow-queries=/var/log/log-slow-mysql.log [mysqld_safe] log-error=/var/log/mysqld.log pid-file=/var/run/mysqld/mysqld.pid I tried updating this to the large my.cnf example file, but it fails to find the innodb files. So it just went down again, this time, it's uptime was almost 48 hrs.
I can't connect to it at all right now, which seems worse than usual. I was in an ssh term at the time and it started to slow down... it didn't take long before the server seemed to be completely offline.
This last one may have been something completely unrelated. It seems the server was down for maybe 5 minutes; but by the time I got to the box itself, it was back up. I'm trying to look through the logs around the time it went down to see if I can find anything...
No downtimes since, but I've noticed that the used memory is slow getting larger and larger... Code: top - 23:52:11 up 3 days, 1:18, 2 users, load average: 0.16, 0.14, 0.10 Tasks: 159 total, 1 running, 158 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 0.1%sy, 0.0%ni, 99.9%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 8201628k total, [B]4597380k used[/B], 3604248k free, 272332k buffers Swap: 2031608k total, 0k used, 2031608k free, 3673088k cached It's maybe growing 500-800mb/day; is this an issue?
As long as it's not swapping, this is ok. Linux tries to use as much memory as it can grab to cache things in order to speed up the system.