Code: top - 14:53:01 up 2 days, 1:09, 1 user, load average: 52.22, 68.57, 37.83 Tasks: 346 total, 1 running, 343 sleeping, 0 stopped, 2 zombie Cpu(s): 26.4%us, 11.2%sy, 0.2%ni, 0.0%id, 61.7%wa, 0.0%hi, 0.4%si, 0.0%st Mem: 2063384k total, 1940664k used, 122720k free, 12108k buffers Swap: 1951856k total, 932880k used, 1018976k free, 142528k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 29613 web5 20 0 202m 49m 7656 S 18 2.5 0:02.42 php-cgi 6500 mysql 20 0 502m 27m 2940 S 12 1.4 24:26.55 mysqld 29458 web5 20 0 202m 49m 7656 S 11 2.5 0:00.96 php-cgi 28987 web8 20 0 195m 31m 7660 D 1 1.6 0:03.24 php-cgi 29201 www-data 20 0 0 0 0 Z 1 0.0 0:00.06 apache2 <defunct> 29335 web29 20 0 190m 33m 7636 S 1 1.7 0:00.48 php-cgi 29408 web8 20 0 203m 47m 7808 D 1 2.4 0:00.64 php-cgi 29515 web5 20 0 202m 50m 7684 S 1 2.5 0:00.78 php-cgi 46 root 15 -5 0 0 0 S 1 0.0 0:07.94 kblockd/0 3630 root 20 0 57384 4100 960 S 1 0.2 13:12.69 collectl 29269 web5 20 0 203m 45m 7640 D 1 2.3 0:00.96 php-cgi 29273 web5 20 0 204m 49m 7636 D 1 2.4 0:01.94 php-cgi 29294 web29 20 0 190m 34m 7684 S 1 1.7 0:00.52 php-cgi 29306 web5 20 0 204m 48m 7584 S 1 2.4 0:01.02 php-cgi 29326 web29 20 0 190m 33m 7636 S 1 1.7 0:00.48 php-cgi 29409 web8 20 0 201m 46m 7628 S 1 2.3 0:00.58 php-cgi 29412 web8 20 0 201m 46m 7624 D 1 2.3 0:00.62 php-cgi 29474 web8 20 0 201m 48m 7592 D 1 2.4 0:00.68 php-cgi 29514 web1 20 0 187m 34m 7788 D 1 1.7 0:01.00 php-cgi 29734 root 20 0 19216 1532 940 R 1 0.1 0:00.32 top Debian Lenny x64, dual 3ghz P4, 2Gb ram, 500gb hdd. I've noticed that the websites will get hit like crazy and cause massive load spikes. (inter5.org and areyouliberal.com, namely). I put a robots.txt limitation of 8 seconds for cycles to stop the massive Googlebot floods. I've also added caching to all Wordpress sites. I've been trying to tweak my.cnf to allow better response times because I think it's going to end up being Mysql related. The tuning-primer.sh recommendations have been implemented almost entirely. Still locks up. It will come back to responding after about 15 minutes. However, it will keep a CPU load of 50+ for about half an hour before things seem to settle causing bad lag. Error logs don't really help too much. mydns log Code: mydns[24690]: mydns: error finding NS type resource records for name `ns1' in zone 12: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2) (errno=2) mydns[24690]: mydns: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2): error during query: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2) (errno=2) mydns[24690]: last message repeated 2 times mydns[24690]: mydns: error finding NS type resource records for name `' in zone 12: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2) (errno=2) mydns[24690]: mydns: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2): error during query: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2) (errno=2) mydns[24690]: last message repeated 2 times mydns[24690]: mydns: error finding NS type resource records for name `ns2' in zone 12: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2) (errno=2) mydns[24690]: mydns: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2): error during query: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2) (errno=2) mydns[24690]: last message repeated 2 times mydns[24690]: mydns: error finding NS type resource records for name `' in zone 12: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2) (errno=2) mydns[24690]: mydns: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2): error during query: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2) (errno=2) mydns[24690]: last message repeated 2 times mydns[24690]: mydns: ns3.derekgordon.com.: error loading SOA: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2) (errno=2) mydns[24690]: mydns: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2): error during query: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2) (errno=2) mydns[24690]: last message repeated 2 times mydns[24690]: mydns: ns3.derekgordon.com.: error loading SOA: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2) (errno=2) mail.err Code: Oct 12 14:48:24 my imapd-ssl: authentication error: Input/output error Oct 12 14:49:45 my authdaemond: failed to connect to mysql server (server=localhost, userid=ispconfig): Lost connection to MySQL server at 'reading authorization packet', system error: 104 Oct 12 14:50:58 my imapd: authentication error: Input/output error Oct 12 14:51:28 my imapd: authentication error: Input/output error Oct 12 15:00:46 my imapd: authentication error: Input/output error Oct 12 15:00:51 my authdaemond: failed to connect to mysql server (server=localhost, userid=ispconfig): Lost connection to MySQL server at 'sending authentication information', system error: 32 kern.log Code: Oct 12 10:15:45 my kernel: [164931.197729] php-cgi[24767]: segfault at 411347f0 ip 676429 sp 7fff02289320 error 4 in php5-cgi[400000+506000] Oct 12 12:42:34 my kernel: [173901.308359] php-cgi[11310]: segfault at 44afc280 ip 676429 sp 7fffca89e5f0 error 4 in php5-cgi[400000+506000] error.log web8 / inter5.org Code: [Tue Oct 12 15:04:16 2010] [warn] (103)Software caused connection abort: mod_fcgid: ap_pass_brigade failed in handle_request functi$ [Tue Oct 12 15:04:43 2010] [warn] (104)Connection reset by peer: mod_fcgid: read data from fastcgi server error. [Tue Oct 12 15:04:43 2010] [warn] (104)Connection reset by peer: mod_fcgid: ap_pass_brigade failed in handle_request function [Tue Oct 12 15:04:43 2010] [warn] (104)Connection reset by peer: mod_fcgid: read data from fastcgi server error. [Tue Oct 12 15:04:44 2010] [error] [client 66.249.71.105] Premature end of script headers: index.php [Tue Oct 12 15:05:19 2010] [warn] (103)Software caused connection abort: mod_fcgid: ap_pass_brigade failed in handle_request functi$ [Tue Oct 12 15:05:36 2010] [warn] mod_fcgid: read data timeout in 360 seconds [Tue Oct 12 15:05:37 2010] [error] [client 98.158.20.230] Premature end of script headers: index.php [Tue Oct 12 15:06:33 2010] [warn] (104)Connection reset by peer: mod_fcgid: read data from fastcgi server error. [Tue Oct 12 15:06:34 2010] [warn] (104)Connection reset by peer: mod_fcgid: ap_pass_brigade failed in handle_request function [Tue Oct 12 15:07:50 2010] [warn] mod_fcgid: read data timeout in 360 seconds [Tue Oct 12 15:07:50 2010] [error] [client 72.14.199.155] Premature end of script headers: index.php [Tue Oct 12 15:08:04 2010] [warn] (103)Software caused connection abort: mod_fcgid: ap_pass_brigade failed in handle_request functi$ [Tue Oct 12 15:08:06 2010] [warn] (103)Software caused connection abort: mod_fcgid: ap_pass_brigade failed in handle_request functi$ [Tue Oct 12 15:08:12 2010] [warn] (103)Software caused connection abort: mod_fcgid: ap_pass_brigade failed in handle_request functi$ [Tue Oct 12 15:08:12 2010] [warn] (103)Software caused connection abort: mod_fcgid: ap_pass_brigade failed in handle_request functi$ [Tue Oct 12 15:08:21 2010] [warn] (103)Software caused connection abort: mod_fcgid: ap_pass_brigade failed in handle_request functi$ access.log web8 / inter5.org has at least 10 queries a second. These use Mysql (wordpress installs). No logs are being added for Mysql /var/log/ files, damnit. Any thoughts on tuning this thing? Errors indicate an inability to access Mysql causing the daemons to not be able to produce information and lock up.... Code: # The MySQL database server configuration file. # # You can copy this to one of: # - "/etc/mysql/my.cnf" to set global options, # - "~/.my.cnf" to set user-specific options. # # One can use all long options that the program supports. # Run program with --help to get a list of available options and with # --print-defaults to see which it would actually understand and use. # # For explanations see # http://dev.mysql.com/doc/mysql/en/server-system-variables.html # This will be passed to all mysql clients # It has been reported that passwords should be enclosed with ticks/quotes # escpecially if they contain "#" chars... # Remember to edit /etc/mysql/debian.cnf when changing the socket location. [client] port = 3306 socket = /var/run/mysqld/mysqld.sock # Here is entries for some specific programs # The following values assume you have at least 32M ram # This was formally known as [safe_mysqld]. Both versions are currently parsed. [mysqld_safe] socket = /var/run/mysqld/mysqld.sock nice = 0 [mysqld] # # * Basic Settings # user = mysql pid-file = /var/run/mysqld/mysqld.pid socket = /var/run/mysqld/mysqld.sock port = 3306 basedir = /usr datadir = /var/lib/mysql tmpdir = /tmp language = /usr/share/mysql/english skip-external-locking # # Instead of skip-networking the default is now to listen only on # localhost which is more compatible and is not less secure. #bind-address = 127.0.0.1 # # * Fine Tuning # key_buffer = 16M max_allowed_packet = 16M thread_stack = 128K thread_cache_size = 8 # This replaces the startup script and checks MyISAM tables if needed # the first time they are touched myisam-recover = BACKUP max_connections = 150 table_cache = 300 #thread_concurrency = 10 # # * Query Cache Configuration # query_cache_limit = 2M query_cache_size = 32M # # * Logging and Replication # # Both location gets rotated by the cronjob. # Be aware that this log type is a performance killer. #log = /var/log/mysql/mysql.log # # Error logging goes to syslog. This is a Debian improvement :) # # Here you can see queries with especially long duration log_slow_queries = /var/log/mysql/mysql-slow.log #long_query_time = 2 log-queries-not-using-indexes # # The following can be used as easy to replay backup logs or for replication. # note: if you are setting up a replication slave, see README.Debian about # other settings you may need to change. #server-id = 1 #log_bin = /var/log/mysql/mysql-bin.log expire_logs_days = 10 max_binlog_size = 100M #binlog_do_db = include_database_name #binlog_ignore_db = include_database_name # # * BerkeleyDB # # Using BerkeleyDB is now discouraged as its support will cease in 5.1.12. skip-bdb # # * InnoDB # # InnoDB is enabled by default with a 10MB datafile in /var/lib/mysql/. # Read the manual for more InnoDB related options. There are many! # You might want to disable InnoDB to shrink the mysqld process by circa 100MB. #skip-innodb # # * Security Features # # Read the manual, too, if you want chroot! # chroot = /var/lib/mysql/ # # For generating SSL certificates I recommend the OpenSSL GUI "tinyca". # # ssl-ca=/etc/mysql/cacert.pem # ssl-cert=/etc/mysql/server-cert.pem # ssl-key=/etc/mysql/server-key.pem [mysqldump] quick quote-names max_allowed_packet = 16M [mysql] #no-auto-rehash # faster start of mysql but no tab completition [isamchk] key_buffer = 16M # # * NDB Cluster # # See /usr/share/doc/mysql-server-*/README.Debian for more information. # # The following configuration is read by the NDB Data Nodes (ndbd processes) # not from the NDB Management Nodes (ndb_mgmd processes). # # [MYSQL_CLUSTER] # ndb-connectstring=127.0.0.1 # # * IMPORTANT: Additional settings that can override those from this file! # The files must end with '.cnf', otherwise they'll be ignored. # !includedir /etc/mysql/conf.d/ join_buffer_size = 2M max_heap_table_size = 80M tmp_table_size = 80M low_priority_updates = 1 concurrent_insert=2
Your server uses a lot of swap, thats normally a indication that it does not has enough RAM. Mysql and apache need a lot of ram to run fast as mysql caches the tables and queries in ram and when it has to use swap then the performance drops rapidly. Currently you have 2 GB ram installed, can you increase it to 4, 6 or 8 GB?
Memory is probably your main issue, but may also want to increase your mysql max_connections to much higher than 150 [500] (and ignore tuning-primer.sh stating that you have it set too high or allocating too much memory to mysql). I say that because mydns lookup and email/spam fighting also hits mysql (I think)....it solved many of my problems at least. The segfaults you get are an ongoing issue that many of us are still trying to fix. I think falko pointed us to some articles just recently but I haven't had a chance to research them.
Till, I don't know on memory. The standard price at the datacenter is $180 onetime fee for an additional 2GB or an extra $18/monthly. If they will let me submit my own memory and pay an installation fee of like $20, then I'd be more apt to doing it. Right now, there is 512MB free on the memory and almost all SWAP is free. It just goes crazy for some reason at random points and they all spike. I'm 99.99999% it's SQL related. I've tweaked it a bit more and will give it 24 hrs before I run those tests again to determine if I need to tweak further.
The good news is I can see from your TOP display that you have collectl running. Download/install collectl-utils and plot everything with colplot. Maybe something will jump out at you as to which resource is being starved. -mark
When I see high load averages the first thing I always check is disk IO. Your load average should be less than or equal to the number of CPU cores you have. If the load is higher than the number of cores then you have a bottleneck somewhere and it is usually disk IO. When this issue is happening run: vmstat 1 Ctrl+C will exit This will print lots of data including blocks read/written each second. You will see that IO in/out will be rather high. Till and Turbanator are right, you need more RAM. You were using nearly 1GB of SWAP and your CPU's were spending 61% of their time waiting on disk IO: Code: Cpu(s): 26.4%us, 11.2%sy, 0.2%ni, 0.0%id, [B]61.7%wa[/B], 0.0%hi, 0.4%si, 0.0%st Mem: 2063384k total, 1940664k used, 122720k free, 12108k buffers Swap: 1951856k total, [B]932880k used[/B], 1018976k free, 142528k cached
I understand that principle and I thank you all for the commentary. My only curiosity and confusion is why this occurs so randomly and when the server is generally not having high use (web, imap, ftp, etc.. are not being hit too hard). It seems as if there's some memleak or something somewhere. RAM has been on this list and I'll get some more put in tonight to test it out at least temporarily.
Also, what would a good WAIT generally be? I'm at 4GB as of now. wait has been as high as 15.4% but just for a second or two. Code: top - 22:41:33 up 19 min, 1 user, load average: 0.71, 0.72, 0.72 Tasks: 175 total, 2 running, 171 sleeping, 0 stopped, 2 zombie Cpu(s): 16.9%us, 11.8%sy, 0.0%ni, 70.9%id, 0.3%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 4063804k total, 1659392k used, 2404412k free, 49000k buffers Swap: 1951856k total, 0k used, 1951856k free, 435708k cached
I've never paid attention to wait unless I'm having an issue so I guess I would say if it is mostly low thats a good sign. The server was overloaded when you were running top. 15 php-cgi and one mysql processes all fighting for swap and CPU time. Based on the info you provided I would say this issue was caused by too much web traffic. If you have email and other processes on this server that only adds to the problem. My suggestion is to edit your apache configs and reduce the number of php processes that are allowed to run at a time. You need to limit the php process to an amount that your hardware can handle. When you have too many processes running the CPU will waste a considerable amount of time just switching from one process to another. Also, reducing the number of php process will reduce the amount of memory needed.
Agreed. But, what is the best method to edit such settings with ISPC3 in use? I'd love to limit PHP processes for my two popular websites (each racking in several thousand hits a day).
Trying to figure out what is going on by looking at a few data points over a handful of seconds is a real good way to make the wrong decision. Like I said you have collectl running so if you look at /var/log/collectl you should see a number of files, probably one/day, that contain samples of almost every performance metric on your system. One set every 10 seconds! All you need to do it 'play it back' with the right parameters. You'll be able to see cpu, disk, network, memory, nfs (if you use it), page faults, interrupts and a whole lot more. Even detailed process and slab memory data. To get started just type: collectl -p /var/log/collectl/filename -oT The -scdnm with display 'brief' format for cpu, disk, network and memory. The -oT switch will include optional timestamps. You can change the subsystems to see individual CPU, NETWORK or DISK loads by specifying then in uppercase, but you'll get far less compact data. you can even running with --vmstats instead of -s if you prefer that format. check out http://collectl.sourceforge.net to learn mode -mark
Mark, I'll look into that collectl after while. Till, I'm using fcgi for the main websites as they're wordpress. The rest use mod-php.
Thats a good choice! Which value have you set under system Server Config on the fastCGI tab for the value "FastCGI Children"? If it is > 1, then set it to 1, then you will have to change a setting in every website, e.g. quota and click on save to apply the new value.
FCGI children is at 8 with max requests of 5000. Should it be just lesser than 8, or 1? Changed children, went to the two big sites and changed hdd quota by 1mb. Now the waiting game.
while you're waiting, just type "collectl<return>" and watch the output. very compact, very low overhead. <0.1%. -mark
Till, I made those changes. Now it takes about 4 seconds to access the website inter5.org and areyouliberal.com. Code: top - 08:37:57 up 10:15, 2 users, load average: 2.17, 2.10, 1.67 Tasks: 185 total, 2 running, 182 sleeping, 0 stopped, 1 zombie Cpu(s): 11.9%us, 8.5%sy, 0.0%ni, 79.6%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 4063804k total, 2898012k used, 1165792k free, 619852k buffers Swap: 1951856k total, 0k used, 1951856k free, 670860k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 15614 web5 20 0 206m 53m 7768 S 36 1.4 0:29.66 php-cgi 16287 root 20 0 0 0 0 Z 3 0.0 0:00.10 miniserv.pl <defunct> 2933 mysql 20 0 251m 79m 5904 S 1 2.0 13:44.49 mysqld 15630 www-data 20 0 248m 10m 1796 S 1 0.3 0:00.10 apache2 1 root 20 0 10316 752 620 S 0 0.0 0:01.14 init 2 root 15 -5 0 0 0 S 0 0.0 0:00.00 kthreadd 3 root RT -5 0 0 0 S 0 0.0 0:00.08 migration/0 4 root 15 -5 0 0 0 S 0 0.0 0:00.14 ksoftirqd/0 Code: waiting for 1 second sample... #<--------CPU--------><----------Disks-----------><----------Network----------> #cpu sys inter ctxsw KBRead Reads KBWrit Writes KBIn PktIn KBOut PktOut 100 28 59 8282 0 0 168 20 5 52 4 44 96 9 49 9013 28 3 172 10 4 27 4 17 100 9 100 11693 0 0 332 8 3 28 5 21 87 14 71 2287 0 0 704 30 4 44 5 37 50 18 49 1217 0 0 80 14 3 22 3 16 24 12 101 857 0 0 0 0 3 31 7 28 0 0 80 148 0 0 196 27 5 52 37 56 0 0 75 150 4 1 148 14 4 29 7 19 45 2 77 5209 0 0 252 18 5 36 8 28 82 5 53 10375 0 0 216 12 5 39 3 21 79 5 40 9537 0 0 52 8 3 27 2 13 48 3 61 7862 0 0 296 17 2 21 5 12 0 0 35 112 0 0 64 10 3 24 3 18 0 0 33 100 0 0 0 0 1 17 1 9 24 2 178 104 0 0 928 151 1 16 2 12 87 11 52 2327 0 0 496 8 2 22 1 13 Ouch!
well I see you got collectl going. btw - if you want to see what your memory is doing at the same time just add the switch -s+m. if you want time stamps just include -oT. more suggestions later if you care. clearly you don't have a network or disk problem. All your CPUs are certainly getting hammered though since the CPU number is reported is a average of all of them. In fact if you have 4 and collectl reports 25%, 1 could be at 100%. To see individual CPU loads, use "collectl -sC". Unfortunally this can be a pain to view so it you add the --home switch it will provide a display similar to top, but with no history. one thing that is curious is I've never seen interrupts so low! Typically you see 1000 on an idle system because the clock interrupts 1K times/second. being a web hosted environment, perhaps you're running in a VM and the clock is being processed by the hypervisor? no big deal, just a curiousity. getting back to the high cpu load, it feels like this is indeed a case where the application needs to be tuned or simply needs more cpu. in any event, as you try to tune you can always run collectl in another window and be able to observe immediate results. enjoy -mark
I'll tell you I notice a response difference when I kill off the most popular website, inter5.org. (just deactivated it again to watch) its online in this one Code: top - 10:38:55 up 12:16, 2 users, load average: 0.83, 1.06, 1.14 Tasks: 180 total, 2 running, 178 sleeping, 0 stopped, 0 zombie Cpu(s): 62.7%us, 18.7%sy, 0.0%ni, 18.1%id, 0.3%wa, 0.0%hi, 0.2%si, 0.0%st Mem: 4063804k total, 1847300k used, 2216504k free, 245612k buffers Swap: 1951856k total, 1792k used, 1950064k free, 370840k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 24176 web5 20 0 207m 53m 7992 S 55 1.3 4:28.24 php-cgi 23646 web8 20 0 194m 40m 7984 S 54 1.0 4:25.35 php-cgi 20516 web5 20 0 209m 55m 7996 R 39 1.4 6:05.86 php-cgi 2933 mysql 20 0 261m 81m 5988 S 14 2.1 20:36.26 mysqld its offline in this one Code: top - 10:41:34 up 12:19, 2 users, load average: 0.35, 0.81, 1.03 Tasks: 164 total, 1 running, 163 sleeping, 0 stopped, 0 zombie Cpu(s): 50.0%us, 0.0%sy, 0.0%ni, 50.0%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 4063804k total, 1443496k used, 2620308k free, 246112k buffers Swap: 1951856k total, 1792k used, 1950064k free, 372280k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 30509 root 20 0 18960 1324 940 R 254 0.0 0:00.46 top 1 root 20 0 10316 752 620 S 0 0.0 0:01.20 init 2 root 15 -5 0 0 0 S 0 0.0 0:00.00 kthreadd 3 root RT -5 0 0 0 S 0 0.0 0:00.10 migration/0 4 root 15 -5 0 0 0 S 0 0.0 0:00.20 ksoftirqd/0 collectl showing where i turned the website back on. you will see the cpu spike and some disk spike. Code: 0 0 65 125 0 0 576 17 1 16 1 5 0 0 53 127 0 0 0 0 2 27 2 21 1 0 44 127 4 1 108 10 3 38 3 33 0 0 9 42 0 0 0 0 1 10 1 4 1 0 62 157 0 0 0 0 1 8 0 2 0 0 44 89 0 0 0 0 3 38 3 32 0 0 36 88 0 0 88 9 3 24 3 15 37 5 30 334 16 4 0 0 2 20 1 12 49 11 33 472 8 2 0 0 1 15 1 12 50 24 24 1673 0 0 0 0 1 17 1 13 0 0 43 82 0 0 0 0 1 12 4 10 1 0 68 111 0 0 476 16 2 34 33 30 0 0 29 68 0 0 0 0 3 28 13 23 0 0 26 95 0 0 0 0 2 19 2 10 1 0 25 61 0 0 0 0 1 15 1 8 #<--------CPU--------><----------Disks-----------><----------Network----------> #cpu sys inter ctxsw KBRead Reads KBWrit Writes KBIn PktIn KBOut PktOut 24 6 45 488 0 0 660 14 2 20 1 10 14 7 43 214 0 0 0 0 3 21 3 17 32 3 12 152 4 1 0 0 2 21 1 10 5 1 16 75 0 0 0 0 1 9 0 3 57 17 32 728 0 0 0 0 1 11 1 5 84 25 142 1585 340 72 1604 30 2 25 1 12 97 15 70 4485 32 6 0 0 2 24 7 20 100 29 65 5916 0 0 0 0 4 53 27 40 76 29 41 4397 8 2 0 0 2 30 2 19 62 16 55 904 0 0 0 0 2 32 24 23 34 18 41 1225 0 0 464 13 3 37 39 35 20 0 30 106 0 0 0 0 1 19 1 9 96 16 31 4941 24 3 0 0 2 20 5 14 84 25 58 7904 0 0 4 1 2 19 2 8 Also, check out the graphs for anything interesting I'm not seeing: http://monitor.derekgordon.com/munin/derekgordon.com/index.html
I took a quick look at your graphs and I'm afraid they're not going to be very useful. The sample times for the data are far too infrequent to see anything meaningful such as spikes. Furthermore, you're using RRD which normalizes the data it plots - a fancy name for 'it lies!!!'. Perhaps the next thing you might want to look into is downloading collectl-utils which contains a tool called colplot, which uses gnuplot and a web interface to display very detailed and accurate plots from collectl data. The way that works is you use collectl to turn the collected data into something plottable (or even loadable into a spreadsheet). Just use the playback command like this: collectl -p /var/log/collectl/filename -P -f/tmp and that will create a plottable file in /tmp. Then you run colplot, point it to /tmp and tell it to draw all plots. There is a sample of one of plots on the collectl webpage as well as more info here: http://collectl-utils.sourceforge.net/colplot.html -mark