Massive disk I/O spike

What could be the reason for this?

~~![](<URL url=)http://img440.imageshack.us/img440/8309/picture3nz.png" />

And what should I do to find out the errant process?~~

9 Replies

logs being rotated or backups the usual culprits, install iotop and it'll tell you what's using your io

Got iotop installed.

Can I use it look back at this timeframe or do I need to set up a cron job and wait for the next outage?

iotop is giving me this.

Total DISK READ: 8.10 M/s | Total DISK WRITE: 120.36 K/s
  PID USER      DISK READ  DISK WRITE   SWAPIN    IO    COMMAND
  183 root           0 B/s       0 B/s  0.00 % 86.19 % [kswapd0]
 1927 www-data    7.92 M/s       0 B/s 26.10 % 24.06 % apache2 -k start
 1858 mysql     182.49 K/s   50.47 K/s  0.00 %  1.53 % mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --pid-file=/var/run/mysqld/mysqld.pid --skip-external-locking --port=3306 --socket=/var/run/mysqld/mysqld.sock
 1959 root        7.77 K/s       0 B/s  0.00 %  1.36 % python /usr/bin/iotop -bo --iter=100
 1763 root           0 B/s   69.89 K/s  0.00 %  0.00 % java -Djava.util.logging.config.file=/opt/tomcat/conf/logging.properties -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Djava.endorsed.dirs=/opt/tomcat/endorsed -classpath /opt/tomcat/bin/bootstrap.jar -Dcatalina.base=/opt/tomcat -Dcatalina.home=/opt/tomcat -Djava.io.tmpdir=/opt/tomcat/temp org.apache.catalina.startup.Bootstrap start

It looks like you're swapping? You've either configured your software to use too much RAM, or you don't have enough RAM.

Right, you are. Syslog is reporting out of memory and swapping.

Dec 20 12:41:28 li176-178 kernel: Swap cache stats: add 8999236, delete 8982449, find 5311322/6233419
Dec 20 12:41:28 li176-178 kernel: Free swap  = 0kB
Dec 20 12:41:28 li176-178 kernel: Total swap = 262136kB
Dec 20 12:41:28 li176-178 kernel: 131072 pages RAM
Dec 20 12:41:28 li176-178 kernel: 0 pages HighMem
Dec 20 12:41:28 li176-178 kernel: 3409 pages reserved
Dec 20 12:41:28 li176-178 kernel: 1094 pages shared
Dec 20 12:41:28 li176-178 kernel: 125143 pages non-shared
Dec 20 12:41:28 li176-178 kernel: Out of memory: kill process 12788 (apache2) score 20764 or a child
Dec 20 12:41:28 li176-178 kernel: Killed process 12788 (apache2)
Dec 20 12:42:47 li176-178 kernel: apache2 invoked oom-killer: gfp_mask=0x200da, order=0, oom_adj=0
Dec 20 12:42:47 li176-178 kernel: Pid: 12794, comm: apache2 Not tainted 2.6.32.16-linode28 #1
Dec 20 12:42:47 li176-178 kernel: Call Trace:
Dec 20 12:42:47 li176-178 kernel: [<c01778fa>] ? oom_kill_process+0x9a/0x280
Dec 20 12:42:47 li176-178 kernel: [<c0177f6c>] ? __out_of_memory+0xfc/0x160
Dec 20 12:42:47 li176-178 kernel: [<c0178024>] ? out_of_memory+0x54/0xb0
Dec 20 12:42:47 li176-178 kernel: [<c017b251>] ? __alloc_pages_nodemask+0x561/0x580
Dec 20 12:42:47 li176-178 kernel: [<c0196707>] ? read_swap_cache_async+0xc7/0x110
Dec 20 12:42:47 li176-178 kernel: [<c01967b8>] ? swapin_readahead+0x68/0x90</c01967b8></c0196707></c017b251></c0178024></c0177f6c></c01778fa>

Set up is a Linode 512 running Debian Lenny with PHP (memory limit 128M), MySQL, Drupal 6 and next to no traffic (site is under development). Have also installed java and Tomcat6 (I suspect this is probably causing the memory leak) for Apache Solr. Checking memory config now…

Some thoughts:

1) PHP's default memory limit is 32MB (previously 16MB in PHP4, I believe), which in my experience is plenty. I've not run Drupal, though, so perhaps it's a memory hog. But, is there a reason you've raised the limit?

2) Tomcat is a huge memory hog, it runs Tomcat, Java, Apache httpd, etc. Are there any alternatives to Solr you can use? If not, you might need a bigger linode.

3) Which mysql config file are you using? It's possible you're giving it more RAM than it needs.

4) What webserver are you running, and how have you configured it? Apache by default is configured for machines with massive amounts of RAM, and tweaking Apache can usually result in enormous savings.

1) Yes, Drupal is a memory hog. For complex sites with lots of image processing 96M is recommended so I think 128M is fairly standard

2) Pretty sure this is where the memory leak is coming from. The alternative is to use Jetty instead of Tomcat. Need to check that out. Either way I do need to stick with Solr.

3) my.cnf looks like so

key_buffer              = 20M
max_allowed_packet      = 64M
thread_stack            = 192K
thread_cache_size       = 8

myisam-recover          = BACKUP
max_connections        = 100

query_cache_limit       = 1M
query_cache_size        = 64M

log_slow_queries        = /var/log/mysql/mysql-slow.log
long_query_time = 2

expire_logs_days        = 10
max_binlog_size         = 100M

skip-bdb
skip-innodb

[mysqldump]
quick
quote-names
max_allowed_packet      = 64M

[mysql]
#no-auto-rehash # faster start of mysql but no tab completition

[isamchk]
key_buffer              = 20M

query_cache_type = 1
default_character_set = utf8
collation_server = utf8_general_ci
character_set_server = utf8
table_cache = 200

Anything odd there?

4) Running Apache with no special configuration. Server has just been set up using the LAMP stack instructions on Linode.

We always intended to get the the server tuned up by a sysadmin before launching but it looks like something is borked and it needs doing now.

Install munin, it will monitor your usage for you and make graphs (like the ones for the linode dashboard) for cpu/ram/swap etc etc.

Also install htop it's a better version of top (imho) it'll show you what's using the most ram.

Your mysql config looks fine.

Is this a 32 bit or 64bit system?

I had similar graphs, starting a few days ago. Symptoms included fail2ban logging thousands of PAM failures at a pop due to brute-force attacks and the kernel running out of memory and killing anything it could to keep saslauthd running. Got quite a few error_exit conditions in kern.log, too. Disk IO – and to some extent CPU -- were spiking as the system thrashed and searched for stuff to kill.

Restarted saslauthd and disabled password authentication for all ssh shells (like I should have in the first place!) and have not had a problem in 24 hours.

Good luck!

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct