Ways to monitor Disk IO and Traffic to pinpoint an error
Ok so I'm still a total Linux novice even though I've been running my linodes for a couple of years now. Lately (as in the last couple of days), I've been having the occasional massive CPU spikes, my disk IO rate has been far more active and there seems to be a lot of transfer spikes. Most of these happen when I'm not paying attention (not that I spend my watching TOP scroll by
What are some methods or tools that I can use to try and pinpoint what causes the increase load? I would love to also be able to see exactly which group of files are being transferred. It is possible that I suddenly got a bunch more traffic but the spikes are a bit worrying.
Also, I already checked the Apache logs to see if there was anything out of the ordinary and I made a couple of corrections to APC but that was about it. I don't know if there is a program I can run for a day and then check all through logs to see whats going on.
So I'm running:
Thanks for the help!
also, since i made my change to APC today, my IO load is way higher. At what point should I be concerned about it?
How are you running PHP? How big are these spikes? Can you give some numbers?
You really need to check the access logs for the times during those spikes and see what was accessed.
As for the disk IO increase, well, since you say you tampered with APC I'd say that's the cause. I'd hazard a guess and say you reduced APC memory and it went from swapping to disk IO because the PHP files have to be purged and reloaded between requests. But that's really a lot of assumption.
You should post your APC config.
;32M per WordPress install
;Relative to the number of cached files (you may need to watch your stats for a day or two to find out a good number)
;Relative to the size of WordPress
;The number of seconds a cache entry is allowed to idle in a slot before APC dumps the cache
;Setting this to 0 will give you the best performance, as APC will
;not have to check the IO for changes. However, you must clear
;the APC cache to recompile already cached files. If you are still
;developing, updating your site daily in WP-ADMIN, and running W3TC
;set this to 1
;This MUST be 0, WP can have errors otherwise!
;Only set to 1 while debugging
;Allow 2 seconds after a file is created before it is cached to prevent users from seeing half-written/weird pages
;Leave at 2M or lower. WordPress does't have any file sizes close to 2M
That spike at 20h is nothing I did. It's one of the things I'm concerned about and trying to track down.
The thing is, I don't really know what it SHOULD look like. I'll take any and all advice at this point
I wonder what did you change in the APC config? And I suppose you checked the relevant error logs? Any of them filling up quickly? Other than that, you should check iotop and maybe even ftop for the most io active process.
All I did was change the from the default apc setup.
At least prune it by 300M
can you explain a little bit more what you are referring too?