What the linode CPU and munin numbers mean?

OK, I finally went live with my system on a Linode 2048.

Pretty sure that's way more than I'll need, but I was in a hurry to move off a dedicated box that did have 2Gb RAM, and I figure once I watch some counters for a while, I can downsize.

So, I'm looking at about a day's worth of usage:

linode's dashboard says I'm using about 25% of CPU

munin's graph looks about the same (25%).

It seems like linode's number only goes to 100%, while munin goes to 800% (the max on the idle counter is 799.9).

What is that telling me? Should I be concerned that I'm using too much CPU? Or is there really room to add tons of users to my vBulletin installation without worry?

9 Replies

1 core = 100%

25% = 1/4 core

Your host machine probably has 8 cores = 800%

According to this page, they put 10 Linode 2048s per machine. So you can probably use up to 80% without any adverse consequences, maybe even a lot more if the other customers aren't using much CPU.

By the way, the Y-axis of the dashboard graph will scale according to how much you're actually using. It makes more sense than having your graph stick to the very bottom of a humongous rectangle.

@ericholtman:

It seems like linode's number only goes to 100%, while munin goes to 800% (the max on the idle counter is 799.9).
Are you using one of the paravirt kernels (2.6.3x)? I've found that in those cases, Munin seems to mis-identify the idle maximum, which should be 400% rather than the full set of cores, as a single Linode only has access to 4 cores of the host at most. It doesn't happen with the older non-paravirt kernel (e.g., 2.6.18), but I know that there are differences in how pvops and non-pvops virtualization is handled by the Xen host.

See http://www.linode.com/forums/viewtopic.php?t=4788

I ended up patching the CPU plugin (as Jed suggested in the above thread) to simply divide idle values by 2 when using a paravirt kernel.

– David

@db3l:

Are you using one of the paravirt kernels (2.6.3x)?

Yup. 2.6.3.x

So, if I understand correctly….. the real physical hardware on I'm running on is 4 CPUs, each dual core.

My virtual instance only has access to four cores.

So, the graph on the linode dashboard should scale to 400%.

And munin would go to 400% as well (ignoring idle, which might be busted, according to the referenced thread).

And 'top', if it showed 100% utilization would be 100% of all cores, or the equivalent of 400% on the other graphs. And when I hit '1' in top, I only see 4 CPUs, not 8, because my virtualized instance only has 4.

So, assuming I'm consuming 1/10th of a physical machine, that would mean a steady state 40% number in the dashboard or in munin would be 'fair' (although I am allowed to have periods of higher use).

@ericholtman:

So, if I understand correctly….. the real physical hardware on I'm running on is 4 CPUs, each dual core.
I believe most (if not all) hosts have dual, quad-core Xeon processors. But yes, a total of 8 cores available to the host, of which each guest is permitted access to up to 4.

> So, the graph on the linode dashboard should scale to 400%.
It depends on what "should" means - I don't think it's what the dashboard currently does, if that's what you mean. I believe the dashboard is scaled to a single host CPU core, so 100% on the dashboard is only a single full core. The benefit to that is it's independent of any change in cores for the guest, but the downside is it's a different scale than guest-based utilities, because Linux counts CPU up to 100% * CPU cores.

> And 'top', if it showed 100% utilization would be 100% of all cores, or the equivalent of 400% on the other graphs. And when I hit '1' in top, I only see 4 CPUs, not 8, because my virtualized instance only has 4.
Yes. Of course, when expanded to show all CPUs, then 100% is per-CPU on each line, as opposed to 100% of all CPUs when shown on a single line in single cpu mode.

> So, assuming I'm consuming 1/10th of a physical machine, that would mean a steady state 40% number in the dashboard or in munin would be 'fair' (although I am allowed to have periods of higher use).
Well, the dashboard is relative to a single CPU core, so 40% on the dashboard is 40% of 1 core, or 5% of the physical machine (all 8 cores). 1/10 of the physical machine (8 cores) would be 80% of a single core (the dashboard value).

CPU is pretty much always "fair", since in contention it is always shared equally among the guests. If multiple Linodes on the same host are really pushing CPU, at worst, they'll share it equally.

It's true you could derive an expected capacity by dividing total CPU amongst the Linodes sharing that host (sans some overhead for host) yielding a value that could be used as a minimum expected allowance - since all guests running at maximum CPU simultaneously would produce that value. And there's a plausible argument that keeping average utilization close to that limit is being "neighborly" to the other guests sharing your host. Also why running, say, a distributed computation just to absorb free CPU, may not be the nicest thing to do.

But in reality, that minimum percentage (especially for the entry configurations) is quite a bit lower than the CPU available on average in practice to a single guest given statistical sharing. So by and large I'd use the CPU you need, and let it be equally shared with everyone else on the host.

– David

@db3l:

@ericholtman:

So, the graph on the linode dashboard should scale to 400%.
It depends on what "should" means - I don't think it's what the dashboard currently does, if that's what you mean. I believe the dashboard is scaled to a single host CPU core, so 100% on the dashboard is only a single full core.

I just tried on my test linode, running 8 instances of

int main (int argc, char **argv) 
{
    while (1);
}

and that pegged all four CPUs in 'top', but the Linode dashboard maxed out at about 260%.

So it's definitely not 100% max, my guess is 400%

I suppose a better gauge would be that thing in the bottom right, the "Host Load Summary", to try and figure out whether I'm beng a hog.

I've seen Linodes go all the way up to 396% when they go belly up (e.g. out of memory). The Dashboard graph scales up to 400%. Try it if you like, but the staff is not going to like it 8)

@ericholtman:

So it's definitely not 100% max, my guess is 400%
Sorry, I didn't mean to imply that the maximum value the dashboard could show was 100%, only that the dashboard scale was based on CPU cores, so 100% on the dashboard represents a single CPU core.

– David

I think you're focusing too much on your CPU usage when deciding how to scale your Linode. Even a 512 has access to four of eight cores, and contention tends to be low (although there are more linodes on the box to contend with, so your guaranteed share is smaller).

You should run your box under typical load for a while and take a look at memory usage; if you're not using all the RAM (ignoring cache/buffer), then you might be able to save some money by downgrading to a smaller linode. If things are running smoothly after downgrading, you can always upgrade later if there is a problem; downgrades and upgrades are very quick processes, and if you do them at an off-peak period, it's likely that your users won't even notice the few minutes of downtime.

Of course, there are also various things that you can do to ensure that your installation is well optimized. vBulletin is PHP/MySQL, so there are a few options to help with performance:

1) Lighttpd/nginx/litespeed can be more easily tweaked for efficient memory usage than Apache

2) An opcode cache for PHP like APC can make a decent dent in CPU usage

3) There are drop-in replacement storage engines for MySQL that are faster than the defaults (can't remember their names off-hand), and there's also the various post-oracle/sun forks. Making sure you're using a decent MySQL config helps too (just pick the sample config for the memory target you want, they work pretty fine without tweaking)

@Guspaz:

I think you're focusing too much on your CPU usage when deciding how to scale your Linode. Even a 512 has access to four of eight cores, and contention tends to be low (although there are more linodes on the box to contend with, so your guaranteed share is smaller).

Oh, I agree. I'm just trying to get a handle on what the different numbers mean.

> You should run your box under typical load for a while and take a look at memory usage; if you're not using all the RAM (ignoring cache/buffer), then you might be able to save some money by downgrading to a smaller linode.

That's what I'm planning on doing. Honestly, I had planned on load testing a test server for a longer while before going into production, but my three year old (unpatched, because I was a retard) dedicated box elsewhere got hacked, so I moved before I was 100% ready.

I'm sure the box I have now is total overkill, I'll watch the traffic (just sent out our weekly newsletter this AM, so the load should be spiking), and then do exactly what you describe. In fact, I could just reboot the current one with less RAM allocated, to see what's what, faster even than downgrading.

> Of course, there are also various things that you can do to ensure that your installation is well optimized. vBulletin is PHP/MySQL, so there are a few options to help with performance:

Yup, these are exactly the next things on my list.

Also, the main reason I asked is I'm trying to be proactively nice, that is, be on the Linode that best fits my usage profile. Sure, I could just run the thing flat out until someone complains, but that seems sub-optimal from a community standpoint.

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct