little to no performance difference between 360 and 2880?

can someone explain why i'm not seeing a (reasonable) performance gain between a 360 and 2880 node?

here is what i benchmarked:

i have 5 text files (mysql dumps), each being ~850MB. i wrote a script that does the following:

  • gzip *.sql

  • gunzip *.gz

  • bzip2 *.sql

  • bunzip2 *.bz2

i ran this on both a 360 node and a 2880 node. the 360 finished in ~36mins, the 2880 in ~33.5mins.

with an 8X ratio of CPU power, i expected much more than a 7% savings in runtime. as far as i can tell, my nodes are practically identical (running on ubuntu 8.04LTS if that makes a diff).

can someone tell me what's going on? is my test invalid? is the 360 unfairly spiking or something? is there something i can do to make the test fair? or is this expected performance?

thanks in advance!

16 Replies

On your Linode 360, does your CPU graph show a 350%+ spike? Linodes have burstable CPU, meaning if no one else is using the CPU at the time, you're free to use as much of it as you want. That is, until they need it, at which time they get priority.

What you may be seeing is that your 360 was taking up 90% of the CPU/IO of the Linode Host because no one else was using it. Being as its called "burstable", its frowned upon for an extended period.

Perhaps someone else could explain it better.

@Smark:

What you may be seeing is that your 360 was taking up 90% of the CPU/IO of the Linode Host because no one else was using it. Being as its called "burstable", its frowned upon for an extended period.
I'm not sure there's any reason to avoid using the CPU if you can get it. It's shared equally among all the nodes on the host if there is contention, so if you can get 400%, nobody else wants it. If someone else is burning CPU at the same time, you'll share equally and you won't get to 400%.

To schmingle, for purely CPU bound tasks, an otherwise idle 360 and an idle 2880 probably won't be all that different, since as Smark points out xen shares the CPU equally among all its nodes (well, up to 4 CPUs per node) so an individual node can get a significant amount of CPU if the host is otherwise idle.

What your test won't show is what the average CPU or disk availability is over time. Best case the 360 and 2880 are similar, but worst case is far worse for the 360 than 2880 (given the difference of on average, 40 nodes competing on the host vs. 5). So it's really a question of buying assurance of resource availability.

It's not clear if your test was I/O or CPU bound, but on average you should see less contention for both resources on the 2880 host (again, fewer nodes competing) which especially with disk can be a performance killer with larger database tasks or other disk heavy operations, but the machines themselves are largely the same, so if other nodes aren't competing for the disk at a given point, the raw performance of the 360 is likely similar to the 2880.

And of course, you have a lot more memory to work with (which is the only real guaranteed resource under the xen setup). That in and of itself could be reason enough to get the larger configuration.

In short, it's not going to obviously boost peak performance to have a 2880 vs. a 360, depending on your workload and how you structure your host. That's actually something I find very attractive about Linode, since it lets you stay economical while on average getting more bang for the buck in performance. I'm guessing that it's rare for all the nodes on a given 360 host to be saturated (CPU or disk) simultaneously.

But worst case for a 360 will be much worse than worst case for a 2880, so it's a question of odds of hitting the worse case, and how much it will impact your application should it happen.

– David

@Smark:

Linodes have burstable CPU,
They do?

@schmingle:

with an 8X ratio of CPU power, i expected much more than a 7% savings in runtime.
Ratio between what? Linode plans are identical in CPU; the only provisioned difference is how much RAM you get assigned to you. Since your test cases here are mostly I/O-bound (and I/O is fairly consistent across plans as well), I would expect these numbers.

We don't take the approach of some other providers and give you a slice of the CPU proportional to your plan size. You get the full computing power of the host server at all times. There are reserves built in to the system to prevent you from pegging other Linodes to doom and back, and this also explains why CPU graphs go to 400%.

@db3l:

I'm not sure there's any reason to avoid using the CPU if you can get it. It's shared equally among all the nodes on the host if there is contention, so if you can get 400%, nobody else wants it. If someone else is burning CPU at the same time, you'll share equally and you won't get to 400%.

If memory serves, this is mostly a leftover attitude from the UML days where CPU sharing wasn't quite as solid.

That aside, in my opinion there is a theoretical downside to people doing heavy sustained workloads, albeit maybe not for the person doing them.

If most linodes have a bursty/interactive workload, where how fast a job gets done (e.g. loading a webpage) is important for the user experience, then theoretically, if the host is populated mainly by other "interactive" linodes, the typical case would be that when any given request or group of requests comes in to any linode on a host, a significant amount of the host CPU would be available to complete it quickly (since the odds of any two particular linodes needing to complete non-trivial requests at the exact same point in time would usually be fairly low). All linodes on such a host benefit through faster average response times.

This would break down if you have a lot of heavy, sustained work going on in other linodes.

As I said, though, this is all theory, and it wouldn't make a lot of difference if there were only a couple heavy users on a host. I don't have any real-world data to know for sure what the typical linode workload is, nor if there are hosts that chronically lack burstable CPU due to clusters of heavy sustained users.

@jed:

Linode plans are identical in CPU; the only provisioned difference is how much RAM you get assigned to you.

thanks for clarifying that, jed. shame on me for my ignorance, thinking that those little green boxes represented proportional CPU power. i'm coming from EC2 so i just assumed that's what it meant. it sounds like i may be downgrading my node then.

@jed:

I/O is fairly consistent across plans as well

jed, could you go into more detail about this? i actually opened a support ticket asking about this and got the following response:

"Linodes on larger plans have a lower contention ration on the hosts, so better IO performance (especially when compared to a 360 host) is quite probable."

thanks for taking the time to answer my questions. i appreciate your help in evaluating hosting solutions for my company.

Lower plans have more nodes per physical machine, so there is usually more competition for disk I/O, which is the real killer on a VPS.

On a somewhat related topic, I love pigz and pbzip2. They're parallel (multithreaded) implementations of gzip and bzip2 that scale linearly. They're also drop-in replacements, functioning identically.

You will see a remarkable improvement in performance if you have, for example, a 1.5GB database. The 2880 can keep the entire database in memory and give you blazing fast responses to whatever query you throw at it, whereas the 360 will need to read from disk all the time.

@hybinet:

You will see a remarkable improvement in performance if you have, for example, a 1.5GB database. The 2880 can keep the entire database in memory and give you blazing fast responses to whatever query you throw at it, whereas the 360 will need to read from disk all the time.

hybinet, could you propose a good test for me to run? i tested some intensive sql scripts and the 360 outperformed the 2880 by quite some margin.

my DB dump is only ~850MB. unfortunately, i can't reveal my scripts. i can tell you however that they involve a lot of updates, inserts, and table swapping.

btw, i also have a small EC2 instance as well as possibly a large one with EB storage that i can also run tests on for benchmarking. i'll be happy to publish the results.

@schmingle:

@jed:

I/O is fairly consistent across plans as well

jed, could you go into more detail about this?
Sorry, should have been clearer – I/O capability does not vary per-plan. Every plan has access to the same hardware.

@schmingle:

@hybinet:

You will see a remarkable improvement in performance if you have, for example, a 1.5GB database. The 2880 can keep the entire database in memory and give you blazing fast responses to whatever query you throw at it, whereas the 360 will need to read from disk all the time.

hybinet, could you propose a good test for me to run? i tested some intensive sql scripts and the 360 outperformed the 2880 by quite some margin.

my DB dump is only ~850MB. unfortunately, i can't reveal my scripts. i can tell you however that they involve a lot of updates, inserts, and table swapping.

btw, i also have a small EC2 instance as well as possibly a large one with EB storage that i can also run tests on for benchmarking. i'll be happy to publish the results.

I am assuming when you ran your SQL tests you had changed the configuration to make use of all the extra resources.

Query speeds only change greatly if its able to read the table from memory rather than disk.. And you need to have your my.cnf setup to make use of the extra storage.

gzip is block-based, not solid, so it wouldn't need or want to keep the entire file in memory at the same time. Apart from a bit of read-ahead, there should be no performance difference, even if you were compressing a 10GB file with 256MB of RAM.

If you're IO bound (you don't seem to specify), you may be able to get better performance by doing your work in a ramdisk:

mount -t tmpfs none /scratch/directory

Now put temporary files in /scratch/directory - they'll be kept in RAM, reducing disk IO. Of course, this takes away from RAM available for other processes.

Why would that produce faster performance? The time spent copying the file to the RAM disk before compressing it would seem to negate any possible benefits derived during the actual compression.

@schmingle:

@hybinet:

You will see a remarkable improvement in performance if you have, for example, a 1.5GB database. The 2880 can keep the entire database in memory and give you blazing fast responses to whatever query you throw at it, whereas the 360 will need to read from disk all the time.

hybinet, could you propose a good test for me to run? i tested some intensive sql scripts and the 360 outperformed the 2880 by quite some margin.

my DB dump is only ~850MB. unfortunately, i can't reveal my scripts. i can tell you however that they involve a lot of updates, inserts, and table swapping.

I don't know about table swapping, but if your workload involves a lot of updates and inserts, then the benefits of more memory are going to be limited by the write performance to disk. Or maybe not, if the default mySQL settings favor speed over durability.

i've done some more benchmarking and i wanted to post an update. for the applications i'm running, i've been seeing 4X+ performance gain moving from EC2 (small instance) to Linode. i'm not specifying which linode size because they're actually very comparable. you just need to get more memory or storage depending on your application's needs.

i also managed to benchmark Rackspace. i tried a few instances and found about a 2X+ gain with Linode across all the sizes. pretty amazing. another thing worth noting is that the first 4 times i launched an instance i got locked out due to some network issues. i spent maybe 45 minutes with online support and, although they were nice, they couldn't help me until eventually it "magically" fixed itself.

Rackspace is 2X EC2. Linode is 2X Rackspace. and the price is hard to beat when you consider the included bandwidth.

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct