Can I tell if I need to upgrade my Linode by looking at my graphs?
I want to improve the speed of my server. How can I tell if I upgrading will improve performance?
Graphs and More Graphs
Graphs are a very useful tool to see where your Linode could use a boost, either from upgrading, or from optimizing which can help you squeeze out the best performance from your Linode. The Linode Manager provides graphs for the things that its host can see: CPU, Network Usage, and I/O. To get the complete picture you will want augment this knowledge with information about what is going on inside. You can install Linode Longview which runs inside your Linode to give you more information about what is happening, like CPU load, available disk space and inodes, memory usage, running processes and how long they have to wait to get access to storage or get a chance to run. There are also command line tools to help you get this information. To look at these graphs (and output from commands) to determine if you need to upgrade requires us to think about the system as a whole, but in general if you are using all of an available resource it is probably likely that upgrading or optimizing that resource's usage will improve your systems performance.
Bottlenecks and Metrics
To improve performance of your server, you will want to look for bottlenecks and identify what you are trying to improve. A bottleneck occurs anytime a part of the system is preventing another part of the system from being fully utilized.
Imagine a four lane highway suddenly narrowing down to a one lane bridge for traffic to go over a river and then the highway becomes three lanes. The three lane highway further down the road is not ever going to be used to its full potential because of the bottleneck where the number of lanes went down to one. Expanding that three lane highway to four lanes would be a waste of effort as long as the bridge is only one lane.
Lets examine this from the point of view of two different metrics:
If we are measuring the average speed of the cars traveling on the road, if we only have a few cars most of the cars will be able to get into the lane for the bridge rather quickly and the average speed of the cars traveling along the highway will not change very much.
If on the other hand we have a lot of cars and want to measure how many cars can travel along the road in a given amount of time, we will have a different story. Cars will start to back up at the bridge and even though they will get back up to speed once they are on the bridge and beyond, they may have to wait a long time for their turn to cross and the result is a traffic jam.
Similar situations occur with computer systems to slow them down. There are 5 major constraints that most often become bottlenecks.
If your system runs out of memory, Your Linode may need to start storing things to slower storage like disks. This can significantly slow down performance. You may have sufficient CPU or network capacity to process the request, but the majority of the time the system will sit idle while it waits for information to get pulled off of disk and processed. Linode's servers use more expensive SSD drives to lessen the impact of this, but it is still a performance consideration. Linode does not have access to the internals of your Linode in order to know how much memory you are using unless you install software that reports this information, such as Linode Longview. If you install this and see that your Linode is using most of it's memory, this would be a reason to upgrade. If you don't want to install longview, you can check your memory by running commands such as
free -m or
top to see what is going on.
If you have large calculations that cannot occur in parallel where the each part of the calculation depends on the previous part, your CPU will be your bottleneck. If your Linode has 2 CPU's you could tell by your graphs in the Linode Manager that they were at capacity if you saw the graph reporting 200% usage. CPU's can get stuck in the bottleneck when they are all waiting for the disk to return information before they can move on.
Your Linode has a certain amount of storage allocated to it. This storage can allocated into disks. On your Linode's dashboard, you will see a bar showing how much of your storage has been dedicated to disks.
Once you create disks with your storage, your Linode will be able to use the space to store files. Linode cannot see how much of your disk is being used for files (unless you have installed software that reports this like Linode Longview. If your Linode needs to write something to disk and it has none left, this will very quickly cause your Linode to get bogged down.
A large amount of a CPU's time is waiting for information to be read from or written to storage. Often there are different levels of storage with small amounts of very fast storage and increasingly larger amounts of slower storage. The fastest storage is built into the CPU itself. CPU's have registers of extremely fast memory, but only 38 registers of various sizes between 16–128 bits on a typical 64 bit CPU core.
CPU's also have varying amounts of Level 2 and Level 3 cache memory, which is slower than the registers and faster than the systems RAM.
As we said before, sometimes the system will have to move information out of RAM to a slower disk.
Additionally network storage can be used for infrequently used information that has to be stored in large quantities.
Beyond that offline storage is used when your burn information to a DVD or make backup tapes
There tools and graphs to analyze how freely information is flowing inside of your Linode. One of these is a command called iostat which gives information from the perspective of the CPU. Another tool is your I/O Graphs in the Linode manager which gives performance from the perspective of your disk.
If you are have a billion pieces of information to send, but a slow network link, it will not matter if your CPU can calculate more information, because it will be sitting and waiting for what it has already computed to be sent across the line.
If any of your graphs are at or close to their maximum values, it is time to rethink your setup to reduce usage, or to upgrade to more capacity. In addition to your Linode Manager graphs, consider using Linode Longview or command line tools such as
df -h in order to monitor your resource usage and know when it might be time for an upgrade.
Please see the following guides for additional information:
What is Longview and How to Use it
Monitoring and Maintaining Your Server
Troubleshooting Memory and Networking Issues
On behalf of the original poster, thanks for your astonishingly complete and well-written response.
Sadly, I am still struggling at a much more basic level. Consider the built-in dashboard graph "traffic - day (5 min avg)".
There is no label on the horizontal axis, and the first time I saw it, 00:00 happened to be at the origin, which left open all kinds of possibilities. But time went by, and I noticed that the local time at the server does not appear on either the left end or the right end, so I'm guessing that I'm seeing GMT (Zulu) time. Is that right? Great, this is fun! Next, because the same time-of-day appears at both ends, I will need to do some more research to find out which end is "now" and which end is "one day ago".
The OP asked "Can I tell if I need to upgrade my Linode by looking at my graphs?" I had the same thought. The vertical axis shows bits/sec, so all I need to do is get out my calculator, and---after some fiddling---I can convert from "bits/sec" to what I really want to know: "% TB-per-month-limit" for my linode instance type. It would be nice to have that on a vertical axis.
Judging by the level of your question, you're probably not out of disk, ram, or bandwidth. The only thing left to look at is your CPU usage. Look at the CPU charts and calculate 100% x number of cores you have. If you 2 cores, the max is 200%. If you have 8 cores, the max is 800%. Then see if your CPU usage is regularly within 85% of your max. If so, it might be time for a resize or at least optimize your web application to decrease CPU usage (caching will do this).