Handling a large spike (php/mysql web application)

Hi all,

I have to configure a web application which might experience quite a large spike next week.. As we talked with Linode guys, they calculated from the data I gave them that we should be able to handle 300-400k users per hour. And I have couple of concerns..

1. How can I test that beforehand? Is there anyway to create that large stress test?

2. The application is simple php/mysql website with only couple of pages. It will list audiofiles which users can listen with a flash player. The content for that comes from amazon cloud so it shouldn't stress the server.

Currently we are using apache but I have a plan to try this configuration:

http://markmaunder.com/2009/how-to-hand … 360mb-vps/">http://markmaunder.com/2009/how-to-handle-1000s-of-concurrent-users-on-a-360mb-vps/

I haven't tried it yet so I'm not sure if I can get it to work.

First I thought of having a load balancer and a couple of linodes but unfortunately Linode does not have those yet (only in beta)..

Does anyone here have any other ideas, concerns, tips, etc what to do? I haven't dealt with this kind of traffic so I'm not exactly sure how much is needed..

-A

9 Replies

Install nginx before Apache it's good idea.

Also you can install APC for PHP - it will increase performance by 2-3 times and reduce amount of IO-operations.

APC (or Memcache) can be used for caching data - it's very important task for high-loaded server.

All static files (images, js, css) should be handled by nginx only, without apache. It can significantly reduce amount of used RAM per second.

Very popular variant for high-loaded servers is php-fpm + nginx (without apache), so if you can build this tandem - try to do it :)

Using just a single web server to handle both static and dynamic content (like nginx) will free up RAM, and probably be faster too. Reverse proxies are a waste of RAM unless you have some specific need for Apache features that nginx (or lighttpd or cherokee or whatever) can't do. Consider if you will be bandwidth limited or CPU limited; removing compression modules/support will reduce CPU usage at the expense of bandwidth, and vice versa. Selecting the appropriate number of fastcgi PHP processes for the available RAM is important.

For testing, you can try ab (Apache Benchmark), which works with any server. Make sure your database queries have appropriate indexes.

Haven't tried 'em, but these guys look like they give a good test: http://browsermob.com/performance-testing

You can build your own lightweight front end load balancer(s) now, and when the Linode product is production-ready, transition over. You don't have to stick with your first infrastructure forever.

The #1, most important, holy-cow thing I can recommend for scalable architecture is separating your application from your data. You should be able to add and destroy application servers without losing or having to copy/sync data.

The #2 thing is to automate. If you notice you need another app server, the absolute maximum amount of work you should have to do is run one command, grab a beer, open it with a bottle opener, then drink the beer. It is recommended that the poolside bartender open the beer for you; it is suggested that you have this process down to one button on your phone. Your scaling process should look like this:

~~![](<URL url=)http://drop.hoopycat.com/linodestaff_caker.jpg" />

The #3 thing is measurement. How do you know you need to push that button? How do you know you're doing your job right? Measure everything, figure out what normal is, figure out what abnormal is, then figure out how to have the system figure out what abnormal is for you so that you don't have to figure out what is abnormal.

Also, if you do it right and end up drinking gold smoothies, get a Blendtec.~~

To elaborate on hoopycat's beer post, check out api.linode.com for how to create new instances easily from anywhere automagically, install munin to see how your servers are handling the load, and I recommend a decent stout for the beer!

Thanks for all the help so far! I really appreciate this. I'll dig into this tomorrow so then I'll now better what works and what not.

So far I'm thinking that maybe I should change apache to nginx completely since there shouldn't be anything apache-specific..

@Guspaz I'm not exactly sure if I know how to figure out if the service will be bandwidth limited or CPU limited.. Is the only way to figure that out to check the graphs or is there other ways as well..?

And what about the ab-tool..? How big tests can I run with that?

@hoopycat :D That's exactly how I'm going to set it up! I like the idea that the more traffic we get the more tipsy I will be at the office :)

How easy is it to set up a load balancer of my own? That would of course seem like the best solution. The data is actually separated so the automation could possibly work pretty easily..

What I failed to mention is that the mysql layer is actually read-only. All the writes happen on another server and the db is dumped on this site every now and then.. I know that there's some cool stuff like this available

http://yoshinorimatsunobu.blogspot.com/ … y-for.html">http://yoshinorimatsunobu.blogspot.com/2010/10/using-mysql-as-nosql-story-for.html

Do you see that the mysql is big enough issue to spend time with? Or should I just concentrate on the other (mentioned) stuff instead?

@Artsi:

How easy is it to set up a load balancer of my own? That would of course seem like the best solution.

With nginx,

upstream examplecom_loadbal {
  server 192.0.2.1:80;
  server 192.0.2.2:80;
  # ...
}

server {
  listen 80;
  server_name example.com www.example.com;
  proxy_set_header Host $host;
  proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
  proxy_set_header X-Real-IP $remote_addr;

  location / {
    proxy_pass http://examplecom_loadbal;
  }
}

That's pretty much it. Easy breezy. There's tutorials and docs out there, too.

> Do you see that the mysql is big enough issue to spend time with? Or should I just concentrate on the other (mentioned) stuff instead?

I probably wouldn't bother, especially for read-only situations. Then again, if you hit the point where you have to do such heroics, you'll know it. :-)

Premature optimization is… well, premature.

You can also try DNS load balancing which just means adding an A record for each app server for the same hostname.

I've been doing a lot of work lately getting Magento to run well on a linode VPS, and some of my findings may well be useful.

1. Ditch apache completely, and use nginx + php-fpm.

2. Tune MySQL. Mage uses innodb, so that's what needs tuning most. However, by default, MySQL doesn't even enable the query cache. This will make a huge difference to your disk IO.

3. ( As others have said ). Install APC. Now the Ubuntu version throws loads of horrible errors, so I use PECL to install a newer one.

4. Can you use memory backed storage for anything? Any data that can be lost between reboots - cache storage for example - will benefit from this.

Things like setting up your compression, expiry times, etc will also help, but is seems like you're not going to be serving much content from this site so work on the ones above first.

hth,

Steve

Nice one I gave that URL a good and well I like it :D

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct