Optimize a 540: Tuning PHP/cgi+Apache/worker+APC+fcgid+MySQL

Hi,

I'm running a Linode 540. It's currently running 32-bit OpenSuSE 11.1,

    uname -a
        Linux dev 2.6.27.29-0.1-xen #1 SMP 2009-08-15 17:53:59 +0200 i686 i686 i386 GNU/Linux

and its primary utilization is as a Drupal (v6.14) server – running on top of
* PHP 5.3.0 (fastcgi)

Apache 2.2.13/worker-mpm

mod_fcgid (apache svn/trunk r816755, == '2.3.2-dev')

APC 3.1.3p1

MySQL v5.1.36.</list> 

Details of my current config follow below. I've yet to find a resource that discusses this particular scenario. And, even for similar ones, for each suggestion to "do X", there's another site suggesting to do exactly the opposite. Translated – "much voudou and mojo required!" :-/ That said, it works -- so far -- well enough.

In my experience, this is, generally, NOT an uncommon config. What IS less common is running it all on a RAM/CPU-limited Linode 540 … rather than, e.g., a standalone box with 4-dedicated CPU cores & 8 GB RAM.

So, "The Question" is … for a "small-to-moderate" (yes, that's subjective …) site, running on a Linode 540, how can/should the PHP/Apache/FCGId config be "optimized" to get the most bang for the buck?

I'll be very interested in any/all comments, suggestions, experience, etc for doing this right @ Linode. Yes, it seems that these configs are very usage-specific, and ultimately benchmarking is, of course, needed. Hopefully, though, others will share some interest here.

Thanks!

    php-cgi5 -v
        PHP 5.3.0 (cgi-fcgi) (built: Sep  8 2009 16:47:38)
        Copyright (c) 1997-2009 The PHP Group
        Zend Engine v2.3.0, Copyright (c) 1998-2009 Zend Technologies
            with Xdebug v2.0.5, Copyright (c) 2002-2008, by Derick Rethans
            with Suhosin v0.9.29, Copyright (c) 2007, by SektionEins GmbH

    httpd2 -V
        Server version: Apache/2.2.13 (Linux/SUSE)
        Server built:   Aug 10 2009 02:14:02
        Server's Module Magic Number: 20051115:23
        Server loaded:  APR 1.3.8, APR-Util 1.3.9
        Compiled using: APR 1.3.8, APR-Util 1.3.9
        Architecture:   32-bit
        Server MPM:     Worker
          threaded:     yes (fixed thread count)
            forked:     yes (variable process count)
        Server compiled with....
         -D APACHE_MPM_DIR="server/mpm/worker"
         -D APR_HAS_SENDFILE
         -D APR_HAS_MMAP
         -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled)
         -D APR_USE_SYSVSEM_SERIALIZE
         -D APR_USE_PTHREAD_SERIALIZE
         -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT
         -D APR_HAS_OTHER_CHILD
         -D AP_HAVE_RELIABLE_PIPED_LOGS
         -D DYNAMIC_MODULE_LIMIT=128
         -D HTTPD_ROOT="/srv/www"
         -D SUEXEC_BIN="/usr/sbin/suexec2"
         -D DEFAULT_PIDLOG="/var/run/httpd2.pid"
         -D DEFAULT_SCOREBOARD="logs/apache_runtime_status"
         -D DEFAULT_ERRORLOG="/var/log/apache2/error_log"
         -D AP_TYPES_CONFIG_FILE="/etc/apache2/mime.types"
         -D SERVER_CONFIG_FILE="/etc/apache2/httpd.conf"

    mysql -V
        mysql  Ver 14.14 Distrib 5.1.36, for suse-linux-gnu (i686) using readline 5.2

cat server-tuning.conf
     <ifmodule mod_worker.c="">StartServers         1
        MinSpareServers      1
        MaxSpareServers      2
        MaxClients           8
        ThreadsPerChild      4
        MinSpareThreads      4
        MaxSpareThreads      8
        MaxRequestsPerChild  2000
        ThreadLimit          64
        ThreadStackSize      1048576</ifmodule> 

    KeepAlive                On
    MaxKeepAliveRequests     10
    KeepAliveTimeout         120
    EnableMMAP               off
    EnableSendfile           off
    LimitRequestBody         1048576
    AddDefaultCharset        utf-8
    AddEncoding x-compress   .Z
    AddEncoding x-gzip       .gz .tgz

cat conf.d/mod_fcgid.conf
     <ifmodule mod_fcgid.c="">Options                     +ExecCGI
        PHP_Fix_Pathinfo_Enable     1
        SharememPath                /var/cache/apache2/fcgid_shm
        SocketPath                  /var/cache/apache2
        BusyScanInterval            120
        BusyTimeout                 300
        DefaultInitEnv PHP_FCGI_CHILDREN     1
        DefaultInitEnv PHP_FCGI_MAX_REQUESTS 1000
        DefaultMaxClassProcessCount          2
        DefaultMinClassProcessCount          2
        ErrorScanInterval           3
        IdleTimeout                 300
        IdleScanInterval            120
        IPCCommTimeout              120
        IPCConnectTimeout           60
    #   MaxProcessCount             2000
        OutputBufferSize            64
        ProcessLifeTime             3600
        SpawnScore                  1
        SpawnScoreUpLimit           10
        TerminationScore            2
        ZombieScanInterval          3</ifmodule> 

cat vhosts.d/master.conf
    ...
    Include /etc/apache2/conf.d/mod_fcgid.conf
    ...
    AddHandler fcgid-script .php
    FCGIWrapper "/usr/bin/php-cgi5 -d apc.shm_size=25 -c /etc/php5/fastcgi/" .php
    ...

cat /etc/php5/conf.d/apc.ini
    extension=apc.so
    apc.enabled="1"
    apc.cache_by_default="1"
    apc.shm_segments="1"
    apc.ttl="7200"
    apc.user_ttl="7200"
    apc.gc_ttl="3600"
    apc.num_files_hint="1024"
    apc.mmap_file_mask="/tmp/apc.XXXXXX"
    apc.enable_cli="0"
    apc.slam_defense="0"
    apc.file_update_protection="2"
    apc.max_file_size="1M"
    apc.stat="1"
    apc.write_lock="1"
    apc.report_autofilter="0"
    apc.include_once_override="0"
    apc.rfc1867="0"
    apc.rfc1867_prefix="upload_"
    apc.rfc1867_name="APC_UPLOAD_PROGRESS"
    apc.rfc1867_freq="0"
    apc.localcache="0"
    apc.localcache.size="512"
    apc.coredump_unmap="0"

13 Replies

@PG1326:

DefaultMaxClassProcessCount 2

DefaultMinClassProcessCount 2

This is the only part that jumps out at me as being possibly bad.

If I understand correctly, your site is almost entirely Drupal-based, but here you're effectively limiting yourself to two simultaneous Drupal requests. Your 540 can handle more than that!

I wouldn't set a maximum of less than 4, and I'd suggest trying at least 8 and seeing how well it works. I also wouldn't have it pre-spawn less than 4.

And remember, even though you're sharing the host with other people, you can use up to four full CPU cores if it's available, and it often will be.

hi,

@nknight:

If I understand correctly, your site is almost entirely Drupal-based,

that's correct (for the moment …)

@nknight:

but here you're effectively limiting yourself to two simultaneous Drupal requests. Your 540 can handle more than that!

I wouldn't set a maximum of less than 4, and I'd suggest trying at least 8 and seeing how well it works. I also wouldn't have it pre-spawn less than 4.

per your suggestion, i've changed to

DefaultMaxClassProcessCount          8
DefaultMinClassProcessCount          4

and, so far, no problems.

@nknight:

And remember, even though you're sharing the host with other people, you can use up to four full CPU cores if it's available, and it often will be.

that implies that core usage is controllable by me … what specific setting are you referring to?

Fwiw, the largest 'effect' I've seen so far in terms of memory usage / performance is from

reducing swapiness,@ /etc/sysctl.conf

vm.swappiness = 0

and adding a cronjob to flush caches

45 * * * * sync; echo 3 > /proc/sys/vm/drop_caches

atm, that results in

free -m
             total       used       free     shared    buffers     cached
Mem:           546        171        374          0          2         81
-/+ buffers/cache:         87        459
Swap:         1023          0       1023

which appears encouraging …

I'm still seeing a not insignificant startup delay @ first access to the site. I think this has to do with preload of PHP children by fcgid … namely, i've

DefaultInitEnv PHP_FCGI_CHILDREN     1
DefaultInitEnv PHP_FCGI_MAX_REQUESTS 1000

and wonder if this is right for this usage … thoughts?

thanks!

@nknight:

I also wouldn't have it pre-spawn less than 4.

checking on the pre-spawning, i find

~~[http://www.mail-archive.com/mod-fcgid-users@lists.sourceforge.net/msg00179.html" target="_blank">](http://www.mail-archive.com/mod-fcgid-u … 00179.html">http://www.mail-archive.com/mod-fcgid-users@lists.sourceforge.net/msg00179.html](

which purports,

> > YOU CAN NOT PRESPAWN FCGID.

>

mod-fastcgi has an option to do this but mod-fcgid does not.

i'll dig some more … but, afayk, has that changed?

@,

"Re: [Mod-fcgid-users] Spawning explanation"

~~[http://www.mail-archive.com/mod-fcgid-users@lists.sourceforge.net/msg00300.html" target="_blank">](http://www.mail-archive.com/mod-fcgid-u … 00300.html">http://www.mail-archive.com/mod-fcgid-users@lists.sourceforge.net/msg00300.html](

there's an interesting discussion about setting DefaultMaxClassProcessCount & DefaultMinClassProcessCount (modfcgid forking) "versus" PHPFCGI_CHILDREN (php forking).

iiuc, php forking doesn't work (well?) with mod_fgid.

BUT, if PHPFCGICHILDREN is not used (unclear whether that's ==1, ==0, or really rm'd) and instead forking control is handed over to mod_fcgid, one loses shared memory capability (APC, Xcache, etc).

is that really true? investigating further …

per,

FastCGI with a PHP APC Opcode Cache

~~[http://www.brandonturner.net/blog/2009/07/fastcgiwithphpopcodecache/" target="blank">](http://www.brandonturner.net/blog/2009/ … ode_cache/">http://www.brandonturner.net/blog/2009/07/fastcgiwithphpopcode_cache/](

reading,

> "… The maxClassProcesses option is very important: it tells FastCGI to only spawn one php-cgi process regardless of how many requests are pending. Remember that our PHP process will spawn its own children, so FastCGI only needs to spawn one. Until this APC bug is fixed, this is necessary to allow sharing the APC cache among children."

he's set

maxClassProcesses 1

checking @ the referenced bug,

http://pecl.php.net/bugs/bug.php?id=11988

it's unclear as to whether apc 3.1.3p1 (which i use) has a solution as yet; a comment therein does refer to an external solution,

PHP-FPM: PHP FastCGI Process Manager

PHP-FPM is a patch for PHP4/5 to greatly improve PHP's FastCGI SAPI capabilities and administration

~~[http://php-fpm.org/MainPage#Isphp-fpmcompatiblewiththeZendPlatform" target="blank">](http://php-fpm.org/Main_Page#Is_php-fpm … d_Platform">http://php-fpm.org/MainPage#Isphp-fpmcompatiblewiththeZend_Platform](

http://php-fpm.org/WhatisPHP-FPM

but at least as of Aug 6, 2009

http://michaelshadle.com/category/php-fpm/

is not php 530-friendly …

so, if we believe THIS scenario, it seems,

DefaultInitEnv PHP_FCGI_CHILDREN     5   
DefaultInitEnv PHP_FCGI_MAX_REQUESTS 500 
DefaultMaxClassProcessCount          1 
DefaultMinClassProcessCount          1

is a best/required config for APC to be useful.

i think :-/

@PG1326:

@nknight:

And remember, even though you're sharing the host with other people, you can use up to four full CPU cores if it's available, and it often will be.

that implies that core usage is controllable by me … what specific setting are you referring to?

No, what nknight means is that you have access to four cores on the host machine. At a minimum, you are guaranteed a proportional amount of CPU time (e.g., if every Linode on the same host were running full bore something like while true; do :; done). However, if other Linodes on your host are using less than their full allotment (which they are most of the time), you can use more than your full allotment up to 400% CPU. No configuration is necessary on your part (although the application must be either multi-threaded or split among multiple processes to be able to take advantage of more than one core).

Sorry, can't be of any help on the FCGI stuff.

Vance got it right on the CPU usage.

It's been quite a while since I dealt much with either of the Apache FastCGI modules (these days I can usually get away with alternate solutions that let me use e.g. mod_php/python/perl), so the advice that you've found about how exactly to configure the number of processes stands a better chance of being right than me. :)

It seems you've worked out a good configuration (or at least a working one you can keep tweaking if needed) using PHP's built-in facilities, but out of curiosity I went hunting a bit and it does seem that mod_fcgid indeed has no ability to pre-spawn processes.

It looks like DefaultMinClassProcessCount 'n' just means that it will never kill processes if the number of processes is 'n' or less. This is something of an oddity, particularly in the context of Apache, where normally such an option would mean that 'n' processes would always be running, even at initial startup.

If you're using fastcgi, you may want to look into lighttpd or nginx, as both of them use fastcgi as their primary means of interacting with PHP.

That isn't exactly an optimization of Apache, but unless you need some functionality of apache that can't be replicated with lighty or nginx, you'd probably save some RAM; you're already paying the overhead of fcgi.

The reason most people here frequently recommend Lighttpd/nginx is because they fit in the amount of memory provided by a linode quite well out-of-the-box. For example, lighttpd is a uniprocess server, and so doesn't need to have various thread/process related things tweaked. Most tweaking of lighttpd performance has to do with just picking the right max connections setting, and the default is usually good for most cases anyhow.

@PG1326:

Fwiw, the largest 'effect' I've seen so far in terms of memory usage / performance is from

reducing swapiness,@ /etc/sysctl.conf

vm.swappiness = 0

and adding a cronjob to flush caches

45 * * * * sync; echo 3 > /proc/sys/vm/drop_caches

atm, that results in

free -m
             total       used       free     shared    buffers     cached
Mem:           546        171        374          0          2         81
-/+ buffers/cache:         87        459
Swap:         1023          0       1023

which appears encouraging …

Is this really improving performance for you? Far from encouraging, I'd call that very discouraging. You're flushing out a lot of cached disk reads, which is only going to hurt performance later on.

You're only using 87 MB of RAM, might as well let the system use the rest of it to do something useful.

~JW

@Guspaz:

If you're using fastcgi, you may want to look into lighttpd or nginx

unfortunately, drupal is still a kludge (mod_rewrite? more?) on both lighttpd and nginx … _not_ using apache adds too many 'gotchas' (for now …)

@JshWright:

Far from encouraging, I'd call that very discouraging.
encouraging only in the sense i'd misunderstood linux caching :-/

@JshWright:

You're flushing out a lot of cached disk reads, which is only going to hurt performance later on.

You're only using 87 MB of RAM, might as well let the system use the rest of it to do something useful.

you're absolutely right here. i stopped using that "sync; echo 3 > /proc/sys/vm/drop_caches".

i stepped back and took a look at the whole system, rather than just the (supposed) apc perfomance.

realized that Apache's 'fat', mod*cache & modssl are sloooow, and learned that drupal's use of caching is underperforming …

that said, i switched:

drupal 6.14 -> pressflow-6
installed drupal CacheRouter module, config'd for APC
installed Pound as front-end/proxy, installed SSL certs there
got rid of Apache mod_ssl, mod_cache, mod_file_cache & mod_disk_cache
installed Varnish as a caching , reverse proxy

with that config, i'm starting to do some local hammering with httperf,

httperf --hog --http-version=1.1 \
--server=my.site.com --uri=/info.php \
--port=443 --ssl --ssl-no-reuse --ssl-ciphers=AES256-SHA \
--send-buffer=4096 --recv-buffer=16384 \
--num-calls=10 --num-conns=5000 --timeout=5 --rate=10

shows a decent CPU load,

top - 18:35:55 up 21:58,  3 users,  load average: 2.29, 8.01, 13.52
Tasks: 102 total,   2 running, 100 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.4%us,  0.3%sy,  0.0%ni, 97.8%id,  1.5%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:    559976k total,   330100k used,   229876k free,     3028k buffers
Swap:  1048568k total,   274004k used,   774564k free,    56232k cached

mem's ok – much better, relatively, than before,

free -m
             total       used       free     shared    buffers     cached
Mem:           546        322        224          0          2         55
-/+ buffers/cache:        263        282
Swap:         1023        267        756

and, httperf returns

`Maximum connect burst length: 1

Total: connections 5000 requests 49970 replies 49966 test-duration 499.964 s

Connection rate: 10.0 conn/s (100.0 ms/conn, <=109 concurrent connections)
Connection time [ms]: min 50.6 avg 485.9 max 14368.1 median 67.5 stddev 1537.5
Connection time [ms]: connect 102.5
Connection length [replies/conn]: 9.995

Request rate: 99.9 req/s (10.0 ms/req)
Request size [b]: 89.0

Reply rate [replies/s]: min 34.2 avg 99.9 max 223.8 stddev 18.9 (99 samples)
Reply time [ms]: response 28.7 transfer 9.4
Reply size [b]: header 266.0 content 75975.0 footer 0.0 (total 76241.0)
Reply status: 1xx=0 2xx=49966 3xx=0 4xx=0 5xx=0

CPU time [s]: user 156.91 system 308.77 (user 31.4% system 61.8% total 93.1%)
Net I/O: 7449.6 KB/s (61.0*10^6 bps)

Errors: total 4 client-timo 4 socket-timo 0 connrefused 0 connreset 0
Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0`

this is @ access of a verbose php page (), not reusing SSL session IDs, and hence renegotiating ...

if i switch to a 'lightweight' html page as --uri taget, i'm seeing request rates ~ 400 req/s.

with the 'old' apache config i'd had in place, i couldn't even get 10 req/s consistently ...

note that, atm, this is testing from the linode itself ... so performance is tainted by the load of the testing executable itself.

now, to figure out how the rates i _am_ seeing compare to 'norms' ... for linodes (somebody 'in her' has to have checked at some point ...), and drupal in general.

oh, and, also did some mysql tweaking:

put mysql /tmp in tmpfs (@ /etc/fstab)
`~~[code]~~tmpfs  /tmp/mysqltmp tmpfs rw,gid=105,uid=105,size=128M,nosuid,nodev,noexec,nr_inodes=10k,mode=0700 0 0<e>[/code]</e>`
monkeyed with (and continue to ...) cache, query, thread & buffer sizes

increased thread_concurrency  from 2-> 8 (i.e., 2x # of CPUs)

all together, a flash-heavy, dynamic php page that had taken ~ 15 secs to load is coming up in ~1-2 secs now ...

still more room to improve, i suspect ....[/s][/b][/b]

@PG1326:

drupal is still a kludge (mod_rewrite? more?) on both lighttpd and nginx … _not_ using apache adds too many 'gotchas' (for now …)
Personally, I've had no problems at all running Drupal over Nginx. If you're only after clean urls, this is fairly trivial – just use something like the following:

if (!-e $request_filename) {
      rewrite  ^/(.*)$  /index.php?q=$1 last;
      break;

That said, it seems like you're pretty set on apache ;)

For the record, and using your httperf params from above (no real optimization, no apc) accessing a page containing phpinfo() over ssl, on a 360 also running mysql, php-fastcgi, spamd and exim4:`` Total: connections 5000 requests 50000 replies 50000 test-duration 499.990 s

Connection rate: 10.0 conn/s (100.0 ms/conn, <=2 concurrent connections)
Connection time [ms]: min 50.0 avg 87.8 max 175.2 median 87.5 stddev 4.4
Connection time [ms]: connect 40.8
Connection length [replies/conn]: 10.000

Request rate: 100.0 req/s (10.0 ms/req)
Request size [b]: 105.0

Reply rate [replies/s]: min 99.4 avg 100.0 max 100.4 stddev 0.1 (100 samples)
Reply time [ms]: response 1.9 transfer 2.8
Reply size [b]: header 182.0 content 52920.0 footer 2.0 (total 53104.0)
Reply status: 1xx=0 2xx=50000 3xx=0 4xx=0 5xx=0

CPU time [s]: user 150.35 system 333.25 (user 30.1% system 66.7% total 96.7%)
Net I/O: 5196.2 KB/s (42.6*10^6 bps)

Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0
Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0`

And accessing a (thinly modified) default drupal front page, still with the above httperf settings:`[code]Total: connections 5000 requests 50000 replies 50000 test-duration 500.174 s

Connection rate: 10.0 conn/s (100.0 ms/conn, <=21 concurrent connections)
Connection time [ms]: min 222.8 avg 290.5 max 2257.2 median 274.5 stddev 123.4
Connection time [ms]: connect 19.2
Connection length [replies/conn]: 10.000

Request rate: 100.0 req/s (10.0 ms/req)
Request size [b]: 95.0

Reply rate [replies/s]: min 81.8 avg 100.0 max 117.8 stddev 2.6 (100 samples)
Reply time [ms]: response 27.1 transfer 0.0
Reply size [b]: header 487.0 content 6666.0 footer 2.0 (total 7155.0)
Reply status: 1xx=0 2xx=50000 3xx=0 4xx=0 5xx=0

CPU time [s]: user 103.38 system 295.68 (user 20.7% system 59.1% total 79.8%)
Net I/O: 707.6 KB/s (5.8*10^6 bps)

Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0
Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0`[/s][/b][/b][/code][/s][/b][/b]
```

@nknight:

@PG1326:

And remember, even though you're sharing the host with other people, you can use up to four full CPU cores if it's available, and it often will be.

Be careful with this. I don't remember exactly how xen works with this, but most of the time VM hosts will process al VM client cpu "slices" synchronously. Meaning that if you have 4 v cpus setup, the host will wait to process that vm till it has room on all 4 real cpus. On vmware a lot of things actually end up running better with 1/2 "cores" on 4-16 core machines when they used to run on 4-8core hardware pre virtualization.

I raised this concern as well. See caker's response at http://www.linode.com/forums/viewtopic. … 4092#24092">http://www.linode.com/forums/viewtopic.php?p=24092#24092

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct