php-fpm memory issues / 504 Gateway Time-out
* Linode 1024 (was Linode 768 until two days ago)
LEMP install on Ubuntu 10.04 LTS (built using cut n paste guides on VPSBible 2 yrs back) 10 Wordpress websites (8 use WP-SuperCache, 2 do not) 1 vBulletin 4 site 1 Mumble installation</list>
When running free -m on the server it generally looks as depleated as the below table shows:
total used free shared buffers cached Mem: 1004 983 21 0 2 64 -/+ buffers/cache: 916 87 Swap: 1023 770 253
Despite the low memory, things never appeared to be a problem for me since sites loaded quickly and I've been able to server circa 2000 page views a day across various domains.
A couple of days ago I updated all my Wordpress plugins to their newest versions. (I appreciate when I do so I get bug fixes and enhancements which may bring extra load on the server). A few hours after doing so visitors to my sites would get the dreaded "504 Gateway Time-Out" message. If I restarted php-fpm and nginx everything was fine for a short while before the problem reoccured.
What have I done to remediate?
I was on a Linode 768 plan. I upgraded to the 1024 product to take into consideration I was perhaps trying to do too much with the server however the problems persisted. The additional memory was gobbled up, just not as quickly.
I've been through and disabled all plugins I've found are now redundant to my requirements. I've checked that caching is in fact turned on and working on the various sites. I've also made sure the distro / installed modules are all up to date.
I contacted Linode and they advised me the error causing the issues is as follows:
> Out of memory: Kill process 13161 (php-fpm) score 246 or sacrifice child
Killed process 13161 (php-fpm) total-vm:523528kB, anon-rss:118892kB, file-rss:9460kB
Out of memory: Kill process 13178 (php-fpm) score 212 or sacrifice child
Killed process 13178 (php-fpm) total-vm:472628kB, anon-rss:128876kB, file-rss:7884kB
What can I do?
I'm a bit stumped as to what direction to go now. My install is now old and I wondered if it was worthwhile setting up a new Linode with the 12.x distro; however I wasn't sure if LEMP was still the best thing for me to go for or if LAMP might be better suited to my needs.
Whilst Googling and hunting around for a solution to my problems I've seen max clients popping up with high frequency. As I write I'm trying to work out where this will be in my configuration files to be able to detail what mine is set to / make an adjustment.
It's probably apparent but I am a bit of a *nix noob. I get by and came to this world because I was sick and tired of the shared hosting (or expensive less shared hosting) companies like 1&1 etc ran which just didn't suit my needs. I'm happy to dig in and make changes, I just want to understand them and have the confidence I'm not going to royally screw things up since I've live sites I obviously want to limit downtime for.
Apologies for the wall of text - I hope someone will be able to point me in the right direction.
Can you post the contents of your php-fpm pool configuration.
Here's the php-fpm pool (I think)
<configuration>All relative paths in this config are relative to php's install prefix Pid file <value name="pid_file">/var/run/php-fpm.pid</value> Error log file <value name="error_log">/var/log/php-fpm.log</value> Log level <value name="log_level">notice</value> When this amount of php processes exited with SIGSEGV or SIGBUS ... <value name="emergency_restart_threshold">10</value> ... in a less than this interval of time, a graceful restart will be initiated. Useful to work around accidental curruptions in accelerator's shared memory. <value name="emergency_restart_interval">1m</value> Time limit on waiting child's reaction on signals from master <value name="process_control_timeout">5s</value> Set to 'no' to debug fpm <value name="daemonize">yes</value> <workers>Name of pool. Used in logs and stats. <value name="name">default</value> Address to accept fastcgi requests on. Valid syntax is 'ip.ad.re.ss:port' or just 'port' or '/path/to/unix/socket' <value name="listen_address">127.0.0.1:9000</value> <value name="listen_options">Set listen(2) backlog <value name="backlog">-1</value> Set permissions for unix socket, if one used. In Linux read/write permissions must be set in order to allow connections from web server. Many BSD-derrived systems allow connections regardless of permissions. <value name="owner">www-data</value> <value name="group">www-data</value> <value name="mode">0666</value></value> Additional php.ini defines, specific to this pool of workers. These settings overwrite the values previously defined in the php.ini. <value name="php_defines"></value> Unix user of processes <value name="user">www-data</value> Unix group of processes <value name="group">www-data</value> Process manager settings <value name="pm">Sets style of controling worker process count. Valid values are 'static' and 'apache-like' <value name="style">static</value> Sets the limit on the number of simultaneous requests that will be served. Equivalent to Apache MaxClients directive. Equivalent to PHP_FCGI_CHILDREN environment in original php.fcgi Used with any pm_style. <value name="max_children">5</value> Settings group for 'apache-like' pm style <value name="apache_like">Sets the number of server processes created on startup. Used only when 'apache-like' pm_style is selected <value name="StartServers">20</value> Sets the desired minimum number of idle server processes. Used only when 'apache-like' pm_style is selected <value name="MinSpareServers">5</value> Sets the desired maximum number of idle server processes. Used only when 'apache-like' pm_style is selected <value name="MaxSpareServers">35</value></value></value> The timeout (in seconds) for serving a single request after which the worker process will be terminated Should be used when 'max_execution_time' ini option does not stop script execution for some reason '0s' means 'off' <value name="request_terminate_timeout">0s</value> The timeout (in seconds) for serving of single request after which a php backtrace will be dumped to slow.log file '0s' means 'off' <value name="request_slowlog_timeout">0s</value> The log file for slow requests <value name="slowlog">/var/log/php-fpm.log.slow</value> Set open file desc rlimit <value name="rlimit_files">1024</value> Set max core size rlimit <value name="rlimit_core">0</value> Chroot to this directory at the start, absolute path <value name="chroot"></value> Chdir to this directory at the start, absolute path <value name="chdir"></value> Redirect workers' stdout and stderr into main error log. If not set, they will be redirected to /dev/null, according to FastCGI specs <value name="catch_workers_output">yes</value> How much requests each process should execute before respawn. Useful to work around memory leaks in 3rd party libraries. For endless request processing please specify 0 Equivalent to PHP_FCGI_MAX_REQUESTS <value name="max_requests">500</value> Comma separated list of ipv4 addresses of FastCGI clients that allowed to connect. Equivalent to FCGI_WEB_SERVER_ADDRS environment in original php.fcgi (5.2.2+) Makes sense only with AF_INET listening socket. <value name="allowed_clients">127.0.0.1</value> Pass environment variables like LD_LIBRARY_PATH All $VARIABLEs are taken from current environment <value name="environment"><value name="HOSTNAME">$HOSTNAME</value> <value name="PATH">/usr/local/bin:/usr/bin:/bin</value> <value name="TMP">/tmp</value> <value name="TMPDIR">/tmp</value> <value name="TEMP">/tmp</value> <value name="OSTYPE">$OSTYPE</value> <value name="MACHTYPE">$MACHTYPE</value> <value name="MALLOC_CHECK_">2</value></value></workers></configuration>
And here's the Perl script output:
>> MySQLTuner 1.2.0 - Major Hayden <firstname.lastname@example.org>>> Bug reports, feature requests, and downloads at http://mysqltuner.com/ >> Run with '--help' for additional options and output filtering [OK] Logged in using credentials from debian maintenance account. -------- General Statistics -------------------------------------------------- [--] Skipped version check for MySQLTuner script [OK] Currently running supported MySQL version 5.1.66-0ubuntu0.10.04.3 [OK] Operating on 32-bit architecture with less than 2GB RAM -------- Storage Engine Statistics ------------------------------------------- [--] Status: +Archive -BDB -Federated +InnoDB -ISAM -NDBCluster [--] Data in MyISAM tables: 205M (Tables: 1714) [--] Data in InnoDB tables: 4M (Tables: 67) [--] Data in MEMORY tables: 505K (Tables: 10) [!!] Total fragmented tables: 297 -------- Security Recommendations ------------------------------------------- [OK] All database users have passwords assigned -------- Performance Metrics ------------------------------------------------- [--] Up for: 1d 16h 35m 36s (607K q [4.159 qps], 15K conn, TX: 9B, RX: 127M) [--] Reads / Writes: 86% / 14% [--] Total buffers: 58.0M global + 2.7M per thread (151 max threads) [OK] Maximum possible memory usage: 463.8M (46% of installed RAM) [OK] Slow queries: 0% (7/607K) [OK] Highest usage of available connections: 3% (6/151) [OK] Key buffer size / total MyISAM indexes: 16.0M/39.4M [OK] Key buffer hit rate: 97.9% (2M cached / 56K reads) [OK] Query cache efficiency: 63.9% (333K cached / 522K selects) [!!] Query cache prunes per day: 18831 [OK] Sorts requiring temporary tables: 0% (0 temp sorts / 23K sorts) [!!] Joins performed without indexes: 3790 [!!] Temporary tables created on disk: 34% (11K on disk / 33K total) [OK] Thread cache hit rate: 99% (6 created / 15K connections) [!!] Table cache hit rate: 0% (64 open / 24K opened) [OK] Open file limit used: 12% (124/1K) [OK] Table locks acquired immediately: 99% (264K immediate / 264K locks) [OK] InnoDB data size / buffer pool: 4.0M/8.0M -------- Recommendations ----------------------------------------------------- General recommendations: Run OPTIMIZE TABLE to defragment tables for better performance Enable the slow query log to troubleshoot bad queries Adjust your join queries to always utilize indexes When making adjustments, make tmp_table_size/max_heap_table_size equal Reduce your SELECT DISTINCT queries without LIMIT clauses Increase table_cache gradually to avoid file descriptor limits Variables to adjust: query_cache_size (> 16M) join_buffer_size (> 128.0K, or always use indexes with joins) tmp_table_size (> 16M) max_heap_table_size (> 16M) table_cache (> 64)</email@example.com>
Looking at your PHP-FPM settings it shows you have a static count of 5 PHP processes. This maybe your problem, if you get 5 long running processes i.e. image uploads then any new requests will get queued and potentially time out. Try increasing that gradually.
On another note looking at the syntax of the PHP-FPM file it looks like you're running a pretty old version of PHP-FPM, I know that PHP-FPM doesn't come as standard on Ubuntu 10.04 how did you install it?
You may want to switch to 'apache-like' instead of static for FPM with max_children of say 10 and start servers of 4.
The first thing I'll do is update php-fpm. I hadn't realised it wasn't updating when I've been using two bash scripts to perform general server updates. Thinking about it, I am sure I've got a link somewhere from VPSBible.com (site I used to config the server in the first place) which tells me how to update PHP-FPM with zero downtime. I'll give that a go.
Re how did I install PHP-FPM? There was a guide for how to install and configure it on VPSBible
If you have the time and want to switch to 12.04 then I'd create a new Linode, install your software, copy your data over, test test and test again then you can use the Linode swap IPs function so you don't have to worry about downtime during DNS updates.
Appreciate the time given to me on this. I wasn't aware of the swap IP function so that's very useful to be able to take advantage of.