PostgreSQL and MySQL crashes

I've been very happy with my linode up until the last few months, and never had any problems. Recently, though, I've started seeing crashes of PostgreSQL and MySQL. I haven't changed any parameters. After having not needed to touch the server for up to 6 months at a time, I now have to check on it a few times a week. This is becoming unmanageable.

Any ideas what may be causing the problems or what logs you'd want to begin diagnosing it? Is it simply a load issue (although, as I say, I don't think there's too much that's changed since the "early days").

I'm running Mandrake 9.1 (minimal install) on a Linode 64.

Thanks, Tom

11 Replies

Are you running out of memory? Which kernel version are you running or have you switched kernels or rebooted lately?

Check "dmesg" for memory allocation errors, too…

-Chris

Seems like it could be a memory issue:

I get a large number of these in dmesg:

_allocpages: 0-order allocation failed (gfp=0x1d2/0)

_allocpages: 0-order allocation failed (gfp=0x1d2/0)

_allocpages: 0-order allocation failed (gfp=0x1d2/0)

_allocpages: 0-order allocation failed (gfp=0x1d2/0)

_allocpages: 0-order allocation failed (gfp=0x1d2/0)

_allocpages: 0-order allocation failed (gfp=0x1d2/0)

_allocpages: 0-order allocation failed (gfp=0x1d2/0)

_allocpages: 0-order allocation failed (gfp=0x1d2/0)

_allocpages: 0-order allocation failed (gfp=0x1f0/0)

_allocpages: 0-order allocation failed (gfp=0x1d2/0)

_allocpages: 0-order allocation failed (gfp=0x1d2/0)

_allocpages: 0-order allocation failed (gfp=0x1d2/0)

(This was just a sample). I'd been worried about memory before, and ran top to determine how much was being used at any one time, but I'd read on a post somewhere that it's okay that top reports high usage, as this has to do with the way that UML allocates memory.

I'm using the following kernel:

Linux li2-77 2.4.27-linode34-1um

Thanks for the help,

Tom

So what would be your suggestion from here - is there anything I can do to determine if I am running out of memory and how to limit that on mysql/postgresql?

Thanks, Tom

Hello…

I think you may find this relevant:

> http://www.linode.com/forums/viewtopic.php?t=1128

And also:

> http://www.linode.com/forums/viewtopic.php?t=1210

jp

You are almost certainly running out of memory. Your _allocpages: error message shows the kernel is trying to allocate a single memory page but the allocation fails, which means that either both physical memory and swap are completely full or all current physical pages have been marked non-swappable (exceeding unlikely). (As it's a single page allocation, page contiguity considerations aren't applicable.)

When the OOM killer is choosing something to kill, it has a preference for processes which are both consuming a lot of memory and have been running for only a short time. MySQL, which spawns many, short-lived threads, is a favourite because it is greedy for memory and looks short-lived because it has young threads. I imagine this also holds true for postgresql.

I suggest that you post the output of:````
ps -e -o pid,cmd,%mem,rss,trs,sz,vsz

````
cat /proc/meminfo

to see what your memory usage is like.

I think I may have identified the issue - I'm running Zope/Plone and I think that's taking up all the memory. Now that I think about it, the issues came about the same time that I installed it. I guess it's something of a memory hog and has been knocking the other sites out.

I may just decide to remove Zope/Plone - it was for a website that I run, and can be converted back to a php solution.

This is without Zope/Plone running:

Here's the output of ps -e -o pid,cmd,%mem,rss,trs,sz,vsz

PID CMD %MEM RSS TRS SZ VSZ

1 init [3] 0.1 80 26 335 1340

2 [keventd] 0.0 0 0 0 0

3 [ksoftirqd_CPU0] 0.0 0 0 0 0

4 [kswapd] 0.0 0 0 0 0

5 [bdflush] 0.0 0 0 0 0

6 [kupdated] 0.0 0 0 0 0

7 [jfsIO] 0.0 0 0 0 0

8 [jfsCommit] 0.0 0 0 0 0

9 [jfsSync] 0.0 0 0 0 0

10 [xfsbufd] 0.0 0 0 0 0

11 [xfslogd/0] 0.0 0 0 0 0

12 [xfsdatad/0] 0.0 0 0 0 0

13 [mdrecoveryd] 0.0 0 0 0 0

14 [kjournald] 0.0 0 0 0 0

269 /bin/bash /etc/r 0.0 4 586 658 2632

517 /sbin/dhclient - 0.7 464 319 486 1944

594 syslogd -m 0 0.2 172 24 421 1684

653 /usr/sbin/atd 0.1 104 12 344 1376

672 /usr/sbin/sshd 0.5 336 279 724 2896

1016 /usr/bin/postmas 0.4 272 1846 2710 10840

1037 crond 0.1 116 19 349 1396

1070 postgres: stats 0.1 64 1846 2958 11832

1075 postgres: stats 0.6 388 1846 2760 11040

1101 /bin/sh /etc/rc3 0.0 4 586 651 2604

1119 /usr/local/sbin/ 0.1 104 73 391 1564

22507 httpd2 -f /etc/h 0.4 252 304 3935 15740

30334 /usr/bin/perl /u 0.0 4 8 1003 4012

31890 /usr/bin/perl /u 0.0 4 8 1003 4012

1477 httpd2 -f /etc/h 12.8 7644 304 4728 18912

3687 /bin/sh /usr/bin 0.0 4 586 605 2420

3688 logger -t mysqld 0.0 4 6 394 1576

3713 /usr/sbin/mysqld 3.6 2156 1764 3522 14088

3714 /usr/sbin/mysqld 3.6 2156 1764 3522 14088

3715 /usr/sbin/mysqld 3.6 2156 1764 3522 14088

9756 httpd2 -f /etc/h 7.2 4296 304 3995 15980

9760 httpd2 -f /etc/h 13.0 7800 304 4703 18812

9762 httpd2 -f /etc/h 13.7 8208 304 4707 18828

9763 httpd2 -f /etc/h 13.6 8152 304 4692 18768

9775 httpd2 -f /etc/h 11.1 6624 304 4679 18716

12031 httpd2 -f /etc/h 4.1 2464 304 3955 15820

12032 httpd2 -f /etc/h 12.4 7440 304 4697 18788

12607 httpd2 -f /etc/h 3.7 2248 304 3951 15804

12608 httpd2 -f /etc/h 3.7 2252 304 3951 15804

12644 sshd: mthaddon [ 2.8 1700 279 1579 6316

12646 sshd: mthaddon@p 3.3 1988 279 1593 6372

12647 -bash 2.5 1508 586 646 2584

12685 su - 1.6 988 16 566 2264

12689 -bash 2.6 1576 586 663 2652

12747 ps -e -o pid,cmd 1.0 636 58 626 2504

And here's the output of cat /proc/meminfo

total: used: free: shared: buffers: cached:

Mem: 60997632 58667008 2330624 0 3559424 25677824

Swap: 68149248 15175680 52973568

MemTotal: 59568 kB

MemFree: 2276 kB

MemShared: 0 kB

Buffers: 3476 kB

Cached: 17684 kB

SwapCached: 7392 kB

Active: 17920 kB

Inactive: 30768 kB

HighTotal: 0 kB

HighFree: 0 kB

LowTotal: 59568 kB

LowFree: 2276 kB

SwapTotal: 66552 kB

SwapFree: 51732 kB

And this is with Zope/Plone running:

ps -e -o pid,cmd,%mem,rss,trs,sz,vsz

PID CMD %MEM RSS TRS SZ VSZ

1 init [3] 0.1 80 26 335 1340

2 [keventd] 0.0 0 0 0 0

3 [ksoftirqd_CPU0] 0.0 0 0 0 0

4 [kswapd] 0.0 0 0 0 0

5 [bdflush] 0.0 0 0 0 0

6 [kupdated] 0.0 0 0 0 0

7 [jfsIO] 0.0 0 0 0 0

8 [jfsCommit] 0.0 0 0 0 0

9 [jfsSync] 0.0 0 0 0 0

10 [xfsbufd] 0.0 0 0 0 0

11 [xfslogd/0] 0.0 0 0 0 0

12 [xfsdatad/0] 0.0 0 0 0 0

13 [mdrecoveryd] 0.0 0 0 0 0

14 [kjournald] 0.0 0 0 0 0

269 /bin/bash /etc/r 0.0 0 586 658 2632

517 /sbin/dhclient - 0.3 204 319 486 1944

594 syslogd -m 0 0.2 172 24 421 1684

653 /usr/sbin/atd 0.0 52 12 344 1376

672 /usr/sbin/sshd 0.3 220 279 724 2896

1016 /usr/bin/postmas 0.4 256 1846 2710 10840

1037 crond 0.1 116 19 349 1396

1070 postgres: stats 0.0 24 1846 2958 11832

1075 postgres: stats 0.1 92 1846 2760 11040

1101 /bin/sh /etc/rc3 0.0 0 586 651 2604

1119 /usr/local/sbin/ 0.0 44 73 391 1564

22507 httpd2 -f /etc/h 0.1 112 304 3935 15740

30334 /usr/bin/perl /u 0.0 0 8 1003 4012

31890 /usr/bin/perl /u 0.0 0 8 1003 4012

1477 httpd2 -f /etc/h 5.7 3444 304 4728 18912

3687 /bin/sh /usr/bin 0.0 0 586 605 2420

3688 logger -t mysqld 0.0 0 6 394 1576

3713 /usr/sbin/mysqld 0.4 292 1764 3522 14088

3714 /usr/sbin/mysqld 0.4 296 1764 3522 14088

3715 /usr/sbin/mysqld 0.4 296 1764 3522 14088

9756 httpd2 -f /etc/h 2.4 1468 304 3995 15980

9760 httpd2 -f /etc/h 4.0 2436 304 4703 18812

9762 httpd2 -f /etc/h 1.7 1072 304 4707 18828

9763 httpd2 -f /etc/h 1.8 1108 304 4692 18768

9775 httpd2 -f /etc/h 4.1 2448 304 4679 18716

12031 httpd2 -f /etc/h 1.3 820 304 3955 15820

12032 httpd2 -f /etc/h 1.8 1104 304 4697 18788

12607 httpd2 -f /etc/h 2.9 1760 304 4149 16596

12608 httpd2 -f /etc/h 1.7 1060 304 3957 15828

12845 sshd: mthaddon [ 2.8 1672 279 1579 6316

12847 sshd: mthaddon@p 3.2 1960 279 1593 6372

12848 -bash 2.5 1508 586 646 2584

12886 su - 1.6 988 16 566 2264

12890 -bash 2.6 1576 586 663 2652

12977 /usr/local/bin/p 62.1 37004 780 12046 48184

12989 /usr/local/bin/p 62.1 37004 780 12046 48184

12990 /usr/local/bin/p 62.1 37004 780 12046 48184

12991 /usr/local/bin/p 62.1 37004 780 12046 48184

12992 /usr/local/bin/p 62.1 37004 780 12046 48184

12993 /usr/local/bin/p 62.1 37004 780 12046 48184

12999 ps -e -o pid,cmd 1.0 636 58 626 2504

cat /proc/meminfo total: used: free: shared: buffers: cached:

Mem: 60997632 59174912 1822720 0 3080192 12967936

Swap: 68149248 36655104 31494144

MemTotal: 59568 kB

MemFree: 1780 kB

MemShared: 0 kB

Buffers: 3008 kB

Cached: 9964 kB

SwapCached: 2700 kB

Active: 9332 kB

Inactive: 42436 kB

HighTotal: 0 kB

HighFree: 0 kB

LowTotal: 59568 kB

LowFree: 1780 kB

SwapTotal: 66552 kB

SwapFree: 30756 kB

Try increasing the size of your swap space. I know you still have ~30 MiB free with Zope/Plone running, but that isn't a big margin on a busy server. The normal rule of thumb is sizeofswap = 2 * sizeofram, but on a Linode, where you are trying to do a lot with not much RAM, you need a bigger swap space. I have 256 MiB on my Linode 64, and while it may slow down if it gets busy, at least the OOM killer doesn't start killing processes.

You may also check with some Zope experts how to tune it so it won't require so much memory. I was recently told that this is possible (although it may require some knowledge of Python and the internals of Zope).

jp

I've allocated all my disk space on the Linode. Is there any way to change the disk allocations so that I can increase the swap size? I think I'd like to try this as an option to begin with.

Thanks, Tom

Yes, you can resize your partitions (I did that when I had the same problem) – go to Hard drive images, click on a partition and you'll see what I mean. Your data should not be lost after resizing, but you'll have to reboot (can't resize a partition when it's being used).

Thanks!! I've updated by swap space to 320MB (to be sure) and will see how things go.

Thanks again to everyone for the response!

Tom

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct