Page allocation errors

Using latest 2.6, I've been seeing, regularly, page allocation errors.

In my case the error always appears to be in the IP stack, so I wonder if there's some Xen level issue. It also seems to be coincidentally with my home machine doing an rsync of my /BACKUP directory, which might explain why the stacks are display tcp4 entries. Note the process ID…

eg

Call Trace:
swapper: page allocation failure. order:2, mode:0x20
Pid: 0, comm: swapper Not tainted 2.6.39.1-linode34 #1
 [<c0189a30>] ? __alloc_pages_nodemask+0x530/0x6f0
 [<c01afb13>] ? T.819+0xb3/0x2e0
 [<c01aff86>] ? cache_alloc_refill+0x246/0x290
 [<c0139826>] ? local_bh_enable+0x16/0x80
 [<c01b008d>] ? __kmalloc+0xbd/0xd0
 [<c050f07e>] ? pskb_expand_head+0x12e/0x200
 [<c050f5bd>] ? __pskb_pull_tail+0x4d/0x2b0
 [<c05d9263>] ? ipv4_confirm+0xd3/0x180
 [<c0517d6d>] ? dev_hard_start_xmit+0x1dd/0x3e0
 [<c059a900>] ? ip_finish_output2+0x260/0x260
 [<c059a900>] ? ip_finish_output2+0x260/0x260
 [<c052bcc2>] ? sch_direct_xmit+0xb2/0x170
 [<c0518069>] ? dev_queue_xmit+0xf9/0x320
 [<c059aa3b>] ? ip_finish_output+0x13b/0x300
 [<c059acaa>] ? ip_output+0xaa/0xe0
 [<c0599e78>] ? ip_local_out+0x18/0x20
 [<c059a257>] ? ip_queue_xmit+0x117/0x3d0
 [<c01062bb>] ? xen_restore_fl_direct_reloc+0x4/0x4
 [<c068fb71>] ? _raw_spin_unlock_irqrestore+0x11/0x20
 [<c013fcb9>] ? mod_timer+0xf9/0x1b0
 [<c05ad70f>] ? tcp_transmit_skb+0x37f/0x660
 [<c05b0165>] ? tcp_write_xmit+0x1e5/0x4f0
 [<c05b04d4>] ? __tcp_push_pending_frames+0x24/0x90
 [<c05ac4e2>] ? tcp_rcv_established+0x3d2/0x610
 [<c05b2fee>] ? tcp_v4_do_rcv+0xce/0x170
 [<c05b3749>] ? tcp_v4_rcv+0x6b9/0x7a0
 [<c0595887>] ? ip_local_deliver_finish+0x97/0x220
 [<c05957f0>] ? ip_rcv+0x320/0x320
 [<c059524b>] ? ip_rcv_finish+0xfb/0x380
 [<c0516ca9>] ? __netif_receive_skb+0x339/0x3d0
 [<c0516f47>] ? netif_receive_skb+0x67/0x70
 [<c04b84cc>] ? handle_incoming_queue+0x17c/0x250
 [<c04b87bc>] ? xennet_poll+0x21c/0x540
 [<c0131061>] ? load_balance+0x71/0x590
 [<c05176da>] ? net_rx_action+0xea/0x190
 [<c013956c>] ? __do_softirq+0x7c/0x110
 [<c01394f0>] ? __local_bh_enable+0x70/0x70
 <irq>[<c013944e>] ? irq_exit+0x6e/0x90
 [<c044d14d>] ? xen_evtchn_do_upcall+0x1d/0x30
 [<c0690c07>] ? xen_do_upcall+0x7/0xc
 [<c01013a7>] ? hypercall_page+0x3a7/0x1000
 [<c0105b3f>] ? xen_safe_halt+0xf/0x20
 [<c010f1ff>] ? default_idle+0x2f/0x60
 [<c0107e52>] ? cpu_idle+0x42/0x70
 [<c0830797>] ? start_kernel+0x2c8/0x2cd
 [<c083030d>] ? kernel_init+0x126/0x126
 [<c083395b>] ? xen_start_kernel+0x4f7/0x4ff
Mem-Info:
DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
CPU    2: hi:    0, btch:   1 usd:   0
CPU    3: hi:    0, btch:   1 usd:   0
Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd:  52
CPU    1: hi:  186, btch:  31 usd:  64
CPU    2: hi:  186, btch:  31 usd:  83
CPU    3: hi:  186, btch:  31 usd: 158
active_anon:1419 inactive_anon:1841 isolated_anon:0
 active_file:54491 inactive_file:55433 isolated_file:0
 unevictable:1137 dirty:3 writeback:0 unstable:0
 free:2130 slab_reclaimable:5041 slab_unreclaimable:2407
 mapped:1924 shmem:6 pagetables:270 bounce:0
DMA free:2048kB min:84kB low:104kB high:124kB active_anon:0kB inactive_anon:0kB active_file:440kB inactive_file:3988kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15808kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:100kB slab_unreclaimable:112kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 500 500 500
Normal free:6472kB min:2816kB low:3520kB high:4224kB active_anon:5676kB inactive_anon:7364kB active_file:217524kB inactive_file:217744kB unevictable:4548kB isolated(anon):0kB isolated(file):0kB present:512064kB mlocked:4548kB dirty:12kB writeback:0kB mapped:7696kB shmem:24kB slab_reclaimable:20064kB slab_unreclaimable:9516kB kernel_stack:832kB pagetables:1080kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:4 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 126*4kB 81*8kB 28*16kB 0*32kB 1*64kB 1*128kB 1*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 2048kB
Normal: 1618*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 6472kB
111437 total pagecache pages
609 pages in swap cache
Swap cache stats: add 5243, delete 4634, find 170851/171200
Free swap  = 254928kB
Total swap = 263164kB
133104 pages RAM
0 pages HighMem
5748 pages reserved
44857 pages shared
89078 pages non-shared</c083395b></c083030d></c0830797></c0107e52></c010f1ff></c0105b3f></c01013a7></c0690c07></c044d14d></c013944e></irq></c01394f0></c013956c></c05176da></c0131061></c04b87bc></c04b84cc></c0516f47></c0516ca9></c059524b></c05957f0></c0595887></c05b3749></c05b2fee></c05ac4e2></c05b04d4></c05b0165></c05ad70f></c013fcb9></c068fb71></c01062bb></c059a257></c0599e78></c059acaa></c059aa3b></c0518069></c052bcc2></c059a900></c059a900></c0517d6d></c05d9263></c050f5bd></c050f07e></c01b008d></c0139826></c01aff86></c01afb13></c0189a30> 

52 Replies

@MotoHoss:

I am bumping it as well. Same for me. NJ Linode uname -r returns 3.0.18-linode43

No IPV6… maybe it's not disabled completely? maybe I need to go to a different kernel… I dun know for sure but help :x

The errors are likely related to IPv4 traffic; moving more traffic to IPv6 reduced the errors for me, at least. Also, are you experiencing any actual problems related to this? I haven't yet heard of anyone actually having a problem, aside from seeing errors.

As far as I can tell, the error messages are independent of RAM usage and are benign, so odds are good your OOM problem is elsewhere.

You know, interesting you mentioned that…

A scrollback through my recent few half-dozen PAFs shows many mentions of IPv4-related things in the call trace. Also, in my histogrammetric(*) analysis, I mentioned I was testing the backup server… you wanna know what I was testing? Moving the rsyncs from IPv4 to IPv6. Post-IPv6, no more spikes…

Both of these servers have a fair bit of IPv4 traffic (~130 packets/second for one, and probably within an order of magnitude for the other, although it is not yet graphed). Obviously, not as much IPv6 traffic yet.

If it is IPv4-related, then great news! It will fix itself within the next century! :-)

(*) i just made that word up

Try switching to our Latest 3.0 (3.0.4-linode38) kernel. We recently deployed a new kernel that should help with this.

@theckman:

Try switching to our Latest 3.0 kernel. We recently deployed a new kernel that should help with this.
I'm running CentOS 5.7; how much testing has been done with this older OS and the new kernel? I don't want to be a guinea pig :-)

I never thought I'd see the day when root became a process… :shock: :D

Seriously, though, I always thought that the pid would always increment by one with every process spawned on a system, and that the pid should always start with 1, which should only be used by the first ever process spawned on the system when the OS initially starts booting? O.o First time I ever seen that :)

@Piki:

I never thought I'd see the day when root became a process… :shock: :D

Seriously, though, I always thought that the pid would always increment by one with every process spawned on a system, and that the pid should always start with 1, which should only be used by the first ever process spawned on the system when the OS initially starts booting? O.o First time I ever seen that :)
Exactly; the error is showing from kernel allocation failures and not userspace overloading; this isn't a typical OOM error, no process is failing :-)

By default the new CentOS deployments use the Latest 3.0 kernel. There should not be any problems as "3.0" is simply "2.6.40". The version numbers were changed by Linus because they were getting too long.

p.s. if anyone wants my kernel logs for some sort of analysis, let me know.

rtucker@hennepin:/var/log$ zgrep -h "kernel:" remote-2600:3c03:*:ce69.log* remote-2600:3c03:*:1dc9.log* | wc
 297278 2893029 31386088

I just thought he changed to 3.0 because that's a better age than 40… After all, wouldn't anybody in their 40's want to go back to being 30? :lol:

Actually, I remember seeing somewhere that it had to do with the 20 year anniversary of the kernel.

@theckman:

Try switching to our Latest 3.0 (3.0.4-linode38) kernel. We recently deployed a new kernel that should help with this.
Did not fix the problem.

swapper: page allocation failure: order:3, mode:0x20
Pid: 0, comm: swapper Not tainted 3.0.4-linode38 #1
Call Trace:   
 [<c018b258>] ? warn_alloc_failed+0x98/0x100
 [<c018baa4>] ? __alloc_pages_nodemask+0x3f4/0x630
 [<c014000a>] ? mod_timer_pending+0xba/0x110
 [<c01b22b3>] ? T.833+0xb3/0x2e0
 [<c01b2726>] ? cache_alloc_refill+0x246/0x290
 [<c060f0ff>] ? ipt_do_table+0x24f/0x580
 [<c01b282d>] ? __kmalloc+0xbd/0xd0
 [<c053a7fe>] ? pskb_expand_head+0x12e/0x200
 [<c053ad3d>] ? __pskb_pull_tail+0x4d/0x2b0
 [<c0607ad3>] ? ipv4_confirm+0xf3/0x1b0
 [<c05436dd>] ? dev_hard_start_xmit+0x1dd/0x3e0
 [<c05c8020>] ? ip_finish_output2+0x260/0x260
 [<c05c8020>] ? ip_finish_output2+0x260/0x260
 [<c0557a62>] ? sch_direct_xmit+0xb2/0x170
 [<c05439d9>] ? dev_queue_xmit+0xf9/0x320
 [<c05c815b>] ? ip_finish_output+0x13b/0x300
 [<c05c83ca>] ? ip_output+0xaa/0xe0
 [<c05c7568>] ? ip_local_out+0x18/0x20
 [<c05daf25>] ? tcp_transmit_skb+0x385/0x670
 [<c05dd965>] ? tcp_write_xmit+0x1e5/0x4f0
 [<c05ddcd4>] ? __tcp_push_pending_frames+0x24/0x90
 [<c05d9cf2>] ? tcp_rcv_established+0x3d2/0x610
 [<c05e080e>] ? tcp_v4_do_rcv+0xce/0x1a0
 [<c05e0f99>] ? tcp_v4_rcv+0x6b9/0x7a0
 [<c05c2fc7>] ? ip_local_deliver_finish+0x97/0x220
 [<c05c2f30>] ? ip_rcv+0x320/0x320
 [<c05c298b>] ? ip_rcv_finish+0xfb/0x380
 [<c0540dae>] ? __netif_receive_skb+0x2fe/0x370
 [<c0542597>] ? netif_receive_skb+0x67/0x70
 [<c04e35fc>] ? handle_incoming_queue+0x17c/0x250
 [<c04e38ec>] ? xennet_poll+0x21c/0x540
 [<c0542d2a>] ? net_rx_action+0xea/0x190
 [<c0139cfc>] ? __do_softirq+0x7c/0x110
 [<c0139c80>] ? irq_enter+0x60/0x60
 <irq>[<c0139ade>] ? irq_exit+0x6e/0xa0
 [<c047829d>] ? xen_evtchn_do_upcall+0x1d/0x30
 [<c06c0947>] ? xen_do_upcall+0x7/0xc
 [<c01013a7>] ? hypercall_page+0x3a7/0x1000
 [<c0105c7f>] ? xen_safe_halt+0xf/0x20
 [<c010f41e>] ? default_idle+0x2e/0x60
 [<c0107f72>] ? cpu_idle+0x42/0x70
 [<c086977f>] ? start_kernel+0x2ce/0x2d3
 [<c08692ef>] ? kernel_init+0x112/0x112
 [<c086c943>] ? xen_start_kernel+0x4f7/0x4ff
Mem-Info:
DMA per-cpu:  
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
CPU    2: hi:    0, btch:   1 usd:   0
CPU    3: hi:    0, btch:   1 usd:   0
Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd: 121
CPU    1: hi:  186, btch:  31 usd: 143
CPU    2: hi:  186, btch:  31 usd: 183
CPU    3: hi:  186, btch:  31 usd: 213
active_anon:1695 inactive_anon:2199 isolated_anon:0
 active_file:56223 inactive_file:50078 isolated_file:0
 unevictable:1137 dirty:16 writeback:0 unstable:0
 free:5126 slab_reclaimable:5292 slab_unreclaimable:1992
 mapped:2511 shmem:4 pagetables:291 bounce:0
DMA free:3428kB min:84kB low:104kB high:124kB active_anon:0kB inactive_anon:0kB active_file:788kB inactive_file:1416kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15808kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:1492kB slab_unreclaimable:132kB kernel_stack:72kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 500 500 500
Normal free:17076kB min:2816kB low:3520kB high:4224kB active_anon:6780kB inactive_anon:8796kB active_file:224104kB inactive_file:198896kB unevictable:4548kB isolated(anon):0kB isolated(file):0kB present:512064kB mlocked:4548kB dirty:64kB writeback:0kB mapped:10044kB shmem:16kB slab_reclaimable:19676kB slab_unreclaimable:7836kB kernel_stack:824kB pagetables:1164kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 251*4kB 142*8kB 79*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3404kB
Normal: 4269*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 17076kB
108423 total pagecache pages
1226 pages in swap cache
Swap cache stats: add 86205, delete 84979, find 55713/67312
Free swap  = 251912kB
Total swap = 263164kB
133104 pages RAM
0 pages HighMem
5833 pages reserved
64798 pages shared
66511 pages non-shared</c086c943></c08692ef></c086977f></c0107f72></c010f41e></c0105c7f></c01013a7></c06c0947></c047829d></c0139ade></irq></c0139c80></c0139cfc></c0542d2a></c04e38ec></c04e35fc></c0542597></c0540dae></c05c298b></c05c2f30></c05c2fc7></c05e0f99></c05e080e></c05d9cf2></c05ddcd4></c05dd965></c05daf25></c05c7568></c05c83ca></c05c815b></c05439d9></c0557a62></c05c8020></c05c8020></c05436dd></c0607ad3></c053ad3d></c053a7fe></c01b282d></c060f0ff></c01b2726></c01b22b3></c014000a></c018baa4></c018b258> 

@sweh:

@theckman:

Try switching to our Latest 3.0 (3.0.4-linode38) kernel. We recently deployed a new kernel that should help with this.
Did not fix the problem.
Yes and no. It fixed the panic problem that you were having. Now that that has been resolved your Linode is OOMing instead like it should. Tune up your memory usage a bit and you should be good to go.

Good catch HoopyCat. Arg, and I was so glad to be done with this too. Now to stare at it harder.

The body of the trace looks similar to an issue we saw with the amount of RAM the kernel was reserving. What does "sysctl vm.minfreekbytes" show? Is it less than 16384? If so that was supposed to be fixed already. If not that means more digging.

@psandin:

@sweh:

@theckman:

Try switching to our Latest 3.0 (3.0.4-linode38) kernel. We recently deployed a new kernel that should help with this.

Did not fix the problem.
Yes and no. It fixed the panic problem that you were having. Now that that has been resolved your Linode is OOMing instead like it should. Tune up your memory usage a bit and you should be good to go.
I think you're confusing me with someone else. I was not panic()ing and wasn't OOMing. The kernel was giving page allocation faults inside the kernel itself whenever I rsync (over ssh) my /BACKUP directory back home.

@psandin:

The body of the trace looks similar to an issue we saw with the amount of RAM the kernel was reserving. What does "sysctl vm.minfreekbytes" show? Is it less than 16384? If so that was supposed to be fixed already. If not that means more digging.
% sysctl vm.minfreekbytes

vm.minfreekbytes = 2906

sweh - reboot into Latest 3.0 if you haven't already, and try bumping vm.minfreekbytes to 4096 or more. Let us know if that fixes it.

Thanks,

-Chris

````

uname -a

Linux linode 3.0.4-linode38 #1 SMP Thu Sep 22 14:59:08 EDT 2011 i686 i686 i386 GNU/Linux

tail -1 /etc/sysctl.conf

vm.minfreekbytes = 4096

sysctl vm.minfreekbytes

vm.minfreekbytes = 4096

````

Machine rebooted. We'll see if it show's up in the next few days!

I wonder what's different about the linode kernels; on my Panix v-colo (Xen based, equiv to a linode512) the value is 2882, and this problem never seems to show.

@caker:

sweh - reboot into Latest 3.0 if you haven't already, and try bumping vm.minfreekbytes to 4096 or more. Let us know if that fixes it.

Thanks,

-Chris

Nope! Still doing it…

swapper: page allocation failure: order:4, mode:0x20
Pid: 0, comm: swapper Not tainted 3.0.4-linode38 #1
Call Trace:
 [<c018b258>] ? warn_alloc_failed+0x98/0x100
 [<c018baa4>] ? __alloc_pages_nodemask+0x3f4/0x630
 [<c014000a>] ? mod_timer_pending+0xba/0x110
 [<c01b22b3>] ? T.833+0xb3/0x2e0
 [<c01b2726>] ? cache_alloc_refill+0x246/0x290
 [<c060f0ff>] ? ipt_do_table+0x24f/0x580
 [<c01b282d>] ? __kmalloc+0xbd/0xd0
 [<c053a7fe>] ? pskb_expand_head+0x12e/0x200
 [<c053ad3d>] ? __pskb_pull_tail+0x4d/0x2b0
 [<c0607ad3>] ? ipv4_confirm+0xf3/0x1b0
etc etc etc</c0607ad3></c053ad3d></c053a7fe></c01b282d></c060f0ff></c01b2726></c01b22b3></c014000a></c018baa4></c018b258> 

The dump appears to be the same as previous message.

I'm having the same problem. However just on one of Linodes. What i noticed is - this happens on Linode that have big load in sense of CPU usage (around 200-250%).

My temporar solution is to use older kernel and time to time to restart apache/nginx/memcached. Because over time kernel moves some data to swap even when there is plenty of ram and at one point it starts to OOM without aparent reason. However that is not bulletproof solution - after about a week or so server stops any traffic and in lish i can see a lot of UFW messages.

I hope that next kernel will fix whatever is causing this.

I'm still getting them with -linode38 and vm.minfreekbytes = 4096:

==> /var/log/remote-2600:3c03::f03c:91ff:fe96:ce69.log <==
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: swapper: page allocation failure: order:3, mode:0x20
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: Pid: 0, comm: swapper Not tainted 3.0.4-linode38 #1
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: Call Trace:
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c018b258>] ? warn_alloc_failed+0x98/0x100
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c018baa4>] ? __alloc_pages_nodemask+0x3f4/0x630
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c01b22b3>] ? T.833+0xb3/0x2e0
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c01b2726>] ? cache_alloc_refill+0x246/0x290
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c060f0ff>] ? ipt_do_table+0x24f/0x580
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c01b282d>] ? __kmalloc+0xbd/0xd0
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c053a7fe>] ? pskb_expand_head+0x12e/0x200
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c053ad3d>] ? __pskb_pull_tail+0x4d/0x2b0
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c0607ad3>] ? ipv4_confirm+0xf3/0x1b0
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05436dd>] ? dev_hard_start_xmit+0x1dd/0x3e0
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05c8020>] ? ip_finish_output2+0x260/0x260
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05c8020>] ? ip_finish_output2+0x260/0x260
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c0557a62>] ? sch_direct_xmit+0xb2/0x170
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05439d9>] ? dev_queue_xmit+0xf9/0x320
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05c815b>] ? ip_finish_output+0x13b/0x300
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05c83ca>] ? ip_output+0xaa/0xe0
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05c7568>] ? ip_local_out+0x18/0x20
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05daf25>] ? tcp_transmit_skb+0x385/0x670
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05dd965>] ? tcp_write_xmit+0x1e5/0x4f0
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05ddcd4>] ? __tcp_push_pending_frames+0x24/0x90
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05d9cf2>] ? tcp_rcv_established+0x3d2/0x610
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05e080e>] ? tcp_v4_do_rcv+0xce/0x1a0
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05e0f99>] ? tcp_v4_rcv+0x6b9/0x7a0
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05c2fc7>] ? ip_local_deliver_finish+0x97/0x220
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05c2f30>] ? ip_rcv+0x320/0x320
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05c298b>] ? ip_rcv_finish+0xfb/0x380
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c0540dae>] ? __netif_receive_skb+0x2fe/0x370
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c0542597>] ? netif_receive_skb+0x67/0x70
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c04e35fc>] ? handle_incoming_queue+0x17c/0x250
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c04e38ec>] ? xennet_poll+0x21c/0x540
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c0542d2a>] ? net_rx_action+0xea/0x190
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c0139cfc>] ? __do_softirq+0x7c/0x110
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c0139c80>] ? irq_enter+0x60/0x60
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: <irq>[<c0139ade>] ? irq_exit+0x6e/0xa0
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c047829d>] ? xen_evtchn_do_upcall+0x1d/0x30
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c06c0947>] ? xen_do_upcall+0x7/0xc
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c01013a7>] ? hypercall_page+0x3a7/0x1000
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c0105c7f>] ? xen_safe_halt+0xf/0x20
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c010f41e>] ? default_idle+0x2e/0x60
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c0107f72>] ? cpu_idle+0x42/0x70
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c086977f>] ? start_kernel+0x2ce/0x2d3
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c08692ef>] ? kernel_init+0x112/0x112
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c086c943>] ? xen_start_kernel+0x4f7/0x4ff
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: Mem-Info:
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: DMA per-cpu:
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: CPU    0: hi:    0, btch:   1 usd:   0
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: CPU    1: hi:    0, btch:   1 usd:   0
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: CPU    2: hi:    0, btch:   1 usd:   0
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: CPU    3: hi:    0, btch:   1 usd:   0
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: Normal per-cpu:
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: CPU    0: hi:  186, btch:  31 usd: 139
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: CPU    1: hi:  186, btch:  31 usd:  27
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: CPU    2: hi:  186, btch:  31 usd:  23
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: CPU    3: hi:  186, btch:  31 usd: 167
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: active_anon:35584 inactive_anon:35624 isolated_anon:0
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: active_file:19975 inactive_file:20016 isolated_file:0
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: unevictable:0 dirty:16 writeback:868 unstable:0
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: free:9374 slab_reclaimable:1040 slab_unreclaimable:2331
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: mapped:6137 shmem:29185 pagetables:415 bounce:0
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: DMA free:2532kB min:120kB low:148kB high:180kB active_anon:1724kB inactive_anon:1916kB active_file:76kB inactive_file:184kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15808kB mlocked:0kB dirty:0kB writeback:0kB mapped:16kB shmem:16kB slab_reclaimable:136kB slab_unreclaimable:520kB kernel_stack:0kB pagetables:132kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: lowmem_reserve[]: 0 500 500 500
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: Normal free:34964kB min:3972kB low:4964kB high:5956kB active_anon:140612kB inactive_anon:140580kB active_file:79824kB inactive_file:79880kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:512064kB mlocked:0kB dirty:64kB writeback:3472kB mapped:24532kB shmem:116724kB slab_reclaimable:4024kB slab_unreclaimable:8804kB kernel_stack:600kB pagetables:1528kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: lowmem_reserve[]: 0 0 0 0
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: DMA: 221*4kB 182*8kB 12*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 2532kB
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: Normal: 5751*4kB 1367*8kB 64*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 34964kB
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: 80575 total pagecache pages
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: 11428 pages in swap cache
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: Swap cache stats: add 3503073, delete 3491645, find 8773626/9187228
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: Free swap  = 151020kB
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: Total swap = 262140kB
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: 133104 pages RAM
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: 0 pages HighMem
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: 5833 pages reserved
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: 21491 pages shared
Oct  1 16:21:03 2600:3c03::f03c:91ff:fe96:ce69 kernel: 102684 pages non-shared</c086c943></c08692ef></c086977f></c0107f72></c010f41e></c0105c7f></c01013a7></c06c0947></c047829d></c0139ade></irq></c0139c80></c0139cfc></c0542d2a></c04e38ec></c04e35fc></c0542597></c0540dae></c05c298b></c05c2f30></c05c2fc7></c05e0f99></c05e080e></c05d9cf2></c05ddcd4></c05dd965></c05daf25></c05c7568></c05c83ca></c05c815b></c05439d9></c0557a62></c05c8020></c05c8020></c05436dd></c0607ad3></c053ad3d></c053a7fe></c01b282d></c060f0ff></c01b2726></c01b22b3></c018baa4></c018b258> 

For data nerds, some annotated grepping and counting:

$ zgrep -h "page allocation failure" remote-2600:3c03:*:1dc9.log* | cut -b1-6 | sort | uniq -c
Sep 4: 2.6.39.1-linode34 #1
     11 Sep  4
     16 Sep  5
      3 Sep  6
      3 Sep  7
     16 Sep  8
     12 Sep  9
     11 Sep 10
      8 Sep 11
      4 Sep 12
      7 Sep 13
      5 Sep 14
     21 Sep 15
     18 Sep 16
     14 Sep 17
     10 Sep 18
     21 Sep 19
     12 Sep 20
     14 Sep 21
     13 Sep 22
    634 Sep 23
      6 Sep 24
Sep 25: 3.0.4-linode38 #1
      5 Sep 25
     24 Sep 26
     18 Sep 27
     41 Sep 28
      8 Sep 29
     52 Sep 30
    211 Oct  1

$ zgrep -h "page allocation failure" remote-2600:3c03:*:ce69.log* | cut -b1-6 | sort | uniq -c
Aug 19: 2.6.39.1-linode34 #1
     22 Aug 19
     34 Aug 20
     41 Aug 21
     36 Aug 22
     43 Aug 23
     37 Aug 24
     36 Aug 25
     50 Aug 26
     43 Aug 27
     75 Aug 28
     57 Aug 29
     58 Aug 30
     81 Aug 31
     55 Sep  1
     80 Sep  2
     62 Sep  3
(gap due to the remote logging system being broken)
Sep 17: 3.0.4-linode36 #1
     10 Sep 17
     88 Sep 18
     54 Sep 19
     53 Sep 20
     32 Sep 21
     66 Sep 22
    148 Sep 23
     48 Sep 24
    163 Sep 25
Sep 26: 3.0.4-linode38 #1
     92 Sep 26
    223 Sep 27
    224 Sep 28
    238 Sep 29
    170 Sep 30
    145 Oct  1

On the first system (1dc9), note the spike on September 23… that happens to correspond to a full backup rsync, but there was also one on September 6. For the second system (ce69), there was also a full backup rsync on September 23. (I was testing the backup server that day.)

That doesn't look like an OOM:

Free swap  = 251912kB 

I took a look at my logs this morning, and indeed, I'm seeing it on my busy nodes. On both, vm.minfreekbytes = 2906. I'm upping to 4096 and rebooting three times to see if that helps.

==> /var/log/remote-2600:3c03::f03c:91ff:fe96:ce69.log <==
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: lighttpd: page allocation failure: order:3, mode:0x20
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: Pid: 2098, comm: lighttpd Not tainted 3.0.4-linode38 #1
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: Call Trace:
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c018b258>] ? warn_alloc_failed+0x98/0x100
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c018baa4>] ? __alloc_pages_nodemask+0x3f4/0x630
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c014000a>] ? mod_timer_pending+0xba/0x110
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c01b22b3>] ? T.833+0xb3/0x2e0
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c01b2726>] ? cache_alloc_refill+0x246/0x290
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c060f0ff>] ? ipt_do_table+0x24f/0x580
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c01b282d>] ? __kmalloc+0xbd/0xd0
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c053a7fe>] ? pskb_expand_head+0x12e/0x200
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c053ad3d>] ? __pskb_pull_tail+0x4d/0x2b0
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c0607ad3>] ? ipv4_confirm+0xf3/0x1b0
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05436dd>] ? dev_hard_start_xmit+0x1dd/0x3e0
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05c8020>] ? ip_finish_output2+0x260/0x260
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05c8020>] ? ip_finish_output2+0x260/0x260
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c0557a62>] ? sch_direct_xmit+0xb2/0x170
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05439d9>] ? dev_queue_xmit+0xf9/0x320
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05c815b>] ? ip_finish_output+0x13b/0x300
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05c83ca>] ? ip_output+0xaa/0xe0
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05c7568>] ? ip_local_out+0x18/0x20
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05daf25>] ? tcp_transmit_skb+0x385/0x670
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05dd965>] ? tcp_write_xmit+0x1e5/0x4f0
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05ddcd4>] ? __tcp_push_pending_frames+0x24/0x90
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05d0aae>] ? tcp_sendmsg+0x81e/0xb30
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c0105c27>] ? xen_force_evtchn_callback+0x17/0x30
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c0106404>] ? check_events+0x8/0xc
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05eed37>] ? inet_sendmsg+0x47/0xb0
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c0532366>] ? sock_aio_write+0x116/0x170
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05d8520>] ? tcp_clean_rtx_queue+0x590/0x8b0
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c0614a8e>] ? bictcp_cong_avoid+0x1e/0x420
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c0532250>] ? sock_aio_dtor+0x10/0x10
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c01b5036>] ? do_sync_readv_writev+0xb6/0xf0
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c0105c27>] ? xen_force_evtchn_callback+0x17/0x30
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c01b52b6>] ? rw_verify_area+0x66/0x120
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c01b57ba>] ? do_readv_writev+0xaa/0x1a0
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c0532250>] ? sock_aio_dtor+0x10/0x10
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c0535233>] ? sock_common_setsockopt+0x23/0x30
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c05338ec>] ? sys_setsockopt+0x6c/0xd0
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c01b58ee>] ? vfs_writev+0x3e/0x60
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c01b5a11>] ? sys_writev+0x41/0xa0
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c06bfb91>] ? syscall_call+0x7/0xb
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: [<c06b0000>] ? sctp_err_lookup+0x90/0x110
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: Mem-Info:
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: DMA per-cpu:
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: CPU    0: hi:    0, btch:   1 usd:   0
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: CPU    1: hi:    0, btch:   1 usd:   0
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: CPU    2: hi:    0, btch:   1 usd:   0
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: CPU    3: hi:    0, btch:   1 usd:   0
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: Normal per-cpu:
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: CPU    0: hi:  186, btch:  31 usd: 152
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: CPU    1: hi:  186, btch:  31 usd: 157
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: CPU    2: hi:  186, btch:  31 usd: 178
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: CPU    3: hi:  186, btch:  31 usd:  50
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: active_anon:36638 inactive_anon:36659 isolated_anon:32
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: active_file:21023 inactive_file:24423 isolated_file:0
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: unevictable:0 dirty:12 writeback:364 unstable:0
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: free:1439 slab_reclaimable:1247 slab_unreclaimable:2337
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: mapped:4976 shmem:26429 pagetables:368 bounce:0
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: DMA free:2168kB min:84kB low:104kB high:124kB active_anon:2104kB inactive_anon:2220kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15808kB mlocked:0kB dirty:0kB writeback:1120kB mapped:12kB shmem:16kB slab_reclaimable:412kB slab_unreclaimable:200kB kernel_stack:16kB pagetables:80kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: lowmem_reserve[]: 0 500 500 500
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: Normal free:3588kB min:2816kB low:3520kB high:4224kB active_anon:144448kB inactive_anon:144416kB active_file:84092kB inactive_file:97692kB unevictable:0kB isolated(anon):128kB isolated(file):0kB present:512064kB mlocked:0kB dirty:48kB writeback:336kB mapped:19892kB shmem:105700kB slab_reclaimable:4576kB slab_unreclaimable:9148kB kernel_stack:568kB pagetables:1392kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: lowmem_reserve[]: 0 0 0 0
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: DMA: 23*4kB 132*8kB 63*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 2156kB
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: Normal: 421*4kB 228*8kB 5*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3588kB
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: 91700 total pagecache pages
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: 19833 pages in swap cache
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: Swap cache stats: add 10930739, delete 10910906, find 34210627/35708199
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: Free swap  = 116176kB
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: Total swap = 262140kB
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: 133104 pages RAM
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: 0 pages HighMem
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: 5833 pages reserved
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: 18151 pages shared
Sep 30 11:53:20 2600:3c03::f03c:91ff:fe96:ce69 kernel: 110656 pages non-shared</c06b0000></c06bfb91></c01b5a11></c01b58ee></c05338ec></c0535233></c0532250></c01b57ba></c01b52b6></c0105c27></c01b5036></c0532250></c0614a8e></c05d8520></c0532366></c05eed37></c0106404></c0105c27></c05d0aae></c05ddcd4></c05dd965></c05daf25></c05c7568></c05c83ca></c05c815b></c05439d9></c0557a62></c05c8020></c05c8020></c05436dd></c0607ad3></c053ad3d></c053a7fe></c01b282d></c060f0ff></c01b2726></c01b22b3></c014000a></c018baa4></c018b258> 
==> /var/log/remote-2600:3c03::f03c:91ff:fe96:1dc9.log <==
Sep 30 13:30:15 2600:3c03::f03c:91ff:fe96:1dc9 kernel: nginx: page allocation failure: order:1, mode:0x20
Sep 30 13:30:15 2600:3c03::f03c:91ff:fe96:1dc9 kernel: Pid: 18595, comm: nginx Not tainted 3.0.4-linode38 #1
Sep 30 13:30:15 2600:3c03::f03c:91ff:fe96:1dc9 kernel: Call Trace:
Sep 30 13:30:16 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c018b258>] ? warn_alloc_failed+0x98/0x100
Sep 30 13:30:16 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c018baa4>] ? __alloc_pages_nodemask+0x3f4/0x630
Sep 30 13:30:16 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c01b22b3>] ? T.833+0xb3/0x2e0
Sep 30 13:30:16 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c01b2726>] ? cache_alloc_refill+0x246/0x290
Sep 30 13:30:16 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c0105c27>] ? xen_force_evtchn_callback+0x17/0x30
Sep 30 13:30:16 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c01b20e9>] ? kmem_cache_alloc+0x79/0x90
Sep 30 13:30:16 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c0536adf>] ? sk_prot_alloc+0x2f/0x100
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c0536c75>] ? sk_clone+0x15/0x260
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c05cc14b>] ? inet_csk_clone+0xb/0xb0
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c05e28a7>] ? tcp_create_openreq_child+0x17/0x400
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c060f0ff>] ? ipt_do_table+0x24f/0x580
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c05e10b1>] ? tcp_v4_syn_recv_sock+0x31/0x1c0
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c05e2610>] ? tcp_check_req+0x200/0x480
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c0607ad3>] ? ipv4_confirm+0xf3/0x1b0
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c05e0694>] ? tcp_v4_hnd_req+0x54/0x100
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c057fb3c>] ? nf_iterate+0x6c/0x90
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c05e0861>] ? tcp_v4_do_rcv+0x121/0x1a0
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c0552977>] ? sk_filter+0x17/0x80
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c05e0f99>] ? tcp_v4_rcv+0x6b9/0x7a0
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c05c2fc7>] ? ip_local_deliver_finish+0x97/0x220
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c05c2f30>] ? ip_rcv+0x320/0x320
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c05c298b>] ? ip_rcv_finish+0xfb/0x380
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c05c2890>] ? inet_del_protocol+0x30/0x30
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c0540dae>] ? __netif_receive_skb+0x2fe/0x370
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c0540eea>] ? process_backlog+0xca/0x1a0
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c0542d2a>] ? net_rx_action+0xea/0x190
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c0139cfc>] ? __do_softirq+0x7c/0x110
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c0139c80>] ? irq_enter+0x60/0x60
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: <irq>[<c0139b91>] ? local_bh_enable_ip+0x71/0x80
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c05ef195>] ? inet_stream_connect+0x55/0x1b0
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c0533c05>] ? sys_connect+0xd5/0xf0
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c0105c27>] ? xen_force_evtchn_callback+0x17/0x30
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c0106404>] ? check_events+0x8/0xc
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c01063fb>] ? xen_restore_fl_direct_reloc+0x4/0x4
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c06bf8b1>] ? _raw_spin_unlock_irqrestore+0x11/0x20
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c0105c27>] ? xen_force_evtchn_callback+0x17/0x30
Sep 30 13:30:26 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c0106404>] ? check_events+0x8/0xc
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c01063fb>] ? xen_restore_fl_direct_reloc+0x4/0x4
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c06bf8b1>] ? _raw_spin_unlock_irqrestore+0x11/0x20
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c01e989d>] ? ep_insert+0x14d/0x220
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c0534c2e>] ? sys_socketcall+0x28e/0x2e0
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c06bfb91>] ? syscall_call+0x7/0xb
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: [<c06b0000>] ? sctp_err_lookup+0x90/0x110
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: Mem-Info:
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: DMA per-cpu:
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: CPU    0: hi:    0, btch:   1 usd:   0
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: CPU    1: hi:    0, btch:   1 usd:   0
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: CPU    2: hi:    0, btch:   1 usd:   0
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: CPU    3: hi:    0, btch:   1 usd:   0
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: Normal per-cpu:
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: CPU    0: hi:  186, btch:  31 usd:  28
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: CPU    1: hi:  186, btch:  31 usd:  30
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: CPU    2: hi:  186, btch:  31 usd:   0
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: CPU    3: hi:  186, btch:  31 usd:   0
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: active_anon:10230 inactive_anon:10362 isolated_anon:32
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: active_file:35748 inactive_file:54897 isolated_file:0
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: unevictable:0 dirty:157 writeback:649 unstable:0
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: free:1246 slab_reclaimable:4855 slab_unreclaimable:6029
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: mapped:7045 shmem:1884 pagetables:736 bounce:0
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: DMA free:2092kB min:84kB low:104kB high:124kB active_anon:944kB inactive_anon:1016kB active_file:0kB inactive_file:196kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15808kB mlocked:0kB dirty:0kB writeback:888kB mapped:0kB shmem:0kB slab_reclaimable:2308kB slab_unreclaimable:448kB kernel_stack:112kB pagetables:160kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: lowmem_reserve[]: 0 500 500 500
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: Normal free:2892kB min:2816kB low:3520kB high:4224kB active_anon:39976kB inactive_anon:40432kB active_file:142992kB inactive_file:219392kB unevictable:0kB isolated(anon):128kB isolated(file):0kB present:512064kB mlocked:0kB dirty:628kB writeback:1708kB mapped:28180kB shmem:7536kB slab_reclaimable:17112kB slab_unreclaimable:23668kB kernel_stack:1328kB pagetables:2784kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: lowmem_reserve[]: 0 0 0 0
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: DMA: 523*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 2092kB
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: Normal: 589*4kB 57*8kB 5*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 2892kB
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: 95982 total pagecache pages
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: 3433 pages in swap cache
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: Swap cache stats: add 5523129, delete 5519696, find 39134429/39792794
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: Free swap  = 95340kB
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: Total swap = 262140kB
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: 133104 pages RAM
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: 0 pages HighMem
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: 5833 pages reserved
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: 60210 pages shared
Sep 30 13:30:27 2600:3c03::f03c:91ff:fe96:1dc9 kernel: 93821 pages non-shared</c06b0000></c06bfb91></c0534c2e></c01e989d></c06bf8b1></c01063fb></c0106404></c0105c27></c06bf8b1></c01063fb></c0106404></c0105c27></c0533c05></c05ef195></c0139b91></irq></c0139c80></c0139cfc></c0542d2a></c0540eea></c0540dae></c05c2890></c05c298b></c05c2f30></c05c2fc7></c05e0f99></c0552977></c05e0861></c057fb3c></c05e0694></c0607ad3></c05e2610></c05e10b1></c060f0ff></c05e28a7></c05cc14b></c0536c75></c0536adf></c01b20e9></c0105c27></c01b2726></c01b22b3></c018baa4></c018b258> 

Hmm, funky… a couple of days ago I made some DNS changes. One of the primary impacts of the DNS is that I now talk IP6 between my home and my linode; previously I spoke IP4.

And I haven't seen any page allocation errors in the past 2 days, either.

Coincidence?

I'm getting the same sort of messages. After reading this thread, I can guess what I did to trigger it. Posting here in case it helps someone somewhere.

I recently enabled IPv6 on a linode, and used the reboot to switch to a version 3 kernel at the same time. Everything was running fine, except one remote system could not connect anymore to exchange data. Problem was obvious: the system was now trying to connect via IPv6, but that particular service was not set up to accept IPv6 connections.

I took the easy way out, and told the remote system to always use IPv4 to connect. The system could now connect, but I started getting log messages similar to those others have posted.

Thought at first it might be related to vm.minfreekbytes being too low (had similar problem on another linode months ago, with an older kernel). So I upped that. But the log messages persisted.

Then I found this thread. Will now change the service to accept IPv6 connections, and see what happens.

Of course, it's pretty worrying if there's a problem with IPv4 connections. I'll hold off updating the kernel on my web-serving linodes. They already do IPv6, but use older kernels.

I also have a server which keeps OOM for no apparent reason. Just stumbled across this thread and I have the same messages in my kernel log.

I'm running latest 3.0 kernel (3.0.4-linode38) with Debian Squeeze 32bit on a 512mb Linode.

Have increased minfreekbytes to 4096 as suggested and will see if that fixes the problem…

I see. Well, I've posted in this thread as it seems I'm experiencing the same bizarre problem with the same setup. I'd be very grateful for some advice on how to troubleshoot. I appear to have buckets of free memory but it's still going into swap!

Has anyone made progress on the page-allocation-failure-despite-plenty-of-free-memory issue? I too am on 3.0.4-linode38 and have vm.minfreekbytes = 8192 and still getting it, though I just found this thread a couple hours ago and the normal periods where it's most likely to show up (during an rsync) haven't fired yet.

And indeed, it is seemingly based on IPv4 traffic somehow. Usually swapper/kswapd0 is the process implicated, but I've also seen sshd, rsync, munin-graph (?), iostat (!?)… all with ipv4_confirm et al in the stack trace.

Just grow your minfreekbytes setting until you get to place where the warnings stop.

And, if you have a high level of iowaits, check if you can reduce the disk contention - then you may "sneak past" with lower value of minfreekbytes.

I've been bumping it with the hopes that the problem eventually goes away, but given my amount of free memory and no anomalies in the Munin graphs on these boxes, I would think vm.minfreekbytes is a red herring. But either way, it seems like a regression, and if Linode folks need guinea pigs, I'm saying I'm game to test some kernels or whatever. :)

I am bumping it as well. Same for me. NJ Linode uname -r returns 3.0.18-linode43

No IPV6… maybe it's not disabled completely? maybe I need to go to a different kernel… I dun know for sure but help :x

Maybe the problem is just that -seeing the errors. I saw them again today and this time anyway rsync wasn't involved. I did some more researching…(i.e- thinking) yeah I know I shouldn't do that… but…

I upped vm.minfreekbytes (5120 was 4096) and shut off a daemon that was using _way too much ram and was really idle (not used). I'll see if the errors at least slow down. I hope my attempts at being prudent aren't disturbing an otherwise great forum I just don't want this to escalate into a 'shark byte' .. just because I only thought it was a 'red herring'. :wink: thanks for the thread and reply!

Just wanted to chime in on this thread. Since Debian Lenny recently went beyond its support date, I begrudgingly decided to upgrade to Squeeze. I also switched from the deprecated 2.6 kernel I was using to 3.0.

The upgrade process certainly wasn't without its frustrating issues, breaking stuff from MySQL to Postfix. Heck, even PHP was complaining, thanks to suhosin. But I finally thought I got all the problems wrangled. Except this one. And it's pretty much the same'ol thing:

kernel: swapper: page allocation failure: order:5, mode:0x20
kernel: Pid: 0, comm: swapper Not tainted 3.0.18-linode43 #1
(etc)

I tried changing vm.minfreekbytes but that made no difference. And I honestly don't know if this error is causing any kind of problem or not. Does it mean a process is requesting more memory than is available, and then when it tries to swap, there's a failure, and then maybe the allocation request in the program simply returns an error/null? I haven't seen anything else crashing, so I just don't know what the consequence is.

Anyhoo, just wanted to add myself to the list of people with the potential problem.

I haven't seen the problem since March 8th, so only once since I set the following:

vm.min_free_kbytes = 5120.

incidentally I have a linode 512 :?

@FyberOptic:

Does it mean a process is requesting more memory than is available, and then when it tries to swap, there's a failure, and then maybe the allocation request in the program simply returns an error/null?
No. This is happening in kernel space, not user space. My linode has more free memory than used memory by an order of magnitude (used 55, free 439… I have a lot of IO buffers!). I think pid0 is just being displayed because it's a kernel thread and not related to a user process.

Hello,

i've update with sysctl and added the value to sysctl.conf.

Then applied sysctl -p.

Do I need to reboot too?

thanks?

This has plagued me for quite a long time on my Centos 5.8 32bit VM.

I have upgraded to kernel 2.6.x and 3.x several times and eventually downgraded to the legacy 2.6.18 for this swapper error.

Latest 3.0 and 3.1 , as well as 3.2 all have the same symptom. Not OOM.

ywliu

As MotoHoss suggested above, I tried for my Linode 512..

vm.min_free_kbytes = 5120

And have not seen an error since.

I'm seeing this in London too, running 3.0.18-linode43. I've even tried increasing minfreekbytes to 16384, but it does not appear to help.

Recently encounter this problem on my Linode 512

From syslog:

kernel: nginx: page allocation failure. order:2, mode:0x20

kernel: Pid: 5003, comm: nginx Not tainted 2.6.39.1-linode34 #1

--

Doing as suggest:

vm.minfreekbytes=5120

Not getting the debugging error on syslog since. :)

edwinlee:

2.6.39.1-linode34? That's really old, and subject to a serious security vulnerability. You should upgrade.

Also, the problem you ran in to is most likely a bug that has since been fixed. But yeah, you should get off that kernel. :)

-Tim

I'm on kernel 3.4.2 (vm.minfreekbytes = 4096) and still getting paging errors.

My OSSEC emails are very spammy.

I'm also running into very similar errors. I'm now running 3.4.2-linode44, but also had issues on 2.6 kernels. I'm reasonably sure it's triggered by my daily rsync backup jobs, which correlates with the network layer functions in the backtrace.

swapper/0: page allocation failure: order:5, mode:0x20
Pid: 0, comm: swapper/0 Not tainted 3.4.2-linode44 #1
Call Trace:
 [<c019b538>] ? warn_alloc_failed+0x98/0x100
 [<c019bfa8>] ? __alloc_pages_nodemask+0x4d8/0x6e0
 [<c01c3b41>] ? T.874+0x31/0xe0
 [<c01c3e31>] ? T.871+0x91/0x250
 [<c01c423e>] ? cache_alloc_refill+0x24e/0x290
 [<c01c433e>] ? __kmalloc+0xbe/0xd0
 [<c056232e>] ? pskb_expand_head+0x12e/0x240
 [<c01c32d2>] ? kmem_cache_free+0x42/0x60
 [<c05628cd>] ? __pskb_pull_tail+0x4d/0x2a0
 [<c05675cf>] ? netif_skb_features+0xaf/0xc0
 [<c056b9dd>] ? dev_hard_start_xmit+0x1ed/0x410
 [<c05a876c>] ? nf_iterate+0x6c/0x90
 [<c057fd7a>] ? sch_direct_xmit+0xba/0x180
 [<c05fb7f0>] ? ip_finish_output2+0x280/0x280
 [<c056bcff>] ? dev_queue_xmit+0xff/0x340
 [<c05fad58>] ? ip_local_out+0x18/0x20
 [<c060eb65>] ? tcp_transmit_skb+0x395/0x660
 [<c06114cd>] ? tcp_write_xmit+0x1dd/0x500
 [<c0611854>] ? __tcp_push_pending_frames+0x24/0x90
 [<c060d922>] ? tcp_rcv_established+0x3d2/0x610
 [<c05f6600>] ? ip_rcv+0x330/0x330
 [<c061413b>] ? tcp_v4_do_rcv+0xbb/0x190
 [<c06148cd>] ? tcp_v4_rcv+0x6bd/0x7a0
 [<c05f6697>] ? ip_local_deliver_finish+0x97/0x220
 [<c05f6600>] ? ip_rcv+0x330/0x330
 [<c05f6067>] ? ip_rcv_finish+0xd7/0x340
 [<c0568dc3>] ? __netif_receive_skb+0x2c3/0x350
 [<c056a79f>] ? netif_receive_skb+0x1f/0x70
 [<c0506b51>] ? handle_incoming_queue+0x1a1/0x270
 [<c0506e44>] ? xennet_poll+0x224/0x570
 [<c056afda>] ? net_rx_action+0xea/0x1a0
 [<c0131c4c>] ? __do_softirq+0x7c/0x110
 [<c0131bd0>] ? irq_enter+0x70/0x70
 <irq>  [<c0131a26>] ? irq_exit+0x66/0x90
 [<c049414d>] ? xen_evtchn_do_upcall+0x1d/0x30
 [<c06f3147>] ? xen_do_upcall+0x7/0xc
 [<c01013a7>] ? hypercall_page+0x3a7/0x1000
 [<c010603f>] ? xen_safe_halt+0xf/0x20
 [<c010fcac>] ? default_idle+0x1c/0x40
 [<c010feea>] ? cpu_idle+0x4a/0x80
 [<c089986e>] ? start_kernel+0x2e3/0x2e8
 [<c0899406>] ? kernel_init+0x127/0x127
 [<c089c813>] ? xen_start_kernel+0x520/0x528
Mem-Info:
DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
CPU    2: hi:    0, btch:   1 usd:   0
CPU    3: hi:    0, btch:   1 usd:   0
Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd:  91
CPU    1: hi:  186, btch:  31 usd:  23
CPU    2: hi:  186, btch:  31 usd: 130
CPU    3: hi:  186, btch:  31 usd:  86
HighMem per-cpu:
CPU    0: hi:   18, btch:   3 usd:  10
CPU    1: hi:   18, btch:   3 usd:  17
CPU    2: hi:   18, btch:   3 usd:  14
CPU    3: hi:   18, btch:   3 usd:   4
active_anon:35793 inactive_anon:49055 isolated_anon:0
 active_file:41148 inactive_file:36416 isolated_file:0
 unevictable:0 dirty:437 writeback:0 unstable:0
 free:16325 slab_reclaimable:4754 slab_unreclaimable:2843
 mapped:29371 shmem:11542 pagetables:1298 bounce:0
DMA free:3904kB min:108kB low:132kB high:160kB active_anon:0kB inactive_anon:484kB active_file:1228kB inactive_file:1232kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15808kB mlocked:0kB dirty:0kB writeback:0kB mapped:100kB shmem:0kB slab_reclaimable:2 [<c05f6600>] ? ip_rcv+0x330/0x330
 [<c05f6067>] ? ip_rcv_finish+0xd7/0x340
 [<c0568dc3>] ? __netif_receive_skb+0x2c3/0x350
 [<c056a79f>] ? netif_receive_skb+0x1f/0x70
 [<c0506b51>] ? handle_incoming_queue+0x1a1/0x270
 [<c0506e44>] ? xennet_poll+0x224/0x570
 [<c056afda>] ? net_rx_action+0xea/0x1a0
 [<c0131c4c>] ? __do_softirq+0x7c/0x110
 [<c0131bd0>] ? irq_enter+0x70/0x70
 <irq>  [<c0131a26>] ? irq_exit+0x66/0x90
 [<c049414d>] ? xen_evtchn_do_upcall+0x1d/0x30
 [<c06f3147>] ? xen_do_upcall+0x7/0xc
 [<c01013a7>] ? hypercall_page+0x3a7/0x1000
 [<c010603f>] ? xen_safe_halt+0xf/0x20
 [<c010fcac>] ? default_idle+0x1c/0x40
 [<c010feea>] ? cpu_idle+0x4a/0x80
 [<c089986e>] ? start_kernel+0x2e3/0x2e8
 [<c0899406>] ? kernel_init+0x127/0x127
 [<c089c813>] ? xen_start_kernel+0x520/0x528
Mem-Info:
DMA per-cpu:
CPU    0: hi:    0, btch:   1 usd:   0
CPU    1: hi:    0, btch:   1 usd:   0
CPU    2: hi:    0, btch:   1 usd:   0
CPU    3: hi:    0, btch:   1 usd:   0
Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd:  91
CPU    1: hi:  186, btch:  31 usd:  23
CPU    2: hi:  186, btch:  31 usd: 130
CPU    3: hi:  186, btch:  31 usd:  86
HighMem per-cpu:
CPU    0: hi:   18, btch:   3 usd:  10
CPU    1: hi:   18, btch:   3 usd:  17
CPU    2: hi:   18, btch:   3 usd:  14
CPU    3: hi:   18, btch:   3 usd:   4
active_anon:35793 inactive_anon:49055 isolated_anon:0
 active_file:41148 inactive_file:36416 isolated_file:0
 unevictable:0 dirty:437 writeback:0 unstable:0
 free:16325 slab_reclaimable:4754 slab_unreclaimable:2843
 mapped:29371 shmem:11542 pagetables:1298 bounce:0
DMA free:3904kB min:108kB low:132kB high:160kB active_anon:0kB inactive_anon:484kB active_file:1228kB inactive_file:1232kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15808kB mlocked:0kB dirty:0kB writeback:0kB mapped:100kB shmem:0kB slab_reclaimable:28kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 700 754 754
Normal free:61192kB min:5008kB low:6260kB high:7512kB active_anon:132192kB inactive_anon:183580kB active_file:157112kB inactive_file:135248kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:717288kB mlocked:0kB dirty:1740kB writeback:0kB mapped:106336kB shmem:44056kB slab_reclaimable:18988kB slab_unreclaimable:11372kB kernel_stack:1864kB pagetables:5192kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 428 428
HighMem free:204kB min:128kB low:220kB high:316kB active_anon:10980kB inactive_anon:12156kB active_file:6252kB inactive_file:9184kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:54868kB mlocked:0kB dirty:8kB writeback:0kB mapped:11048kB shmem:2112kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0 0
DMA: 684*4kB 112*8kB 17*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 3904kB
Normal: 4854*4kB 3130*8kB 900*16kB 51*32kB 11*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 61192kB
HighMem: 11*4kB 2*8kB 3*16kB 1*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 204kB
100731 total pagecache pages
11592 pages in swap cache
Swap cache stats: add 205947, delete 194355, find 1954619/1973411
Free swap  = 446976kB
Total swap = 524284kB
198640 pages RAM
13826 pages HighMem
6561 pages reserved
155807 pages shared
118099 pages non-shared
SLAB: Unable to allocate memory on node 0 (gfp=0x20)
  cache: size-131072, object size: 131072, order: 5
  node 0: slabs: 0/0, objs: 0/0, free: 0</c089c813></c0899406></c089986e></c010feea></c010fcac></c010603f></c01013a7></c06f3147></c049414d></c0131a26></irq></c0131bd0></c0131c4c></c056afda></c0506e44></c0506b51></c056a79f></c0568dc3></c05f6067></c05f6600></c089c813></c0899406></c089986e></c010feea></c010fcac></c010603f></c01013a7></c06f3147></c049414d></c0131a26></irq></c0131bd0></c0131c4c></c056afda></c0506e44></c0506b51></c056a79f></c0568dc3></c05f6067></c05f6600></c05f6697></c06148cd></c061413b></c05f6600></c060d922></c0611854></c06114cd></c060eb65></c05fad58></c056bcff></c05fb7f0></c057fd7a></c05a876c></c056b9dd></c05675cf></c05628cd></c01c32d2></c056232e></c01c433e></c01c423e></c01c3e31></c01c3b41></c019bfa8></c019b538>

Hi,

mnordhoff:

Thanks for heads up. I have checked on the security issue relating to kernel version 2.6.39.

–---

Vulnerability Summary for CVE-2012-0056

The mem_write function in Linux kernel 2.6.39 and other versions, when ASLR is disabled, does not properly check permissions when writing to /proc//mem, which allows local users to gain privileges by modifying process memory


I have also run the exploit (mempodipper)- http://blog.zx2c4.com/749

on my linodes with a normal login account and did not gain root shell.


pentester@mercury:~$ whoami && id

pentester

uid=1012(pentester) gid=1011(pentester) groups=1011(pentester)

pentester@mercury:~$ ./mempodipper

===============================

= Mempodipper =

= by zx2c4 =

= Jan 21, 2012 =

===============================

[+] Waiting for transferred fd in parent.

[+] Executing child from child fork.

[+] Received fd at -1.

[-] recv_fd: Address already in use

pentester@mercury:~$ [+] Opening parent mem /proc/11146/mem in child.

[-] open: No such file or directory

pentester@mercury:~$ whoami && id

pentester

uid=1012(pentester) gid=1011(pentester) groups=1011(pentester)

–--

Confirmed this vulnerability on Debian - http://security-tracker.debian.org/trac … -2012-0056">http://security-tracker.debian.org/tracker/CVE-2012-0056 (linux-2.6 source squeeze (not affected) )


pentester@mercury:~$ lsb_release -a

No LSB modules are available.

Distributor ID: Debian

Description: Debian GNU/Linux 6.0.5 (squeeze)

Release: 6.0.5

Codename: squeeze


This raise lower concerns for now as it's only locally exploitable and I am the only user accessible to shell, non logins and system accts are properly secured. Have additional security measures with filesystem quotas, shell limits, and OSSEC HIDS (syscheck, rootkits detect, file changes, log monitoring, alerts and responses). I am actively looking into the security of my linodes with nmap and nessus scans.

theckman:

Thanks, I am considering using newer kernels when I order for new linodes. :)

We're experiencing this as well (it happened once and I'd like to ensure that it doesn't happen again). Has anyone found a definite fix for this?

I found a post that seems to imply that it's a problem with Linode's Xen configuration, not something that we can change ourselves. ~~[http://xen.1045712.n5.nabble.com/SLUB-allocation-error-on-3-0-3-4-1-1-td4795696.html" target="_blank">](http://xen.1045712.n5.nabble.com/SLUB-a … 95696.html">http://xen.1045712.n5.nabble.com/SLUB-allocation-error-on-3-0-3-4-1-1-td4795696.html](

And I found a discussion, which links to a Redhat Bugzilla that I don't have permission to access, but seems to imply that the message is faily harmless, as it just means that the kernel didn't have enough free RAM to process a network packet, so it was dropped. This could happen even if you have free Swap, as the packet is received in interrupt context and it's not safe to swap memory out to disk from that context, so if there's not enough free RAM then the kernel's hands are tied. (at least that's my interpretation of the discussion)

~~[http://thr3ads.net/centos/2012/10/2111457-swapper-page-allocation-failure.-order-1-mode-0x20" target="_blank">](http://thr3ads.net/centos/2012/10/21114 … -mode-0x20">http://thr3ads.net/centos/2012/10/2111457-swapper-page-allocation-failure.-order-1-mode-0x20](

So is this just a scary message or does it have real consequences?

Sounds just about right.

"Real world consequence" is a dropped packet, which will get retransmitted like after any other tiny networking blip.

Increasing the vmfreekbytes setting reserves a larger in-RAM buffer for various operations like this, reducing the chance you'll run into this error.

I wonder if this will fix it… http://git.kernel.org/?p=linux/kernel/g … dc06f9ee83">http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=976a702ac9eeacea09e588456ab165dc06f9ee83

I'm not sure if this got fixed by a new kernel but anyway the message appears to be harmless. As stated before it's the kernel trying and failing to allocate a chunk of memory.

Does someone with this problem want to try:

ethtool -K eth0 lro off

Did you look at the patch? It's in the IPv4 memory allocation routines for metrics, and if a kzalloc fails then it will fall back to a vzalloc - which kinda matches the symptoms we're seeing (always in the IPv4 stack, and memory allocation failures) :-)

I received that error once then i downloaded the latest version and it fixed everything

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct