[SOLVED] My Linode Dies

Hi All,

I've been having some problems recently with one of my Linodes (on Atlanta57).

I'll be doing something simple and boring and the whole thing will die.

Today's example, I had just started extracting a tarball when it died.

This is all I can gather from LISH:

 [<c016141d>] mempool_alloc+0x2d/0xe0
 [<c016141d>] mempool_alloc+0x2d/0xe0
 [<c01a72ab>] bvec_alloc_bs+0x7b/0x140
 [<c01a7571>] bio_alloc_bioset+0x51/0xe0
 [<c0425852>] clone_bio+0x42/0x90
 [<c0426a60>] __split_bio+0x370/0x3a0
 [<c0426e3f>] dm_request+0xff/0x170
 [<c03a6566>] generic_make_request+0xe6/0x230
 [<c0105c53>] xen_restore_fl_direct_end+0x0/0x1
 [<c01825f7>] kmem_cache_alloc+0x57/0xb0
 [<c016141d>] mempool_alloc+0x2d/0xe0
 [<c03a78d3>] submit_bio+0x63/0xf0
 [<c01a72bd>] bvec_alloc_bs+0x8d/0x140
 [<c01a758b>] bio_alloc_bioset+0x6b/0xe0
 [<c01a389a>] submit_bh+0xba/0xf0
 [<c01a5639>] __block_write_full_page+0x1a9/0x310
 [<c0105407>] xen_force_evtchn_callback+0x17/0x30
 [<c0212880>] ext3_get_block+0x0/0x100
 [<c01a588a>] block_write_full_page+0xea/0x100
 [<c0212880>] ext3_get_block+0x0/0x100
 [<c02141b3>] ext3_ordered_writepage+0xa3/0x170
 [<c0210f70>] bget_one+0x0/0x10
 [<c0164c78>] __writepage+0x8/0x30
 [<c016521f>] write_cache_pages</c016521f></c0164c78></c0210f70></c02141b3></c0212880></c01a588a></c0212880></c0105407></c01a5639></c01a389a></c01a758b></c01a72bd></c03a78d3></c016141d></c01825f7></c0105c53></c03a6566></c0426e3f></c0426a60></c0425852></c01a7571></c01a72ab></c016141d></c016141d>

I know these things are near on impossible to diagnose, but any suggestions folks? It's quite annoying :(

EDIT: forgot to mention, I'm running ArchLinux with kernel 2.6.28-linode15

EDIT 2: Here's the logs from the time it died:

Jun 25 17:12:34 platypus kernel: [IPT ISC] : IN=eth0 OUT= MAC=ff:ff:ff:ff:ff:ff:fe:fd:40:16:47:15:08:00 SRC=192.168.139.100 DST=192.168.255.255 LEN=243 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=UDP SPT=138 DPT=138 LEN=223 
Jun 25 17:12:34 platypus kernel: [IPT ISC] : IN=eth0 OUT= MAC=ff:ff:ff:ff:ff:ff:fe:fd:40:16:47:15:08:00 SRC=192.168.139.100 DST=192.168.255.255 LEN=235 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=UDP SPT=138 DPT=138 LEN=215 
Jun 25 17:24:16 dingo syslog-ng[3743]: syslog-ng starting up; version='3.0.1'
Jun 25 17:24:16 dingo kernel: Reserving virtual address space above 0xf5800000
Jun 25 17:24:16 dingo kernel: Linux version 2.6.28-linode15 (root@db1.linode.com) (gcc version 4.2.4 (Ubuntu 4.2.4-1ubuntu3)) #2 SMP Wed Jan 14 09:18:53 EST 2009

3 Replies

Well it happened again, I managed to get a proper kernel trace this time:

------------[ cut here ]------------
kernel BUG at drivers/block/xen-blkfront.c:243!
invalid opcode: 0000 [#1] SMP
last sysfs file: /sys/block/dm-4/removable
Modules linked in:

Pid: 21028, comm: perl Not tainted (2.6.28-linode15 #2)
EIP: 0061:[<c03ee830>] EFLAGS: 00010046 CPU: 0
EIP is at do_blkif_request+0x2e0/0x360
EAX: 00000001 EBX: 00000000 ECX: d43a5bc0 EDX: c343edb0
ESI: d5952288 EDI: d59522c8 EBP: 000001c3 ESP: c151fe98
 DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069
Process perl (pid: 21028, ti=c151e000 task=d49d2040 task.ti=c151e000)
Stack:
 00000005 d5952288 00000288 d5988028 d5956000 c420864c 00000007 0000000d
 d5956000 00000002 00000006 d5952000 00000000 d43a5bc0 d2c7de0c ffffffff
 d5988028 d5956000 0000000b 00000014 c03a6ca5 d5956000 c03ee8c6 00000000
Call Trace:
 [<c03a6ca5>] blk_invoke_request_fn+0x95/0x100
 [<c03ee8c6>] kick_pending_request_queues+0x16/0x30
 [<c03eea6d>] blkif_interrupt+0x18d/0x1d0
 [<c0159510>] handle_IRQ_event+0x30/0x60
 [<c015b428>] handle_level_irq+0x78/0xf0
 [<c010aae7>] do_IRQ+0x77/0x90
 [<c03c8968>] xen_evtchn_do_upcall+0xe8/0x150
 [<c0109197>] xen_do_upcall+0x7/0xc
Code: 2c 8d 54 03 40 8d 44 0e 54 b9 6c 00 00 00 e8 98 a5 fc ff 8b 44 24 3c e8 ff 92 fd ff 83 44 24 18 01 e9 40 fd ff ff 0f 0b eb fe 90 <0f> 0b eb fe 8b 44 24 20 ba 40 e5 3e c0 8b 4c 24 20 c7 04 24 0b
EIP: [<c03ee830>] do_blkif_request+0x2e0/0x360 SS:ESP 0069:c151fe98
Kernel panic - not syncing: Fatal exception in interrupt
------------[ cut here ]------------
WARNING: at kernel/smp.c:333 smp_call_function_mask+0x1cb/0x1d0()
Modules linked in:
Pid: 21028, comm: perl Tainted: G      D    2.6.28-linode15 #2
Call Trace:
 [<c0128adf>] warn_on_slowpath+0x5f/0x90
 [<c03b8e26>] memmove+0x36/0x40
 [<c03dcc5a>] scrup+0x7a/0xe0
 [<c0140987>] atomic_notifier_call_chain+0x17/0x20
 [<c03dccdf>] notify_update+0x1f/0x30
 [<c03dcf6a>] vt_console_print+0x20a/0x2d0
 [<c0105407>] xen_force_evtchn_callback+0x17/0x30
 [<c0105cea>] check_events+0x8/0xe
 [<c0105c53>] xen_restore_fl_direct_end+0x0/0x1
 [<c0105407>] xen_force_evtchn_callback+0x17/0x30
 [<c0105cea>] check_events+0x8/0xe
 [<c0105c53>] xen_restore_fl_direct_end+0x0/0x1
 [<c01295e0>] vprintk+0x170/0x350
 [<c014a46b>] smp_call_function_mask+0x1cb/0x1d0
 [<c0105fd0>] stop_self+0x0/0x30
 [<c0105407>] xen_force_evtchn_callback+0x17/0x30
 [<c0105cea>] check_events+0x8/0xe
 [<c0105c53>] xen_restore_fl_direct_end+0x0/0x1
 [<c0561ca3>] _spin_unlock_irqrestore+0x13/0x20
 [<c03dec96>] do_unblank_screen+0x16/0x130
 [<c014a484>] smp_call_function+0x14/0x20
 [<c0128b6e>] panic+0x4e/0x100
 [<c010ac3c>] oops_end+0x8c/0xa0
 [<c0109b50>] do_invalid_op+0x0/0xa0
 [<c0109bcf>] do_invalid_op+0x7f/0xa0
 [<c03ee830>] do_blkif_request+0x2e0/0x360
 [<c0105407>] xen_force_evtchn_callback+0x17/0x30
 [<c0105cea>] check_events+0x8/0xe
 [<c0105407>] xen_force_evtchn_callback+0x17/0x30
 [<c0105407>] xen_force_evtchn_callback+0x17/0x30
 [<c0105cea>] check_events+0x8/0xe
 [<c0105407>] xen_force_evtchn_callback+0x17/0x30
 [<c0105cea>] check_events+0x8/0xe
 [<c0105c53>] xen_restore_fl_direct_end+0x0/0x1
 [<c0561ca3>] _spin_unlock_irqrestore+0x13/0x20
 [<c0561f4a>] error_code+0x72/0x78
 [<c03ee830>] do_blkif_request+0x2e0/0x360
 [<c03a6ca5>] blk_invoke_request_fn+0x95/0x100
 [<c03ee8c6>] kick_pending_request_queues+0x16/0x30
 [<c03eea6d>] blkif_interrupt+0x18d/0x1d0
 [<c0159510>] handle_IRQ_event+0x30/0x60
 [<c015b428>] handle_level_irq+0x78/0xf0
 [<c010aae7>] do_IRQ+0x77/0x90
 [<c03c8968>] xen_evtchn_do_upcall+0xe8/0x150
 [<c0109197>] xen_do_upcall+0x7/0xc
---[ end trace c449499288c87a80 ]---</c0109197></c03c8968></c010aae7></c015b428></c0159510></c03eea6d></c03ee8c6></c03a6ca5></c03ee830></c0561f4a></c0561ca3></c0105c53></c0105cea></c0105407></c0105cea></c0105407></c0105407></c0105cea></c0105407></c03ee830></c0109bcf></c0109b50></c010ac3c></c0128b6e></c014a484></c03dec96></c0561ca3></c0105c53></c0105cea></c0105407></c0105fd0></c014a46b></c01295e0></c0105c53></c0105cea></c0105407></c0105c53></c0105cea></c0105407></c03dcf6a></c03dccdf></c0140987></c03dcc5a></c03b8e26></c0128adf></c03ee830></c0109197></c03c8968></c010aae7></c015b428></c0159510></c03eea6d></c03ee8c6></c03a6ca5></c03ee830>

It's a known bug with kernel 2.6.28-linode15.

2.6.28.3-linode17 has a fix.

Thanks – I opened a support ticket and Chris just told me the same thing. I keep forgetting that pacman doesn't keep me up to date with Linode kernels ;)

I've updated all my Linodes to 2.6.30 now :)

Reply

Please enter an answer
Tips:

You can mention users to notify them: @username

You can use Markdown to format your question. For more examples see the Markdown Cheatsheet.

> I’m a blockquote.

I’m a blockquote.

[I'm a link] (https://www.google.com)

I'm a link

**I am bold** I am bold

*I am italicized* I am italicized

Community Code of Conduct