| Author |
Message |
edavis
Joined: 20 Jul 2004
Posts: 35
|
| Posted: Thu May 29, 2008 12:31 pm Post subject: my xen linode crashed (solved) |
|
|
Here is what I found on the Lish console. I'm on fremont37 and running the 2.6.25-linode9 kernel. Any ideas?
Code:
BUG: unable to handle kernel paging request at 889cbbea
IP: [<c0182b97>] prune_dcache+0x87/0x180
*pdpt = 000000022a452027
Oops: 0002 [#4] SMP
Modules linked in:
Pid: 122, comm: kswapd0 Tainted: G D (2.6.25-linode9 #1)
EIP: 0061:[<c0182b97>] EFLAGS: 00010202 CPU: 1
EIP is at prune_dcache+0x87/0x180
EAX: 889cbbea EBX: da6cb094 ECX: da6cb0bc EDX: c0615d4c
ESI: 00000068 EDI: 00000000 EBP: 00000001 ESP: ecdc1ef8
DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
Process kswapd0 (pid: 122, ti=ecdc0000 task=ecd40eb0 task.ti=ecdc0000)
Stack: 00006f54 c0615d38 00000096 000290a5 c0182cca c0155c65 c12d4be0 c130d900
ecdc1f18 00000020 00000000 c069f640 000000d0 00000000 00000180 00000001
c0612680 c0613200 00000001 c01570dc 00000001 00000000 c06154fc 00000000
Call Trace:
[<c0182cca>] shrink_dcache_memory+0x3a/0x40
[<c0155c65>] shrink_slab+0x115/0x190
[<c01570dc>] kswapd+0x2cc/0x470
[<c0155ad0>] isolate_pages_global+0x0/0x80
[<c0136330>] autoremove_wake_function+0x0/0x50
[<c0156e10>] kswapd+0x0/0x470
[<c0136134>] kthread+0x74/0x80
[<c01360c0>] kthread+0x0/0x80
[<c01077a7>] kernel_thread_helper+0x7/0x10
=======================
Code: 8d 59 d8 3b 7b 50 0f 84 f4 00 00 00 40 8b 49 04 39 c2 75 e0 81 f9 4c 5d 61 c0 74 73 8d 59 d8 4
e 8b 41 04 8b 11 89 42 04 89 49 04 <89> 10 a1 50 5d 61 c0 89 09 0f 18 00 90 8d 43 08 ff 0d 24 5d 61
EIP: [<c0182b97>] prune_dcache+0x87/0x180 SS:ESP 0069:ecdc1ef8
---[ end trace 312d266a1134b5bd ]---
----------------------------------------------------------------------------------------------------
|
|
| Back to top |
|
caker
Joined: 15 Apr 2003
Posts: 2370
Location: Galloway, NJ
|
| Posted: Thu May 29, 2008 12:34 pm Post subject: |
|
|
Well .. the pv_ops Xen kernels (those > 2.6.18) are still somewhat experimental. I'd give it a few more versions before running those in production.
I'll forward your crash dump along to the pv_ops guys.
-Chris |
|
| Back to top |
|
edavis
Joined: 20 Jul 2004
Posts: 35
|
| Posted: Thu May 29, 2008 12:43 pm Post subject: |
|
|
That's fine. I've been using that kernel for about a month now without problems till this morning.
Thanks! |
|
| Back to top |
|
edavis
Joined: 20 Jul 2004
Posts: 35
|
| Posted: Thu May 29, 2008 1:01 pm Post subject: |
|
|
Well, I just rebooted and it crashed again very quickly (approx 10 minutes) using the same kernel. I'll try the Latest 2.6 Series kernel. I can't get the full trace from Lish because of the terminal size but here is what I can see:
Code:
[<c039407d>] notify_update+0x1d/0x30
[<c039518d>] vt_console_print+0x1ed/0x2d0
[<c0394fa0>] vt_console_print+0x0/0x2d0
[<c0122dee>] __call_console_drivers+0x5e/0x70
[<c051c69c>] _spin_unlock_irqrestore+0xc/0x10
[<c014272d>] sys_futex+0x9d/0x130
[<c0123830>] vprintk+0x1c0/0x390
[<c012073f>] mm_release+0x7f/0x90
[<c01248d2>] exit_mm+0x12/0xe0
[<c012636c>] do_exit+0x14c/0x6d0
[<c0123a1b>] printk+0x1b/0x20
[<c010857f>] die+0x17f/0x180
[<c0115624>] do_page_fault+0x564/0xa20
[<c0151155>] __alloc_pages+0x55/0x370
[<c010348a>] xen_set_pte_at+0x6a/0xf0
[<c01150c0>] do_page_fault+0x0/0xa20
[<c051c932>] error_code+0x72/0x78
[<c016007b>] vma_link+0x6b/0x100
[<c01500d8>] setup_per_zone_lowmem_reserve+0x48/0xf0
[<c0103195>] xen_make_pte+0x45/0x50
[<c01620da>] mprotect_fixup+0x3da/0x600
[<c051c60a>] _spin_lock_irq+0xa/0x30
[<c012c3e0>] run_timer_softirq+0x130/0x190
[<c0162489>] sys_mprotect+0x189/0x240
[<c0106bce>] syscall_call+0x7/0xb
=======================
Code: b8 01 00 00 00 e9 d6 ff ff ff 8d b6 00 00 00 00 64 8b 15 10 e1 69 c0 b8 6c e1 69 c0 8b 0c 10 8
5 c9 75 08 c7 04 10 02 00 00 00 c3 <0f> 0b eb fe 90 b8 02 00 00 00 e9 a6 ff ff ff 8d b6 00 00 00 00
EIP: [<c0113a9b>] paravirt_enter_lazy_cpu+0x1b/0x20 SS:ESP 0069:d773bc0c
---[ end trace 6e3af9dcb4e20f12 ]---
Fixing recursive fault but reboot is needed!
----------------------------------------------------------------------------------------------------
I can't help but think something is wrong with fremont37. |
|
| Back to top |
|
edavis
Joined: 20 Jul 2004
Posts: 35
|
| Posted: Mon Jun 02, 2008 12:14 pm Post subject: |
|
|
My linode crashed yet again while running the 2.6.18.8-domU-linode7 kernel... :-(
Code:
Call Trace:
[<c0173edf>] prune_one_dentry+0x54/0x75
[<c0174031>] prune_dcache+0x131/0x15b
[<c0174094>] shrink_dcache_memory+0x39/0x3b
[<c01439a0>] shrink_slab+0x111/0x186
[<c012eb7b>] finish_wait+0x25/0x4b
[<c0144d5b>] kswapd+0x2e9/0x3eb
[<c012e960>] autoremove_wake_function+0x0/0x37
[<c0144a72>] kswapd+0x0/0x3eb
[<c012e89a>] kthread+0xde/0xe2
[<c012e7bc>] kthread+0x0/0xe2
[<c0102b75>] kernel_thread_helper+0x5/0xb
Code: 06 89 d8 ff d2 5b c3 89 da a1 00 de 58 c0 5b e9 4f 52 fe ff 0f 0b ae 00 26 6b 4c c0 eb c5 53 89 c3 85 c0 74 58 8b 80 9c 00 00 00 <8b> 40 20 83 bb 40 01 00 00 20 74 48 85 c0 74 0e 8b 50 14 85 d2
EIP: [<c01750f4>] iput+0xd/0x6b SS:ESP 0069:ec5dfee4
<1>BUG: unable to handle kernel paging request at virtual address 009f4557
printing eip:
c01750f4
21e7d000 -> *pde = 00000002:10fff027
083bc000 -> *pme = 00000000:00000000
Oops: 0000 [#2]
SMP
Modules linked in:
CPU: 0
EIP: 0061:[<c01750f4>] Not tainted VLI
EFLAGS: 00210286 (2.6.18.8-domU-linode7 #1)
EIP is at iput+0xd/0x6b
eax: 009f4537 ebx: d110b0d4 ecx: d110b0ec edx: d110b0ec
esi: 0000007f edi: 00000000 ebp: ec38da3c esp: d5c91c44
ds: 007b es: 007b ss: 0069
Process ruby (pid: 25356, ti=d5c90000 task=ecae0ab0 task.ti=d5c90000)
Stack: c5393b14 c0173edf c5393b14 0000007f c0174031 00000080 00000000 0000f230
c169eac0 00000090 00027c99 c0174094 c01439a0 00000010 c017ea49 c1037020
00000018 00000000 c05c6fe0 00000080 000201d2 00000000 0000000c 00000000
Call Trace:
[<c0173edf>] prune_one_dentry+0x54/0x75
[<c0174031>] prune_dcache+0x131/0x15b
[<c0174094>] shrink_dcache_memory+0x39/0x3b
[<c01439a0>] shrink_slab+0x111/0x186
[<c017ea49>] mpage_bio_submit+0x19/0x1d
[<c0144f94>] try_to_free_pages+0x137/0x1f3
[<c0140ae7>] __alloc_pages+0x12e/0x2d2
[<c0141f6b>] __do_page_cache_readahead+0x1f4/0x28a
[<c01d073a>] ext3_get_block+0x0/0xc8
[<c0114c65>] __wake_up+0x32/0x43
[<c01421f5>] blockable_page_cache_readahead+0x53/0xbb
[<c01422b8>] make_ahead_window+0x5b/0x9b
[<c014244e>] page_cache_readahead+0x156/0x1c8
[<c013d52c>] do_generic_mapping_read+0x40e/0x496
[<c013de28>] __generic_file_aio_read+0x16e/0x243
[<c013bc05>] file_read_actor+0x0/0xc7
[<c010808e>] timer_interrupt+0x448/0x685
[<c012351c>] __capable+0xc/0x1f
[<c013e09f>] generic_file_aio_read+0x3e/0x4f
[<c015db64>] do_sync_read+0xc1/0x11c
[<c0131741>] hrtimer_run_queues+0xc1/0x193
[<c012e960>] autoremove_wake_function+0x0/0x37
[<c0120d86>] __do_softirq+0x8b/0x116
[<c015daa3>] do_sync_read+0x0/0x11c
[<c015df67>] vfs_read+0xa2/0x160
[<c015ea2e>] sys_read+0x41/0x6a
[<c0105137>] syscall_call+0x7/0xb
Code: 06 89 d8 ff d2 5b c3 89 da a1 00 de 58 c0 5b e9 4f 52 fe ff 0f 0b ae 00 26 6b 4c c0 eb c5 53 89 c3 85 c0 74 58 8b 80 9c 00 00 00 <8b> 40 20 83 bb 40 01 00 00 20 74 48 85 c0 74 0e 8b 50 14 85 d2
EIP: [<c01750f4>] iput+0xd/0x6b SS:ESP 0069:d5c91c44
|
|
| Back to top |
|
edavis
Joined: 20 Jul 2004
Posts: 35
|
| Posted: Mon Jun 16, 2008 12:17 pm Post subject: |
|
|
| My linode crashed again. Note that after the crash the system is halted and CPU utilization on the dashboard shows 100%.... and Lassie doesn't catch it / reboot it. Shouldn't my linode get rebooted automatically after a panic? |
|
| Back to top |
|
ArbitraryConstant
Joined: 10 Feb 2007
Posts: 52
|
| Posted: Mon Jun 16, 2008 12:40 pm Post subject: |
|
|
| I had exactly the same thing happen to a client's linode. Lassie didn't reboot it, but I did get a warning that I'd gone 100% CPU. I saw a kernel trace like the ones posted here. |
|
| Back to top |
|
edavis
Joined: 20 Jul 2004
Posts: 35
|
| Posted: Fri Jun 27, 2008 11:28 am Post subject: |
|
|
Crashed again. This time a bit different. I still have a login prompt via lish and when attempt to login I always get the following stack dump. After the dump is printed the login prompt comes back.
<1>BUG: unable to handle kernel paging request at virtual address 74732e31
printing eip:
c0174d30
13c6f000 -> *pde = 00000004:6f1a9007
25e55000 -> *pme = 00000000:00000000
Oops: 0000 [#236]
SMP
Modules linked in:
CPU: 3
EIP: 0061:[<c0174d30>] Not tainted VLI
EFLAGS: 00010202 (2.6.18.8-domU-linode7 #1)
EIP is at __d_lookup+0x5b/0xf0
eax: c1602228 ebx: 74732e31 ecx: 00000011 edx: 000056db
esi: d56d9e0c edi: d56d9f34 ebp: d56d9e18 esp: d56d9d94
ds: 007b es: 007b ss: 0069
Process imap (pid: 4950, ti=d56d8000 task=ebd81ab0 task.ti=d56d8000)
Stack: dfbcd594 32fdb346 ec537b94 ec537b94 d56d9e0c 00000000 0000000b d97dd011
d56d9e0c d56d9e0c d56d9f34 d56d9e18 c016b774 ec4eb840 c01dca50 00000001
dfbcd614 32fdb346 d56d9e0c d56d9f34 d97dd01c c016c04e d56d9f34 d97dd011
Call Trace:
[<c016b774>] do_lookup+0x1a/0x127
[<c01dca50>] ext3_permission+0x0/0xa
[<c016c04e>] __link_path_walk+0x7cd/0xeec
[<c016c7b2>] link_path_walk+0x45/0xcb
[<c015d00b>] get_unused_fd+0x57/0xbc
[<c016cbdf>] do_path_lookup+0xab/0x23b
[<c016d1d1>] __path_lookup_intent_open+0x48/0x83
[<c016d280>] path_lookup_open+0x20/0x25
[<c016d4ea>] open_namei+0x6e/0x65c
[<c015d5f3>] do_filp_open+0x25/0x40
[<c015d00b>] get_unused_fd+0x57/0xbc
[<c015d64c>] do_sys_open+0x3e/0xc7
[<c012351c>] __capable+0xc/0x1f
[<c015d712>] sys_open+0x1c/0x20
[<c0105137>] syscall_call+0x7/0xb
Code: 07 01 d0 89 c2 81 f2 01 00 37 9e 8b 0d f4 dd 58 c0 d3 ea 31 d0 23 05 f8 dd 58 c0 01 c0 01 c0 03 05 f0 dd 58 c0 8b 18 85 db 74 20 <8b> 03 0f 18 00 90 8d 6b f0 8b 54 24 04 3b 55 1c 75 08 8b 34 24
EIP: [<c0174d30>] __d_lookup+0x5b/0xf0 SS:ESP 0069:d56d9d94 |
|
| Back to top |
|
edavis
Joined: 20 Jul 2004
Posts: 35
|
| Posted: Fri Jun 27, 2008 11:34 am Post subject: |
|
|
| I'm using the Latest 2.6 Series (2.6.18.8-linode10) kernel. Is there a recommended kernel to use with Xen? |
|
| Back to top |
|
caker
Joined: 15 Apr 2003
Posts: 2370
Location: Galloway, NJ
|
| Posted: Fri Jun 27, 2008 12:23 pm Post subject: |
|
|
Your crash reports are all from 2.6.18.8-domU-linode7. Let me know if -linode10 solves this for you.
Thanks,
-Chris |
|
| Back to top |
|
edavis
Joined: 20 Jul 2004
Posts: 35
|
| Posted: Fri Jun 27, 2008 2:56 pm Post subject: |
|
|
caker wrote: Your crash reports are all from 2.6.18.8-domU-linode7. Let me know if -linode10 solves this for you.
Thanks,
-Chris
Oh ok. I see my latest reboot picked up -linode10. I'll let you know how it goes... |
|
| Back to top |
|
edavis
Joined: 20 Jul 2004
Posts: 35
|
| Posted: Thu Jul 17, 2008 11:42 am Post subject: my xen linode crashed (solved) |
|
|
| I haven't had any problem since running the new -linode10 kernel. Thank you. |
|
| Back to top |
|
| |