Linode.com Forum Forum Index Linode.com Forum
Linode Community Forums
 


my xen linode crashed (solved)

Click here to go to the original topic

 
       Linode.com Forum Forum Index -> Feature Request/Bug Report
Author Message
edavis



Joined: 20 Jul 2004
Posts: 35

Posted: Thu May 29, 2008 12:31 pm    Post subject: my xen linode crashed (solved)  

Here is what I found on the Lish console. I'm on fremont37 and running the 2.6.25-linode9 kernel. Any ideas?

Code:
BUG: unable to handle kernel paging request at 889cbbea
IP: [<c0182b97>] prune_dcache+0x87/0x180
*pdpt = 000000022a452027
Oops: 0002 [#4] SMP
Modules linked in:

Pid: 122, comm: kswapd0 Tainted: G      D  (2.6.25-linode9 #1)
EIP: 0061:[<c0182b97>] EFLAGS: 00010202 CPU: 1
EIP is at prune_dcache+0x87/0x180
EAX: 889cbbea EBX: da6cb094 ECX: da6cb0bc EDX: c0615d4c
ESI: 00000068 EDI: 00000000 EBP: 00000001 ESP: ecdc1ef8
 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069
Process kswapd0 (pid: 122, ti=ecdc0000 task=ecd40eb0 task.ti=ecdc0000)
Stack: 00006f54 c0615d38 00000096 000290a5 c0182cca c0155c65 c12d4be0 c130d900
       ecdc1f18 00000020 00000000 c069f640 000000d0 00000000 00000180 00000001
       c0612680 c0613200 00000001 c01570dc 00000001 00000000 c06154fc 00000000
Call Trace:
 [<c0182cca>] shrink_dcache_memory+0x3a/0x40
 [<c0155c65>] shrink_slab+0x115/0x190
 [<c01570dc>] kswapd+0x2cc/0x470
 [<c0155ad0>] isolate_pages_global+0x0/0x80
 [<c0136330>] autoremove_wake_function+0x0/0x50
 [<c0156e10>] kswapd+0x0/0x470
 [<c0136134>] kthread+0x74/0x80
 [<c01360c0>] kthread+0x0/0x80
 [<c01077a7>] kernel_thread_helper+0x7/0x10
 =======================
Code: 8d 59 d8 3b 7b 50 0f 84 f4 00 00 00 40 8b 49 04 39 c2 75 e0 81 f9 4c 5d 61 c0 74 73 8d 59 d8 4
e 8b 41 04 8b 11 89 42 04 89 49 04 <89> 10 a1 50 5d 61 c0 89 09 0f 18 00 90 8d 43 08 ff 0d 24 5d 61
EIP: [<c0182b97>] prune_dcache+0x87/0x180 SS:ESP 0069:ecdc1ef8
---[ end trace 312d266a1134b5bd ]---

----------------------------------------------------------------------------------------------------
Back to top  
caker



Joined: 15 Apr 2003
Posts: 2370
Location: Galloway, NJ

Posted: Thu May 29, 2008 12:34 pm    Post subject:  

Well .. the pv_ops Xen kernels (those > 2.6.18) are still somewhat experimental. I'd give it a few more versions before running those in production.

I'll forward your crash dump along to the pv_ops guys.

-Chris
Back to top  
edavis



Joined: 20 Jul 2004
Posts: 35

Posted: Thu May 29, 2008 12:43 pm    Post subject:  

That's fine. I've been using that kernel for about a month now without problems till this morning.

Thanks!
Back to top  
edavis



Joined: 20 Jul 2004
Posts: 35

Posted: Thu May 29, 2008 1:01 pm    Post subject:  

Well, I just rebooted and it crashed again very quickly (approx 10 minutes) using the same kernel. I'll try the Latest 2.6 Series kernel. I can't get the full trace from Lish because of the terminal size but here is what I can see:

Code:
 [<c039407d>] notify_update+0x1d/0x30
 [<c039518d>] vt_console_print+0x1ed/0x2d0
 [<c0394fa0>] vt_console_print+0x0/0x2d0
 [<c0122dee>] __call_console_drivers+0x5e/0x70
 [<c051c69c>] _spin_unlock_irqrestore+0xc/0x10
 [<c014272d>] sys_futex+0x9d/0x130
 [<c0123830>] vprintk+0x1c0/0x390
 [<c012073f>] mm_release+0x7f/0x90
 [<c01248d2>] exit_mm+0x12/0xe0
 [<c012636c>] do_exit+0x14c/0x6d0
 [<c0123a1b>] printk+0x1b/0x20
 [<c010857f>] die+0x17f/0x180
 [<c0115624>] do_page_fault+0x564/0xa20
 [<c0151155>] __alloc_pages+0x55/0x370
 [<c010348a>] xen_set_pte_at+0x6a/0xf0
 [<c01150c0>] do_page_fault+0x0/0xa20
 [<c051c932>] error_code+0x72/0x78
 [<c016007b>] vma_link+0x6b/0x100
 [<c01500d8>] setup_per_zone_lowmem_reserve+0x48/0xf0
 [<c0103195>] xen_make_pte+0x45/0x50
 [<c01620da>] mprotect_fixup+0x3da/0x600
 [<c051c60a>] _spin_lock_irq+0xa/0x30
 [<c012c3e0>] run_timer_softirq+0x130/0x190
 [<c0162489>] sys_mprotect+0x189/0x240
 [<c0106bce>] syscall_call+0x7/0xb
 =======================
Code: b8 01 00 00 00 e9 d6 ff ff ff 8d b6 00 00 00 00 64 8b 15 10 e1 69 c0 b8 6c e1 69 c0 8b 0c 10 8
5 c9 75 08 c7 04 10 02 00 00 00 c3 <0f> 0b eb fe 90 b8 02 00 00 00 e9 a6 ff ff ff 8d b6 00 00 00 00
EIP: [<c0113a9b>] paravirt_enter_lazy_cpu+0x1b/0x20 SS:ESP 0069:d773bc0c
---[ end trace 6e3af9dcb4e20f12 ]---
Fixing recursive fault but reboot is needed!

----------------------------------------------------------------------------------------------------


I can't help but think something is wrong with fremont37.
Back to top  
edavis



Joined: 20 Jul 2004
Posts: 35

Posted: Mon Jun 02, 2008 12:14 pm    Post subject:  

My linode crashed yet again while running the 2.6.18.8-domU-linode7 kernel... :-(

Code:
Call Trace:
 [<c0173edf>] prune_one_dentry+0x54/0x75
 [<c0174031>] prune_dcache+0x131/0x15b
 [<c0174094>] shrink_dcache_memory+0x39/0x3b
 [<c01439a0>] shrink_slab+0x111/0x186
 [<c012eb7b>] finish_wait+0x25/0x4b
 [<c0144d5b>] kswapd+0x2e9/0x3eb
 [<c012e960>] autoremove_wake_function+0x0/0x37
 [<c0144a72>] kswapd+0x0/0x3eb
 [<c012e89a>] kthread+0xde/0xe2
 [<c012e7bc>] kthread+0x0/0xe2
 [<c0102b75>] kernel_thread_helper+0x5/0xb
Code: 06 89 d8 ff d2 5b c3 89 da a1 00 de 58 c0 5b e9 4f 52 fe ff 0f 0b ae 00 26 6b 4c c0 eb c5 53 89 c3 85 c0 74 58 8b 80 9c 00 00 00 <8b> 40 20 83 bb 40 01 00 00 20 74 48 85 c0 74 0e 8b 50 14 85 d2
EIP: [<c01750f4>] iput+0xd/0x6b SS:ESP 0069:ec5dfee4
 <1>BUG: unable to handle kernel paging request at virtual address 009f4557
 printing eip:
c01750f4
21e7d000 -> *pde = 00000002:10fff027
083bc000 -> *pme = 00000000:00000000
Oops: 0000 [#2]
SMP
Modules linked in:
CPU:    0
EIP:    0061:[<c01750f4>]    Not tainted VLI
EFLAGS: 00210286   (2.6.18.8-domU-linode7 #1)
EIP is at iput+0xd/0x6b
eax: 009f4537   ebx: d110b0d4   ecx: d110b0ec   edx: d110b0ec
esi: 0000007f   edi: 00000000   ebp: ec38da3c   esp: d5c91c44
ds: 007b   es: 007b   ss: 0069
Process ruby (pid: 25356, ti=d5c90000 task=ecae0ab0 task.ti=d5c90000)
Stack: c5393b14 c0173edf c5393b14 0000007f c0174031 00000080 00000000 0000f230
       c169eac0 00000090 00027c99 c0174094 c01439a0 00000010 c017ea49 c1037020
       00000018 00000000 c05c6fe0 00000080 000201d2 00000000 0000000c 00000000
Call Trace:
 [<c0173edf>] prune_one_dentry+0x54/0x75
 [<c0174031>] prune_dcache+0x131/0x15b
 [<c0174094>] shrink_dcache_memory+0x39/0x3b
 [<c01439a0>] shrink_slab+0x111/0x186
 [<c017ea49>] mpage_bio_submit+0x19/0x1d
 [<c0144f94>] try_to_free_pages+0x137/0x1f3
 [<c0140ae7>] __alloc_pages+0x12e/0x2d2
 [<c0141f6b>] __do_page_cache_readahead+0x1f4/0x28a
 [<c01d073a>] ext3_get_block+0x0/0xc8
 [<c0114c65>] __wake_up+0x32/0x43
 [<c01421f5>] blockable_page_cache_readahead+0x53/0xbb
 [<c01422b8>] make_ahead_window+0x5b/0x9b
 [<c014244e>] page_cache_readahead+0x156/0x1c8
 [<c013d52c>] do_generic_mapping_read+0x40e/0x496
 [<c013de28>] __generic_file_aio_read+0x16e/0x243
 [<c013bc05>] file_read_actor+0x0/0xc7
 [<c010808e>] timer_interrupt+0x448/0x685
 [<c012351c>] __capable+0xc/0x1f
 [<c013e09f>] generic_file_aio_read+0x3e/0x4f
 [<c015db64>] do_sync_read+0xc1/0x11c
 [<c0131741>] hrtimer_run_queues+0xc1/0x193
 [<c012e960>] autoremove_wake_function+0x0/0x37
 [<c0120d86>] __do_softirq+0x8b/0x116
 [<c015daa3>] do_sync_read+0x0/0x11c
 [<c015df67>] vfs_read+0xa2/0x160
 [<c015ea2e>] sys_read+0x41/0x6a
 [<c0105137>] syscall_call+0x7/0xb
Code: 06 89 d8 ff d2 5b c3 89 da a1 00 de 58 c0 5b e9 4f 52 fe ff 0f 0b ae 00 26 6b 4c c0 eb c5 53 89 c3 85 c0 74 58 8b 80 9c 00 00 00 <8b> 40 20 83 bb 40 01 00 00 20 74 48 85 c0 74 0e 8b 50 14 85 d2
EIP: [<c01750f4>] iput+0xd/0x6b SS:ESP 0069:d5c91c44
Back to top  
edavis



Joined: 20 Jul 2004
Posts: 35

Posted: Mon Jun 16, 2008 12:17 pm    Post subject:  

My linode crashed again. Note that after the crash the system is halted and CPU utilization on the dashboard shows 100%.... and Lassie doesn't catch it / reboot it. Shouldn't my linode get rebooted automatically after a panic?
Back to top  
ArbitraryConstant



Joined: 10 Feb 2007
Posts: 52

Posted: Mon Jun 16, 2008 12:40 pm    Post subject:  

I had exactly the same thing happen to a client's linode. Lassie didn't reboot it, but I did get a warning that I'd gone 100% CPU. I saw a kernel trace like the ones posted here.
Back to top  
edavis



Joined: 20 Jul 2004
Posts: 35

Posted: Fri Jun 27, 2008 11:28 am    Post subject:  

Crashed again. This time a bit different. I still have a login prompt via lish and when attempt to login I always get the following stack dump. After the dump is printed the login prompt comes back.

<1>BUG: unable to handle kernel paging request at virtual address 74732e31
printing eip:
c0174d30
13c6f000 -> *pde = 00000004:6f1a9007
25e55000 -> *pme = 00000000:00000000
Oops: 0000 [#236]
SMP
Modules linked in:
CPU: 3
EIP: 0061:[<c0174d30>] Not tainted VLI
EFLAGS: 00010202 (2.6.18.8-domU-linode7 #1)
EIP is at __d_lookup+0x5b/0xf0
eax: c1602228 ebx: 74732e31 ecx: 00000011 edx: 000056db
esi: d56d9e0c edi: d56d9f34 ebp: d56d9e18 esp: d56d9d94
ds: 007b es: 007b ss: 0069
Process imap (pid: 4950, ti=d56d8000 task=ebd81ab0 task.ti=d56d8000)
Stack: dfbcd594 32fdb346 ec537b94 ec537b94 d56d9e0c 00000000 0000000b d97dd011
d56d9e0c d56d9e0c d56d9f34 d56d9e18 c016b774 ec4eb840 c01dca50 00000001
dfbcd614 32fdb346 d56d9e0c d56d9f34 d97dd01c c016c04e d56d9f34 d97dd011
Call Trace:
[<c016b774>] do_lookup+0x1a/0x127
[<c01dca50>] ext3_permission+0x0/0xa
[<c016c04e>] __link_path_walk+0x7cd/0xeec
[<c016c7b2>] link_path_walk+0x45/0xcb
[<c015d00b>] get_unused_fd+0x57/0xbc
[<c016cbdf>] do_path_lookup+0xab/0x23b
[<c016d1d1>] __path_lookup_intent_open+0x48/0x83
[<c016d280>] path_lookup_open+0x20/0x25
[<c016d4ea>] open_namei+0x6e/0x65c
[<c015d5f3>] do_filp_open+0x25/0x40
[<c015d00b>] get_unused_fd+0x57/0xbc
[<c015d64c>] do_sys_open+0x3e/0xc7
[<c012351c>] __capable+0xc/0x1f
[<c015d712>] sys_open+0x1c/0x20
[<c0105137>] syscall_call+0x7/0xb
Code: 07 01 d0 89 c2 81 f2 01 00 37 9e 8b 0d f4 dd 58 c0 d3 ea 31 d0 23 05 f8 dd 58 c0 01 c0 01 c0 03 05 f0 dd 58 c0 8b 18 85 db 74 20 <8b> 03 0f 18 00 90 8d 6b f0 8b 54 24 04 3b 55 1c 75 08 8b 34 24
EIP: [<c0174d30>] __d_lookup+0x5b/0xf0 SS:ESP 0069:d56d9d94
Back to top  
edavis



Joined: 20 Jul 2004
Posts: 35

Posted: Fri Jun 27, 2008 11:34 am    Post subject:  

I'm using the Latest 2.6 Series (2.6.18.8-linode10) kernel. Is there a recommended kernel to use with Xen?
Back to top  
caker



Joined: 15 Apr 2003
Posts: 2370
Location: Galloway, NJ

Posted: Fri Jun 27, 2008 12:23 pm    Post subject:  

Your crash reports are all from 2.6.18.8-domU-linode7. Let me know if -linode10 solves this for you.

Thanks,
-Chris
Back to top  
edavis



Joined: 20 Jul 2004
Posts: 35

Posted: Fri Jun 27, 2008 2:56 pm    Post subject:  

caker wrote: Your crash reports are all from 2.6.18.8-domU-linode7. Let me know if -linode10 solves this for you.

Thanks,
-Chris

Oh ok. I see my latest reboot picked up -linode10. I'll let you know how it goes...
Back to top  
edavis



Joined: 20 Jul 2004
Posts: 35

Posted: Thu Jul 17, 2008 11:42 am    Post subject: my xen linode crashed (solved)  

I haven't had any problem since running the new -linode10 kernel. Thank you.
Back to top  
 
       Linode.com Forum Forum Index -> Feature Request/Bug Report
Page 1 of 1