| Author |
Message |
caker
Joined: 15 Apr 2003
Posts: 2392
Location: Galloway, NJ
|
| Posted: Tue Apr 04, 2006 10:03 pm Post subject: Reboot: host56 (graceful) |
|
|
The xen beta box is going to be rebooted in a few. Under heavy load (a few migrations, a deployment, and a resize), it looks like it triggered CONFIG_DETECT_SOFTLOCKUP and created a few zombie domains, preventing people from booting. I'm going to grab the latest Xen updates, turn that off, update the host kernel and reboot.
Those that are still up and running should see a graceful shutdown in a few minutes...
-Chris |
|
| Back to top |
|
Xan
Joined: 08 Feb 2004
Posts: 311
Location: Austin
|
| Posted: Tue Apr 04, 2006 10:52 pm Post subject: |
|
|
Great to hear it was what amounts to an instrumentation problem and not a real failure.
Xen seems to be working out really well.
Any idea when the nodes will come back up? :-) |
|
| Back to top |
|
Xan
Joined: 08 Feb 2004
Posts: 311
Location: Austin
|
| Posted: Tue Apr 04, 2006 10:58 pm Post subject: |
|
|
| ...oh. It's in the queue. Nevermind! |
|
| Back to top |
|
caker
Joined: 15 Apr 2003
Posts: 2392
Location: Galloway, NJ
|
| Posted: Tue Apr 04, 2006 10:58 pm Post subject: |
|
|
About half have been booted already, and it's working its way through the rest.
-Chris |
|
| Back to top |
|
Xan
Joined: 08 Feb 2004
Posts: 311
Location: Austin
|
| Posted: Tue Apr 04, 2006 11:05 pm Post subject: |
|
|
It doesn't seem to have worked...
Code: xen_linode_boot: failed to get domid
xen_linode_boot: warning - li-network might not have ran |
|
| Back to top |
|
caker
Joined: 15 Apr 2003
Posts: 2392
Location: Galloway, NJ
|
| Posted: Tue Apr 04, 2006 11:09 pm Post subject: |
|
|
Yeah, I've seen that one as well. Issue another reboot and it should work.
-Chris |
|
| Back to top |
|
Xan
Joined: 08 Feb 2004
Posts: 311
Location: Austin
|
| Posted: Tue Apr 04, 2006 11:10 pm Post subject: |
|
|
| Sure did. |
|
| Back to top |
|
pclissold
Joined: 24 Oct 2003
Posts: 472
Location: Netherlands
|
| Posted: Wed Apr 05, 2006 4:20 am Post subject: |
|
|
My Linode reported the same error after the host initiated restart:
Code: xen_linode_boot: failed to get domid
xen_linode_boot: warning - li-network might not have ran
LPM showed it as 'Powered off'. (I'm at work, where ssh is only allowed to predefined hosts - not including my Linode - so I guess it was off but I'm not totally sure).
I issued a boot command and got the same error message and Linode still shown as 'Powered off'.
A second boot command again gave the error messages but the Linode was shown by LPM as 'Running'.
A reboot command produced a successful shutdown followed by a boot with error messages and a 'Powered off' Linode.
Another boot command and it came up without error messages and LPM shows 'Running'.
The Linode is attempting to boot into a vanilla Debian 3.1 distro, so I don't think it's a problem with the system. |
|
| Back to top |
|
egatenby
Joined: 19 Sep 2004
Posts: 26
Location: New York, NY
|
| Posted: Wed Apr 05, 2006 12:56 pm Post subject: Re: Reboot: host56 (graceful) |
|
|
caker wrote: The xen beta box is going to be rebooted in a few. Under heavy load (a few migrations, a deployment, and a resize), it looks like it triggered CONFIG_DETECT_SOFTLOCKUP and created a few zombie domains, preventing people from booting. I'm going to grab the latest Xen updates, turn that off, update the host kernel and reboot.
Is host56 experiencing more problems today, or has anyone else noticed anything wrong? For at least the past 2 hours, the performance has been absolutely horrible. |
|
| Back to top |
|
Xan
Joined: 08 Feb 2004
Posts: 311
Location: Austin
|
| Posted: Wed Apr 05, 2006 12:59 pm Post subject: |
|
|
| Yes, it's been struggling... Maybe Caker's migrating some folks to the new Xen box right now. |
|
| Back to top |
|
caker
Joined: 15 Apr 2003
Posts: 2392
Location: Galloway, NJ
|
| Posted: Wed Apr 05, 2006 1:37 pm Post subject: |
|
|
We found another bug in Xen. It looks related to what we hit last night. I've got an email thread going on the xen-devel mailing list.
If you want to be un-migrated, please open a support ticket and specify if we can just "reset" you back to the host you were previously on without moving the disks, or if you need your disk images moved.
-Chris |
|
| Back to top |
|
Xan
Joined: 08 Feb 2004
Posts: 311
Location: Austin
|
| Posted: Wed Apr 05, 2006 1:41 pm Post subject: |
|
|
Is there a forecast for when things will be better? Are we having to wait for the Xen developers to fix something, or can we go back to the state where things were working fine?
And if we do choose to move our disk images, will that happen at a reasonable speed, or will it be subject to the same slowdown?
If there's a chance that rebooting the host will make things better, I'd say let's try it. It's not doing me a whole lot of good as is... |
|
| Back to top |
|
caker
Joined: 15 Apr 2003
Posts: 2392
Location: Galloway, NJ
|
| Posted: Wed Apr 05, 2006 1:52 pm Post subject: |
|
|
It'll take at least another reboot for those people that can't boot currently.
I've already suspended pending migrations to the box, so anyone with a migration pending, you'll need to hold off for now.
Things seemed to work fine until a certain threshold of number of linodes on the machine was hit. If we can get a few people off the machine, I think we'll be ok while this gets resolved.
To answer your question re speed of migrating off ... I honestly don't know at this point. The disk performance might be being masked by this bug in Xen, since a few of us were able to totally thrash the box without any other domains even noticing. I've also been able to get easily 60M/sec reads, so something weird is going on.
If you're just worried about performance, check back in about 10 minutes. There's one final migration that was currently underway when this happeneed, that's about to finish...
-Chris |
|
| Back to top |
|
egatenby
Joined: 19 Sep 2004
Posts: 26
Location: New York, NY
|
| Posted: Wed Apr 05, 2006 2:00 pm Post subject: |
|
|
caker wrote:
If you want to be un-migrated, please open a support ticket and specify if we can just "reset" you back to the host you were previously on without moving the disks, or if you need your disk images moved.
I moved from a Dallas host to the Xen host. Is there any availability in Fremont? (I've trying to avoid an IP change)
Thanks! |
|
| Back to top |
|
caker
Joined: 15 Apr 2003
Posts: 2392
Location: Galloway, NJ
|
| Posted: Wed Apr 05, 2006 2:04 pm Post subject: |
|
|
egatenby wrote: I moved from a Dallas host to the Xen host. Is there any availability in Fremont? (I've trying to avoid an IP change)
Yes, that would be best -- it would involve migrating your disk images again (no big deal).
Send me a new ticket with this request for tracking...
-Chris |
|
| Back to top |
|
| |