--- Day changed --- Log opened Tue Apr 04 23:59:04 2006 00:02 -!- TheFirst [gaveup@your.friendly.neighborhood.hellmouth.info] has joined #linode-xenbeta 00:07 < caker> I really need to serialize bootups 00:08 < caker> kernel msg: ebtables bug: please report to author: Wrong nr. of counters requested <-- been fighting that one since day 1 00:08 < caker> numerous fixes have gone in, but I still see it 00:09 < TheFirst> could i suggest have the control panel/lish report properly too...when the node was down it was still report as up (assuming cached from last 'check'?) 00:12 < caker> right .. the host uses that to know which Linodes to restart when it comes back up 00:13 < caker> I should add another status value, I suppose .. "Shutdown, but host will auto-start" or something 00:14 < TheFirst> yah...this is the third time i've checked it and started worrying...catches me just about every time 00:14 < caker> valen2: btw, that isn't the rdns name that's displayed on the network page -- it's the default FQDN for your IP 00:34 -!- superbeef [lane@69-165-56-198.clvdoh.adelphia.net] has quit [Quit: My damn controlling terminal disappeared!] 00:35 -!- superbeef [lane@69-165-56-198.clvdoh.adelphia.net] has joined #linode-xenbeta 00:35 < superbeef> lol still migrating eh? 00:35 < mikegrb> lolz 00:36 < caker> yeah .. it's causing unusually high lag for the guests 00:39 < superbeef> cool 00:39 < superbeef> well i'll have a xenode to look forward to in the morning with my breakfast tacos 01:36 -!- superbeef [lane@69-165-56-198.clvdoh.adelphia.net] has quit [Quit: My damn controlling terminal disappeared!] 01:39 -!- fs2k [~cfffc5d2@webuser.linode.com] has quit [Quit: CGI:IRC (Ping timeout)] --- Log closed Wed Apr 05 02:01:44 2006 --- Log opened Wed Apr 05 02:01:49 2006 04:19 -!- vodka_ [~knarf@ip-83-134-83-209.dsl.scarlet.be] has joined #linode-xenbeta 04:20 -!- vodka [~knarf@ip-83-134-78-67.dsl.scarlet.be] has quit [Ping timeout: 480 seconds] 04:33 -!- TheFirst [gaveup@your.friendly.neighborhood.hellmouth.info] has quit [Remote host closed the connection] 04:42 -!- vodka [~knarf@ip-83-134-83-103.dsl.scarlet.be] has joined #linode-xenbeta 04:43 -!- TheFirst [gaveup@your.friendly.neighborhood.hellmouth.info] has joined #linode-xenbeta 04:43 -!- vodka_ [~knarf@ip-83-134-83-209.dsl.scarlet.be] has quit [Ping timeout: 480 seconds] 04:44 -!- TheFirst [gaveup@your.friendly.neighborhood.hellmouth.info] has quit [Remote host closed the connection] 04:50 -!- TheFirst [gaveup@your.friendly.neighborhood.hellmouth.info] has joined #linode-xenbeta 04:52 -!- vodka_ [~knarf@ip-83-134-86-46.dsl.scarlet.be] has joined #linode-xenbeta 04:53 -!- vodka [~knarf@ip-83-134-83-103.dsl.scarlet.be] has quit [Ping timeout: 480 seconds] 05:04 < linbot> New news from forums: Kernel: 2.6.16.1-linode18 with NPTL/TLS support in Linode.com Announcements 05:10 -!- vodka [~knarf@ip-83-134-86-35.dsl.scarlet.be] has joined #linode-xenbeta 05:12 -!- vodka_ [~knarf@ip-83-134-86-46.dsl.scarlet.be] has quit [Ping timeout: 480 seconds] 06:25 -!- vodka [~knarf@ip-83-134-86-35.dsl.scarlet.be] has quit [Quit: vodka] 09:43 -!- TheFirst [gaveup@your.friendly.neighborhood.hellmouth.info] has quit [Ping timeout: 480 seconds] 10:38 -!- sonorous [~simon@grolsch.attenuate.org] has left #linode-xenbeta [] 12:06 -!- valen2 [~valen@adsl-70-238-134-49.dsl.stlsmo.sbcglobal.net] has quit [helium.oftc.net oxygen.oftc.net] 12:06 -!- phlaegel [~phlaegel@atdot.ca] has quit [helium.oftc.net oxygen.oftc.net] 12:06 -!- dsoul [darksoul@vice.ii.uj.edu.pl] has quit [helium.oftc.net oxygen.oftc.net] 12:06 -!- SupaZubon [~crack@frotz.zork.net] has quit [helium.oftc.net oxygen.oftc.net] 12:06 -!- tierra [~tierra@ibaku.net] has quit [helium.oftc.net oxygen.oftc.net] 12:06 -!- alnr [~alan@cpe-69-200-85-107.nyc.res.rr.com] has quit [helium.oftc.net oxygen.oftc.net] 12:06 -!- Spads [~crack@dsl081-246-246.sfo1.dsl.speakeasy.net] has quit [helium.oftc.net oxygen.oftc.net] 12:06 -!- warewolf [warewolf@warewolf.org] has quit [helium.oftc.net oxygen.oftc.net] 12:06 -!- encode [~encode@blah.i.hate.w1ndo.ws] has quit [helium.oftc.net oxygen.oftc.net] 12:06 -!- gpd [~gpd@70.85.16.173] has quit [helium.oftc.net oxygen.oftc.net] 12:06 -!- caker [~caker@caker.netrep.oftc.net] has quit [helium.oftc.net oxygen.oftc.net] 12:06 -!- ElectricElf [~dbharris@electricelf.noc.oftc.net] has quit [helium.oftc.net oxygen.oftc.net] 12:06 -!- JasonF [~jay@cialis.oldos.org] has quit [helium.oftc.net oxygen.oftc.net] 12:06 -!- linbot [~supybot@ns.theshore.net] has quit [helium.oftc.net oxygen.oftc.net] 12:06 -!- sprouse [sprouse@chewbacca.infonurse.net] has quit [helium.oftc.net oxygen.oftc.net] 12:06 -!- fo0bar [fo0bar@38.99.66.211] has quit [helium.oftc.net oxygen.oftc.net] 12:11 -!- fo0bar [fo0bar@38.99.66.211] has joined #linode-xenbeta 12:12 -!- fo0bar [fo0bar@38.99.66.211] has quit [helium.oftc.net oxygen.oftc.net] 12:17 -!- fo0bar [fo0bar@38.99.66.211] has joined #linode-xenbeta 12:22 -!- SupaZubon [~crack@frotz.zork.net] has joined #linode-xenbeta 12:22 -!- gpd [~gpd@70.85.16.173] has joined #linode-xenbeta 12:22 -!- ElectricElf [~dbharris@electricelf.noc.oftc.net] has joined #linode-xenbeta 12:22 -!- alnr [~alan@cpe-69-200-85-107.nyc.res.rr.com] has joined #linode-xenbeta 12:22 -!- encode [~encode@blah.i.hate.w1ndo.ws] has joined #linode-xenbeta 12:22 -!- warewolf [warewolf@warewolf.org] has joined #linode-xenbeta 12:22 -!- JasonF [~jay@cialis.oldos.org] has joined #linode-xenbeta 12:22 -!- Spads [~crack@dsl081-246-246.sfo1.dsl.speakeasy.net] has joined #linode-xenbeta 12:22 -!- linbot [~supybot@ns.theshore.net] has joined #linode-xenbeta 12:22 -!- caker [~caker@caker.netrep.oftc.net] has joined #linode-xenbeta 12:22 -!- tierra [~tierra@ibaku.net] has joined #linode-xenbeta 12:22 -!- mode/#linode-xenbeta [+o caker ] by kinetic.oftc.net 12:23 -!- sprouse [sprouse@chewbacca.infonurse.net] has joined #linode-xenbeta 12:23 -!- phlaegel [~phlaegel@atdot.ca] has joined #linode-xenbeta 12:23 -!- dsoul [darksoul@vice.ii.uj.edu.pl] has joined #linode-xenbeta 12:23 -!- valen2 [~valen@adsl-70-238-134-49.dsl.stlsmo.sbcglobal.net] has joined #linode-xenbeta 12:27 -!- valen2 [~valen@adsl-70-238-134-49.dsl.stlsmo.sbcglobal.net] has quit [arion.oftc.net neutron.oftc.net] 12:27 -!- phlaegel [~phlaegel@atdot.ca] has quit [arion.oftc.net neutron.oftc.net] 12:27 -!- sprouse [sprouse@chewbacca.infonurse.net] has quit [arion.oftc.net neutron.oftc.net] 12:27 -!- dsoul [darksoul@vice.ii.uj.edu.pl] has quit [arion.oftc.net neutron.oftc.net] 12:28 -!- sprouse [sprouse@chewbacca.infonurse.net] has joined #linode-xenbeta 12:28 -!- phlaegel [~phlaegel@atdot.ca] has joined #linode-xenbeta 12:28 -!- dsoul [darksoul@vice.ii.uj.edu.pl] has joined #linode-xenbeta 12:28 -!- valen2 [~valen@adsl-70-238-134-49.dsl.stlsmo.sbcglobal.net] has joined #linode-xenbeta 12:39 -!- Kandur [Kandur@83-131-96-91.adsl.net.t-com.hr] has joined #linode-xenbeta 12:39 < Kandur> hellp 12:39 < Kandur> hello,help 12:40 < Kandur> hi? 12:40 < Spads> yes hi. 12:41 < Kandur> hey. have a question... 12:41 < Kandur> about migrating to xen 12:41 < Spads> sure 12:41 < Spads> I can try to answer 12:41 < Spads> although I'm just a lowly beta customer :) 12:42 < Kandur> i started it, and it got stuck on Migrate Filesystem, i mean it's in process for an hour now 12:42 < Spads> yeah 12:42 < Spads> how big is your filesystem? 12:42 < Kandur> 6GB 12:42 * Spads nods 12:42 < Spads> it can take a while 12:43 < Kandur> huh thanks.. 12:43 < Spads> the way I was asked to do it 12:43 < Spads> was to shrink the filesystem down to meet the used space 12:43 < Kandur> i thought i f.. something up 12:43 < Spads> then transfer 12:43 < Spads> and re-expand 12:43 < Kandur> hmmmm 12:43 < Spads> but at this point 12:43 < Spads> you may as well just wait for it to finish 12:44 < Spads> you can queue up a boot already too, I think 12:44 < Kandur> it will finish, wont it? i allready did 12:44 < Spads> yeah, it will finish 12:44 < Kandur> how's xen like anyhow? 12:45 < Spads> oh man, it's such an improvement 12:45 < Spads> to give you an idea of how much better, when this new xen host went up, caker and I and someone else started stress-testing it 12:46 < Spads> caker ran a huge memory-consuming process 12:46 < Spads> as did someone else 12:46 < Spads> I set off a forkbomb on my node 12:46 < Spads> and we kept asking people "hey, see our load yet?" 12:46 < Spads> and folks were all "Um, what are you talking about?" 12:46 < Kandur> hehhehee nice 12:47 < Spads> so we had two nodes swapping like mad, mine consuming 200% CPU when it could, and still people on other nodes could easily slip in and run their programs 12:47 < Spads> without even knowing we were there 12:47 < Spads> and, hey... NO IO TOKENS 12:48 < warewolf> so caker, there are no problems when one or more guests in Xen are thrashing :) 12:48 < Spads> pardon my ignorance, but is Hrvatska the same as what we call Croatia? 12:49 < Kandur> yes it is 12:49 < Spads> heh 12:49 < Spads> phew! 12:49 < Spads> my mental map of Europe is still stuck in 1987, unfortunately 12:49 < Kandur> just a sec 12:49 < Spads> I get pretty confused when I head east of Berlin 12:50 < Kandur> i have a call...just a sec 12:50 * Spads nods 12:52 < caker> warewolf: well, not that we saw. But I did see some performance issues lsat night with a bunch of migrations going on. I think the host domain is stealing CPU time. I'm going to spend the day reseaching all the scheduling features/options of Xen 12:52 < Spads> the migration processes are peers to the hypervisor, yes? 12:53 < Spads> or I guess the xen hypervisor has more tendrils in the kernel... 12:53 < caker> no, they're in dom0, which is a "guest" like the other nodes 12:53 < caker> it's just that guest has h/w access 12:53 < Kandur> i'm back! 12:54 < Kandur> :) 12:54 < caker> oh, and the back-end drivers for disk and whatnot -- all the other guests talk to it to to have it service disk/net requests 12:58 < Kandur> does anybody know how long does it take to transfer 6GB filesystem to xen? 13:00 < caker> Kandur: from which host? 13:00 < Kandur> host11 13:00 < caker> hmm.. I'd say 5-10MB/sec, easy 13:00 < Kandur> ok 13:01 < Kandur> thanks 13:03 < Kandur> wait that means that it should last like 10 min? and it's lasting over an hour now 13:05 < caker> it also depends on the load on both the hosts 13:06 < Kandur> ok. i'm going to leave it and come back in an hour. adios! 13:06 < Kandur> exit 13:06 -!- Kandur [Kandur@83-131-96-91.adsl.net.t-com.hr] has left #linode-xenbeta [] 13:18 -!- orospakr [~orospakr@ip-151.52.99.216.dsl-cust.ca.inter.net] has joined #linode-xenbeta 13:39 < sunny> hey 13:40 < sunny> something wrong ? 13:49 < Spads> ? 13:50 < sunny> it says sucess 13:50 < sunny> *success 13:50 < sunny> but gives the erorrs: 13:50 < sunny> xen_linode_boot: failed to get domid 13:50 < sunny> xen_linode_boot: warning - li-network might not have ran 13:51 < sunny> and it fails to power on 13:55 -!- Kandur [Kandur@83-131-96-91.adsl.net.t-com.hr] has joined #linode-xenbeta 13:56 < Kandur> hello again 13:56 < Kandur> i booted up my xen and got this problem xen_linode_boot: failed to get domid 13:56 < Kandur> xen_linode_boot: warning - li-network might not have ran 13:56 < sunny> Kandur: I get the same issue 13:57 < Kandur> tried to do a reboot and reboot again and failed 13:58 < Kandur> sunny: do you now why is this happening? 13:58 < sunny> no 13:59 < Kandur> did staff reply to your problem? 14:07 < alnr> if i've a migration pending will it automatically on next boot? 14:07 < caker> automatically what? 14:07 < alnr> sorry, migrate 14:07 < caker> no 14:07 < alnr> ok good :) 14:07 < caker> but if you're changing datacenters, your IPs won't work 14:08 < caker> Kandur/sunny: hold off on reboots for a bit, ok? 14:08 < Kandur> ok 14:09 < caker> looks like a timeout issue 14:20 < Kandur> caker: any progress? 14:35 < sunny> caker: sure 14:36 < sunny> just lemme know when I can power on :) 14:39 < caker> Kandur / sunny: I think it might be best if we un-migrate you back where you guys were. 14:40 < caker> Kandur / sunny: Please let me know if I can just reset you back where you were, without migrating these filesystems 14:41 < sunny> now is fine 14:41 < caker> sunny: ok .. which host were you previously on? 14:41 < sunny> o_O 14:41 < sunny> I think 11 ? 14:41 * sunny has *no* clue 14:41 < caker> ok, I'll check 14:41 < caker> make sure you're logged OUT of the LPM 14:41 < sunny> caker: done 14:42 < caker> sunny: ok, give me 5 14:42 < sunny> sure 14:46 < caker> sunny: looks like you're back home 14:49 < sunny> cool 14:49 < sunny> yay 14:53 < Kandur> hey it's ok with me 14:54 < Kandur> caker: take me back where i belong 14:54 < caker> Kandur: ok, which host were you on? 14:54 < Kandur> huh...i think 11 14:54 < caker> yup... 14:54 < caker> Kandur: please log out of the website 14:55 < Kandur> i'm not loged in 14:56 < caker> Kandur: ok, you're back home 14:57 < Kandur> caker: thanks man! 14:57 -!- fmdns1 [~466cfc96@webuser.linode.com] has joined #linode-xenbeta 14:57 < fmdns1> I am having issues with a new XEN install. 14:57 < caker> what's up? 14:57 < fmdns1> I use ubuntu and it won't bootup 14:58 < caker> username? 14:58 < Kandur> caker: will we have a chance to try again? 14:58 -!- Jeremy [stormy@2001:4830:2064::9] has joined #linode-xenbeta 14:58 < fmdns1> fmiranda 14:58 < caker> Kandur: yes 14:58 < caker> fmdns1: I'm going to reset you back to host44. Please log out of the website, and let me know when that's done 14:59 < fmdns1> I'm logged out. 15:00 < Kandur> caker: you'll post on forum when this problem is solved? 15:01 < caker> Kandur: yes 15:01 < caker> fmdns1: ok, you're back on host44, and booted up 15:01 < fmdns1> Any ideas what could the problem be? 15:02 < caker> yeah, a problem in xen's backend block device driver 15:02 < caker> http://lists.xensource.com/archives/html/xen-devel/2006-04/msg00170.html 15:03 < fmdns1> Thanks, I will try again when this is solved. 15:09 -!- Kandur [Kandur@83-131-96-91.adsl.net.t-com.hr] has left #linode-xenbeta [] 15:29 -!- taupehat [me@taupehat.com] has joined #linode-xenbeta 16:09 < valen2> Humm 16:09 < valen2> Think my node just zombied out again. :/ 16:10 < taupehat> you on 56? 16:10 < valen2> Yep 16:10 < taupehat> same here 16:10 < taupehat> ping caker 16:10 < taupehat> it's doggy 16:10 < caker> yes? 16:10 < taupehat> host56 is slow for both us us 16:10 < taupehat> of us* 16:10 < valen2> Actually 16:10 < valen2> It's not that 16:10 < valen2> It's doing what it did last night 16:11 < taupehat> I'm getting extremely slow throughput and performance 16:11 < caker> want to be reset back where you were before migrating? 16:11 < taupehat> heh 16:11 < taupehat> traceroute is interesting: 16:11 < taupehat> 12 pos10-0.gsr12416.fmt.he.net (216.218.229.38) 44.808 ms * 43.478 ms 16:11 < taupehat> 13 pos8-0.gsr12012.fmt.he.net (66.220.20.138) 55.045 ms 41.942 ms 42.205 ms 16:11 < taupehat> 14 taupehat.com (64.62.231.41) 72.414 ms 274.323 ms 249.821 ms 16:11 < valen2> I'm thinking something is up with Xen 16:12 < caker> boots won't work until I reboot the host 16:12 < caker> xen broke 16:12 < taupehat> d'oh 16:12 < caker> http://lists.xensource.com/archives/html/xen-devel/2006-04/msg00170.html <-- 16:13 < taupehat> "It also appears that after this happens, no new block devices can be attached." 16:13 < taupehat> zoinks 16:13 < caker> yup 16:14 < taupehat> this is less than ideal =] 16:14 < taupehat> anyhow 16:14 < taupehat> are we in zombie mode right now? 16:14 < caker> nodes that are running are ok 16:14 < caker> no new nodes can boot 16:14 < caker> until I bounce the host 16:14 < taupehat> eh 16:14 < taupehat> I'm getting horrible performance 16:15 < taupehat> less than ok =] 16:15 < valen2> Yea 16:15 < caker> yeah, that's another problem 16:15 < valen2> I was getting the performance issue 16:15 < caker> I haven't figured that one out yet .. performance is terrible 16:15 < valen2> Which caused me to try and bounce my node. 16:15 < taupehat> heh 16:15 < taupehat> oops 16:15 < taupehat> well 16:15 < caker> it was all-ok until we hit a certain number of nodes, or IO ... 16:15 < taupehat> bounce if you need to 16:15 < taupehat> hmm 16:16 < valen2> Humm 16:16 < taupehat> I wonder if there isn't some kind of race condition or looping that happens when the host swaps 16:16 < taupehat> or caches IO 16:16 < caker> I think it's xen's backend block driver 16:16 < caker> it's looping like mad trying to service a disconnected request 16:16 < taupehat> heh 16:16 < taupehat> can you write a watchdog for it? 16:17 < taupehat> I'm assuming you're referring to this bit: 16:17 < taupehat> Apr 5 14:28:40 host56 kernel: xvd 73 fd:85: I/O pending, delaying exit 16:17 < taupehat> Apr 5 14:28:40 host56 kernel: xvd 73 fd:85: not connected (13 pending) 16:17 < caker> yup 16:18 < caker> and that leads to zombies, and what I suspect is the perf problem, although that's unconfirmed 16:18 < taupehat> odd 16:18 < caker> anyhow .. lemme queue up a reboot 16:18 < taupehat> and you're getting about 12 hours per boot out of it right now, eh? 16:18 < caker> No, this happened right away for some people 16:19 * taupehat will d/c when the reboot happens 16:20 < caker> there was also a timeout problem in Xend waiting for devices to connect .. I've increased that from 10 seconds to 2 minutes 16:20 < taupehat> do you have any patches inline for this boot? 16:20 < caker> no 16:20 < caker> other than my fix above 16:21 < caker> that's where the majority of the "failed to get domid" errors came from, and what I *think* initiated all of this 16:21 < taupehat> ok 16:21 < caker> of course, once it went into this bad state, timeout didn't matter -- it's not going to attach 16:23 -!- FireRabbit [~ebutler@71-35-163-161.tukw.qwest.net] has joined #linode-xenbeta 16:24 < FireRabbit> hello, is there someone here that can help me? my linode wont start apparently 16:24 < FireRabbit> Host Message xen_linode_boot: failed to get domid 16:24 < FireRabbit> xen_linode_boot: warning - li-network might not have ran 16:24 < valen2> Um 16:24 < valen2> Doesn't matter... Host 56 is going down soon. 16:25 < FireRabbit> its going down? 16:25 < FireRabbit> what does that mean for me? 16:25 < Spads> http://www.google.com/search?hl=en&lr=&q=failed+to+get+domid 16:25 < valen2> There's issues with nodes not starting right. 16:25 < Spads> # If we try to update and fail, we must have been + # deleted from the hypervisor 16:25 < Spads> deleted huh 16:25 < Spads> FireRabbit: did you migrate? 16:25 < FireRabbit> i did 16:25 < Spads> huh 16:25 < FireRabbit> iam regretting it now 16:26 < FireRabbit> my system has hardly been up at all since 16:26 < Spads> well, I guess caker is working on this then 16:26 < caker> Spads: that's my error message, not from something else 16:26 < Spads> ahhh 16:26 < valen2> He's currently working on the so called "zombie node" issue. 16:27 < valen2> Where xend would get messed up and not allow any nodes to boot. 16:27 < FireRabbit> host 56 is going down indefinetly? are you moving all the xen domains to another box? what's the plan? should i open a support ticket? 16:27 < caker> FireRabbit: wait a few minutes, please 16:27 < FireRabbit> okay. 16:28 < valen2> No. It's temporary. 16:29 < linbot> New news from forums: Reboot: host56 (graceful) #2 in Xen Public Beta 16:30 < mikegrb> There once was a cake from Stafford. 16:30 < mikegrb> He found Xen to please the landlord. 16:30 < mikegrb> He lost his domid, so he tried again 16:30 < mikegrb> But the daemon fell on a broadsword. 16:31 < caker> host56:~# xm list | grep Zombie | wc -l 16:31 < caker> 39 16:31 < caker> nice... 16:31 < valen2> Oh my 16:32 < mikegrb> There once was a cake from Stafford. 16:32 < mikegrb> He found Xen to ease the discord. 16:32 < mikegrb> He lost his domid, so he tried again 16:32 < mikegrb> But the daemon acted of its own accord. 16:32 < mikegrb> ok, I'm done 16:33 < mikegrb> the creative stuff sure is hard 16:33 < FireRabbit> mikegrb, haha, nice 16:33 < mikegrb> s/the/this/ 16:34 -!- taupehat [me@taupehat.com] has quit [Ping timeout: 480 seconds] 16:36 < valen2> The system is going down for reboot NOW! <-- Just seen on host56 ssh session. 16:37 < Spads> heh 16:37 < FireRabbit> yep 16:42 < FireRabbit> should I hold off on booting my linode? 16:43 < Spads> probablyt 16:44 < Spads> there may be a host-initiated load 16:44 < caker> not if it wasn't running at time of shutdown 16:44 < caker> FireRabbit: so yes, issue a boot, if there isn't already one in the queue 16:45 < Spads> ah 16:45 < FireRabbit> nothing in the queue.. ill do that now 16:46 < valen2> Waiting on host to start boot process 16:47 < caker> It's stupid they've hardcoded all these timeouts all over the place 16:48 < caker> their "shutdown or kill" timeout value is way too small, too 16:48 < FireRabbit> hmmmm my ssh session to host56 died after i issued the boot 16:48 -!- FireRabbit [~ebutler@71-35-163-161.tukw.qwest.net] has left #linode-xenbeta [Leaving] 16:48 < valen2> Humm 16:49 < valen2> Maybe Xen needs a lesson in configuration files? 16:49 -!- FireRabbit [~ebutler@71-35-163-161.tukw.qwest.net] has joined #linode-xenbeta 16:49 < Spads> valen2: not half as much as asterisk does 16:50 < FireRabbit> okay, boot command is queued 16:50 < mikegrb> Spads: huh? 16:51 < Spads> mikegrb: asterisk needs to learn a different lesson than xen seem sto 16:51 < mikegrb> that lesson is? 16:51 < Spads> I'd say it's "either design a config file format or a scripting language. Don't try to pretend there's an in-between" 16:53 -!- TheFirst [gaveup@CPE-70-92-72-102.new.res.rr.com] has joined #linode-xenbeta 16:54 -!- sednet [~51cf1045@webuser.linode.com] has joined #linode-xenbeta 16:54 -!- taupehat [me@taupehat.com] has joined #linode-xenbeta 16:54 < taupehat> much nicer =] 16:54 < Spads> asterisk feels like they started with a quick test-harness parser for stuff, and just kept making up syntax as they needed more features 16:54 < taupehat> maybe they did =] 16:55 < sednet> Hows host56? 16:55 < taupehat> happier for me at least 16:55 < sednet> Hmm. my linode isn't booting 16:56 < valen2> Same here. 16:56 < caker> give it time 16:56 < caker> I added a delay in between jobs 16:56 < valen2> Ah 16:56 < valen2> Well 16:56 < valen2> I hope the delay is temporary. :) 16:56 < sednet> It says I've been waiting since 01/01/1974 :) 16:57 < valen2> It's been almost 15 minutes. 16:57 < Spads> sednet: for priority 16:57 < caker> I really should make that my birthday .. it's only 3 days off 16:57 < taupehat> hehe 16:58 < caker> anyhow .. so far, no "failed to get domid" boots 16:58 < mikegrb> caker started time 16:58 < taupehat> w00t 16:58 < taupehat> caker is the Flying Spaghetti Monster 16:58 < Spads> he found the domid 16:58 < Spads> maybe he's lost our swaps again! 16:58 < valen2> taupehat: Was your Linode just touched by his noodly appendage? 16:59 < taupehat> indeed it was 16:59 < FireRabbit> my boot command is still in the queue here 17:00 -!- orospakr [~orospakr@ip-151.52.99.216.dsl-cust.ca.inter.net] has quit [Quit: Ex-Chat] 17:00 < sednet> W00t 17:02 < mikegrb> FireRabbit: they go one at a time 17:02 < FireRabbit> ah 17:02 < sednet> Mine is going rather slowly 17:02 < Spads> oh, mine was not queued 17:02 < Spads> I must have misplaced my domid today 17:03 < Spads> good thing caker found it! 17:03 < caker> crap this is slow 17:03 < taupehat> baby steps... 17:05 < sednet> Any idea whats causing the delay? 17:07 < caker> it's either CPU sched related (which I can tweak), or some bug in the backend driver, or both 17:08 < taupehat> [===============>.....] recovery = 76.7% (170371456/221921792) finish=25.9min speed=33046K/sec 17:08 < taupehat> zzz 17:09 < sednet> I've seen DMA switched off on disks make xen slow as hell? 17:10 < sednet> Surely it's not that 17:13 -!- TheFirst [gaveup@CPE-70-92-72-102.new.res.rr.com] has quit [Quit: You're a bloody puppet!] 17:17 < sednet> w00t. It's up. 17:19 < caker> xm sched-sedf 0 0 0 0 1 1 17:19 < caker> ^-- seems to have helped 17:24 -!- cow [Ap0ll0@modemcable160.99-83-70.mc.videotron.ca] has quit [Read error: Connection reset by peer] 17:27 < FireRabbit> okay, mine is back up... seems a bit slower than it used to be though 17:30 < caker> sednet: yeah, doubtful -- hardware raid 17:30 < linbot> New news from forums: What are sit0 and gre0? in Xen Public Beta 17:31 < caker> cache is on, it looks good to go 17:32 < Spads> well I'm up 17:32 < Spads> a little sluggish though 17:32 < caker> only a little? 17:32 < Spads> are people still fscking or something? 17:32 < caker> not that I see 17:32 < Spads> yeah, not as bad as my UML node :) 17:33 < caker> heh 17:34 < caker> # time xm list 17:34 < caker> real 0m27.305s 17:34 < caker> user 0m0.280s 17:34 < caker> sys 0m0.020s 17:34 < caker> that's pretty sad 17:37 < Spads> ouch 17:38 < Spads> [nick@golgatem(~)] time ls /usr/lib 17:38 < Spads> real 0m11.477s 17:38 < Spads> user 0m0.000s 17:38 < Spads> sys 0m0.000s 17:38 -!- FireRabbit [~ebutler@71-35-163-161.tukw.qwest.net] has quit [Ping timeout: 480 seconds] 17:38 < caker> hvae something to test cpu-only time? 17:38 < Spads> hmmm 17:39 < caker> like number of loops in 10 seconds, or something? 17:39 < Spads> hard to be pure CPU-only 17:39 < Spads> but maybe CPU with lots of RAM access 17:39 < Spads> I mean 17:39 < Spads> second ls is just this: 17:39 < Spads> real 0m0.029s 17:39 < Spads> user 0m0.010s 17:39 < Spads> sys 0m0.000s 17:39 < Spads> cache helps 17:39 < caker> well, that's fine -- just something that's not tied to disk 17:39 < valen2> Humm 17:39 < Spads> hmmm 17:39 < Spads> lemme see 17:40 < caker> or .. how long it takes to do 10000 loops in bash 17:40 < Spads> yeah 17:40 < Spads> I've got an idea 17:40 < Spads> why don't I have xrange... 17:40 < caker> these sched options are blowing my mind 17:40 < Spads> bah 17:41 < caker> host56:~# xm help sched-sedf 17:41 < caker> sched-sedf [DOM] [OPTIONS] Show|Set simple EDF parameters 17:41 < caker> -p, --period Relative deadline(ms). 17:41 < caker> -s, --slice Worst-case execution time(ms) 17:41 < caker> (slice < period). 17:41 < caker> -l, --latency scaled period(ms) in case the domain 17:41 < caker> is doing heavy I/O. 17:41 < caker> -e, --extra flag (0/1) which controls whether the 17:41 < caker> domain can run in extra-time 17:41 < caker> -w, --weight mutually exclusive with period/slice and 17:41 < caker> specifies another way of setting a domain's 17:41 < caker> cpu period/slice. 17:41 < caker> host56:~# xm sched-sedf 17:41 < caker> Name ID Period(ms) Slice(ms) Lat(ms) Extra Weight 17:41 < caker> Domain-0 0 20.0 15.0 0.0 1 0 17:41 < caker> acot 42 100.0 0.0 0.0 1 0 17:42 < caker> all the guests are set like acot's 17:42 < caker> I think that's a clue 17:43 < caker> I need to dumb down the other domains 17:43 -!- FireRabbit [~ebutler@71-35-163-161.tukw.qwest.net] has joined #linode-xenbeta 17:45 < Spads> [nick@golgatem(~)] time for i in 1 2 3 4 5 6 7 8 9 0; do echo $i | md5sum -; done 17:45 < Spads> real 0m0.028s 17:45 < Spads> user 0m0.000s 17:45 < Spads> sys 0m0.010s 17:46 < Spads> that's faster than my laptop 17:46 < Spads> by an order of magnitude 17:46 < Spads> but it's only 10 17:46 < valen2> It's IO 17:46 < sednet> spads - x=0;time while [[ $x -lt 100000 ]]; do :;(( x++ )); done 17:46 < Spads> yeah it is 17:46 < Spads> hmmm 17:47 < Spads> I get about 2-3s for that on xen 17:47 < Spads> 5-6 on my laptop 17:47 < Spads> CPU doesn't seem to be the problem 17:48 < Spads> [nick@golgatem(~)] time for i in $( seq 1 1000 ); do echo $i | md5sum - > /dev/null; done 17:48 < Spads> real 0m3.215s 17:48 < Spads> user 0m0.390s 17:48 < Spads> sys 0m1.610s 17:49 < Spads> [nick@xyzzy(~)] time for i in $( seq 1 1000 ); do echo $i | md5sum - > /dev/null; done 17:49 < Spads> real 0m8.967s 17:49 < Spads> user 0m5.876s 17:49 < Spads> sys 0m1.759s 17:51 < linbot> New news from forums: Spamlists and exim4? in Email/SMTP Related Forum 17:56 -!- sednet [~51cf1045@webuser.linode.com] has quit [Quit: bye] 17:57 < caker> better now? 17:57 < Spads> ooh 17:57 < Spads> login was quick 17:59 < caker> which way do you think "weight" would ... provide more/less cpu juice to a domain? 17:59 < Spads> haha 17:59 < Spads> the old priority problem 17:59 < caker> these docs suck 17:59 < Spads> "Wait, is a higher priority number ordinal or cardinal?" 18:00 < Spads> "Is priority 1 less priority than 2, or is it 1st priority vs 2nd?" 18:01 < caker> anyhow .. does time ls /usr/lib not suck as bad as it did? 18:01 < Spads> [nick@golgatem(~)] time ls -lR /usr/lib 18:01 < caker> bah, mine too 5 seconds, non recursive 18:01 < caker> *took 18:02 < Spads> mine went quick and is now bogged a bit 18:02 < Spads> and it's not TCP lag 18:02 < caker> ok .. .012 that time 18:02 < caker> cache, I guess 18:02 < Spads> yep 18:02 < Spads> cache makes fools of us all 18:03 < Spads> my recursive one is still going 18:03 < Spads> haha 18:03 < Spads> two minutes so far 18:04 < caker> what a jole 18:04 < caker> joke 18:08 * caker codes up a shell script to set the weight for all the guests 18:10 < Spads> real 4m4.132s 18:10 < Spads> user 0m0.030s 18:10 < Spads> sys 0m0.020s 18:10 < Spads> !! 18:10 < Spads> linbot: hush 18:16 < Spads> of course it's only 13 seconds the second time 18:16 < caker> looks like higher weights get more CPU 18:16 < Spads> heh 18:16 < caker> setting the host to 512, and the guests to 80 (for now) 18:17 < caker> on the 5th one ... 18:17 < Spads> heh 18:17 < caker> (slowwwwwwwwwwwwww &*(&*(@&#*(@ 18:17 -!- TheFirst [gaveup@your.friendly.neighborhood.hellmouth.info] has joined #linode-xenbeta 18:17 < Spads> the CSUA at berkeley used to have like 1000 users on at once 18:17 < Spads> on a single pentium machine 18:17 < caker> this has got to help 18:18 < Spads> like, you'd log in and you'd get like pts;% 18:18 < Spads> because they ran out of numbers and letters 18:18 < caker> geez 18:18 < Spads> and they used a lottery-based scheduler 18:18 < Spads> which worked pretty well 18:18 < caker> ?? random, essentailly? 18:18 < Spads> no 18:18 < Spads> it was in some ways like your IO bucket system 18:19 < Spads> idle time earned you lottery tickets 18:19 < caker> huh 18:19 < TheFirst> the xen box still acting up? things seem sluggish 18:19 < Spads> and if one of your tickets won, you got a timeslice 18:19 < caker> TheFirst: I'm working on it 18:19 < TheFirst> ah k 18:20 < TheFirst> what's the issue? 18:20 < caker> http://en.wikipedia.org/wiki/Earliest_deadline_first_scheduling 18:21 < Spads> what is up with wikipedia's animated moon GIF logo? 18:21 < caker> on that page? 18:21 < caker> no animation for me 18:23 < Spads> their logo 18:23 < Spads> top left 18:23 < caker> yeah, I don't seen any animation in it 18:23 < Spads> oh weird 18:24 < Spads> super weird 18:24 < Spads> I can't see the image 18:24 < Spads> firefox doesn't see it 18:24 < Spads> just a background... 18:24 * Spads did a force-reload 18:24 < Spads> probably cache nonsense 18:30 -!- FireRabbit [~ebutler@71-35-163-161.tukw.qwest.net] has quit [Ping timeout: 480 seconds] 18:32 < caker> any better now? 18:34 < Spads> hard to say 18:34 < Spads> hmm 18:34 < Spads> lots of stuff is cached 18:34 < caker> # hdparm -t /dev/sda 18:34 < caker> /dev/sda: 18:34 < caker> Timing buffered disk reads: 2 MB in 3.66 seconds = 559.10 kB/sec 18:34 < caker> pitiful 18:34 < caker> no clue what's going on 18:34 < caker> we had this box rockin early on 18:34 < Spads> heh 18:34 < TheFirst> things are still sligish 18:34 < Spads> well a shaboom looks to happen okay 18:35 < TheFirst> though the load is down a little bit... 18:35 < caker> host57:~# hdparm -t /dev/sda 18:35 < Spads> [nick@golgatem(~)] time sudo updatedb 18:35 < caker> /dev/sda: 18:35 < caker> Timing buffered disk reads: 172 MB in 3.00 seconds = 57.28 MB/sec 18:35 < Spads> let's see how that goes 18:35 -!- FireRabbit [~ebutler@71-35-163-161.tukw.qwest.net] has joined #linode-xenbeta 18:35 < TheFirst> before just logging in cause a 1.5-2.0 load...now it's down to .9-1.0 at least 18:38 < Spads> [nick@golgatem(~)] time sudo updatedb 18:38 < Spads> real 2m50.201s 18:38 < Spads> user 0m0.670s 18:38 < Spads> sys 0m0.080s 18:39 < Spads> [nick@golgatem(~)] time sudo updatedb 18:39 < Spads> real 0m3.032s 18:39 < Spads> user 0m1.050s 18:39 < Spads> sys 0m0.200s 18:39 < Spads> lol cache 18:39 < mikegrb> lolz 18:40 < caker> ok, I'm going to do the pause-all, unpause one by one trick to see if it's a specific node 18:41 < Spads> haha 18:41 < caker> yeah, it hauls ass with everyone paused 18:41 < Spads> isolate variables baby 18:42 < TheFirst> load numbers shouldn't be affected by other nodes, right? 18:43 < caker> found one 18:43 < TheFirst> found one? problem node? 18:43 < Spads> what's it doing? 18:44 < caker> I can't see anything from /proc/ the nodes anymore, so hard to tell 18:44 < caker> found another 18:44 < TheFirst> whatever you did things are blazing now 18:45 < caker> ok, found two nodes .. they must be thrashing :( 18:45 < caker> I've kept them paused ... 18:45 < caker> how are things? 18:45 < Spads> fine 18:45 < TheFirst> blazin' 18:46 < valen2> Speedy Gonzalez... 18:46 < caker> shitty 18:46 < valen2> It works. 18:46 < Spads> to be honest, I didn't really notice the problem earlier 18:46 < TheFirst> 18:45:02 up 1:57, 1 user, load average: 0.02, 0.42, 0.62 18:46 < TheFirst> huge diff... 18:46 < Spads> yeah 18:46 < Spads> I'm 0.00 0.18 0.20 18:46 < Spads> although the pause probably bolloxes some of that 18:46 < valen2> My IMAP actually works now (previously it would timeout). :) 18:46 < TheFirst> a ls was causing my load to go >1.5 a few min ago 18:46 < caker> /dev/sda1: 18:46 < caker> Timing buffered disk reads: 134 MB in 3.09 seconds = 43.42 MB/sec 18:46 < caker> :) 18:47 < TheFirst> ok so who were the offenders so we can deal with them appropriately ;P 18:47 < caker> bendy was one 18:47 * TheFirst looks for his beatin' bat 18:47 < caker> the other isn't on irc 18:48 < Spads> so what is it these people are doing that swamps I/O? 18:48 < Spads> swap? 18:48 < valen2> I think it has to do with swap file/partition usage 18:48 < caker> not sure ... the b/o numbers were in the 2-3k range 18:48 < Spads> hmmm 18:48 < caker> when we were thrashing, we had it doing 30k 18:48 < Spads> yeah 18:49 < valen2> I think what happens is... They load up too much, eatting up all free RAM... Then it goes to the swap file. 18:49 < Spads> valen2: yeah, that's what "thrashing" refers to, historically 18:49 < caker> valen2: that's how it works :) 18:49 < caker> even with memory upgrades, people are still going to misconfigure stuff, OOM, etc 18:50 < valen2> Yea. 18:50 < Spads> but I had a mofoin FORK BOMB going on my node 18:50 < Spads> and nobody noticed 18:50 < caker> it'll be less of an issue with default configs, but that shit still will happen 18:50 < valen2> People that have a 1024M bufer on their MySQL 18:50 < caker> pity that real-world it wasn't more resilient 18:50 < Spads> heh 18:50 < Spads> still 18:50 < Spads> it's a damn sight better than UML 18:50 < caker> that's got to change 18:50 < Spads> I mean 18:50 < Spads> I had to really struggle to find a way to feel the lag on my node 18:51 < Spads> and it was just sluggish, not slow 18:51 < TheFirst> so it's the io scheduling that's failing/causing the issues? 18:51 < caker> I don't think I'm willing to push this into production without some control over IO bandwidth per node, and I *need* something better to look at than the pause-all, start-em one by one deal 18:51 < valen2> I've got way too many running processes on my node... Runs web, dns, mail, virus scan, database, half dozen other things I can't think of right now. 18:52 < caker> TheFirst: more or less, yeah 18:52 < caker> I think that the only thing that would slow down Xen (and UML really) is being disk bound 18:52 < Spads> hmmm 18:52 < TheFirst> i could only see badness if this was pushed out now 18:52 < caker> agreed 18:52 < valen2> Yea 18:52 < caker> it's pre-token-limiter-hell all over again 18:52 < Spads> hmmm 18:52 < Spads> so what I/O schedulers are used at each level? 18:52 < caker> maybe I should look at implementing something for Xen ... 18:53 < caker> Spads: that's the thing -- I'm not even sure it matters 18:53 < TheFirst> caker: if this is as bad as the pre-token limiter was then this is nothing compared to other providers i've dealt with 18:53 < valen2> As much as I hated the token limiter when putting a massive amount of backqueued email through a spamfilter. 18:53 < Spads> yeah dude, this is *nothing* compared to a thrashing UML 18:53 < warewolf> heh 18:53 < caker> from what I've gathered from the lists, they SAY it does go through the dom0's block layer (not cache), and therefore the sched .. meaning, the backend disk driver goes through whatever elevator I use 18:53 < Spads> caker: and which one are you using now? CFQ? 18:53 < caker> Spads: yes 18:53 < warewolf> caker- so it goes through the host's seheduler? 18:54 < Spads> I notice that my guest kernel has everything as noop 18:54 < warewolf> caker- you can't .. hmm. 18:54 < caker> Spads: with ioprios, etc .. don't seem to do shit 18:54 < TheFirst> seems to do a lot of shitting actually ;P 18:54 < Spads> caker: I'm not so sure 18:54 < Spads> I mean honestly 18:54 < Spads> I had slowish disk access 18:54 < warewolf> run bonnie 18:54 < Spads> but I had disk access 18:54 < caker> I'll unpause one of the thrashers and mess with ioprio .. good test 18:54 < warewolf> if you think you're having disk io problems 18:54 < Spads> oh yeah 18:55 < caker> Spads: this is true 18:55 < Spads> a bonnie run could be good 18:55 < warewolf> and bonnie is hella tunable too 18:55 < caker> Spads: previously (UML) it would be a dead stop. But, there's no knowing if that's because of Xen, LVM, or because of hardware raid .. which the rest of the hosts do not have 18:55 < Spads> it could rule out hardware I/O chain problems 18:55 < warewolf> you can make it write/read enough to blow away the kernel's default cache too 18:55 < Spads> ooh 18:55 < warewolf> caker- so run bonnie on the host, then run it on a guest. 18:56 < warewolf> caker- and compare 18:56 < caker> warewolf: yeah, I have .. it's more or less the same, with the expected overhead 18:56 < warewolf> hmm. 18:57 < caker> can someone generate some massive copy right now? Like make a 1gig file, and then copy it? 18:57 < warewolf> egad 18:57 < Spads> sure 18:57 < valen2> Hah 18:57 < caker> dd if=/dev/zero of=/bigfile bs=1M count=1024 18:57 < warewolf> lights just blinked at work 18:57 < valen2> Sure thing. Haha 18:57 < valen2> Just have to let dist-upgrade finish first. :) 18:57 < caker> well, just one, please 18:58 < Spads> [nick@golgatem(~)] dd if=/dev/zero of=/tmp/bigfile bs=1M count=1024 18:58 < Spads> it's running 18:58 < caker> procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu---- 18:58 < caker> 0 0 19296 75144 28824 11396 0 0 222 16754 467 383 0 1 99 0 18:58 < caker> 0 0 19296 75144 28824 11396 0 0 34 66209 1117 1235 0 3 97 0 18:58 < caker> 0 0 19296 75144 28824 11396 0 0 39 31836 735 865 0 0 100 0 18:58 < caker> etc... 18:58 < caker> that looks fine. 18:58 < Spads> 989+0 records in 18:58 < Spads> 988+0 records out 18:58 < Spads> 1035993088 bytes transferred in 43.454779 seconds (23840717 bytes/sec) 18:59 < caker> ok, now copy that sumbitch 18:59 < Spads> [nick@golgatem(/tmp)] gzip bigfile 18:59 < Spads> :D 18:59 < valen2> LOL 18:59 < mikegrb> lolz 18:59 -!- superbeef [lane@69-165-56-198.clvdoh.adelphia.net] has joined #linode-xenbeta 18:59 < caker> well, that'll involve CPU .. I wanted to see how good reads are while other stuff is going on 18:59 < Spads> haha ok 18:59 < Spads> [nick@golgatem(/tmp)] cp bigfile justasbigfile 19:00 < valen2> bigfile.gz: Before = 1GB, After = 0KB, Ratio = 100.0% 19:00 < Spads> valen2: close. We used to troll people with huge gzipped files in dcc 19:00 < Spads> it's like the frozen shaving foam prank, only it actually works 19:00 < valen2> *chuckles* 19:01 < valen2> I think it was a mailbomb tactic too. 19:01 < Spads> it might actually work for that 19:01 < caker> Spads: I'm not seeing anything better than 3k blocks/sec from vmstat 19:01 < caker> I think there's another thrasher 19:01 < caker> Spads: mind stopping that for a sec? 19:01 < Spads> yeah, I'm getting like 2000 19:01 * Spads kills it 19:01 < Spads> -rw-r--r-- 1 nick nick 1037041664 2006-04-05 15:58 bigfile 19:01 < Spads> -rw-r--r-- 1 nick nick 164462592 2006-04-05 16:01 justasbigfile 19:01 < Spads> fyi 19:02 < caker> wow .. lag 19:02 < Spads> netlag? 19:02 < caker> yeah .. to oftc or TP 19:02 < warewolf> check vmstat 19:02 < warewolf> er oh 19:02 < Spads> that's what we're watching 19:02 < caker> 3-4k/sec in 19:03 < Spads> I've got a vmstat 5 going 19:03 < Spads> I'm 100% idle 19:03 < caker> maybe cfq sucks 19:03 < caker> for this, anyhow 19:03 < Spads> heh 19:03 < Spads> it's possible 19:03 < warewolf> doesn't cfq stand for completely fair queue ? 19:03 < Spads> deadline maybe? 19:03 < caker> you think? 19:03 < warewolf> what are the tunable params of cfq? 19:04 < warewolf> maybe the defaults (or what you have) really aren't suited for what Linode hosts expect to do 19:05 < caker> # echo deadline > /sys/block/sda/queue/scheduler 19:05 < caker> ... 19:05 < Spads> hmmm 19:06 < warewolf> caker- and you've been using ionice as documented in linux/Documentation/block/ioprio.txt? 19:06 < caker> warewolf: yes 19:06 < caker> warewolf: I believe it's also tied to nice value .. regardless, I've been changing both :) 19:07 < caker> ok, I unpaused the two thrashers 19:07 < warewolf> oh shit 19:07 < Spads> heh 19:07 < warewolf> This document mainly details the current possibilites 19:07 < warewolf> with cfq, other io schedulers do not support io priorities so far. 19:07 < caker> right 19:07 < warewolf> looks like you're stuck with CFQ 19:07 < Spads> well 19:07 < warewolf> (sorry, I apologize if I'm stepping in really late here with things already said) 19:07 < Spads> unless other schedulers don't need them 19:07 < caker> does the box seem slow again? 19:08 < Spads> want me to try copying again? 19:08 < caker> no .. just normal usage 19:08 < caker> it sure seems like it's slower 19:08 < Spads> hmmm, yeah 19:08 < Spads> man pages took a while to load up 19:08 < taupehat> weird 19:08 < taupehat> I'm actually fine 19:09 < Spads> taupehat: you're probably running off of cache 19:09 < Spads> try something new 19:09 * taupehat is busy sinking a raid5 array into an LVM, but not on his node! 19:09 < warewolf> caker- yeah, doco says 'The mapping between cpu nice level to io nice level is determined as: io_nice = (cpu_nice +20) /5' 19:09 < caker> # echo anticipatory > /sys/block/sda/queue/scheduler 19:09 < caker> warewolf: those are docs -- I have yet to check the code :) 19:09 * warewolf nods 19:09 < Spads> caker: I'm fast again 19:09 < Spads> caker: anticipatory seemed to speed me up 19:09 < TheFirst> caker: things seem to have slowed slightly, not as bad as it was 19:10 < Spads> anticipatory was a dramatic improvement for me 19:10 < caker> # echo cfq > /sys/block/sda/queue/scheduler 19:10 < warewolf> doco says heavy disk io systems should use deadline 19:10 < warewolf> I don't know how fair that is though 19:10 < Spads> hmmm 19:11 < caker> of course, doing all of these tests, we don't know if they're still thrashing or what 19:11 < caker> not controlled 19:11 < Spads> yeah 19:11 < warewolf> good point 19:11 < Spads> I'm having a hard time realizing when I'm hitting cache for some things 19:13 < caker> # renice 20 -p `ps auxhf | grep xvd | grep -v grep | awk {'print $2'}` 19:13 * caker snickers 19:13 < warewolf> heh 19:13 < caker> I should only really be doing that for the nodes that thrash 19:13 < Spads> haha 19:13 < taupehat> hehe 19:13 < Spads> if that works 19:13 < caker> right 19:13 < taupehat> remind me to set your initdefault to six sometime, caker 19:13 < taupehat> =P 19:13 < Spads> then it could be a better replacement for an IO token system 19:13 < taupehat> http://70.86.201.113/imageserv2/temporary/PBF095ADPrankDragon.html 19:13 < Spads> haha 19:13 < Spads> yes 19:14 < caker> lemme to the pause-all thing again, one sec 19:14 < Spads> haha 19:15 < Spads> yeah 19:15 < Spads> science! 19:16 < caker> first one I found is still doing it .. pausing 19:16 < caker> oh, he stopped 19:17 < Spads> heh 19:17 < Spads> okay, I'm gunna go do laundry 19:17 < taupehat> heh 19:17 * taupehat has kicked caker in the shins! 19:17 < Spads> caker: if you want, I can give you a login on my node 19:17 < taupehat> was it me? 19:18 < warewolf> heh 19:18 < warewolf> he can just create a new linode to test on 19:18 < Spads> so you can pull its strings 19:18 < Spads> yeah he can 19:18 < caker> Spads: nah, that's ok -- I've got a guest on this box 19:18 < warewolf> already taken care of :) 19:18 < Spads> cool 19:18 < Spads> okay, I'm out 19:18 < warewolf> caker- are you monitoring /sys/block//stat? 19:19 < caker> warewolf: no, vmstat ... 19:19 < warewolf> caker- that appears to be vmstat for the block later. 19:19 < warewolf> caker- I'd suggest monitoring that 19:19 < warewolf> read linux/Documentation/block/stat.txt 19:20 < caker> yeah, vmstat -d 19:21 < warewolf> hmm there appears to be one more field in /sys/block//stat 19:21 < caker> ok .. I'm going to walk away for a few 20:03 -!- fmdns1 [~466cfc96@webuser.linode.com] has quit [Quit: CGI:IRC (Session timeout)] 20:10 -!- FireRabbit [~ebutler@71-35-163-161.tukw.qwest.net] has quit [Ping timeout: 480 seconds] 20:15 -!- superbeef [lane@69-165-56-198.clvdoh.adelphia.net] has quit [Quit: BitchX-1.1-final -- just do it.] 20:23 -!- FireRabbit [~ebutler@71-35-163-161.tukw.qwest.net] has joined #linode-xenbeta 21:57 -!- FireRabbit [~ebutler@71-35-163-161.tukw.qwest.net] has quit [Ping timeout: 480 seconds] --- Log closed Wed Apr 05 23:59:00 2006