Wed Apr 10 15:49:22 CEST 2013
GCC 4.8 - building everything?
The last few days I've spent a few hundred CPU-hours building things with gcc 4.8. So far, alphabetically up to app-office/, it's been really boring.
The amount of failing packages is definitely lower than with 4.6 or 4.7. And most of the current troubles are unrelated - for example the whole info page generation madness.
At the current rate of filing and fixing bugs we should be able to unleash this new version on the masses really soon - maybe in about a month? (Or am I just too optimistic?)
The amount of failing packages is definitely lower than with 4.6 or 4.7. And most of the current troubles are unrelated - for example the whole info page generation madness.
At the current rate of filing and fixing bugs we should be able to unleash this new version on the masses really soon - maybe in about a month? (Or am I just too optimistic?)
Thu Mar 7 03:45:34 CET 2013
Having fun with integer factorization
Given the input
The input number is conveniently chosen from the RSA challenge numbers and was the "world record" until 2003. Advances in algorithms, compilers and hardware have made it possible for me to re-do that record attempt in about a month walltime on a single machine ( 4-core AMD64).
Want to try yourself?
If you feel like more of a challenge:
# yafu "factor(10941738641570527421809707322040357612003732945449205990913842131476349984288934784717997257891267332497625752899781833797076537244027146743531593354333897)" -threads 4 -v -noecmif one is patient enough gives this output:
sqrtTime: 1163 NFS elapsed time = 3765830.4643 seconds. pretesting / nfs ratio was 0.00 Total factoring time = 3765830.6384 seconds ***factors found*** PRP78 = 106603488380168454820927220360012878679207958575989291522270608237193062808643 PRP78 = 102639592829741105772054196573991675900716567808038066803341933521790711307779What does that mean?
The input number is conveniently chosen from the RSA challenge numbers and was the "world record" until 2003. Advances in algorithms, compilers and hardware have made it possible for me to re-do that record attempt in about a month walltime on a single machine ( 4-core AMD64).
Want to try yourself?
emerge yafuthat's the "easiest" tool to manage. The dependencies are a bit fiddly, but it works well for up to ~512bit, maybe a bit more. It depends on msieve, which is quite impressive, and gmp-ecm, which I find even more intriguing.
If you feel like more of a challenge:
emerge cado-nfsThis tool even supports multi-machine setups out of the box using ssh, but it's slightly intimidating and might not be obvious to figure out. Also for a "small" input in the 120 decimal digits range it was about 25% slower than yafu - but it's still impressive what these tools can do.
Wed Nov 14 04:14:44 CET 2012
An informal comparison
A few people asked me to write this down so that they can reference it - so here it is.
A completely unscientific comparison between Linux flavours and how they behave:
CentOS 5 (because upgrading is impossible):
And on the same hardware, doing the same jobs, a Gentoo:
On the same hardware!
(The IO difference could be attributed to the ext3 -> ext4 upgrade and the kernel 2.6.18 -> 3.2.1 upgrade)
Another random data point: A really clumsy mediawiki (php+mysql) setup.
Since php is singlethreaded the performance is pretty much CPU-bound; and as we have a small enough dataset it all fits into RAM.
So we have two processes (mysql+php) that are serially doing things.
Original CentOS install: ~900 qps peak in mysql, ~60 seconds walltime to render a pathological page
Default-y Gentoo: ~1200 qps peak, ~45-50 seconds walltime to render the same page
Gentoo with -march=native in CFLAGS: ~1800qps peak, ~30 seconds render time (this one was unexpected for me!)
And a "move data around" comparison: 63GB in 3.5h vs. 240GB in 4.5h - or roughly 4x the throughput
So, to summarize: For the same workload on the same hardware we're seeing substantial improvements between a few percent and roughly four times the throughput, for IO-bound as well as for CPU-bound tasks. The memory use goes down for most workloads while still getting the exact same results, only a lot faster.
Oh yeah, and you can upgrade without a reinstall.
A completely unscientific comparison between Linux flavours and how they behave:
CentOS 5 (because upgrading is impossible):
total used free shared buffers cached
Mem: 3942 3916 25 0 346 2039
-/+ buffers/cache: 1530 2411
And on the same hardware, doing the same jobs, a Gentoo:
total used free shared buffers cached
Mem: 3947 3781 166 0 219 2980
-/+ buffers/cache: 582 3365
So we use roughly 1/3rd the memory to get the same things done (fileserver),
and an informal performance analysis gives us roughly double the IO throughput.
On the same hardware!
(The IO difference could be attributed to the ext3 -> ext4 upgrade and the kernel 2.6.18 -> 3.2.1 upgrade)
Another random data point: A really clumsy mediawiki (php+mysql) setup.
Since php is singlethreaded the performance is pretty much CPU-bound; and as we have a small enough dataset it all fits into RAM.
So we have two processes (mysql+php) that are serially doing things.
Original CentOS install: ~900 qps peak in mysql, ~60 seconds walltime to render a pathological page
Default-y Gentoo: ~1200 qps peak, ~45-50 seconds walltime to render the same page
Gentoo with -march=native in CFLAGS: ~1800qps peak, ~30 seconds render time (this one was unexpected for me!)
And a "move data around" comparison: 63GB in 3.5h vs. 240GB in 4.5h - or roughly 4x the throughput
So, to summarize: For the same workload on the same hardware we're seeing substantial improvements between a few percent and roughly four times the throughput, for IO-bound as well as for CPU-bound tasks. The memory use goes down for most workloads while still getting the exact same results, only a lot faster.
Oh yeah, and you can upgrade without a reinstall.
Sat Oct 13 15:58:12 CEST 2012
Reanimating #gentoo-commits
Today I got annoyed with the silence in #gentoo-commits and spent a few hours fixing that. We have a bot reporting ... well, I hope all commits, but I haven't tested it enough.
So let me explain how it works so you can be very amused ...
First stage: Get notifications
Difficulty: I can't install postcommit hooks on cvs.gentoo.org
Workaround: gentoo-commits@lists.gentoo.org emails
Code (procmailrc):
Second stage: Extracting the data
Difficulty: Email is not a structured format
Workaround: bashing things with bash until happy
Code (irker-wrapper.sh):
Third stage: Sending the notifications
Difficulty: How to communicate with irkerd?
Workaround: nc, a hammer, a few thumbs
Code:
Bonus trick: using command="" in ~/.ssh/authorized_keys
... and now I really need a beer :)
So let me explain how it works so you can be very amused ...
First stage: Get notifications
Difficulty: I can't install postcommit hooks on cvs.gentoo.org
Workaround: gentoo-commits@lists.gentoo.org emails
Code (procmailrc):
:0:
* ^TO_gentoo-commits@lists.gentoo.org
{
:0 c
.maildir/.INBOX.gentoo-commits/
:0
| bash ~/irker-wrapper.sh
}
So this runs all mails that come from the ML through a script, and puts a copy into a subfolder.
Second stage: Extracting the data
Difficulty: Email is not a structured format
Workaround: bashing things with bash until happy
Code (irker-wrapper.sh):
#!/bin/bash
# irker wrapper helper thingy
while read line; do
# echo $line # debug
echo $line | grep -q "X-VCS-Repository:" && REPO=${line/X-VCS-Repository: /}
echo $line | grep -q "X-VCS-Committer:" && AUTHOR=${line/X-VCS-Committer:/}
echo $line | grep -q "X-VCS-Directories:" && DIRECTORIES=${line/X-VCS-Directories:/}
echo $line | grep -q "Subject:" && SUBJECT=${line/Subject:/}
EVERYTHING+=$line
EVERYTHING+="\n"
done
COMMIT_MSG=`echo -e $EVERYTHING | grep "Log:" -A1 | grep -v "Log:"`
ssh commitbot@lolcode.gentooexperimental.org "{\"to\": [\"irc://chat.freenode.net/#gentoo-commits\"], \"privmsg\": \"$REPO: ${AUTHOR} ${DIRECTORIES}: $COMMIT_MSG \"}"
Why the ssh stuff? Well, the server where the mails arrive is a bit restricted, hard to run a daemon there 'n stuff, so let's just pipe it somewhere more liberal
Third stage: Sending the notifications
Difficulty: How to communicate with irkerd?
Workaround: nc, a hammer, a few thumbs
Code:
#!/bin/bash echo $@ | nc --send-only 127.0.0.1 6659And that's how the magic works.
Bonus trick: using command="" in ~/.ssh/authorized_keys
... and now I really need a beer :)
Tue Jun 26 07:44:19 CEST 2012
The Janitor's Manifesto
(as there are Council Elections again here's my Manifesto. Or something almost, but not completely unlike it)
Most of what I do is cleaning up. Fighting entropy. Removing bugs so we can have the best distribution there is. And because there just never are enough hours in a day I like to find replacements for me. Recruit new people to do what I did so I can find the next problem.
I wont stop doing that.
Most of the things we do are outside the sphere of influence of the council. We get at least 90% of our Gentoo-work done without needing anyone to nudge us in the right direction. But it's the rest of our troubles that needs lots of discussion and motivation to find the most effective solution to our issues.
I'm an opinionated muppet. But my opinions are usually based on experience. And when I see a bad idea I'm not going to tolerate it because it might upset your feelings. Deal with it ;) But I guess people are aware that I'm, uhm, "polarizing". And getting me into the Council will make it a lot easier for me to punch stupid ideas until they go away. Like, say, the zombie GLEPs 54/55 that have been summoned back for about 2 years until people slowly got the idea that they won't be tolerated. (Amusingly we now finally have the "EAPI at beginning of file" rule that obsoletes it. So it goes.)
Independent of what you vote I'll continue doing what I do - try to recruit people, do random build experiments across the tree for fun, file bugs ... it won't get boring soon. So much to do ... but it'll be a tiny bit easier for me if I'm councilerated.
Here's a rough list of things that would be nice to have, in approximately ascending order of difficulty:
Most of what I do is cleaning up. Fighting entropy. Removing bugs so we can have the best distribution there is. And because there just never are enough hours in a day I like to find replacements for me. Recruit new people to do what I did so I can find the next problem.
I wont stop doing that.
Most of the things we do are outside the sphere of influence of the council. We get at least 90% of our Gentoo-work done without needing anyone to nudge us in the right direction. But it's the rest of our troubles that needs lots of discussion and motivation to find the most effective solution to our issues.
I'm an opinionated muppet. But my opinions are usually based on experience. And when I see a bad idea I'm not going to tolerate it because it might upset your feelings. Deal with it ;) But I guess people are aware that I'm, uhm, "polarizing". And getting me into the Council will make it a lot easier for me to punch stupid ideas until they go away. Like, say, the zombie GLEPs 54/55 that have been summoned back for about 2 years until people slowly got the idea that they won't be tolerated. (Amusingly we now finally have the "EAPI at beginning of file" rule that obsoletes it. So it goes.)
Independent of what you vote I'll continue doing what I do - try to recruit people, do random build experiments across the tree for fun, file bugs ... it won't get boring soon. So much to do ... but it'll be a tiny bit easier for me if I'm councilerated.
Here's a rough list of things that would be nice to have, in approximately ascending order of difficulty:
- deprecate EAPIs so that only 0 and latest are in use
- Signing policies - keyring, key type, etc. Make it make sense.
- PR: make people notice us again. GWN revival, articles for LWN, etc.
- Tools and automation: Make it easy to be lazy. Get more done in less time
- Recruitment: assimilation of derived distros (pentoo, TinHat, Sabayon, funtoo, calculate) and others (Slackware, OpenSUSE, FreeBSD ?)
- Recruitment: make it easy for people to contribute, and as soon as they do don't let them leave
- Stop the Windowsification (offline updates?!) and MacOSification (systemd, /usr move, initrd as everything, trollolololooooo)
- Be Awesome
Mon May 21 07:33:56 CEST 2012
LSI Controllers Part II, or how to be sad
Last (psychotic) episode we looked at how to download and install LSI controller management tools.
Today's special will look at using them, or trying to, or finding reasons to be totally drunk.
Will -h work? Yes it does ... but ...
The bad news is the syntax of this ancient demon summoning device. To quote:
Second one notes Dsbl instead of disable, because characters are precious (and why not use Off then?)
And that's the best part of it.
There's also a few excellent features in these "RAID" "Controllers" that might make you a bit grumpy.
For example it will only boot off the first disk, so in JBOD mode (with software raid on top of it maybe?) if the first disk fails you will have to manually change the config at boot time to, err, boot. But on the upside it will stop if *any* disk has failed, or any array is degraded, so you'll do that often enough.
You might ask, why software raid? Well - the controller firmware does not support growing volumes that are on a shared disk, so if you have [sda1 sda2] etc. and make a raid1 out of sd{a,b,c,d}1 and a raid5 out of sd{a,b,c,d}2 then you cannot grow these partitions in the future, which means you have to move all data away, destroy array, recreate array, copy back.
And that's supposed to be the Market Leader ?!
Today's special will look at using them, or trying to, or finding reasons to be totally drunk.
# file ./MegaCli64 ./MegaCli64: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.4.0, stripped # ./MegaCli64 Fatal error - Command Tool invoked with wrong parameters Exit Code: 0x01Ho hum. Err. 0x01 you say? -ESTUPIDUSER I guess.
# ./MegaCli64 --help Invalid input at or near token - Exit Code: 0x01Invalid? --help? Erf. Uhm. Now I could really use some, err, help?
Will -h work? Yes it does ... but ...
# ./MegaCli64 -h | wc -l 277But at least: Exit Code: 0x00 - yey?
The bad news is the syntax of this ancient demon summoning device. To quote:
MegaCli -AdpPR -Dsbl|EnblAuto|EnblMan|Start|Suspend|Resume|Stop|Info|SSDPatrolReadEnbl |SSDPatrolReadDsbl
|{SetDelay Val}|{-SetStartTime yyyymmdd hh}|{maxConcurrentPD Val} -aN|-a0,1,2|-aALL
First there's no explanation at all what AdpPR means. And that it's actually case-independant.Second one notes Dsbl instead of disable, because characters are precious (and why not use Off then?)
And that's the best part of it.
There's also a few excellent features in these "RAID" "Controllers" that might make you a bit grumpy.
For example it will only boot off the first disk, so in JBOD mode (with software raid on top of it maybe?) if the first disk fails you will have to manually change the config at boot time to, err, boot. But on the upside it will stop if *any* disk has failed, or any array is degraded, so you'll do that often enough.
You might ask, why software raid? Well - the controller firmware does not support growing volumes that are on a shared disk, so if you have [sda1 sda2] etc. and make a raid1 out of sd{a,b,c,d}1 and a raid5 out of sd{a,b,c,d}2 then you cannot grow these partitions in the future, which means you have to move all data away, destroy array, recreate array, copy back.
And that's supposed to be the Market Leader ?!
Mon May 7 07:15:20 CEST 2012
How not to treat your customers
Here's a funny game - let's try to use a LSI controller from the command line in Linux.
First difficulty: Since a short while ago all downloads are behind a registration wall. Ok, no biggie, let's register then (grumble, grumble, moan.)
The registration demands a password containing at least seven characters, including at least one non-alphanumeric. Yey!
Then there's the Captcha, which looks quite amusing, but ... erf. After three or four guesses you might even get it right.
Fast forward to the driver download page. The specific model (according to lspci) doesn't even exist. Oh, haha, u so funny! But no worry, all the MegaRAID models share the same driver.
Now, here it gets a bit esoteric:
So, uhm, taddah:
And now you know why so many sysadmins are alcoholics.
(Bonus feature: We're not allowed to mirror any of these highly proprietary files, so every single person trying to interact with a LSI "Raid" controller on Linux gets to experience that)
First difficulty: Since a short while ago all downloads are behind a registration wall. Ok, no biggie, let's register then (grumble, grumble, moan.)
The registration demands a password containing at least seven characters, including at least one non-alphanumeric. Yey!
Then there's the Captcha, which looks quite amusing, but ... erf. After three or four guesses you might even get it right.
Fast forward to the driver download page. The specific model (according to lspci) doesn't even exist. Oh, haha, u so funny! But no worry, all the MegaRAID models share the same driver.
Now, here it gets a bit esoteric:
$ file 8.02.21_MegaCLI.zip 8.02.21_MegaCLI.zip: Zip archive data, at least v2.0 to extract $ unzip 8.02.21_MegaCLI.zip Archive: 8.02.21_MegaCLI.zip inflating: 8.02.21_MegaCLI.txt inflating: 8.02.21_Windows_MegaCLI/MegaCli.exe inflating: 8.02.21_Windows_MegaCLI/MegaCli64.exe inflating: 8.02.21_Windows_MegaCLI/Readme.txt inflating: 8.02.21_VMware_MegaCLI/MegaCli extracting: 8.02.21_VMware_MegaCLI/MegaCli.zip inflating: 8.02.21_Solaris_MegaCLI/MegaCli inflating: 8.02.21_Solaris_MegaCLI/MegaCli.pkg inflating: 8.02.21_Solaris_MegaCLI/readme.txt extracting: 8.02.21_Linux_MegaCLI/MegaCliLin.zip inflating: 8.02.21_Linux_MegaCLI/readme.txt inflating: 8.02.21_FreeBSD_MegaCLI/MegaCli inflating: 8.02.21_FreeBSD_MegaCLI/MegaCli64 inflating: 8.02.21_DOS_MegaCLI/LICENSE_DOS32A.txt inflating: 8.02.21_DOS_MegaCLI/MegaCli.exe... wow, we got a Solaris driver for free, and a linux ... zipfile? eh whut?
$ unzip 8.02.21_Linux_MegaCLI/MegaCliLin.zip Archive: 8.02.21_Linux_MegaCLI/MegaCliLin.zip inflating: MegaCli-8.02.21-1.noarch.rpm inflating: Lib_Utils-1.00-09.noarch.rpmOh, right, not really a zipfile but an rpm. My bad! haha! oh ... uhm ... :(
$ rpm2tar MegaCli-8.02.21-1.noarch.rpm $ tar --list -f MegaCli-8.02.21-1.noarch.tar ./ ./opt/ ./opt/MegaRAID/ ./opt/MegaRAID/MegaCli/ ./opt/MegaRAID/MegaCli/MegaCli ./opt/MegaRAID/MegaCli/MegaCli64 $ file ./opt/MegaRAID/MegaCli/MegaCli64 ./opt/MegaRAID/MegaCli/MegaCli64: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.4.0, strippedzipception! it's a single binary, in an rpm, in a zip, in a zip, on a protected website! And look, "for GNU/Linux 2.4.0" - it's built to run everywhere. Even on Really Ancient Computers.
So, uhm, taddah:
$ ./opt/MegaRAID/MegaCli/MegaCli64 Fatal error - Command Tool invoked with wrong parameters Exit Code: 0x01
And now you know why so many sysadmins are alcoholics.
(Bonus feature: We're not allowed to mirror any of these highly proprietary files, so every single person trying to interact with a LSI "Raid" controller on Linux gets to experience that)
Wed Feb 22 16:29:48 CET 2012
o_O
Total: 2409 packages (33 upgrades, 2269 new, 7 in new slots, 100 reinstalls, 1 uninstall), Size of downloads: 6,073,164 kB Fetch Restriction: 8 packages (8 unsatisfied) Conflict: 4 blocks Would you like to merge these packages? [Yes/No]
Sat Oct 22 18:11:25 CEST 2011
Booting Gentoo - from init to console
I've spent some time with OpenRC and sysvinit trying to understand a few things (for example how to integrate
CGroups support), and along the way I've learned a few things about the boot process that are not that well
documented.
So why not document it for posteriority ...
In the beginning, through some magic, the kernel is booted. How that happens is another story and not our concern at the moment. At some point the kernel has initialized, figured out where the rootfs is (for example through the root= kernel parameter), mounted it ... and now what?
At this point we need to start userspace process #1, traditionally known as "init". The kernel has some hardcoded defaults that by default will try /sbin/init, but it's easy enough to override that with the init= kernel parameter if you want to have something else run (like /bin/bash to get just a rescue shell)
init comes from the sysvinit package and is small enough to be read in an afternoon. There are some surprisingly elegant bits in it, but it's still just doing one job well - starting the rest of the userland processes. It takes its info from /etc/inittab, which just lists different runlevels and what to start. Now in the case of Gentoo this is a bit unusual as it mostly just calls "rc", which is part of the OpenRC package, with a parameter like "rc single". This is the name of the runlevel then - "default" by default. We have sane defaults!
Init is also triggered to change runlevels, this is usually done through the "init" or "telinit" commands.
Now OpenRC needs to figure out what to start. Before the "default" runlevel is started we need to start the "sysinit" and "boot" runlevels (for details have a look at /etc/runlevels ). This starts a few things like udev and mounts local filesystems, then starts all the daemons you requested. The bookkeeping for that (what has started, is starting, has failed etc.) can be found in /lib/rc/init.d/ - just another simple directory with self-explaining filenames. And running "rc" without arguments will just try to get us back to the current runlevel defaults - start what is stopped, and stop everything not defined in /etc/runlevels. Running "rc" in cron is a nice way to keep things like sshd running even through accidents like "killall sshd" :)
How OpenRC figures out the dependencies is quite "magic", but if you trace it you find runscript (an executable) running runscript.sh with a parameter like "depend", which sources the init scripts and just outputs the value of the DEPEND line. (Read /lib/rc/sh/runscript.sh to get an idea, or if you get bored read the source of runscript). And that information is cached in /lib/rc/init.d/deptree to avoid having to re-source the init scripts as this is a "slow" process (maybe 50msec per init script, but if you have 100 scripts that's still 5 seconds you lose just parsing the init scripts instead of starting stuff)
So OpenRC starts all the things from /etc/runlevel and is now done, it returns the control to init, which now notices that it has a few lines like this in its config (/etc/inittab):
And here we are, booted up and ready to serve our human overlords ;)
So why not document it for posteriority ...
In the beginning, through some magic, the kernel is booted. How that happens is another story and not our concern at the moment. At some point the kernel has initialized, figured out where the rootfs is (for example through the root= kernel parameter), mounted it ... and now what?
At this point we need to start userspace process #1, traditionally known as "init". The kernel has some hardcoded defaults that by default will try /sbin/init, but it's easy enough to override that with the init= kernel parameter if you want to have something else run (like /bin/bash to get just a rescue shell)
init comes from the sysvinit package and is small enough to be read in an afternoon. There are some surprisingly elegant bits in it, but it's still just doing one job well - starting the rest of the userland processes. It takes its info from /etc/inittab, which just lists different runlevels and what to start. Now in the case of Gentoo this is a bit unusual as it mostly just calls "rc", which is part of the OpenRC package, with a parameter like "rc single". This is the name of the runlevel then - "default" by default. We have sane defaults!
Init is also triggered to change runlevels, this is usually done through the "init" or "telinit" commands.
Now OpenRC needs to figure out what to start. Before the "default" runlevel is started we need to start the "sysinit" and "boot" runlevels (for details have a look at /etc/runlevels ). This starts a few things like udev and mounts local filesystems, then starts all the daemons you requested. The bookkeeping for that (what has started, is starting, has failed etc.) can be found in /lib/rc/init.d/ - just another simple directory with self-explaining filenames. And running "rc" without arguments will just try to get us back to the current runlevel defaults - start what is stopped, and stop everything not defined in /etc/runlevels. Running "rc" in cron is a nice way to keep things like sshd running even through accidents like "killall sshd" :)
How OpenRC figures out the dependencies is quite "magic", but if you trace it you find runscript (an executable) running runscript.sh with a parameter like "depend", which sources the init scripts and just outputs the value of the DEPEND line. (Read /lib/rc/sh/runscript.sh to get an idea, or if you get bored read the source of runscript). And that information is cached in /lib/rc/init.d/deptree to avoid having to re-source the init scripts as this is a "slow" process (maybe 50msec per init script, but if you have 100 scripts that's still 5 seconds you lose just parsing the init scripts instead of starting stuff)
So OpenRC starts all the things from /etc/runlevel and is now done, it returns the control to init, which now notices that it has a few lines like this in its config (/etc/inittab):
# TERMINALS c1:12345:respawn:/sbin/agetty -c 38400 tty1 linuxSo what it does now is very simple - it runs agetty, which configures the (pseudo-)terminals (tty1 here) and starts a login program (in this case /bin/login, the default). This asks us for username and password (another interesting story for a different time), and when this is done runs the login shell specified for that user.
And here we are, booted up and ready to serve our human overlords ;)