Mon Jul 14 08:39:32 CEST 2014

Biggest ebuilds in-tree

Random datapoint: There's only about 10 packages with ebuilds over 600 lines.

Sorted by lines, duplicate entries per-package removed, these are the biggest ones:
828 dev-lang/ghc/ghc-7.6.3-r1.ebuild
817 dev-lang/php/php-5.3.28-r3.ebuild
750 net-nds/openldap/openldap-2.4.38-r2.ebuild
664 www-client/chromium/chromium-36.0.1985.67.ebuild
654 www-servers/nginx/nginx-1.4.7.ebuild
658 games-rpg/nwn-data/nwn-data-1.29-r5.ebuild
654 media-video/mplayer/mplayer-1.1.1-r1.ebuild
644 dev-vcs/git/git-9999-r3.ebuild
621 x11-drivers/ati-drivers/ati-drivers-13.4.ebuild
617 sys-freebsd/freebsd-lib/freebsd-lib-9.1-r11.ebuild

Posted by Patrick | Permalink

Fri Jun 27 10:01:01 CEST 2014

Build times

Just for fun, over about 8500 packages built, the slowest three:
     Fri Jun 13 19:40:13 2014 >>> dev-python/pypy-2.2.1
       merge time: 2 hours, 7 minutes and 23 seconds.

     Fri Jun 20 09:58:38 2014 >>> app-office/libreoffice-4.2.4.2
       merge time: 1 hour, 37 minutes and 22 seconds.

     Fri Jun 27 12:52:19 2014 >>> sci-libs/openfoam-2.3.0
       merge time: 1 hour, 5 minutes and 8 seconds.
(Quadcore AMD64, 3.4Ghz, 8GB RAM)

These are also the only packages above 1h build time.
Average seems to be near 5 minutes (hard to filter out all the binpkg merges, which are silly-fast)

Edit: New highscore!
     Sun Jun 29 20:36:09 2014 >>> sci-mathematics/nusmv-2.5.4
       merge time: 2 hours, 58 minutes.

Posted by Patrick | Permalink

Wed Jun 25 08:27:30 CEST 2014

Building Everything

Preparation:
  • Take recent stage3 and unpack to a temporary location
  • Set up things: make.conf, resolv.conf, keywords, ...
  • Update @system, check gcc version etc.
  • Clone this snapshot to 4 locations (4 because of CPU cores)
  • bindmount /usr/portage and friends
Run:
Start a screen session for each clone. Chroot in. Apply magic oneliner:
for i in $( qsearch -NC --all | sort -R ); do 
    if $( emerge --nodeps -pk $i > /dev/null ) ; then 
        emerge --depclean; echo $i; emerge -uNDk1 $i; 
    fi; 
done
Wait 4-5 days, get >10k binary packages, lots of logfiles.

Space usage:
~2.5G logfiles
~35G distfiles
~20G binary packages
~100G temp space (/var/tmp has lots of cruft unless FEATURES="fail-clean")


Triage of these logfiles yields about 1% build failures, on average.
It's not hard to do, just tedious!

make.conf additions:
FEATURES="buildpkg split-log -news"
PORT_LOGDIR="/var/log/portage/"
MAKEOPTS="-j4"
EMERGE_DEFAULT_OPTS="--jobs 4"

CLEAN_DELAY="0"
EMERGE_WARNING_DELAY="0"
ACCEPT_PROPERTIES="* -interactive"

Posted by Patrick | Permalink

Tue Jun 17 09:13:34 CEST 2014

EAPI statistics, again

Start: Thu Jan 16 08:18:45 UTC 2014
End:   Mon Jun 16 00:00:01 UTC 2014

EAPI 0:   5966 ebuilds (15.78 percent) ->  5477 ebuilds (14.40 percent)
EAPI 1:    370 ebuilds (0.98 percent)  ->   215 ebuilds ( 0.57 percent)
EAPI 2:   3335 ebuilds (8.82 percent)  ->  2938 ebuilds ( 7.72 percent)
EAPI 3:   3005 ebuilds (7.95 percent)  ->  2585 ebuilds ( 6.79 percent)
EAPI 4:  12385 ebuilds (32.76 percent) -> 10375 ebuilds (27.27 percent)
EAPI 5:  12742 ebuilds (33.71 percent) -> 16455 ebuilds (43.25 percent)
Total    37803 -> 38045

EAPI 0 change:  -8.2%
EAPI 1 change: -58.1%
EAPI 2 change: -11.9%
EAPI 3 change: -14.0%
EAPI 4 change: -16.2%
EAPI 5 change: +29.1%
So over the last 5 months we had about 2% increase in the total amount of ebuilds. The only growing class is EAPI5, which is quite excellent.

EAPI 0 is the slowest decreasing, as long as there's no coordinated effort to get rid of it it'll be there forever. EAPI1 is now very close to extinction.

EAPI 2,3 and 4 are slowly shrinking away, but at this rate it'll still take years.

Posted by Patrick | Permalink

Fri Jun 13 10:33:22 CEST 2014

A one-line Tinderbox

Needs portage-utils, best to run in a chroot:
for i in $( qsearch --all -CN | sort -R ); do emerge -1 $i; emerge --depclean; done

Posted by Patrick | Permalink

Sat May 3 03:40:52 CEST 2014

KDE's Baloo Indexer: Constructive Criticism

KDE 4.13 was released with a new indexer, named "Baloo". It mostly replaces the 'old' Akonadi indexer, which at first glance appears to be a good idea. It seems to work, so that's quite swell. There's only a problem. Or rather, some little problems, and upstream is one of them as they don't want to acknowledge that these issues exist. So let me try to explain ...
  • There are times when I just need the indexer to not run. For example when I'm watching a movie (IO activity -> stutter), doing a presentation (random lag?!) etc. And there are times (e.g. at night) when the indexer can run as much as it wants.
  • There are times when the indexer interferes with normal operation - e.g. when using firefox, the added IO activity causes the FF UI to lag severely, as if the machine was swapping. Partially also because the IO activity evacuates the filesystem cache, which is quite funny. And fsync plus lots of reads means the latency goes up to multiple seconds or even multiple tens of seconds for a single IO activity
  • The indexer claims to not interfere with normal operation. It limits itself to 10% CPU usage - which is the wrong metric, since I have lots of CPU and very little IO, relatively speaking. Thus it takes 100% of available IO bandwidth. Akonadi used up to 4 CPUs for longer amounts of time, but as it didn't hurt IO much I could just ignore it.
  • The indexer takes a LONG time. On boot it needs about 20 minutes walltime just to figure out if anything has changed. During that time service quality is severely degraded.
  • The indexer takes a long time. The initial scan of my home directory takes about, hmm, 36-48h I think, during which time service quality is severely degraded
  • The indexer isn't polite, it auto-respawns if you just kill the baloo_file_indexer process. You have to kill its parent too, otherwise it'll just respawn and bother you some more
  • [Fixed in next release] Removing a directory from the index causes an index cleaner to run, which is even more severe than the indexer itself
So, to summarize: As much as I like the indexer, it prevents me from working normally, so I have to insist that it has a simple "off" button. A lesson that akonadi learned, that gnome's tracker learned, is that you need to nice yourself down. It would be very much appreciated if baloo were to nice and ionice itself down to idle, which usually avoids the severe lag that foreground tasks may experience.

An extra bonus would be this: The indexer should do a microbenchmark on startup (or let the user provide a guesstimate) to figure out IO capacity in IO/s, and then limit itself to a configurable amount of that. If it takes 1/10th of my IO bandwidth (about 10-15 IO/s with a single SATA disk) it wouldn't even bother me more than, say, Firefox running in the background.

Another interesting glitch is that most indexers use inotify listeners to see if anything in a directory changes. This has the funny effect that it only works on small data sets - on my desktop I get random popups that an application wants to change system limits. Well, /proc/sys/fs/inotify/max_user_watches is already set to "262144" by default, and that's still not enough? This also takes memory, and it can't scale up. I "only" have a few million files, that's not even a lot.

So, to summarize:
Simple fixes:
  • Nice and ionice the indexer on startup
  • Provide users with a simple on/off mechanism
Advanced fixes:
  • Throttle on IO instead of CPU
  • Delay indexer startup for a little while on boot. Maybe 120sec grace period
  • Figure out system limits and fail gracefully instead of annoying users with popups
Well, dear upstream, don't accuse me of not being constructive ...
<DrEeevil> people complain about user-hostile behaviour, and you tell them to ... be nicer and not complain so loud?
<unormal> DrEeevil: To be honest. The only things I see from you in here since hours and days is hostile behaviour. I really would like to ask you to stop this and be constructive or otherwise leave
<DrEeevil> unormal: well, if I didn't have to remove binaries and kill processes I'd be a lot happier
<DrEeevil> since upstream hasn't shown any understanding I'll rather escalate until the bugs are resolved
<DrEeevil> constructive: give me an off button so I can stop the indexer when it hurts me, give me a rate limit so I can run it while using the computer
<DrEeevil> (using 99% of available IO bandwidth for up to 72h is just not acceptable in normal use)
<DrEeevil> I don't want to remove the indexer, but I want to control how much resource usage it has
<DrEeevil> (bonus: ionice + nice it down to lowest/idle, then it doesn't bother that much)
<DrEeevil> it's not THAT hard to figure that out ...
<unormal> DrEeevil: Ok and now explicitely: Please do us a favor and leave the channel.
<DrEeevil> unormal: once I can have baloo installed and working as described above you'll never hear from me again
<DrEeevil> just every time I get a local DoS you get a complaint, so that you don't lose the motivation to fix the bugs
<unormal> That's not how you motivate people, please leave.
<DrEeevil> that's not how you write software, please fix
<DrEeevil> I'll be the stone in your shoe until you stop being the one in mine
<DrEeevil> heck, I'll even test patches once they are provided!
<krop> and ultimately, you'll roll on the floor crying until something happens ? how can gentoo accept immature people in their staff ?
--> seaLne (~seaLne@kde/kenny) has joined #kde-baloo
*** Mode #kde-baloo +o seaLne by ChanServ
<DrEeevil> krop: how can kde have releases with such serious regressions?
<DrEeevil> sorry, I don't deal with C++, in this case I'm just a QA tool
<krop> no, you just behave like a stubborn child
<DrEeevil> because I actually would like to USE kde
<DrEeevil> not sure how you see that, but it's kinda nice usually, except when someone staples in a DoS and then tells me that's all fine and dandy
<DrEeevil> maybe I should use git HEAD again to catch regressions earlier
*** Mode #kde-baloo +b DrEeevil!*@* by seaLne
*** Mode #kde-baloo +b not!*@* by seaLne
*** Mode #kde-baloo +b being!*@* by seaLne
*** Mode #kde-baloo +b constructive!*@* by seaLne
<DrEeevil> heh
<-* seaLne has kicked DrEeevil from #kde-baloo (DrEeevil)

Posted by Patrick | Permalink

Sat Apr 5 12:29:12 CEST 2014

"smart" software

1) Grab webbwrowser
2) Enter URL
3) Figure out that webbrowser doesn't want to use HTTP because ... saturday? I don't know, but ass'u'me'ing that some URLs are ftp is just, well stupid, because your heuristic is whack.

Or, even more beautiful:
$ clementine
18:02:59.662 WARN  unknown                          libpng warning: iCCP: known incorrect sRGB profile 
Bus error


I have no idea what this means, so I'll be explicitly writing http:// at the beginning of all URL I offer to Firefox. And Clementine just got a free travel to behind the barn, where it'll get properly retired - after all it doesn't do the simple job it was hired to do. Ok, before it randomly didn't play "some" music files because gstreamer, which makes no sense either, but open rebellion will not have happy results.

I guess the moral of the story is: Don't misengineer things, clementine should output music and not be a bus driver. Firefox should not interpret-dance the URLS offered to it, but since it's still less retarded than the competition it'll be allowed to stay a little bit longer.

Sigh. Doesn't anyone engineer things anymore?

Posted by Patrick | Permalink

Fri Feb 28 08:37:26 CET 2014

INSTALL_MASK'ing for a better future

So today I was pointed at a funny one:
/etc/systemd/system/ntpdate.service.d/00gentoo.conf
Now instead of being wrongly installed in /usr/lib (whuarghllaaaaaaaawwreghhh!?!$?) there's some config files for systemd bleeding into /etc.

Apart from being inconsistent with itself this eludes all previous ways to avoid useless files from being installed. The proper response thus looks like this now:
INSTALL_MASK="/lib/systemd /lib32/systemd /lib64/systemd /usr/lib/systemd /usr/lib32/systemd /usr/lib64/systemd /etc/systemd"
And on the upside this will break udev unless you carefully move config to /etc (lolwat ur no haz EUNICHS system operation?) - which just motivated me to shift everything I can to eudev.

Reading recommendation: FHS

Posted by Patrick | Permalink

Thu Feb 20 09:32:05 CET 2014

gentoo-x86 to git, round two

After my not-so-good experiments with cvs2git I was pointed at cvsps. The currently masked 3.13 release (plus the lastest ~arch version of cvs) seems to do the trick quite well. It throws a handful of warnings about timestamps that appear to be harmless to me.
What I haven't figured out yet is how to "fix" the email addresses, but that's a minor thing.
Take the raw cvs repo as in the first blogpost, then:
$ time cvsps --root :local:/var/tmp/git-test/gentoo-x86-raw/ --fast-export gentoo-x86 > git-fast-export-stream
cvsps: NOTICE: used alternate strip path /var/tmp/git-test/gentoo-x86-raw/gentoo-x86/
cvsps: broken revision date: 2003-02-18 13:46:55 +0000 -> 2003-02-18 13:46:55 file: dev-php/PEAR-Date/PEAR-HTML_Common-1.0.ebuild, repairing.

[SNIP]

real    212m56.219s
user    12m11.170s
sys     6m59.110s
So this step takes near 3h walltime, and consumes ~10GB RAM. It generates about 17GB of temporary data.
To get performance up you'd need a machine with 32GB+ RAM so that you can do that in TMPFS (and don't forget to make /tmp a tmpfs too, because tmpfile() creates lots and lots of temporary files there) - and the tmpfs needs to be >18GB

In theory you can pipe that directly into git-fast-import. To make testing easier I didn't do that..
Throwing everything into git takes "a while" (forgot to time it, about 20 minutes I think):
Alloc'd objects:    9680000
Total objects:      9675121 (    190979 duplicates                  )
      blobs  :      3020032 (    158366 duplicates    1389088 deltas of    2989578 attempts)
      trees  :      5150778 (     32613 duplicates    4633675 deltas of    4709477 attempts)
      commits:      1504311 (         0 duplicates          0 deltas of          0 attempts)
      tags   :            0 (         0 duplicates          0 deltas of          0 attempts)
Total branches:           8 (         3 loads     )
      marks:     1073741824 (   4682709 unique    )
      atoms:         431658
Memory total:        516969 KiB
       pools:         63219 KiB
     objects:        453750 KiB

pack_report: getpagesize()            =       4096
pack_report: core.packedGitWindowSize = 1073741824
pack_report: core.packedGitLimit      = 8589934592
pack_report: pack_used_ctr            =    7139457
pack_report: pack_mmap_calls          =    1976288
pack_report: pack_open_windows        =          3 /          9
pack_report: pack_mapped              = 2545679911 / 8589934592
And then run git gc (warning: Another mem-hungry operation peaking at ~8GB).
The result is about 7.2GB git repository and appears to have full history.

Files to play around with:
Raw copy of the CVS repo (~440MB)
The git-fast-importable stream created by cvsps (biiig)
The mangled compressed git repository that results from it (~6GB)
Edit:
The same repo recompressed (~1.7GB)
"git repack -a -d -f --max-pack-size=10g --depth=100 --window=250" takes ~3 CPU-hours and collapses the size nicely. Thanks, Mr.Klausmann!

Posted by Patrick | Permalink

Wed Feb 19 07:43:21 CET 2014

Thunderbird - double sending is better sending

So here's something brilliant I've found while debugging some PGP-issues:
0q2CYNVFEz6wXHAGYArfO/F/faOL5L6fQw9f93FurZgx7Y+iR1J7Civaa7LHxQ8h
FzstP7BYEhCx2HmEZuDf18htDsTBZAlNVGsI0DMb2wFKudCaI7hXhMHpYBQF/rdZ
=3Dw1hZ
-- --  END PGP MESSAGE     


--  -- --  --  -- 070107010101000406000609
Content Type: text/html; charset=ISO 8859 1
Content Transfer Encoding: 8bit

<html>
  <head>
    <meta content="text/html; charset=ISO 8859 1"
      http equiv="Content Type">
  </head>
  <body bgcolor="#FFFFFF" text="#000000">
    <br>
         BEGIN PGP MESSAGE      <br>
    Charset: ISO 8859 1 <br>
    Version: GnuPG v2.0.22 (MingW32) <br>
    Comment: Using GnuPG with Thunderbird   <a class="moz txt link freetext" href="http://www.enigmail.net/">http://www.enigmail.net/</a> <br>
     <br>
    hQIMA0dhXCfgRaeBAQ/+P2NCYSVE7vxW742D9eYJmJ/7g7xHSvPFuYvGSZk2gRaJ <br>
    JoZ98x+TPjSlvYVWuS+Y2Fz04ydhi4vNcK+QAqImVO0nO6dFvxUfmZiERBcYGs4C <br>
    Lhe+B/I0P/hEDl+Zu/QJ/v+SEcFoXKv2iclrXwWF6RyLlO97iu8UsLYUjLIZ7Y+r <br>
    YGqphoIdJLfVZ9bb05RIb0ZKnYX5dzunpqu6V6zRpwckWCkos7qBOZ9hfBjaFkvD <br>
    ZQAoJM78qQ0//vV6qyxSpXXFEFbDZuJjPjjDfIF+qyNbcW657bDHQH2ctcyvdcTf <br>
(Modulo some dashes, but you get the idea)

So, uhm, there's a multipart-mime mail, with a PGP-encrypted attachment, and then there's a properly quoted HTML attachment, CONTAINING the same PGP attachment BASE64 encoded. Or something. The funny thing is that Thunderbird itself fails to display the body directly, but displays it in the editor window when you reply.
In vino veritas, and tonight I will need lots of veritas to unremember this madness.

Posted by Patrick | Permalink