Sat Mar 12 13:05:37 CET 2016

How to break sysctl

A long time ago sysctl used one confg file: /etc/sysctl.conf

There was a simple way to (re)load the values from that file: sysctl -p
There are aliases -f --file that do the same.

Then things were Improved and Enhanced. Now sysctl -p will either fail (bug in procps-3.3.9) or not apply the config (3.3.10+). Which is possibly at times a bit fatal on production machines that rely on nonstandard settings to handle the workload.

How did things break? Of course new config paths must be added. Like any Modern Application sysctl will read snippets from a directory, and not just one directory but six:
/run/sysctl.d/*.conf
/etc/sysctl.d/*.conf
/usr/local/lib/sysctl.d/*.conf
/usr/lib/sysctl.d/*.conf
/lib/sysctl.d/*.conf
/etc/sysctl.conf

So let's think ...

/run ? Why would you put config there. Srsly wat. Use sysctl -w if you want to temporarily set a value.

/etc/sysctl.d ? Looks reasonable.

/usr/local/lib ? WAT. That's not a path where config lives. /usr/lib ? Why do you put things that are not libs in libdir. And since you need to be administrative access person to modify that path, it is like /etc/sysctl.d only strictly worse.

/lib ? oh, because ... uhm ... I can't figure this one out

and finally, the classic /etc/sysctl.conf.

So four of the six new paths are poop, and we could completely remove this misfeature by adding an 'include /etc/sysctl.d/*.conf' to /etc/sysctl.conf. Then we wouldn't need sysctl --system, sysctl -p would still work, and there'd be less code written to implement this misfortune and less code written to mitigate the failures caused by it.

Having to fight such changes and the breakage they cause is frustrating, by changing less we could achieve more.
What amuses me most about this is that this change actually broke the new feature (--system) in the first iteration, after breaking the old behaviour. Amazing amount of churn that doesn't fix a problem we've had. No, I'm not grumpy!

Posted by Patrick | Permalink

Sun Nov 15 14:32:31 CET 2015

Memories of the future of the past

Every now and then I lament the badness of current things, and then I remember the things we had that we can't get anymore ...

  • ISDN Telephones were a wonderful upgrade to analog Telephones. 8kB/s dedicated bandwidth with low latency. Compared to that all VoIP things I've used were just horribly cheap shoddy crap of unspeakably bad quality. Luckily ISDN has been discontinued and is no longer available to consumers, so we have no reference for what good audio quality means.
  • High-res displays like the IBM T220. Look it up, it's a time traveller! And, of course, it was discontinued, with no modern device coming close.
  • Mobile phones that we recharged every week like the Motorola StarTac I bought a few years ago. Now 24h seems to be 'ok' ...
  • Washing machines that took 30 minutes for one load which is not energy-efficient, so instead the modern ones run for 1-2h. Not sure how that helps, and it looks like they use more water too. So we just hide the problem and PROBLEM SOLVED?
  • ThinkPad Notebooks
And many other things that were better in the past, but have regressed now to a lower quality, less-feature, harder to repair state.

Can we please have more future?

Posted by Patrick | Permalink

Thu Oct 22 14:02:09 CEST 2015

WTF Google

Every now and then this happens: I have no idea what it is supposed to do, but it makes the whole site non-interactive, which sucks.
If this is Google's attempt to recruit me, or whatever, it should try to do it with less JavaScript.

Maybe I should just DuckDuckGo instead?

Posted by Patrick | Permalink

Sun Sep 6 12:46:21 CEST 2015

Printers in Linux ...

I was trying to make a Brother printer do some printing. (Optimistic, I know ... but ...)
Out of the box CUPS can't handle it. Since it's CUPS it doesn't do errors, so who knows why. Anyway, the autoconfigured stuff ends up being eh whut ur doin is naught gud.

The config file points at a hostname that is not even within our local network, so I think this won't print locally. Cloud printing? Cups can do!
There's a printer driver from Brother, which is horribly stupid, bad, stupid, and bad. I read through the 'installer' script trying to figure out what it would do, but then I realized that I'm not on an RPM distro so it's futile.

So then I figured "why not postscript?"

And guess what. All the documentation I found was needlessly wrong and broken, when all you have to do is configure an IPP printer, generic postscript, and ... here comes instant good quality colour duplex output from the printer.

I'm confused about many things here - the complexity of Brother's attempt at a driver that doesn't work, CUPS autoconfiguration that sees the printer and then derps out, and all the documentation that doesn't point out that all this weird layercake of workarounds is not needed because it's a standards-compliant postscript printer.
How exhausting.

Posted by Patrick | Permalink

Fri Jul 17 05:29:37 CEST 2015

OpenLDAP upgrade trap

After spending a few hours trying to figure out why OpenLDAP 2.4 did not want to return any results while 2.3 worked so nicely ...

The following addition to the slapd config file Makes Things Work (tm) - I have no idea if this is actually correct, but now things don't fail.
access to dn.base="dc=example,dc=com"
    by anonymous search
    by * none
I didn't notice this change in search behaviour in any of the Changelog, Upgrade docs or other documentation - so this is quite 'funny', but not really nice.

Posted by Patrick | Permalink

Wed Apr 29 05:03:41 CEST 2015

Code Hygiene

Some convenient Makefile targets that make it very easy to keep code clean:
scan:
        scan-build clang foo.c -o foo

indent:
        indent -linux *.c
scan-build is llvm/clang's static analyzer and generates some decent warnings. Using clang to build (in addition to 'default' gcc in my case) helps diversity and sometimes catches different errors.

indent makes code pretty, the 'linux' default settings are not exactly what I want, but close enough that I don't care to finetune yet.

Every commit should be properly indented and not cause more warnings to appear!

Posted by Patrick | Permalink

Sat Apr 11 13:06:54 CEST 2015

Almost quiet dataloss

Some harddisk manufacturers have interesting ideas ... using some old Samsung disks in a RAID5 config:
[15343.451517] ata3.00: exception Emask 0x0 SAct 0x40008410 SErr 0x0 action 0x6 frozen
[15343.451522] ata3.00: failed command: WRITE FPDMA QUEUED
[15343.451527] ata3.00: cmd 61/20:20:d8:7d:6c/01:00:07:00:00/40 tag 4 ncq 147456 out
                        res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)                                                                                                                                                                                            
[15343.451530] ata3.00: status: { DRDY }
[15343.451532] ata3.00: failed command: WRITE FPDMA QUEUED
[15343.451536] ata3.00: cmd 61/30:50:d0:2f:40/00:00:0d:00:00/40 tag 10 ncq 24576 out
                        res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)                                                                                                                                                                                            
[15343.451538] ata3.00: status: { DRDY }
[15343.451540] ata3.00: failed command: WRITE FPDMA QUEUED
[15343.451544] ata3.00: cmd 61/a8:78:90:be:da/00:00:0b:00:00/40 tag 15 ncq 86016 out
                        res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)                                                                                                                                                                                            
[15343.451546] ata3.00: status: { DRDY }
[15343.451549] ata3.00: failed command: READ FPDMA QUEUED
[15343.451552] ata3.00: cmd 60/38:f0:c0:2b:d6/00:00:0e:00:00/40 tag 30 ncq 28672 in
                        res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)                                                                                                                                                                                            
[15343.451555] ata3.00: status: { DRDY }
[15343.451557] ata3: hard resetting link
[15343.911891] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[15344.062112] ata3.00: configured for UDMA/133
[15344.062130] ata3.00: device reported invalid CHS sector 0
[15344.062139] ata3.00: device reported invalid CHS sector 0
[15344.062146] ata3.00: device reported invalid CHS sector 0
[15344.062153] ata3.00: device reported invalid CHS sector 0
[15344.062169] ata3: EH complete
Hmm, that doesn't look too good ... but mdadm still believes the RAID is functional.

And a while later things like this happen:
[ 2968.701999] XFS (md4): Metadata corruption detected at xfs_dir3_data_read_verify+0x72/0x77 [xfs], block 0x36900a0
[ 2968.702004] XFS (md4): Unmount and run xfs_repair
[ 2968.702007] XFS (md4): First 64 bytes of corrupted metadata buffer:
[ 2968.702011] ffff8802ab5cf000: 04 00 00 00 99 00 00 00 fc ff ff ff ff ff ff ff  ................
[ 2968.702015] ffff8802ab5cf010: 03 00 00 00 00 00 00 00 02 00 00 00 9e 00 00 00  ................
[ 2968.702018] ffff8802ab5cf020: 0c 00 00 00 00 00 00 00 13 00 00 00 00 00 00 00  ................
[ 2968.702021] ffff8802ab5cf030: 04 00 00 00 82 00 00 00 fc ff ff ff ff ff ff ff  ................
[ 2968.702048] XFS (md4): metadata I/O error: block 0x36900a0 ("xfs_trans_read_buf_map") error 117 numblks 8
[ 2968.702476] XFS (md4): Metadata corruption detected at xfs_dir3_data_reada_verify+0x69/0x6d [xfs], block 0x36900a0
[ 2968.702491] XFS (md4): Unmount and run xfs_repair
[ 2968.702494] XFS (md4): First 64 bytes of corrupted metadata buffer:
[ 2968.702498] ffff8802ab5cf000: 04 00 00 00 99 00 00 00 fc ff ff ff ff ff ff ff  ................
[ 2968.702501] ffff8802ab5cf010: 03 00 00 00 00 00 00 00 02 00 00 00 9e 00 00 00  ................
[ 2968.702505] ffff8802ab5cf020: 0c 00 00 00 00 00 00 00 13 00 00 00 00 00 00 00  ................
[ 2968.702508] ffff8802ab5cf030: 04 00 00 00 82 00 00 00 fc ff ff ff ff ff ff ff  ................
[ 2968.702825] XFS (md4): Metadata corruption detected at xfs_dir3_data_read_verify+0x72/0x77 [xfs], block 0x36900a0
[ 2968.702831] XFS (md4): Unmount and run xfs_repair
[ 2968.702834] XFS (md4): First 64 bytes of corrupted metadata buffer:
[ 2968.702839] ffff8802ab5cf000: 04 00 00 00 99 00 00 00 fc ff ff ff ff ff ff ff  ................
[ 2968.702842] ffff8802ab5cf010: 03 00 00 00 00 00 00 00 02 00 00 00 9e 00 00 00  ................
[ 2968.702866] ffff8802ab5cf020: 0c 00 00 00 00 00 00 00 13 00 00 00 00 00 00 00  ................
[ 2968.702871] ffff8802ab5cf030: 04 00 00 00 82 00 00 00 fc ff ff ff ff ff ff ff  ................
[ 2968.702888] XFS (md4): metadata I/O error: block 0x36900a0 ("xfs_trans_read_buf_map") error 117 numblks 8
fsck finds quite a lot of data not being where it should be.
I'm not sure who to blame here - the kernel should actively punch out any harddisk that is fish-on-land flopping around like that, the md layer should hate on any device that even looks weirdly, but somehow "just doing a link reset" is considered enough.

I'm not really upset that an old cheap disk that is now ~9 years old decides to have dementia, but I'm quite unhappy with the firmware programming that doesn't seem to consider data loss as a problem ... (but at least it's not Seagate!)

Posted by Patrick | Permalink

Wed Mar 18 03:35:39 CET 2015

Upgrading ThunderBird

With the recent update from the LongTimeSuffering / ExtendedSufferingRelease of Thunderbird from 24 to 31 we encountered some serious badness.

The best description of the symptoms would be "IMAP doesn't work at all"
On some machines the existing accounts would be disappeared, on others they would just be inert and never receive updates.

After some digging I was finally able to find the cause of this:
Too old config file.

Uhm ... what? Well - some of these accounts have been around since TB2. Some newer ones were enhanced by copying the prefs.js from existing accounts. And so there's a weird TB bugreport that is mostly triggered by some bits being rewritten around Firefox 30, and the config parser screwing up with translating 'old' into 'new', and ... effectively ... IMAP being not-whitelisted, thus by default blacklisted, and hilarity happens.

Should you encounter this bug you "just" need to revert to a prefs.js from before the update (sigh) and then remove all lines involving "capability.policy".
Then update and ... things work. Whew.

Why not just remove profile and start with a clean one you say? Well ... for one TB gets brutally unusably slow if you have emails. So just re-reading the mailbox content from a local fast IMAP server will take ~8h and TB will not respond to user input during that time.
And then you manually have to go into eeeevery single subfolder so that TB remembers it is there and actually updates it. That's about one work-day per user lost to idiocy, so sed'ing the config file into compliance is the easy way out.
Thank you, Mozilla, for keeping our lives exciting!

Posted by Patrick | Permalink

Mon Feb 2 03:33:51 CET 2015

Mozilla: Hating you so you don't have to

Ahem. I'm mildly amused, Firefox 35 shows me this nice little informational message in the "Get addons" view:
Secure Connection Failed

An error occurred during a connection to services.addons.mozilla.org. 
Peer's Certificate has been revoked. (Error code: sec_error_revoked_certificate) 
Oh well. Why I was looking at that anyway? Well, for some reasons I've had adb (android thingy) running on my desktop. Which makes little sense ... but ... find tells me:
./.mozilla/firefox/badrandomvalue.default/extensions/adbhelper@mozilla.org/linux64/adb
So now there's a random service running *when I start firefox* because ...


err, I might want to " test, deploy and debug HTML5 web apps on Firefox OS phones & Simulator, directly from Firefox browser. "
Which I don't. But I appreciate having extra crap default-enabled for no reason. Sigh.

Mozilla: We hate you so you don't have to

Posted by Patrick | Permalink

Wed Jan 28 06:26:16 CET 2015

CGit

Dirty hack of the day:

A CGit Mirror of git.overlays.gentoo.org

I wonder if the update cronjob actually works ...

Posted by Patrick | Permalink