From 482ddccda5bd5cadda154930cd78b2be1d8f6622 Mon Sep 17 00:00:00 2001 From: Wynn Wolf Arbor Date: Sat, 11 Jan 2020 23:14:56 +0100 Subject: posts: Add new post --- posts/debug-ncmpcpp-taglib.md | 174 +++++++++++++++++++++++++++++++++++++++ posts/img/ncmpcpp-double-tag.png | Bin 0 -> 33098 bytes 2 files changed, 174 insertions(+) create mode 100644 posts/debug-ncmpcpp-taglib.md create mode 100644 posts/img/ncmpcpp-double-tag.png diff --git a/posts/debug-ncmpcpp-taglib.md b/posts/debug-ncmpcpp-taglib.md new file mode 100644 index 0000000..d8d6d57 --- /dev/null +++ b/posts/debug-ncmpcpp-taglib.md @@ -0,0 +1,174 @@ +title: Debugging adventures: ncmpcpp and taglib +date: 2020-01-11 +author: Wolf + +I have a pretty extensive music library which I manage with +[mpd(1)](https://www.musicpd.org/) and +[ncmpcpp(1)](https://github.com/arybczak/ncmpcpp). Whilst I would love to +find a standalone tag editor that is easily usable from the command line +(or alternatively a nicely designed GUI one), ncmpcpp's built-in tag +editor has always been enough for my needs. + +About two weeks ago I was in the process of adding a new album to my +collection. This always includes a quick glance at the tags and some +cleanup. This time, however, after I saved my changes, the following +happened: + +![ncmpcpp displaying duplicate entries](img/ncmpcpp-double-tag.png) + +Every time I saved my changes, another duplicate entry would be added to +each tag. I had not updated `ncmpcpp(1)` in a while and was initially +confused as to why it failed suddenly. I left it, removed all tags using +`metaflac(1)`, and tagged it manually. Then I decided that I could not +be bothered but still wanted to investigate it at a later point, so I +added an entry to my to-do list. + +## Diving in + +Today I decided to look into this finally. First stop: the `ncmpcpp(1)` +source code. I already knew that it used `taglib` to access and modify +Vorbis comments, so I quickly found the [part of the +code](https://github.com/arybczak/ncmpcpp/blob/master/src/tags.cpp#L173) +that was responsible for saving changes: + +```cpp +void writeXiphComments(const MPD::MutableSong &s, TagLib::Ogg::XiphComment *tag) +{ + auto writeXiph = [&](const TagLib::String &type, const TagLib::StringList &list) { + tag->removeField(type); + for (auto it = list.begin(); it != list.end(); ++it) + tag->addField(type, *it, false); + }; + // remove field previously used as album artist + tag->removeField("ALBUM ARTIST"); + // remove field TRACK, some taggers use it as TRACKNUMBER + tag->removeField("TRACK"); + // remove field DISC, some taggers use it as DISCNUMBER + tag->removeField("DISC"); + // remove field DESCRIPTION, it's displayed as COMMENT + tag->removeField("DESCRIPTION"); + writeXiph("TITLE", tagList(s, &MPD::Song::getTitle)); + // [...] +} +``` + +Here, `ncmpcpp(1)` calls `removeField(type)` before adding a tag, which +looked fine. Maybe the list contains more than one entry? Stepping +through this code in `gdb(1)` revealed that `addField` was only called +once per tag, meaning it was only added once. At this point I remembered +that I was pretty sure `ncmpcpp(1)` had not been updated in a while, so +I refocused on `taglib` instead: Maybe `removeField` stopped working for +some reason. + +## Unorthodox packaging decisions + +A quick query against the Gentoo package database revealed that I was +using `taglib-1.11.1_p20190920-r1` which looked very suspicious already. +The latest stable version on the [website](https://taglib.org/) is +`1.11.1` from __October 24, 2016__. + +```sh +nabokov ~$ eix taglib +[I] media-libs/taglib (1.11.1_p20190920-r1@11/01/20) +``` + +So what is going on here? `taglib` is still under active development, +but there has not been a stable release for about four years. Gentoo +instead decided to +[package](https://gitweb.gentoo.org/repo/gentoo.git/tree/media-libs/taglib/taglib-1.11.1_p20190920-r1.ebuild#n7) +the [latest +commit](https://github.com/taglib/taglib/commit/54508df30bc888c4d2359576ceb0cc8f2fa8dbdf) +(at time of writing) as a "stable" version. For projects that have some +kind of stability guarantee regarding their API, this is a very +dangerous and strange choice on the part of the packagers; quite +frankly, it is inviting random breakage. + +## Exploring taglib + +Digging around in `taglib`'s [github +repo](https://github.com/taglib/taglib) revealed that `removeField` had +been [marked](https://github.com/taglib/taglib/issues/651) deprecated +internally in mid-2015 because of an apparent linking error when using +`String::null`. Let's have a look at the prototype of `removeField` from +[back then](https://github.com/taglib/taglib/blob/84e3582332eb593497912d055f15827a57815478/taglib/ogg/xiphcomment.h#L188): + +```cpp +void removeField(const String &key, const String &value = String::null); +``` + +This method removes a tag entry of a specific type and optionally +matches it against a given value. For example, `removeField("TITLE")` +removes all `TITLE` tags, and `removeField("TITLE", +"Garbagemx36")` only removes `TITLE` entries if they match +"Garbagemx36". + +The [issue](https://github.com/taglib/taglib/issues/651) mentions that +changing the default parameter for `value` to an empty string would +change the behaviour of the method. Suddenly, `removeField("TITLE")` +would only remove `TITLE` entries if they matched the empty string - +leaving everything else intact. + +Sounds like we found our culprit. But didn't the issue say that they +would not make this change? What gives? + +## Enter GDB + +At this point I started doubting the correctness of five year old issue +comments, and decided to have a look at things myself. Thankfully, +Gentoo makes it very easy to add debugging information to packages and +have the source code installed alongside binaries and library files. A +single addition to `/etc/portage/package.env` later and I was debugging +`removeField` in `gdb(1)`. + +Stepping ahead to [where the method +checks](https://github.com/taglib/taglib/blob/84e3582332eb593497912d055f15827a57815478/taglib/ogg/xiphcomment.cpp#L267) +whether `value` is null or not, I printed its contents: + +``` +(gdb) p value +$1 = (const TagLib::String &) @0x7fffffffc240: { + _vptr.String = 0x7ffff5be83b8 , static null = { + _vptr.String = 0x7ffff5be83b8 , + static null = , + static WCharByteOrder = TagLib::String::UTF16LE, d = 0x5555559a95c0}, + static WCharByteOrder = TagLib::String::UTF16LE, d = 0x555555b9f700} +``` + +Ah, like any good project, `taglib` is using a custom string type. Let's convert it to a C +string: + +``` +(gdb) print value.toCString(false) +$5 = 0x555555b9f740 "" +``` + +And there it was. `taglib` indeed was using an empty string (and not +`null`) as the default for `value` in `removeField`. + +## Closing Blame + +One question remained: Why was this changed after all? It seemed pretty +clear back then that the developers did not want to break the API. +Perhaps there was a different bug at work here? + +As I had learnt earlier, the default value for a parameter is found in +the header file, so I used `git-blame(1)` to find commits that introduced +changes to the prototype. Surprisingly, what I found was a very [recent +change](https://github.com/taglib/taglib/commit/c05fa78406fd8ce7382a11c1f63a17c4bfbe83fa#diff-b48915d5a31d8a598dc1f3e92810bbd1L189) +from September that added actual deprecation warnings... and +quietly changed `String::null` to `String()`. I was not able to find a +reason for this; to my knowledge, no commit or issue mentions this +change. + +So in the end, whilst wanting to keep up API promises, the behaviour of +an already deprecated method was changed in what can only be called the +"development branch" of the project, and it hit me because Gentoo +decided that it was a good idea to use work-in-progress code as a stable +candidate for a package. + +Around the time I discovered this, someone else had already opened an +[issue](https://github.com/arybczak/ncmpcpp/issues/371) on GitHub. I +shared my findings and later opened a [pull +request](https://github.com/arybczak/ncmpcpp/pull/373). I am pondering +bringing this to the package maintainer's attention, but am not +confident it will result in any positive change. diff --git a/posts/img/ncmpcpp-double-tag.png b/posts/img/ncmpcpp-double-tag.png new file mode 100644 index 0000000..c9efbb6 Binary files /dev/null and b/posts/img/ncmpcpp-double-tag.png differ -- cgit v1.2.3-2-gb3c3