title: Debugging adventures: ncmpcpp and taglib
date: 2020-01-11
author: Wynn Wolf Arbor
I have a pretty extensive music library which I manage with
[mpd(1)](https://www.musicpd.org/) and
[ncmpcpp(1)](https://github.com/ncmpcpp/ncmpcpp). Whilst I would love to
find a standalone tag editor that is easily usable from the command line
(or alternatively a nicely designed GUI one), ncmpcpp's built-in tag
editor has always been enough for my needs.
About two weeks ago I was in the process of adding a new album to my
collection. This always includes a quick glance at the tags and some
cleanup. This time, however, after I saved my changes, the following
happened:
![ncmpcpp displaying duplicate entries](img/ncmpcpp-double-tag.png)
Every time I saved my changes, another duplicate entry would be added to each
tag. I had not updated `ncmpcpp(1)` in a while and was initially confused as to
why it failed suddenly. I left it, removed all tags using `metaflac(1)`, and
tagged the album manually. At the time I could not really be bothered to fix
it, but I still wanted to investigate it at a later point, so I added an entry
to my to-do list.
## Diving in
Today I decided to look into this finally. First stop: the `ncmpcpp(1)`
source code. I already knew that it used `taglib` to access and modify
Vorbis comments, so I quickly found the [part of the
code](https://github.com/ncmpcpp/ncmpcpp/blob/31ee76e8eee0cdc953f85d6c9369fb0aca010889/src/tags.cpp#L173-L199)
that was responsible for saving changes:
```cpp
void writeXiphComments(const MPD::MutableSong &s, TagLib::Ogg::XiphComment *tag)
{
auto writeXiph = [&](const TagLib::String &type, const TagLib::StringList &list) {
tag->removeField(type);
for (auto it = list.begin(); it != list.end(); ++it)
tag->addField(type, *it, false);
};
// remove field previously used as album artist
tag->removeField("ALBUM ARTIST");
// remove field TRACK, some taggers use it as TRACKNUMBER
tag->removeField("TRACK");
// remove field DISC, some taggers use it as DISCNUMBER
tag->removeField("DISC");
// remove field DESCRIPTION, it's displayed as COMMENT
tag->removeField("DESCRIPTION");
writeXiph("TITLE", tagList(s, &MPD::Song::getTitle));
// [...]
}
```
Here, `ncmpcpp(1)` calls `removeField(type)` before adding a tag, which
looked fine. Maybe the list contains more than one entry? Stepping
through this code in `gdb(1)` revealed that `addField` was only called
once per tag, meaning it was only added once. At this point I remembered
that I was pretty sure `ncmpcpp(1)` had not been updated in a while, so
I refocused on `taglib` instead: Maybe `removeField` stopped working for
some reason.
## Unorthodox packaging decisions
A quick query against the Gentoo package database revealed that I was
using `taglib-1.11.1_p20190920-r1` which looked very suspicious already.
The latest stable version on the [website](https://taglib.org/) is
`1.11.1` from __October 24, 2016__.
```sh
nabokov ~$ eix taglib
[I] media-libs/taglib (1.11.1_p20190920-r1@11/01/20)
```
So what is going on here? `taglib` is still under active development,
but there has not been a stable release for about four years. Gentoo
instead decided to
[package](https://gitweb.gentoo.org/repo/gentoo.git/tree/media-libs/taglib/taglib-1.11.1_p20190920-r1.ebuild#n7)
the [latest
commit](https://github.com/taglib/taglib/commit/54508df30bc888c4d2359576ceb0cc8f2fa8dbdf)
(at time of writing) as a "stable" version. For projects that have some
kind of stability guarantee regarding their API, this is a very
dangerous and strange choice on the part of the packagers; quite
frankly, it is inviting random breakage.
## Exploring taglib
Digging around in `taglib`'s [github
repo](https://github.com/taglib/taglib) revealed that `removeField` had
been [marked](https://github.com/taglib/taglib/issues/651) deprecated
internally in mid-2015 because of an apparent linking error when using
`String::null`. Let's have a look at the prototype of `removeField` from
[back then](https://github.com/taglib/taglib/blob/84e3582332eb593497912d055f15827a57815478/taglib/ogg/xiphcomment.h#L188):
```cpp
void removeField(const String &key, const String &value = String::null);
```
This method removes a tag entry of a specific type and optionally
matches it against a given value. For example, `removeField("TITLE")`
removes all `TITLE` tags, and `removeField("TITLE",
"Garbagemx36")` only removes `TITLE` entries if they match
"Garbagemx36".
The [issue](https://github.com/taglib/taglib/issues/651) mentions that
changing the default parameter for `value` to an empty string would
change the behaviour of the method. Suddenly, `removeField("TITLE")`
would only remove `TITLE` entries if they matched the empty string -
leaving everything else intact.
Sounds like we found our culprit. But didn't the issue say that they
would not make this change? What gives?
## Enter GDB
At this point I started doubting the correctness of five year old issue
comments, and decided to have a look at things myself. Thankfully,
Gentoo makes it very easy to add debugging information to packages and
have the source code installed alongside binaries and library files. A
single addition to `/etc/portage/package.env` later and I was debugging
`removeField` in `gdb(1)`.
Stepping ahead to [where the method
checks](https://github.com/taglib/taglib/blob/84e3582332eb593497912d055f15827a57815478/taglib/ogg/xiphcomment.cpp#L267)
whether `value` is null or not, I printed its contents:
```
(gdb) p value
$1 = (const TagLib::String &) @0x7fffffffc240: {
_vptr.String = 0x7ffff5be83b8 <vtable for TagLib::String+16>, static null = {
_vptr.String = 0x7ffff5be83b8 <vtable for TagLib::String+16>,
static null = <same as static member of an already seen type>,
static WCharByteOrder = TagLib::String::UTF16LE, d = 0x5555559a95c0},
static WCharByteOrder = TagLib::String::UTF16LE, d = 0x555555b9f700}
```
Ah, like any good project, `taglib` is using a custom string type. Let's convert it to a C
string:
```
(gdb) print value.toCString(false)
$5 = 0x555555b9f740 ""
```
And there it was. `taglib` indeed was using an empty string (and not
`null`) as the default for `value` in `removeField`.
## Closing Blame
One question remained: Why was this changed after all? It seemed pretty
clear back then that the developers did not want to break the API.
Perhaps there was a different bug at work here?
As I had learnt earlier, the default value for a parameter is found in
the header file, so I used `git-blame(1)` to find commits that introduced
changes to the prototype. Surprisingly, what I found was a very [recent
change](https://github.com/taglib/taglib/commit/c05fa78406fd8ce7382a11c1f63a17c4bfbe83fa#diff-b48915d5a31d8a598dc1f3e92810bbd1L189)
from September that added actual deprecation warnings... and
quietly changed `String::null` to `String()`. I was not able to find a
reason for this; to my knowledge, no commit or issue mentions this
change.
So in the end, whilst wanting to keep up API promises, the behaviour of
an already deprecated method was changed in what can only be called the
"development branch" of the project, and it hit me because Gentoo
decided that it was a good idea to use work-in-progress code as a stable
candidate for a package.
Around the time I discovered this, someone else had already opened an
[issue](https://github.com/ncmpcpp/ncmpcpp/issues/371) on GitHub. I
shared my findings and later opened a [pull
request](https://github.com/ncmpcpp/ncmpcpp/pull/373). I am pondering
bringing this to the package maintainer's attention, but am not
confident it will result in any positive change.