Avoiding the Chasm

March 30, 2008

MusicBrainz

Filed under: Media, Software — Tags: , — vextasy @ 10:45 pm

MusicBrainz is a community-driven music meta-database. It is almost a Wikipedia of the music world. Content is maintained by a community in which changes have to be approved by a vote before they become permanently accepted into the database and a team of dedicated moderators oversee the whole thing. The words orderly and consistent spring to mind; emphasis is placed on consistency of style (by that I mean, for example, capitalisation and abbreviation conventions) and correctness. In return for their input, community members are rewarded with a great tool for maintaining their own music databases – the tags in their own music collection. The tool is called Picard and is free to download. In acknowledgement of the quality of the database MusicBrainz has now been licenced by an impressive list of customers including MusicIP, The British Broadcasting Corporation (BBC) and Last.fm.
MusicBrainz Logo
At some point during the Christmas holiday of 2005 I began the process of ripping my CD collection to allow it to be recalled and replayed in a more controlled manner and to reduce the amount of space that it occupied. I opted to make it visible on my home LAN using a combination of a Linksys NSLU2 (affectionately known as a SLUG) and a 250GB Buffalo DriveStation. The DriveStation is a USB hard drive and the Linksys bridges such a USB drive to an ethernet network. The Linksys will support two drives, although I’ve not needed to use both here.
The nice thing about the Linksys is that it runs Linux and can be customised to perform a number of task in addition to file serving. Both devices sit quietly tucked out of the way in an upstairs room and are directly connected to my wireless broadband router. The Linksys runs the ubiquitous (at least in the Unix world) Samba SMB file server which allows it to look just like a networked PC to other PCs on the home network. Files can be accessed from its drive(s), given the appropriate permission, as if they were on a PC but, of course, there is no fan noise or hot processor or display and so power consumption is at a minimum. The beauty of this arrangement is that they can be left running and so are always available.

I used Windows Media Player to rip the, roughly, 5,700 tracks from CD to mp3 format. Media Player makes a great attempt to tag the mp3 files correctly but for an irritatingly large number of tracks the information is either incorrect or inconsistent. This is where MusicBrainz comes to the rescue. Specifically, MusicBrainz Picard, their free and open source, cross-platform music file tagger.

Picard uses the MusicBrainz database to correctly and consistently tag mp3, wav, vorbis, flac, mpc, mp4 and wma format files. If asked to identify a music CD it will recognise the artist and release based on an analysis of the content of the CD which it uses to construct a unique disc-id which can be compared to known disc-ids in the MusicBrains database (at the time of writing there are approximately 228,000 such known disc-ids). Alternatively, Picard can recognise individual music files by a form of audio fingerprinting and makes a special effort to associate clusters of music files with a particular release or album. If neither of these techniques succeed the GUI allows manual associations to be made with the correct titles from the database.

Once associations have been made, Picard displays the tag information currently stored in the music file alongside the suggested information (from the MusicBrainz database) together with an indication of closeness of fit and allows (selective) correcting of the tags in the music file. Plugins to Picard allow you to pull down cover art of incorporate genre information from Last.fm but I haven’t tried either of these.

Reading Ian Dixon and Ed Bott’s postings on how they organise their music collections made me realise just how many different ways there are to achieve the same outcome. Where I think the MusicBrainz tools score is in the quality of the database that sits behind them.

Blog at WordPress.com.