The Music Database: Prehistory

In 1997, after leaving NIWEB, but before starting BlackStar, I launched The Music Database. It was a simple concept, with big plans. The Internet Movie Database (IMDB), was one of the most useful sites on the Web [at this point it hadn’t yet been sold to Amazon]. But there was no equivalent for music. A few noble attempts had been made, but they relied on the population of the net building them up from scratch, and none had gained the critical momentum necessary.

And then I had an interesting idea. I’d been using CDDB for a while to save me having to type the tracklists in for the CDs I was listening to on my computer. A few emails later I’d become an official mirror of the CDDB, and was busily converting their information from a series of flatfiles into a relational database. Then I put together a simple web based browse and search engine, and we got a local designer to knock up a simple design for it. The Music Database, version 1.0, was alive.

At this point there was over 50,000 CDs in this database, which made it much larger than most of the versions trying to bootstrap from nothing. Of course, as all the data was entered by people just to see it appear in their CD players, not in a database, there was little consistency of spelling, or even adherence to the CDDB’s rather loose specification for how info should be entered. But, we had a couple of media trainee placement students with us for several months, and we got one of those to start cleaning the data up. The nine different spellings of Einstürzende Neubauten were consolidated; multiple pressings of the same CD that differed only in the last song running two thousands of a second longer were removed. And the daily feed of updates and new CDs from CDDB got larger and larger. People started linking to us. And the traffic kept growing.

We had two features that made the initial version a success. Firstly not only was information on the band you were interested in probably there, you could actually find it. We were pretty much the only music site that was able to find things by “The The”. Secondly, you could see what compilation albums an artist had appeared on. Most music sites treated compilation albums as second class, only really listing what they classed as “significant” appearances – usually on soundtracks of big films.

But as it grew we started getting swamped by emails. People started pointing out corrections, which we could fix, but we preferred to get them to get a CDDB aware player, put their CD in the player, and resubmit the changed version back to CDDB. Then not only would we get the fixes, but everyone who used the player. But we also got lots of emails from people saying “How do I get a copy of this CD”. We initially linked to NTK, one the nascent online retailers, but then we noticed that most of the CDs that people were asking about weren’t actually available.

So we launched MDB version 2. This time it had a simple button beside every CD allowing visitors to buy or sell their CD. There was no fancy eBay style transactional engine behind this. People just entered the price they’d sell at, or the price they’d buy at, and these details, and the user’s email address would be displayed to anyone else who was interested. Thousands of CDs were listed within the first few weeks, and the levels of email dropped to a sensible volume again.

We had great plans for version 3 that would allow users to start entering extra information for the CDs, to start turning this into a great service, equivalent to IMDB.

Then we got a call from the aforesaid retailer, asking if we could license them the data we were using. We explained that it was mostly just CDDB data, and that they could license it from them. They explained that they had looked at CDDB and it was the fact that we had “cleaned” the data up that interested them. We sought some legal advice and learned more about copyright law for data than anyone should ever need to know. Although legally we probably could license the cleaned up data, it didn’t seem right. CDDB was GPLed, and the data had always been listed as such also, although this notice had recently disappeared from the download files, possibly because the GPL couldn’t apply to data.

We eventually came up with an agreement that kept both sides happy. We would license the database of Valley – the distributor that most of the online retailer used as a dropshipper, and write a piece of code that could turn the CD’s unique fingerprint (using a different algorithm from CDDB, whose approach had significant clashing issues with shorter CDs) into a reference into the Valley database.

Unfortunately CDDB weren’t happy. Despite our assurances that we weren’t licensing “their” data on, and despite us sending NTK to talk to them to see if there was anything they could license directly, they accused us of going behind their back and stealing their information, and subsequently removed our feed.

We had several ideas for how to get around this, but instead we took the money we’d made from the deal with NTK, and started BlackStar. Anni, whom we’d just employed to keep cleaning up the data, and contact lots of music sites telling them about MDB and getting them to link to it, handled the customer care for BlackStar instead. She was phenomenal in this role, and almost single handedly responsible for setting the standard of care that made BlackStar legendary.

BlackStar grew so quickly we didn’t have time to go back to the MDB, which, starved of new information grew staler and staler over time.

NTK went on to merge with CDnow. CDDB went on to become a commercial organisation, and launched CDDB2, leaving many of their early supporters and partners feeling screwed over. The freedb was launched as an alternative non-commercial version. It now has over half a million CDs.

And now the time is right to bring back The Music Database, built, this time on Freedb. We’ve a few new plans for it, but mostly the plan remains the same. There still is no music equivalent to IMDB. It’s time to change that.

Leave a Reply

Your email address will not be published. Required fields are marked *