Archive

Archive for January, 2004

A Month of Bogofilter

January 12th, 2004 No comments

I’ve been using bogofilter against my spam for about a month now, and the results are looking good.

It’s catching a much higher percentage of my spam than SpamAssassin was, and I’ve only had one false positive. Although any amount of false positives is a major problem, this doesn’t concern me for two reasons.

Firstly, SpamAssassin was giving me at least one false positive every couple of days. I get a lot of solicited commercial emails, including quite a lot of financial related news. SpamAssassin had a nasty habit of assuming that things that talked about mortgages etc was spam. I trained bogofilter specifically against archives of this mail, and so far it hasn’t marked any as spam.

Secondly, the one false positive was rather an odd case. About a year ago I released a perl module, Games::Boggle that finds words on a Boggle board.

Recently I received email from a user who was having difficulty getting a script using it to work. With this script he included the entire dictionary file he was running the script against!

According to the theory of the pseudo-Bayesian Spam Filtering the spam detector should only pay attention to the 10 (or whatever) most significantly ham or spam words in your message (which is why the new wave of “include random phrases” or “include a chapter of a book” emails aren’t really causing me any difficulty). However, if there more than 10 “definitely spam” and more than 10 “definitely ham” words, I’m guessing they don’t cope very well…

Tags:

The Perpetuation of Errors

January 10th, 2004 No comments

There are many sites on the internet carrying the lyrics to Lou Reed and John Cale’s “Song for Drella” album. Interestingly it seems that the vast majority of them are wrong.

There’s a wonderful couplet in “Faces and Names”: People who want to meet the name I have / Are always disappointed in me. But almost every site that carries the lyrics has this as “… always disappointed when they meet me”.

Similarly there seem to be many sites that carry the text of the Robert Frost poem “The Road Less Traveled”. Of course, there’s no such poem (that phrase doesn’t even exist in it); the title is actually “The Road Not Taken”.

There’s always a temptation to believe that the better attested to something is, the more likelihood it is to be true. Truth is never that simple.

Tags:

The Abacus, Malone Road, Belfast

January 10th, 2004 No comments

Note to self:

Although the Abacus Chinese Restraurant on the corner of Eglantine Avenue and Malone Road does really good food really cheaply, remember that it pulls the nasty trick of pouring its soft drinks from 2-litre bottles that have been sitting around and have probably gone stale. (They do 500ml bottles of Ballygowan though which they bring to the table). Also remember that they pull the other nasty trick of charging 3% extra if you want to pay by credit card.

The Death of the B-Side

January 5th, 2004 No comments

Once upon a time lots of singles would be released with lots of extra B-Sides. It was quite common that a single release would comprise of 2 CD singles, each with 3 bonus tracks, a 12″ with another 3, and a 7″/cassingle with yet another, thus hooking fans and collectors into purchasing all 4, but giving an album’s worth of otherwise unreleased tracks. Even if they didn’t expect you to buy the vinyl, you’d still get 6 new tracks with the 2 CDs (which were often imported back into the UK as an EP from the US).

These days it seems you’re not allowed to do this. According to the Offical Chart Rules [pdf], now only 3 versions of a single can count towards the charts, and you can only have 3 distinct tracks on a CD single (although you can have alternate versions of the main track).

Also, no individual release can be over 20 minutes long, unless you only have different remixes of the same song, in which case you can have any number of them, as long as the total running time is no longer than 40 minutes.

But of course none of this could possibly contribute any understanding whatsoever to the declining sales…

Worst Phone Interface ever

January 5th, 2004 1 comment

I would like to nominate the Student Loans Company as having the worst phone interface either.

I got a letter from them last week stating that I have overpaid my loan, and that I “may” be entitled to a refund, and should call them.

You are presented with 3 options: “If you are calling on behalf of a borrower, press 1. If you want to change your bank details, press 2. Otherwise, press 3″. So, if I’m a borrower, am I calling on behalf of me? Do I press 1 or 3. I went for 1. Wrong. I was politely informed that due to customer confidentiality they couldn’t talk to me, and disconnected my call.

Next time around, I pressed 3. Now I had to enter my “automated response ID”. If I don’t know what it is, I can press 1. Of course, I have absolutely no idea what one is, or even why I might have one. So I pressed 1. Now I was informed that it was an 11 digit number which would be at the top of any recent correspondence (why they couldn’t have just said that without needing me to press 1, I’m not entirely sure).

So I checked my letter, and there was no such thing on it. There was a “when you call us quote the following reference – REF.” that just ended there with no reference whatsoever, and there was the loan account numbers from my 3 loans (all of which are 11 characters long, but contain letters as well as numbers). So I pressed 0 to just get an operator. 10 minutes of being told that I’m still in queue later I hung up.

Next time around I tried my most recent loan account “numbers”, and was told it was invalid. Then I tried the earliest one, and hey presto, got put into presumably the same queue with the same annoying voice telling me that I was in a queue every 30 seconds or so. Ten minutes later I hung up again.

I think I’ll just write instead. And probably make a Data Protection request too…