In the future, there will be so much open source software available, programmers will be judged by how much they know about it and how well they can glue it together to build solutions.
– dive into mark
In the Perl world this is already the case. The Perl programmer who knows what’s on CPAN has a huge head start on the one who doesn’t…
Kasia decided to investigate her highest score from Spamassassin.
41.90 points is fairly impressive, but I’ve got 22 mails with higher scores than that in the last month!
My current highest is 56.8:
Content analysis details: (56.80 points, 5 required)
RCVD_FAKE_HELO_DOTCOM (2.3 points) Received contains a faked HELO hostname
NO_REAL_NAME (0.7 points) From: does not include a real name
SUBJ_HAS_SPACES (2.0 points) Subject contains lots of white space
AS_SEEN_ON (2.1 points) BODY: As seen on national TV!
ONLY_COST (0.2 points) BODY: Only $$$
MLM (0.8 points) BODY: Multi Level Marketing mentioned
BANG_GUARANTEE (0.5 points) BODY: Something is emphatically guaranteed
EARN_MONEY (1.2 points) BODY: Message talks about earning money
EXCUSE_14 (0.1 points) BODY: Tells you how to stop further spam
JODY (2.9 points) BODY: Contains "My wife, Jody" testimonial
OPT_IN (0.3 points) BODY: Talks about opting in (lowercase version)
BANG_MONEY (0.7 points) BODY: Talks about money with an exclamation!
BULK_EMAIL (2.1 points) BODY: Talks about bulk email
ORDER_REPORT (2.9 points) BODY: Order a report from someone
SENT_IN_COMPLIANCE (4.3 points) BODY: Claims compliance with spam regulations
READ_TO_END (2.9 points) BODY: You'd better read all of this spam!
FINANCIAL (4.3 points) BODY: Financial Freedom
SECTION_301 (3.2 points) BODY: Claims compliance with spam regulations
INVALUABLE_MARKETING (2.9 points) BODY: Invaluable marketing information
NOT_INTENDED (2.9 points) BODY: Not intended for residents of somewhere or other
RISK_FREE (1.0 points) BODY: Risk free. Suuurreeee....
COPY_ACCURATELY (2.9 points) BODY: Common pyramid scheme phrase (1)
INITIAL_INVEST (2.9 points) BODY: Requires Initial Investment
SERIOUS_CASH (2.7 points) BODY: Serious cash
MSGID_OUTLOOK_TIME (4.4 points) Message-Id is fake (in Outlook Express format)
SUBJ_HAS_UNIQ_ID (0.8 points) Subject contains a unique ID
DATE_IN_FUTURE_12_24 (2.8 points) Date: is 12 to 24 hours after Received: date
CASHCASHCASH (0.0 points) Contains at least 3 dollar signs in a row
BlackStar.co.uk recently redesigned their website to supposedly be XHTML 1.0.
Although it doesn’t actually validate, it has drastically reduced the download time of pages – particularly the front page.
However, someone seems to have listened to too many branding experts or something. Their homepage mentions the full URL of the site 191 times! Of course, this is all solely in the HTML source so any potential branding impact only applies to the freaky people who actually read the source of web pages…
The one simple act of making all the URLs absolute rather than relative bloats the file size of the page by over 19%. That’s over 4Gb of completely needless traffic per million page views. They must have money to burn on bandwidth or something…
So Steve creates a bookmarklet that lets me see that my default MT RSS feed is bad.
Then Beowulf points me to Cynthia, who tells me that my alt = “RSS” tag on my RSS image is too short, as alt tags are supposed to be between 7 and 81 characters long, my “powered by MT” logo doesn’t have an alt tag at all, and my permalinks are bad as they all display the same text but link to different places.
I’ve fixed the others, but what are you meant to do about permalinks?
Yesterday, when writing some tests involving data stored in MySQL, I was quite impressed to discover that I could add random data to the table by doing:
UPDATE table SET column = FLOOR(8 * RAND()) + 1;
I was originally worried that this would pick a random value and then assign it to every row, but no, it picks a new random number each time.
Of course, I shouldn’t have been quite so surprised as I’ve long been a fan of picking a random row via:
SELECT * FROM table ORDER BY RAND() LIMIT 1
I’d just never thought through the implication that this must indeed be selecting a random value for each row in order to sort by it…
“I went straight to the control tower and demanded to see the airport manager. In Somalia, it is the custom to pay a man double the market price if you accidentally kill his beast. I had the price of two goats in my hand before the plane had finished taxiing back to the terminal.”
“What luck!” I cried. “What did you do with the money?”
“I went to the market, of course, and bought two goats…”
– The Somaliland News [via Wesley]
People who know me know that I spend much more time ranting about really bad service than praising really good service.
For the most part this is because I get much more bad service than good service.
But last week I had a great experience. I left work one evening and discovered that one of the windows of my car had been smashed, with a large rock that was sitting on the passenger seat. This, of course, was not a great experience. But the way it was handled was. I drove the car home, and the following morning phoned AutoGlass (after a few puzzling minutes working out where to find them in the Yellow Pages). If I’d been prepared to wait a while longer they could have come out to my house and replace the window, but instead I chose to drive to them, where they replaced it whilst I went for food.
They handled all the dealings with my insurance company, assured me that this wouldn’t impact my No Claims Bonus, and best of all, there wasn’t even an excess to pay.
It was all very simple – literally drive in, give them my insurance details, go away for an hour to get food, come back, sign a form, drive away again and forget about the whole thing.
I like companies that can turn bad experiences into good ones. There just aren’t enough of them around.
I’m a big believer in automated testing. Not just that your code performs the correct functions, but also that the code meets whatever ‘policy’ decisions you’ve chosen. So, for example, at work we can’t check in perl code in that hasn’t provided POD documentation all its public methods. This is achieved using the wonderful Pod::Coverage module.
But today, when trying to check some code in, I was warned that I hadn’t documented the methods ("" and (bool in a class!
A little digging revealed an interesting facet of Perl that I hadn’t previously been aware of. I was using Perl’s overloading ability, which lets you specify how an object should be treated in various contexts (in this case, when you try to stringify it or check it in a boolean context). This is a really useful ability, but I’ve never examined how it actually works before. It turns out that it does this, in part, by creating functions in your class that are named after the operation you’re overloading, prefixed with a bracket. (Beyond that it gets much scarier!)
So, when Pod::Coverage comes to do its checking (by digging through the symbol table for the class it’s examining to see what methods exist and should be documented) it discovers these strange methods and, finding no documentation, complains.
Until I see how Richard Clamp decides to work around this in Pod::Coverage I’m now faced with either providing a bizarre piece of documentation or adapting our
build script to not complain about this sort of error…
We waited until the debian package for the new SpamAssassin was out before upgrading, and I’ve now had a weekend to play with it.
Other than the fact that the previous version was starting to get far too many false negatives, the main reason I wanted to upgrade was the fact that SA no longer broke all the attachments. In the previous version any ham marked as spam was pretty much ruined as all the attachments were inlined into the main body. Now they stay as attachments.
As SA tended to occassionally mark commercial emails that I had actually signed up for as spam (such as the ebay auction watch mails) this was rather irritating.
[I discovered a few days ago that I can edit the mail and do ':%!spamassassin -d' to remove the mark-up but that's still annoying]
The main issue now is that SA no longer marks the Subject line with a **** SPAM *** header. In general this is a good thing, but it also makes my deletion strategy more difficult. Previously I using mutt’s scoring rules to automatically decrese the score of all mail so marked, and have a rule that marked anything below a score threshhold as ready to be deleted as soon as I quit the folder.
Now I’ve had to bind a function key to a macro to do this as you can’t score on just any random header.
But for now I’ve decided to make that macro move the spam to a different folder rather than delete it, so I can play about with training the new baysianesque filter.