Tourist Taxi Treatment

February 16th, 2012 1 comment

For the last few weeks I’ve been travelling in Latin America, swapping the -30ºC of Tallinn for the +30ºC here. One thing I’ve found somewhat odd is the elaborate system of taxi pricing that several places have. I encountered this first in Brazil. The taxi would arrive at my destination, with the meter reading, say, R$14.10. As I dug through my wallet looking for R$15, the driver would produce a printed chart and point at the box that translated the meter price into him actually demanding R$17. At first I thought this was some sort of elaborate scam on foreigners, but then it was explained to me that it’s a complex system to cope with the high rate of inflation, where the meters only get adjusted every 6 months, but every quarter there’s a new chart that ups the price in the interim. Last night I got a cab in Montevideo which took this to its logical extreme — the meter didn’t even bother with displaying prices at all: it just had a counter, starting at 1, and clicking upwards until you arrived at your destination, at which point the driver would look up the 53, or whatever, on his chart to tell you how much that cost.

It’s a fairly simple system, with some obvious advantages for drivers, but it seems remarkably unfriendly for passengers in general, and even more so for tourists. Although in many places being a tourist carries with it a presumption that you’re going to get ripped off, actually being so tends to leave a lasting negative impression on your overall view of a country or city. And tourism is one of those things that can make a huge economic difference to somewhere, so it’s worthwhile making the experience as pleasant as possible. Now, with Brazil in particular, it has a reputation of a place where you might actually be physically unsafe as a tourist, so lots of cities have Tourist Police who patrol areas frequented by tourists, looking out for common crimes, and speak multiple languages. I don’t have any information on what the actual effects of that are, but even making people feel safer presumably has lots of useful benefits.1

I was particularly pleased to note that Rio has implemented a similar nicety for taxis too. If you arrive on one of the (very pleasant) long distance coaches, the taxi rank outside the Rodoviária has a dispatcher who asks where you’re going, passes that information to the next available driver, and hands you a sheet of paper telling you the fixed-fee price to that destination. I first encountered this at JFK many years ago, where there’s a similar fixed-fee to anywhere in Manhattan, which you’re told up-front by someone other than the driver, making you feel pretty sure you’re not being ripped off. This sort of thing is really simple and really effective at improving the experience of travellers.

For all I love about Estonia, this is one that they seem to get wrong, and in the last year or so it’s gotten worse. Taxis are regulated to the extent that they have to prominently display their prices, and must give you a receipt (otherwise the journey is free). But the average tourist is going to have no idea what a reasonable rate is, and there are some taxi companies who notoriously wait outside hotels frequented by Finns, charging five times the average rates. From the airport there are only a small number of taxi companies operating, which cuts down on the chances of being ripped off, but a new bus company has started operating a single line to and from the airport, and which has a habit of pulling up at the bus stop approximately two minutes before the normal city bus (which is half the price).

In most areas of life I’m not a hug fan with interfering with the market. But a market only works well with a good information flow, and tourists tend not to be well informed. Saying that they should be, and that it’s their own fault if they’re not, doesn’t really work so well. My suspicion is that if you actually want tourists (and, of course, not everywhere does), you need to make certain adjustments to cope with them (e.g. not necessarily expecting everyone to be able to speak the local language), and I suspect that what might otherwise be over-aggressive meddling with tourist-heavy transport infrastructure is one area where the second-order effects are more valuable than any initial problems caused.

It’s entirely possible, though, that I’m just promoting my own desires as a frequent traveller (and one who’s never well enough prepared or educated about the places I’m visiting) into areas where actually they’d be detrimental. So I’m certainly willing to be pointed to research on this, or hear arguments as to why it’s a really bad idea. But arriving into Rio after dark and simply being told that it’s a flat rate R$35 to Ipanema was such a pleasant surprise, that I’m hoping it’s not, and that more places will copy it!

  1. and, yes, I’m aware of how dangerous that approach can be if you take it too far. []
Tags:

Transparency Statistics

March 28th, 2010 3 comments

I had a fascinating discussion yesterday about Freedom of Information statistics. Apparently, in the country in question (which shall remain anonymous as the actual details here are less important than the general concept), the percentage of requests that are being rejected has gone up significantly over the last year. It seems that many people think that this is a bad thing and shows that transparency is on the decline. My immediate reaction, on the other hand, was exactly the opposite: I assumed that this was a good thing, and transparency must be much greater now than before!

To explain, let me first step back to my days at BlackStar — at the time Europe’s largest online DVD retailer. Amongst my jobs there was tracking the response times to customer support emails. But, the flip side of that was to keep a close eye on what sorts of emails we were getting, to work out what we could do to no longer need to get those emails. Every week we would see which thing was taking our care staff most time to deal with that could most easily be minimised by making the site work better in the first place. Lots of people emailing us to see if they could cancel their order? Well, then we needed to push the ability for them to cancel it themselves out as late as possible: not just from when it started to be packed, but right through until the Royal Mail delivery truck actually picked it up. Lots of people emailing us to see if we were selling an item that didn’t seem to be listed on our site? Then we found and licensed a database of every video and DVD that had ever been released, even if they were no longer available. Lots of people now emailing us confused as to why were were listing an item on the site if they couldn’t buy it? Then we came up with a service to track down used copies for people. This process repeated itself time and time again and led us to have the most customer-friendly ecommerce site in Europe. However, the natural result of this was that over time the average customer support email got more complex. The low-hanging fruit were constantly being picked off so that no-one would ever again need to ask those sorts of questions. So, by definition, most of the requests we were getting were becoming significantly harder to answer, would take longer to deal with, and had a much higher chance of being unresolvable.

I view Freedom of Information requests in much the same way. Having to make a request should be an exception. Each one is highlighting an area where a government department should have been more proactively transparent. A perfectly functioning system would never require any Freedom of Information requests, as everything that would be released on request has already been published anyway. Most countries aren’t at that stage yet, but there’s a definite trend in that direction, helped greatly by initiatives like data.gov in the US and data.gov.uk in the UK. Both of these countries (and, presumably, any others doing likewise), aren’t entirely sure yet what the most important datasets to release are, though. Patterns of Freedom of Information requests can provide great insight into that. If certain types of information are being requested more than others, that’s probably a good sign that that’s something that would benefit greatly from pre-emptive disclosure.

So, to return to the original question of statistics: what would it mean if a department (or indeed entire country) rejected 100% of all Freedom of Information requests? Well, I guess it could mean that it was the most secretive, refusing to release anything at all. But it could also mean that it’s the most open, and each request made is being rejected either because the information is already publicly available anyway, or because it’s amongst the subset of information that is exempt from disclosure. The raw numbers tell us almost nothing. As always, the truth requires much more digging behind the statistics.

Tags: ,

Goodbye _why

August 20th, 2009 1 comment

_why the lucky stiff has vanished. He was one of Ruby’s most curious characters, always treading carefully that thin line between the eccentric and the surreal. His “Poignant Guide to Ruby” was one of the best books about programming I’ve ever read, and certainly the only to have an accompanying soundtrack album. But now he’s gone. His websites, code repositories, twitter stream, etc. have all been deleted.

And, inevitably, some people are moaning about how terrible this is. Not just because the future will be that bit dimmer without him, but instead because of how unprofessional it is that he just vanished without warning and orphaned all his code, and how much work they’re going to have to do now to replace it all in their projects that depended on it. Somehow the actions of the developer have tainted the existing code so much that it’s now toxic to even use.

This is a perennial issue in the FOSS world, particularly where it intersects with the business sphere. Companies fear they’ll be at the mercy of developers they don’t, and can’t, control, and so some Open Source evangelists assuage their clients’ fears with a lie. Sometimes this is explicit: by promising a wealth of free (as in beer) software, created by great developers all around the world who’ll gladly work with your developers, answer all your questions, and make all the changes you require. More often it’s a lie of omission, touting some of the benefits, without mentioning the possible negatives. What they should be saying is that there is absolutely no guarantee that they people who write this code are not cranky, illogical, unprofessional, hostile, rude, crazy, or even just plain nasty. Many, if not most of them may not be, but from a purely commercial risk management position, you’re better assuming that they are.

Only once you truly accept this can you start to see the real benefits and opportunities. FOSS is not like commercial software, other than with a price tag of zero. Beyond a few high-profile exceptions there is no support available for most projects. You may have a helpful author, or a mailing list, wiki, or message-board filled with knowledgeable people who’ll give timely, friendly, useful advice for free. Or, you may not.

Pretending you always will is foolish. Assuming you always should is dangerous.

FOSS does not make this promise. Instead it makes a better one. It provides you the source code so that you can find a programmer anywhere in the world to fix your problem (or do so yourself if so inclined and skilled). And, better still, it uses a license that gives you the freedom to actually do this.

One of the many freedoms of FOSS is the freedom from vendor dependency. Like all freedoms, it has a price. But perpetuating the myth of a volunteer army of slaves ready to serve your every whim is just a fantasy and like all illusions, leads only to two possible outcomes: disillusionment or insanity.

I’ve never met _why and can’t claim to have known him. But he certainly never matched any of the caricatures of a FOSS developer — whether good or bad. He always trod his own path. He offered some great software to the world, simply as a gift. Some people assumed that that gift came with implied promises. They were wrong. That doesn’t detract from the software, and it most certainly doesn’t detract from _why. I hope he enjoys whatever crazy path he chooses next.

Tags:

Get Excited and Make Stuff

June 29th, 2009 1 comment

Last weekend I braved a visit to the UK for Social Innovation Camp Scotland. My remit had been to be a roving expert, flitting from team to team, but I was so impressed by the first team I started working with that I ended up staying with them the whole weekend — right through to the point where we won!

Yesterday the News of the WorldSunday Times ran a rather hilarious scare-mongering piece about it. They quote Calum Steele, general secretary of the Scottish Police Federation, as saying: “the police service already have ways for the public to express dissatisfaction”.

The point that this so magnificently misses, however, is that those ways aren’t good enough. Local councils already have ways for the public to report potholes, graffiti, broken streetlights etc. — yet FixMyStreet flourishes. All public authorities already have ways for the public to make Freedom of Information requests — yet in just over a year WhatDoTheyKnow has already grown to handle about 10% of all such requests. The National Health Service already have ways for the public to provide feedback — yet over 7,000 people have preferred to use Patient Opinion (who have also managed to pull off the neat trick of getting the NHS to pay them to deliver complaints to them.)

There are many reasons why someone would prefer to use these sorts of sites rather than going directly. For some it’s purely practical: in many cases it’s much easier to visit a single easy-to-use site with a consistent interface rather than navigate the more, erm, interesting, waters of official sites, some of which still sport “Beware of the Leopard” signs.

For others it’s the communal nature of these sites, where others who have experienced the same problems can chip in with support and advice, or even just learn that they’re not alone.

But for me it’s all about how transparency reverses the balance of power. For too long too many government agencies have forgotten that they are meant to be our servants, not our masters. Before they will engage with us, they make us jump through hoops that do nothing but frustrate us, in the guise of making their lives somehow easier (though usually anyone with any inkling of business processes can’t help but wonder how it possibly ever could). And more often than not, complaints get the stonewall or runaround treatment, and those who persist often get little more than a bland not-quite-apology with no indication that anyone ever took the time to engage with the matter, and certainly no sign that anything might actually change as a result.

The simple act of moving all this out into the open changes things dramatically. Everyone knows that “what gets measured gets done”, and, in the UK at least, government bodies tend to be rather sensitive to what the public at large think of them. As such, rubbish that has been left in an alleyway for weeks has a habit of suddenly being collected rather quickly when there’s a public report of it on FixMyStreet for anyone browsing that Council’s page to view. Agencies tend to be less inclined to take 6 months to respond to Freedom of Information requests when anyone looking at their WhatDoTheyKnow page could see at a glance that they never meet the required timescales. (We’ve heard, for example, that the Information Commissioner’s Office love WDTK as now they get see all manner of patterns and common problems that are missed when only dealing with complaints that get escalated to them.)

Transparency is powerful, as the UK has learned dramatically over the last couple of months. And once it’s in place, it’s extremely difficult to remove it. A central proposition of the Open Source movement has been that “given enough eyeballs, all bugs are shallow“. I wish I were witty and wise enough to come up with an equivalent for Open Government (suggestions welcome!), but even without a catch-phrase the underlying idea still holds. Government in the open will, more often than not, be better government. In some parts of the world, this is a concept that still needs to be fought for. In the rest, where there’s at least a token agreement, even if (or perhaps especially if) it’s more honour’d in the breach than the observance, then join in. Create your own site. Shine some more sunlight. You don’t need permission. You’re already in charge. Just Do It.

What is a spreadsheet-wiki?

June 3rd, 2009 4 comments

While I’m on the subject of products I really want to see, I would be remiss of me not to mention the spreadsheet-wiki. This one should already exist by now, and I hold myself largely responsible for it not — after all, I spent almost a year working with Dan Bricklin and Socialtext trying to make it happen. When we parted ways, I hoped to be able to continue the project, but, for a variety of reasons, that never came together either. There have, from time to time, been vaguely encouraging noises from Socialtext, but this still doesn’t seem like a high priority for them, and the information that leaks out from time to time implies they’re still going down a different path. I’ve deliberately held back from talking about some of this stuff to give them a chance to get something out, but it’s 18 months now since I left, any inside knowledge I had is long past its sell-by date, and I really want to see this come together from somewhere.

By far the most common response when I tried to explain to people what I was working on, and what a spreadsheet-wiki actually meant, was “Oh, you mean like Google Spreadsheets?” But Google, and their online spreadsheet rivals, aren’t really creating what I want. Google Spreadsheets is no more a spreadsheet wiki than Google Docs is a text wiki. Yes, they’re great for collaboration, but that’s only half the wiki story. The critical other ingredient on a wiki is the humble link. Even outside wiki-land the power of the hyperlink is still poorly understood and massively underrated. It’s the fundamental building block of the Web, but even still hasn’t lived up to anywhere near its potential. Almost everyone, when they talk or write about Wikipedia, focuses on the “Anyone Can Edit!” part (whether with awe or despair), but the vast majority of readers never edit anything—the key for them is that absolutely everything is a link:


My dream is that that could also be true for numbers.

Wikipedia, of course, is full of numbers. People can talk about them, change them, cross-reference them, and do all manner of wiki goodness with them. But that’s not enough. Those numbers currently live in splendid isolation. They can’t interact in a spreadsheety way.

There have been various attempts to fix this, generally involving embedding spreadsheets into wiki pages as a replacement for plain tables. But although that achieves the goal of being able to perform some basic calculations in-place, it’s no better than being able to embed an Excel sheet in a Word document. It doesn’t solve any of the well known problems with large spreadsheets (aka Spreadsheet Hell). In a spreadsheet-wiki the spreadsheet should not be a second class citizen, subservient to the wiki. Rather, the spreadsheet should itself contain wikiness. Forget simple single sheet spreadsheets; I’m talking here about hundreds or thousands of properly cross-linked sheets, all mutually feeding each other. Forget having to email around your monthly financial statements comparing actuals to budgets with everything gradually drifting out of sync as no-one is quite sure which is the master copy any more, and no ability to examine how you got to what you have. In a spreadsheet-wiki every number is a link. You can see where it came from, and where it’s being used. If an assumption changes, everything that depends on it automatically changes. No more wasting 3 weeks in a dead end because you were working from old numbers. No more wondering why that P&L entry for “Miscellaneous Expenses” was so high in March. No more wasted time collating projections and forecasts from department heads, harmonising them into a divisional budget for the upcoming year, only to have to redo the entire process 4 times when the CFO trims your budget, or the COO explains some of the impact of a new office opening in in September. Instead everyone can work on their own page, have the data pulled automatically into a series of other sheets, and have changes take effect universally and instantly whilst everyone hammers out the details — with, of course, full transparency of who changed what when (and hopefully why), and the ability to roll-back to any earlier stage.

Most of the technology to make this work already exists. There are some interesting issues when you start talking about thousands of inter-related sheets, but that can evolve when we see what the real usage patterns are. Making something come together that will show just how powerful a concept this is, is mostly just a matter of vision, SMOP, and tuits. Like any of my other ideas, I’d love to work on it, but I can’t build it on my own. If you’re interested in working with me on it, or just building it yourself and picking my brains from time to time, please get in touch.

Track Every Penny

June 1st, 2009 6 comments

Personal finance software universally sucks. I have two theories for why this is:

Firstly, there actually is no personal finance software. It’s all just dumbed down versions of corporate accounting software. No matter how it’s dressed up to be ‘user friendly’, at the heart of it all is the core underlying assumption that you want to run your personal life like an accountant runs a business. Many people have been sucked into this way of thinking, in large part because it’s the mindset that using such software foists upon you, but really it’s a poor model.

The second reason is that it’s almost all created by Americans. By itself, of course, that’s not necessarily a bad thing. But it becomes a bad thing when it reflects their view of the world, which, particularly in matters of personal finance, doesn’t really hold true in other countries. And I’m not just talking here about the assumptions that are generally built into how relatively complex things like mortgages, or sharedealing, or taxes, or retirement accounts, etc work in different countries. Mostly I’m talking about the really simple things like not assuming everything is in US dollars! Sure, there’s generally a token nod to other currencies, but most finance software cope can’t even cope with simple multi-currency transactions (like all the times in Switzerland when I got a bill in Swiss Francs but paid in Euros), never mind the more complicated ones (like the time I bought my bus ticket from Bulgaria to Macedonia using my last remaining levs and made up the difference in Euros). And because most people who write finance software have never lived somewhere like New Zealand where the smallest coin is the ten cent piece, they tend not to cope very well with things like Swedish Rounding, where your total is rounded if you’re paying by cash (but not if by card).

Of course it’s possible to do these things in most software, but only if, under theory #1, you can think like an accountant and do the equivalent of lots of complex ledger entries. But lots of things that are deliberately really complex in business accounting are commonplace in people’s personal lives – like asking your friend if they have eighty cents to avoid needing to break another €10 note.

Most people get round all this by just ignoring it. Who really wants to have to record that their shopping bill was €14.80 but eighty cents of that came from Steve anyway? Well, me. I’m a firm believer in the “track every penny” school of thought. And I hate how hard it is for me to do so. In the 5 months of this year so far, I’ve been in 9 different countries. I keep every receipt, and record every item from every one of them. And it’s much too much hassle. I’ve tried numerous different software packages, and they’re all terrible for me. There’s a lot of innovation in the area recently with sites like Mint and Wesabe springing up and giving the old faithfuls of Quicken and Microsoft Money a serious run for their money. But there’s increasingly an assumption that most of your spending detail can be automatically obtained from your bank records so you don’t need to type it in. It makes sense for them to concentrate there, as having to painstakingly enter all your spending is the thing that puts most people off ever actually keeping track of where their money is going. But the more that part gets automated away, the less these companies work on making it really easy to enter transactions manually — which leaves me worse off, as I do the vast majority of my spending using cash, and my bank records thus tell me next to nothing. Even on the rare occasion where I pay for my groceries on my debit card, I don’t just want a total spend entered—I want a full breakdown of every line item. I want to know at the end of the year just how much I spent on milk or eggs, not just on “groceries”.

So I want software that works for me. That assumes I’ll be travelling a lot and working with multiple currencies. That makes it easy for me to enter detailed records rather than a chore. That deals with all the little details I raised last time I ranted about this.

It may be that I’m the only person in the world that actually wants this software, but I suspect I’m not. In the current economic climate people are watching their pennies carefully. Almost every personal finance book suggest that people literally track every cent they spend for at least a month. I think lots of people would like to know much more about where their money goes, but the pain of keeping track currently outweighs the benefits for a lot of people. So I want to make that easy.

I have a detailed vision of how that software would work, but I can’t build it by myself. Anyone want to help?

Where can I fly to this month?

June 1st, 2009 1 comment

All my playing with end-of-year travel plans has given me itchy feet. I’d like to go somewhere interesting for a few days sometime soon, but I don’t really care so much where. This is something the internets are meant to help with, but though the US is well served with any number of useful quirky travel sites, Europe doesn’t have so many of the “Just show me good deals” versions if you don’t live in certain key cities. So, in the DIY spirit, I wrote my own. I gathered a list of all the commercial airports I could find in Europe, grouped them by country, and wrote a script that searched on ITA in turn for all flights from Tallinn to any airport in that country over the next 30 days, and tell me the cheapest date to travel there. It’s a slightly nasty site to screen-scrape (and I’m pretty sure they don’t have any alternatives that you don’t have to pay for, as some of the puzzles they set job applicants involve scraping the site), and the code certainly isn’t pretty, but, thanks to Google Charts, the results are:

(Green is the cheapest, red the most expensive, yellow somewhere inbetween.)

My plan is to widen this beyond Europe, have it run every day, set some threshholds and have it email me any time something interesting appears. I suspect, however, that I’m much better served from Riga:

Thankfully there’s a comfortable bus to there!

More bmi Hacking

May 26th, 2009 3 comments

Star Alliance claim to be ‘committed to delivering to you the latest flight schedules from the Star Alliance members on multiple platforms Anytime, Anywhere.”‘ (emphasis mine). What’s more they go on to explain that that means that it will be ‘Automatically updated on your platform of choice.’

That is unless your ‘platform of choice’ is anything other than a Windows PC or a handheld with Palm OS, as their Electronic Timetable doesn’t run on, for example, a Mac. Instead we need to just make do with a hulking big PDF.

So, I decided to parse all the data out of that PDF, and on the basis that others might find it useful, make it available as a CSV file: Star Alliance Timetable 2009-05.

It’s nothing fancy, but being able to open it in Excel and filter on the various columns is still quite useful, and of course it opens up any number of other possibilities. I’m also considering building a little mini-application that makes it easier to play with, so if anyone has any suggestions for that, I’m all ears.

bmi Hacking

May 25th, 2009 No comments

I’ve been a bmi Diamond Club holder for many years. Unlike most Frequent Flier programs, airmiles you earn in this scheme never expire, so I’ve built up quite a few of them. However, it’s looking increasingly likely that bmi won’t actually be around for much longer — at least not in its current form. The most likely outcome seems to be a takeover by Lufthansa, and subsequent conversion of Diamond Club to their nowhere-near-as-good Miles and More scheme. So it’s looking like a good time to turn all my airmiles into a fun end-of-year escape-the-Tallinn-winter trip.

I’ve spent quite a bit of time over the last week learning how best to go about that, and discovering all manner of interesting ways of combining the various rules in interesting ways. (Much of this is learned from the fine folks at Flyer Talk, which, once you can get beyond all the jargon, is an amazing source of tips, tricks, and useful advice.)

The first thing you need to get the hang of is the bmi zone chart. Rather than spending miles based on the actual distance you fly, the world is divided up into a series of zones, and you pay a fixed rate per flight based on the zones you’re flying to/from. (This is purely in terms of the miles spent—you still need to pay the taxes depending on the airports you use, which, of course, differ everywhere.) I found it hard to keep track of which countries were in which zone, so I drew a pretty map.

The biggest problem with constructing a suitably interesting trip is that you’re only allowed one stop-over (visiting a city en-route for more than 24 hours) per ticket. So, for example, if you were to book a return from London to Sydney you’d only be allowed to stop off in other place (e.g. Singapore) in either direction. However, you can purchase one way tickets, so by getting two of those, instead of a return, you now get a stop-over in each direction, so could stop, for example, in Singapore for a couple of weeks on the way there, and Thailand on the way back.

What I then noticed was that to go from Zone 2 (Central/Eastern Europe — where I currently am) to Zone 10 (Australia/NZ — where I want to go) is 50,000 miles each way, but two singles from Zone 2 to Zone 8 (East Asia) and then Zone 8 to Zone 10 are only 25,000 each. Thus, by going via South Korea or Japan, for example, you can effectively get 3 free stops in each direction – effectively turning a naïve two destination trip (e.g. Copenhagen – (Bangkok) – Auckland – Copenhagen) into a seven destination trip for the same price (e.g. Copenhagen – (Bangkok) – Tokyo – (Hong Kong) – Auckland – (Sydney) – Seoul – (Delhi) – Copenhagen)! These are all published Star Alliance routes: Air Asiana, for example, fly Seoul to Copenhagen via Delhi and Zurich three times a week.

If you really wanted to, you could also (again, for the same price) omit the last ticket, and return Auckland–Copenhagen via L.A. or Vancouver turning it into a complete round the world trip at half the mileage cost of an actual round-the-world ticket!

I wrote a little script to analyse the entire zone chart for other free multi-zone detours, and discovered there were quite a few of them (including some where the detour actually lowered the total price, such as Zones 2–7 via 10 which is only 70,000 miles, instead of 80,000 direct!)

Of course, the longer the route, the more complexity there is in trying to piece it all together.  You get significantly more value spending the miles on business class flights than on economy, but availability on those disappears quite far in advance on popular routes (and isn’t available at all on many Singapore Airlines flights as they reserve those for their own card-holders rather than their Star Alliance partners). But I’m currently contemplating trying to piece together a 2-10-7-9-8-2 route, which is only 110,000 base miles, and would theoretically allow something along the following lines:

Riga – (Cairo) – Bombay – (Bangkok) – Manila – (Tokyo or Sydney) – Auckland – (Shanghai) – Tashkent or Almaty – (Istanbul) – Riga.

Which, if I can pull it off, isn’t bad for only 10,000 miles more than a simple Riga–Auckland return! Suggestions / alternatives / gotchas / etc. welcomed!

Splitting a WordPress blog in two

May 13th, 2009 No comments

This blog had its seventh birthday recently. I know there are many amongst you who have been blogging since before the term was even coined, and who make more posts in a month than I’ve made in seven years, but still.

Anyway, back in the early days of blogging, a significant percentage of blog posts weren’t original content, but the equivalent of retweeting: a way of passing on to your readers something interesting you’d read elsewhere. Of course the vast majority of those were links to other people’s blogs. It’s how word spread about interesting posts before digg and reddit and twitter and the like.

I tried to do something slightly different for a while: rather than just regurgitating other blog posts, I instead regurgitated interesting snippets from real dead tree books I was reading, picking interesting excerpts chapter by chapter.

It seemed to be well received, and I had a lot of fun choosing which couple of paragraphs from each chapter could convey something interesting enough to both stand alone without the surrounding context and also encourage others to seek out the book for more depth.

Early in 2004, I seem to have abandoned the idea. Likely it’s just because I was super-busy with Twingle, and then with Unite, and I probably always meant to get around to picking it up again, but just never did. Until now.

I decided, however, to do this on a new separate blog: dustyvolumes.com. So I had to work out how to move all the old posts to there. This was significantly more complicated than I expected. Doubtless someone will point me to a WordPress plugin that could have made the whole thing take 30 seconds, but in the absence of that, here’s the gory details for anyone else who ever wants to do something like this.

First, of course, I needed to have the new blog set up. I’m assuming that’s self-evident, and needs no further explanation.

Next I needed to find all the posts I wanted to move. I already had them all tagged with “Books”, so this part was fairly easy and avoided an even longer manual process. WordPress doesn’t have an ‘export by tag/category’ option, though—the only way to restrict an export is by author. So I had to go into “Posts > Edit”, find a post with the relevant tag, and click that tag to give me a list of all those posts. Then I could do a Bulk Edit of each to change the author to a new temporary account I set up just for this purpose. There were multiple pages of them, and there doesn’t seem to be a way to operate on more than one page at a time, so I went through them page by page. It was repetitive enough to make me want to find a short-cut, but there weren’t quite enough pages to make it worthwhile.

Then I exported all the posts by my new author, and imported those into the new blog. I did some more tidying up there of tags and categories etc, and found a few posts that should probably still remain on this blog instead (they were tagged with Books too, but were, for example, about me getting rid of my collection before moving to Estonia, rather than being excerpts suitable for Dusty Volumes), so deleted them from there, and changed the author here back to me on each of them in turn (I wanted that author to match exactly the posts that were on the other blog so I could continue to operate on those here).

Now I had the new blog working, but hit the much harder problem of what to do about the posts here. I could, of course, just have deleted the posts that I’d moved, but I still get quite a few hits on them from Google searches and links from other blogs, as well as some internal links to them, and I didn’t want to break all those. After some research I found a couple of WordPress plugins for setting up redirection. The first one I tried, “Redirection“, has lots and lots of features, but wasn’t quite what I wanted. The second, “Redirect“, was perfect. It does only one thing, but does it simply, and does it well. Using the Custom Field options in WordPress, it lets you set a ‘Redirect’ field with a value of the URL that viewers should be redirected to on viewing a given post. So now it was just a matter of going through and setting those up one by one.

Thankfully the WordPress import maintains the post ID from the export, so I didn’t need to spend any time building a map of which IDs should map where: each relevant post would just need to redirect to http://dustyvolumes.com/archives/<id>. I did a couple of these manually to make sure everything was working, but there was no way I wanted to do another 150 or so by hand. It was time to go to the database.

I’ve never actually explored the WordPress schema before, but there aren’t very many tables, and it’s fairly easy to work out what’s going on. (There’s probably decent documentation for it all too, but I tend to prefer to just work things like this out manually.) I’m not going to detail all the SQL commands I had to run: if you don’t know enough to work them out yourself you probably shouldn’t be playing with the database directly anyway, and should just do this the longwinded way (and I really don’t want to be fielding questions on it 6 months from now when the schema has changed). But it was a simple matter to just select the IDs of all posts by my fake ‘author’, and insert the relevant Redirect custom field values.

However, this still left a large number of ‘Books’ entries in my tagcloud that really weren’t there any more, so I also wanted to remove all the tags from these posts too. Ideally the Bulk Edit should be capable of this, but it currently only allows you to add a tag to multiple posts, not remove one, so again I went to the database. This one was slightly trickier, as it’s a cross-table DELETE, but again, if don’t know how to do that, you shouldn’t just be pasting in random SQL you found on someone’s blog somewhere.

Unfortunately, although that successfully removed all the tags, the tag cloud still proudly declared that I had a huge number of “Books” posts. WordPress, presumably for speed, keeps a total of how many posts are assigned to each category in a different table, and, being a typical modern webapp, maintains that count in client code rather than in the database itself. So having manually removed lots of tags without updating the count field too, my database was now out of sync with itself. MySQL doesn’t do cross-table UPDATEs with aggregates, so this time I needed an UPDATE with a subselect of a COUNT(*).

Including lots of cautious exploratory SELECTs, lots of LIMITs of my UPDATEs and DELETEs to make sure the right thing was happening each time, and backing up carefully after each major change, the whole thing took about an hour. I could possibly have done it all via the web interface in that time, but it would have been a close call, and there was a very high chance that I’d have gotten so bored in the middle of it that I’d have abandoned it half-way through, promising to finish it another day (and likely never quite gotten around to it). This way was mentally stimulating rather than draining, thus giving much more satisfaction when done, and I learned much more about the WordPress database structure that could be very useful if I ever decide to write a Plugin.

And now I have two blogs to rarely write in…