Archive

Archive for January, 2007

Wiki Gardening

January 16th, 2007 1 comment

Chris Dent, whose blog is a treasure trove of interesting thinking about wikis, asks:

Even if you have everyone writing in blogs every day, how do you ensure that all those stories are distilled for information that is useful tomorrow, next month, next year and five years down the road?

This is a great question, and one that I have also been puzzling over a lot in my current employ. When we took over the running of Ireland’s oldest ISP a couple of years ago there was a huge information loss problem. So within the first week or so we set up RT to track customer enquiries, a blog for each member of staff to narrate their work, and a wiki to act as a basic customer management system and repository of useful information (it has since grown into much more, including our accounts package, but that’s a different story).

Now, we often have the opposite problem. I remember seeing the information somewhere, but can’t remember where it is – is it buried in a customer dialogue within RT? Did someone write it up on their blog? Was it added to the wiki?

In this version of the problem, ‘search’ probably is the simplest answer. But as Chris points out, search isn’t always the right answer. I would go further in my reasons why, however.

Search is useful when you’re looking for something, and you know what it is. Often you don’t have both halves of that. When a customer contacts you, for example, you should be able to pull up a single page of details about them where all the important facts will be listed: what level of support they have; what services they have; major problems on their account; key personnel etc. If a customer has recently had significant problems with their email, but this hasn’t been recorded on the wiki, and the person dealing with the customer now doesn’t know that, then they’re not even going to consider searching for information about it. Even if they have a vague memory of overhearing someone talking about some sort of problem with the customer’s account a month or so ago, they probably don’t even have enough information to search with.

It wouldn’t matter how great a search appliance we had, normal search just wouldn’t help here. This is where, as Chris points out, the process of wiki gardening comes in. Someone needs to tend to the wiki, carefully pruning back the less relevant information, and reshaping each page into its most useful form.

But this is a time consuming operation, and generally most people don’t have that sort of time. It’s hard enough trying to ensure that a summary of the key facts from each customer interaction just get copied over onto their wiki page, without also needing to spend another five minutes just tidying that page up. In larger organisations where call-centre staff are measured on how many queries they can handle per hour, the disincentive is much much stronger.

And, of course, any time where a human is copying information between two different computer systems, a giant red flag should pop up and scream that something really bad is going on.

I don’t have any great answers to this problem, but I wholeheartedly agree that enterprise wikis need to provide better tools for dealing with information stored outside themselves.

JotSpot did quite a lot of work in this area, providing two-way integration with Salesforce.com etc., but although that made for a cool demo, I don’t think that’s really what’s needed. Rather than just replicating the data stored elsewhere, the wiki needs to allow you to summarise what’s there, and then direct you off to view the full detail in situ. (Ideally searching within the wiki should pick up the full content though). But, crucially, there needs to be a way for the wiki to know when there’s un-summarised data needing handled, rather than users needing to remember to copy the data across.

Perhaps in the first instance it’s as simple as being able to add a little gizmo to a page that tells it how to find, for example, all RT tickets for this customer, which will then automatically list each ticket – initially in a default manner but where each entry can be edited in the traditional wiki way? As we move more and more into a world where different pieces of software can talk to each other with webservices, it should become easier and easier for this sort of information to be pulled across.

Semantic Mediawiki already provides an syntax for querying information within the wiki (although there’s no way yet to manually manipulate the results), so something like this could probably be repurposed to query information outside the wiki in a similar manner. Time to go ask on the semantic wiki list whether anyone’s working on anything like this. And I’ll certainly be paying close attention to how Socialtext (or anyone else, for that matter) tries to solve this issue.

Tags:

mytop and locked threads

January 5th, 2007 2 comments

Recently I’ve had to do some heavy-duty maintenance work on a MySQL database that’s still in heavy use, can’t take more than about 30 seconds of downtime, is having serious problems due to a 50 million row table that really needs trimmed down, but is missing the crucial indexes that would allow that to happen easily. Without the indexes, even deleting the complete data set for obsolete accounts can take several hours, never mind the time to perform the more complex purges for active accounts (which should be happening daily, but as they take too long, haven’t been happening for a long time, thus making the problem worse every day!) Of course, adding the indexes that would make this all much easier would also lock the table up even longer.

I have a plan to solve this by temporarily replicating the table to a slave version that has the correct indexes and then swapping the tables (I’ll blog the complete details later if I can get it to work), but whilst I’ve been in investigation mode, I’ve been relying heavily on mytop. This is a wonderful little utility for watching what’s happening in a mysql database, similar to the unix ‘top’ command.

Because the data is in a MyISAM table, and thus has table level locking, it’s prone to the old problem of a long select causing an insert or update to block, which in turn causes all other selects behind that to block as well. So I need to be very careful that none of my queries are causing a big queue. For this, mytop is almost perfect, with one small caveat: out of the box it doesn’t show which threads are locked. I don’t care if my select takes too long if it isn’t blocking anything, but once a queue forms, I need to be ready to kill my thread. Mostly I can work this out from what else is executing, but I prefer letting the computer do that sort of work for me. So I made a simple one line addition to the code.

At around line 1000 there’s a an ‘if ($HAS_COLOR)’ block to print the output in different colours depending on the type of command being executed at the database. At the end of that I added:

print RED()    if $thread->{State} && $thread->{State} eq 'Locked';

Now any locked thread is instantly recognizable, and I can react much quicker to any problem. It’s also quite interesting to watch what’s happening even when I’m not meddling, and see how many locks are naturally arising anyway!

There hasn’t been a release of mytop for a few years now, the mailing list has vanished, and I’m not sure whether it’s even actively maintained any more. So I’m not expecting to see this show up in the live code any time soon. But this post will remind me to add this on any future installations where it might come in useful. The beauty of Free Software!

Tags: