I was talking to Marty again recently about his anti-Java stance, and we were trying to think of ways in which different languages could be rated.
This of course reminded me to go back and check out Doug Bagley’s Great Language Shootout, which compares multiple languages’ speed and memory usage for doing the same task.
I’d previously helped optimise some of the Perl solutions, and last night I noticed that there were a few new tests from when I last looked at it. In partiuclar the results for the Matrix Multiplication test seemed slightly strange: Perl was much further down the list that I’d have expected.
The meat of the code seemed to be the function mmult:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | sub mmult { my ($rows, $cols, $m1, $m2) = @_; my @m3 = (); --$rows; --$cols; for my $i (0 .. $rows) { my @row = (); my $m1i = $m1->[$i]; for my $j (0 .. $cols) { my $val = 0; for my $k (0 .. $cols) { $val += $m1i->[$k] * $m2->[$k]->[$j]; } push(@row, $val); } push(@m3, \@row); } return(\@m3); } |
I played around for a while, and got about a 50% speedup with quite a nasty nested map approach:
1 2 3 4 5 6 7 8 9 10 11 12 13 | sub mmult2 { my ($rows, $cols, $m1, $m2) = @_; --$rows; --$cols; my @m3 = (); for (0 .. $rows) { my $i = $_; push @m3, [ map { my $j = $_; sum map $m1->[$i]->[$_] * $m2->[$_]->[$j], 0..$cols; } 0 .. $cols ] } return \@m3; } |
I tried various approaches to turn the outer for() into a map as well, but my brain started hurting too much as it all got very messy.
And then I noticed that we were pushing to @m3 each time around a loop that counted from 0, and realised it would probably be much more efficient to just assign directly each time. So I replaced the push @m3, ... with $m3[$i] = ... , and performance shot up.
So I rolled back all the other changes I’d made, and just applied this straight through:
1 2 3 4 5 6 7 8 9 10 11 | sub mmult3 { my ($rows, $cols, $m1, $m2) = @_; my $m3 = []; --$rows; --$cols; for my $i (0 .. $rows) { for my $j (0 .. $cols) { $m3->[$i][$j] += $m1->[$i][$_] * $m2->[$_][$j] for 0..$cols; } } return $m3; } |
I think this version is much neater, more idiomatic Perl, and also more understandable and maintainable than not just my optimised one, but the original as well. And it’s 3 times faster.
Optimising for speed doesn’t always mean trading off maintainability. Usually finding a better approach gets better results that micro-optimisations, and can end up producing an all-round better solution, not just a faster one.
Unfortunately Doug has stopped updating the Shootout pages, so perl will just have to languish 4 places lower on this test than it should be …
The Computer Language Shootout is alive and well
http://shootout.alioth.debian.org/gp4/