Archive

Posts Tagged ‘perl’

Camel Poop

January 5th, 2003 No comments

In response to the Camel POOP article on Evolt, Simon Willison complains: I have no
intention of starting a language war, but my God this is ugly. Still, I guess it must work for some people.

As a user and fan of Perl, I have to agree. This is exceptionally ugly. In fact it’s the sort of thing that turns people against Perl. Thankfully OO in Perl doesn’t have to be like that.

Let’s take his Person class:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
#class Person
package Person;
use strict;
use Address;    #Person class will contain an Address
 
#constructor
sub new {
  my ($class) = @_;
  my $self = {
    _firstName => undef,
    _lastName  => undef,
    _ssn       => undef,
    _address   => undef
  };
  bless $self, $class;
  return $self;
}
 
#accessor method for Person first name
sub firstName {
  my ( $self, $firstName ) = @_;
  $self->{_firstName} = $firstName if defined($firstName);
  return $self->{_firstName};
}
 
#accessor method for Person last name
sub lastName {
  my ( $self, $lastName ) = @_;
  $self->{_lastName} = $lastName if defined($lastName);
  return $self->{_lastName};
}
 
#accessor method for Person address
sub address {
  my ( $self, $address ) = @_;
  $self->{_address} = $address if defined($address);
  return $self->{_address};
}
 
#accessor method for Person social security number
sub ssn {
  my ( $self, $ssn ) = @_;
  $self->{_ssn} = $ssn if defined($ssn);
  return $self->{_ssn};
}
 
sub print {
  my ($self) = @_;
 
  #print Person info
  printf("Name:%s %snn", $self->firstName, $self->lastName );
}
 
1;

There are numerous problems with this. Let’s start from the top.

  1. The ‘use Address’ is completely needless. Misleading comment notwithstanding, nothing in the package actually uses the Address module, so there’s no need to load it.
  2. The constructor is complete overkill. The object is going to be a hash, but in Perl hash keys autovivify the first time you assign to them, so there’s absolutely no need to set up lots of keys that contain undef. As the constructor does nothing, it could simply be sub new { bless {}, shift }.
  3. The data methods all do exactly the same thing. This breaks one of the cardinal rules of programming – Don’t Repeat Yourself. firstName, lastName, address, and ssn are all trivial accessor/mutator methods. They could all be abstracted away in a variety of methods, but as this is Perl, someone else has already done that for us. Class::Accessor lets us set up all these methods simply by doing:
    Person->mk_accessors(qw/firstName lastName address ssn/);
  4. The constructor doesn’t all you to set object data. It’s pretty much a matter of style, but in general it’s nice to be able to instantiate your object with data, rather than having to call all the mutators in turn. As this is a simple hash-based object (as with 90% of all perl objects) Class::Accessor gives us a default new() as well that allows us to pass a hashref of the data members.

So, this entire class could be replaced with:

1
2
3
4
5
6
7
8
9
10
11
12
13
package Person;
 
use strict;
use base 'Class::Accessor';
 
Person->mk_accessors(qw/firstName lastName address ssn/)
 
sub print {
  my ($self) = @_;
  printf "Name:%s %snn", $self->firstName, $self->lastName;
}
 
1;

Similarly, the Employee subclass, which simply adds ‘id’ and ‘title’ data members, and overrides print could become:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
package Employee;
 
use strict;
use base 'Person';
 
Employee->mk_accessors(qw/id title/);
 
sub print {
  my ($self) = @_;
  $self->SUPER::print;
  printf("Name:%s %snn", $self->id, $self->title );
}
 
1;

(The example on the site makes no sense as it keeps referring to an Address class that doesn’t seem to exist. I’ve assumed that the overridden print should be outputting the extra data members of this class instead…)

Then, instead of the longwinded test program (with a eval to catch exceptions that’s completely needless as neither class throws them):

1
2
3
4
5
6
7
8
9
10
11
12
13
use Employee;
 
#create Employee class instance
my $khurt =  eval { new Employee(); }  or die ($@);
 
#set object attributes
$khurt->firstName('Khurt');
$khurt->lastName('Williams');
$khurt->id(1001);
$khurt->title('Executive Director');
 
#diplay Employee info
$khurt->print();

We can simply have:

1
2
3
4
5
6
7
8
9
10
use Employee;
 
my $khurt = Employee->new({
  firstName => 'Khurt',
  lastName  => 'Williams',
  id        => 1001,
  title     => 'Executive Director',
});
 
$khurt->print();

It may not be as nice as OO in some other languages, but it’s a lot nicer than the example in the article… and as it’s Perl, there’s plenty of other approaches if you want to dice it another way.

Tags: ,

Is Perl the new COBOL?

October 2nd, 2002 1 comment

It seems to me that COBOL and other popular high-level procedural languages should be blamed for most of the current problems in programming. It is so easy to program in such languages that even novice programmers can write large programs – but the result is often poorly structured. With OO languages such as Smalltalk, programmers take pains to learn how to program, and the result is well-structured programs. Having programming approaches that can be used only by experts may be a good idea. Our past attitude of easy-going programming is a major cause of the present software crisis.

– Takaya Ishida, in Steve McConnell’s “The Best Influences on Software Engineers”, Software, 2000

Tags: ,

The Great Language Shootout

October 2nd, 2002 1 comment

I was talking to Marty again recently about his anti-Java stance, and we were trying to think of ways in which different languages could be rated.

This of course reminded me to go back and check out Doug Bagley’s Great Language Shootout, which compares multiple languages’ speed and memory usage for doing the same task.

I’d previously helped optimise some of the Perl solutions, and last night I noticed that there were a few new tests from when I last looked at it. In partiuclar the results for the Matrix Multiplication test seemed slightly strange: Perl was much further down the list that I’d have expected.

The meat of the code seemed to be the function mmult:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
sub mmult {
    my ($rows, $cols, $m1, $m2) = @_;
    my @m3 = ();
    --$rows; --$cols;
    for my $i (0 .. $rows) {
        my @row = ();
        my $m1i = $m1->[$i];
        for my $j (0 .. $cols) {
            my $val = 0;
            for my $k (0 .. $cols) {
                $val += $m1i->[$k] * $m2->[$k]->[$j];
            }
            push(@row, $val);
        }
        push(@m3, \@row);
    }
    return(\@m3);
}

I played around for a while, and got about a 50% speedup with quite a nasty nested map approach:

1
2
3
4
5
6
7
8
9
10
11
12
13
sub mmult2 {
  my ($rows, $cols, $m1, $m2) = @_;
  --$rows; --$cols;
  my @m3 = ();
  for (0 .. $rows) {
    my $i = $_;
    push @m3, [ map {
      my $j = $_;
      sum map $m1->[$i]->[$_] * $m2->[$_]->[$j], 0..$cols;
    } 0 .. $cols ]
  }
  return \@m3;
}

I tried various approaches to turn the outer for() into a map as well, but my brain started hurting too much as it all got very messy.

And then I noticed that we were pushing to @m3 each time around a loop that counted from 0, and realised it would probably be much more efficient to just assign directly each time. So I replaced the push @m3, ... with $m3[$i] = ... , and performance shot up.

So I rolled back all the other changes I’d made, and just applied this straight through:

1
2
3
4
5
6
7
8
9
10
11
sub mmult3 {
  my ($rows, $cols, $m1, $m2) = @_;
  my $m3 = [];
  --$rows; --$cols;
  for my $i (0 .. $rows) {
    for my $j (0 .. $cols) {
      $m3->[$i][$j] += $m1->[$i][$_] * $m2->[$_][$j] for 0..$cols;
    }
  }
  return $m3;
}

I think this version is much neater, more idiomatic Perl, and also more understandable and maintainable than not just my optimised one, but the original as well. And it’s 3 times faster.

Optimising for speed doesn’t always mean trading off maintainability. Usually finding a better approach gets better results that micro-optimisations, and can end up producing an all-round better solution, not just a faster one.

Unfortunately Doug has stopped updating the Shootout pages, so perl will just have to languish 4 places lower on this test than it should be …

Tags: ,

Perl Aikido

September 7th, 2002 No comments

I’ve spent the last two days in Damian Conway’s “Advanced Module Implementation” course. In it he steps through, line by line, the core code of his Attribute::Handlers, NEXT, Class::Delegation, Inline::Files, Filter::Simple, Perl6::Variables, Perl6::Currying, Hook::Lexwrap, Regexp::Common, and, yes, Acme::Bleach modules to show lots of different ways in which to build modules. And twist your brain in lots of painful ways along the way.

A key refrain along the way is that of Perl Aikido – having enough mastery of your craft to know how to get lots done, not by writing lots of code, but by doing just the right thing at the right time. Most of these modules are pretty much implemented in less than one screenful of code. And whilst some of it is insanely devious (such as the infamous goto , "$imposter" } }; line), much of it is just piecing together straightforward things in an unusual way (the much more subtle deviousness of Regexp::Common that doesn’t look at all scary until you try to follow just what is actually going on).

I’ve recently been noticing a glitch in my theory of extraneous code. Almost everyone knows that a beginner in any programming language writes far too much code. They don’t yet know all the idioms of the language, and try to replicate the approaches they know from other languages, even if this language has a built in manner for doing this. As you get better at writing the language you learn the better approaches, and your code gets shorter. However, I no longer think there’s a simple progression from this point to the ‘master’ level where modules like the above ones can be written in one screenful of code. I think that ‘expert’ programmers, still on their way towards ‘master’, actually revert to writing much too much code again. When faced with a difficult problem, they know enough of the nasty things they can do to attempt to stomp through the problem. Ironically someone with less knowledge would probably not attempt this, and try to find a way around the problem instead. An expert may have lots more tools in his toolbox, but it takes a master to know exactly which one is best for every job.

Space shuttle engineers use Python to streamline mission design

June 19th, 2002 No comments

Software engineers have long told their bosses and clients that they can have software “fast, cheap, or right,” as long as they pick any two of those factors. Getting all three? Forget about it! But United Space Alliance (USA), NASA’s main shuttle support contractor, had a mandate to provide software that meets all three criteria.

[via Simon Willison]

This is a nice article on how using the right tool for the job [in this case Python] can shrink your development time significantly. It’s a great piece of advocacy for the Python folks, but also for the whole a dynamic scripting language as a “real” language approach.

Unfortunately it reinfornces the usual Perl issues: “Without a lot of documentation, it is hard to grasp what is going on in Java and C++ programs and even with a lot of documentation, Perl is just hard to read and maintain.”

I know no-one believes me, but I’ll just yell it again anyway: It doesn’t have to be like that! Well written Perl can be as readable and maintainable as any other language.

Unfortunately most of it isn’t. It’s too easy to written terrible Perl. And it’s too acceptable to write terrible Perl. And unless software manages start insisting that their developers’ Perl is as readable and maintainable as it would have to be in any other language, then we’re always going to hear stories like this.

Tags:

Lightweight XSLT with TT

April 16th, 2002 No comments

I discovered a wonderful Template Toolkit plugin yesterday: XML::Style.

The basic idea, according to the docs, is that you can apply various attributes to your HTML. The example given is of transforming an HTML table:

           [% USE xmlstyle
                  table = {
                      attributes = {
                          border      = 0
                          cellpadding = 4
                          cellspacing = 1
                      }
                  }
           %]
 
           [% FILTER xmlstyle %]
 
           <table>
           <tr>
             <td>Foo</td> <td>Bar</td> <td>Baz</td>
           </tr>
           </table>
 
           [% END %]

This didn’t sit quite right with me though, as that seemed to be something you should be doing in CSS. But as I read through the docs I discovered you can also change tags. Again though the docs gave a bad example:

           [% FILTER xmlstyle
                     th = {
                         element = 'td'
                         attributes = { bgcolor='red' }
                     }
           %]
           <tr>
             <th>Heading</th>
           </tr>
           <tr>
             <td>Value</td>
           </tr>
           [% END %]

Having been playing with XSLT recently, though, a lightbulb went off. The real power of this plugin is more to be able to do things like:

   [% USE xmlstyle
        video = {
          pre_start = '<html><head><title="Video Info"></head><body>'
          element = 'table'
          attributes = { class='videoTable' },
          post_end  = '</body></html>'
        }
 
        title = {
          pre_start = '<tr><td>Title:</td>'
          element    = 'td'
          attributes = { class='videoTitle' }
          post_end  = '</tr>'
        }
 
        price = {
          pre_start = '<tr><td>Price:</td>'
          element    = 'td'
          attributes = { class='videoPrice' }
          post_end  = '</tr>'
        }
   %]

And then, given some XML such as:

    <video>
      <title>La Double Vie De Veronique</title>
      <price>10.99</price>
    </video>

We end up with:

    <html><head><title="Video Info"></head><body><table class="videoTable">
      <tr><td>Title:</td><td class="videoTitle">La Double Vie De Veronique</td></tr>
      <tr><td>Price:</td><td class="videoPrice">10.99</td></tr>
    </table></body></html>

This could be used as a first step towards “true” XSLT if you’re already using TT. If the only reason you’re moving towards XSLT is because a PHB says to, it might even be enough to convince them that you’ve done so :)

Tags: ,

Tainted SOAP

April 10th, 2002 No comments

Jon Udell says that “I’m sure Paul Kulchenko will soon fix the SOAP::Lite vulnerability that was just noticed.”

This is quite a strange tale. It discussed on the Perl-5-Porters list back in December. At that time the discussion centred less around SOAP, but about why Perl’s taint mode didn’t help.

For those that don’t know, this feature tells the language not to trust any data received from the outside world unless the author has taken steps to verify that that information is safe (usually through a regular expression, although there are a variety of tools that can help with the monotony of this).

Several orthogonal issues arose out of this. Firstly, in a language as dynamic as Perl, what exactly should you check – in this case the problem is down to resolving method calls at run-time based on possibly unsafe data, whereas previously taint had been used more for areas like I/O and executing other arbitary programs.

Secondly, and more imporatntly IMO, a change arose (not directly out of this but around the same time), to help with the gradual migration to taint-safe code. One of the areas in which taint mode is most useful, and most important, is in web-programming. When your forms are taking input from the big bad web(tm) you need to be very sure of what you’re doing with that information. But, if you’re running under mod_perl, taintedness is very much an ‘all or nothing’ across your entire application.

This causes problems if you come in late to a project that isn’t running in taint mode. It’s very difficult to bootstrap your way back into safe territory. You can attempt to migrate your code to a taintsafe approach (and abstracting all input processing to using a tool like that described earlier can help greatly with this). But you can’t actually turn taint on until you’ve found every last unsafe construct without risking your entire application blowing up at run-time (trying to do something unsafe with an unchecked tainted variable throws a fatal exception).

This issue has come up many times on the perl internals list, and it generally generates a lot of heat. Proponents of the current system point out that if you haven’t closed every possible problem case then your entire application can blow up in even more dangerous ways at run time anyway – you’re just relying on no-one else finding the holes before you. People in precisely this situtation, on the other hand, would point out that they’d like to move to a truly safe environment, but without being able to turn taint on an area at a time, it’s a very difficult path to take.

This time around however, we got a result. Larry Wall blessed the idea of taint warnings. From version 5.8 you will now have the ability to turn on a taint pragma which warns rather than dies when you attempt to do something potentially unsafe. In most cases this should alert you to the hole before someone passes something actually unsafe through it. So now (or at least soon, if you don’t run bleadperl), you can turn that on everywhere, watch for warnings, and when you’re comfortable that you should be clean, turn full taint mode on.

However. No-one then went back to the original problem. The SOAP::Lite problem remained. Possibly no-one ever told Paul about it. And then it raised its head again this week, 4 months later.

Jarrko, the current pumpking, had a wonderful response:

While it’s true that Perl strives to give you enough rope I sometimes wonder was it morally right to buy a whole sisal plantation.

This time, the loophole has been closed.

The story provides an interesting take on the open source approach to security. When everything comes together properly, as has often been pointed out, security works well. Holes can be closed quicker than the equivalent closed source world. But things don’t always come together properly, and attention can very easily be diverted elsewhere. Of course that “elsewhere” can often be very useful too :)

Tags:

Dynamic META tags in Template Toolkit

April 5th, 2002 3 comments

Thanks to Jonas Liljegren, I finally found a workaround to a limitation in the Template Toolkit.

I tend to wrap all my templates in a base template, which will add a standard header and footer etc to every page. TT allows you to set META variables in your inner template which will affect how this gets processed.

So, for example, in my default wrapper file I tend to have something like:

   DEFAULT wrapper = template.wrapper or 'wrapper/main_content';
   PROCESS components/browser_title;
   PROCESS components/header;
   PROCESS $template WRAPPER $wrapper;
   PROCESS components/footer;

This allows the inner content of every page to be wrapped in a certain way (in the main_content wrapper), but be overriden within a given template, by saying:

  [% META wrapper = "wrapper/flibble" %]

However, due to how the version of TT works, this can only contain static text. This causes problems in, for example, my “browser_title” template, which will contain something like:

  <title>[% template.browser_title || "My Site Name" %]</title>

It’s fine if I want to set the title of a particular page to “Product Information”, but less good if I want to set it to the name of the particular product being passed to that template!

Jonas came up with a useful hack to get around this. He defined a new Interpolate plugin which would, when USEd, translate any template variable starting with a minus sign into a re-evaluated version of that:

package MyPath::Template::Plugin::Meta::Interpolate;
use strict;
use base "Template::Plugin";
sub new {
    my ($self, $context, @params) = @_;
    my $template = $context->stash->{'template'};
    foreach my $key (keys %{$template}) {
        next if $key =~ /^_/;
        my $val = $template->{$key};
        if( $val =~ /^-(.*)/ ) {
            my $src = "[% $1 %]";
            $template->{$key} = $context->process(\$src, {});
        }
    }
    return $self;
}

Now, in my template, I can do:
[% META browser_title = -product.name %]

Perfect for improving the google ranking of all my dynamically generated product pages!

Thanks Jonas.

Tags: