Coderefs are like other references, they allow you to indirectly access the item they reference. They also allow you to treat code like it is data.
A quick overview of references, read the documentation listed later to get a fuller understanding of references.
\
operator to create a reference.
my $sref = \$scalar;
my $aref = \@array;
my $href = \%hash;
my $cref = \&func;
# anonymous array and hash
my $aref2 = [ 1, 2, 3 ];
my $href2 = { a => 1, b => 2 };
Here are some examples of how to make references. Scalar references are not used nearly as often as array or hash references. Both named and anonymous references to arrays and hashes are quite useful.
Code references are much more useful than most people seem to think.
Note that you do not make a coderef with &func;
.
This construct calls the subroutine func
passing it the arguments of
subroutine that called it. This calling convention was required in Perl 4, but
not in Perl 5. It's effects are different than people expect, so the general
recommendation is to not call subroutines this way.
Assigning the output of this call to a variable captures the return value of the subroutine call, not a reference to the subroutine.
print "${$sref}\n";
my $len = @{$aref}; # the whole array
$aref->[0] = 15; # an element
my $keyscnt = keys %{$href}; # the whole hash
$href->{'c'} = 3; # an element
These are basic examples of using the different kinds of references.
The curly braces are not the part of the syntax that causes the dereference. The sigil performs the dereference. The curly braces serve two purposes. They make the dereference somewhat easier to read and they specify what is being dereferenced. If the reference is stored in a more complicated structure than a single scalar, this becomes much more important.
perldoc perlreftut
)perldoc perlref
)perldoc perllol
)perldoc perldsc
)The standard Perl documentation has a lot of good material on references. I
would definitely recommend starting with perlreftut
.
sub
operator.sub
operator returns a coderef.@_
array.return
.Now we're back to the main topic. There's really not much to say about creating subroutines.
To be precise, the sub
operator returns a coderef when it is
called to create an anonymous subroutine. If it is called with a subroutine
name, it does not.
sub sum
{
my $sum = shift;
$sum += $_ foreach @_;
return $sum;
}
my $sumref = \∑
print $sumref->( 1..6 ), "\n";
This is how you make a reference to a named subroutine. Then, we call it.
my $sumref = sub
{
my $sum = shift;
$sum += $_ foreach @_;
return $sum;
};
print $sumref->( 1..6 ), "\n";
You can also use sub as an operator to create an anonymous subroutine. It's interesting to note that this one does not execute at compile-time like the named form. This form executes at run-time.
You can also call through a coderef like this:
&{$cref}( 1..6 );
I really don't like this style, but it's here for completeness.
Why use coderefs?
If you've never seen coderefs or function pointers before it is quite easy to underestimate how useful they are.
sort
, map
, grep
People are often surprised to find they have already been using coderefs with the list operators. If you've used Perl/Tk, you'll probably already be familiar with the callback mechanism. (I'm sure most of the UI frameworks probably do something similar. The last use just extends the first idea, but few people stumble onto it themselves.
@list = sort { $b <=> $a } @list;
sub backwards { return $b cmp $a; }
@list = sort backwards @list;
The code in the curlies is actually a coderef. Alternatively, you can use
the name of a subroutine. This subroutine is special because it takes its
parameters in $a
and $b
instead of @_
.
These are package variables, which can be a bit of a surprise under some
circumstances.
my $quit_btn = $window->Button(
-text => "Quit",
-command => sub { exit( 0 ); }
);
Supplying a callback to be executed when the button is clicked.
How do you do things generically with complex data structures?
Let's do this with a concrete example: a hierarchical filesystem
One of the problems with complex data structures is access. The more complicated the data structure, the more work you need to do to get to the part of the data you want. Accessing all of the data in the structure may require a significant amount of code.
The file system is not really very complicated, but it requires more work that a list or hash to access. If you've ever tried to write code to manipulate files and directories, you are probably aware of many places where things go wrong.
sub find_large_files
{
my ($dir, $size) = @_;
my @files = grep { -s $_ > $size } $dir->files();
foreach my $d ($dir->dirs())
{
push @files, find_large_files( $d, $size );
}
return @files;
}
In this case, I decided to process all of the files first, then the subdirectories. I could also have done the subdirectories first, or mixed the two.
Since I moved the recursion to the bottom, we could have actually used an iterative solution fairly easily, but that would have been a little harder to grasp.
sub find_small_files
{
my ($dir, $size) = @_;
my @files = grep { -s $_ < $size } $dir->files();
foreach my $d ($dir->dirs())
{
push @files, find_small_files( $d, $size );
}
return @files;
}
Notice the two changes that would need to be made to each copy. How often do you think the second would be forgotten?
Notice how much of the code is devoted to walking the container vs. the actual work we wanted to do.
With most of the code devoted to walking the container, the code we would need to copy would mostly be boilerplate, requiring no real thought or creativity.
sub grep_files
{
my ($pred, $dir) = @_;
my @files = grep { $pred->( $_ ) } $dir->files();
foreach my $d ($dir->dirs())
{
push @files, grep_files( $pred, $d );
}
return @files;
}
Except for the coderef and the grep
call, this is basically
identical to the other solutions. The coderef is called a predicate
because it tests one parameter and returns true or false.
my @large_files =
grep_files( sub { -s $_[0] > $bigsize }, $dir );
my @small_files =
grep_files( sub { -s $_[0] < $smallsize }, $dir );
Calling the command with the appropriate predicates. With a little more
magic we could avoid adding the sub
keyword, but I'm not going
to go there, now.
For some reason, people seem to have a harder time coming to grips with this idea.
foreach my $cref (@code)
{
@data = $cref->( @data );
}
Notice that we are looping over the code routines not over the data. This can be especially useful if the routines you are building up come from some sort of configuration or user input.
Part of a four-function calculator.
my %operations = (
'+' => sub { return $_[0] + $_[1]; },
'-' => sub { return $_[0] - $_[1]; },
'*' => sub { return $_[0] * $_[1]; },
'/' => sub { return $_[0] / $_[1]; },
'q' => sub { exit( 0 ); },
);
You could actually map any strings. Also good for language-based interfaces.
This anonymous sub approach works best when the calls fit on a single
line. I would not recommend doing this with long subroutines. If you need to do more
work, write a real subroutine elsewhere and use the \
to take it's
reference here.
my $obj = {
'name' => 'Fred',
'age' => 37,
'command' => sub { return $_[0] * 2; },
};
It is hard to come up with a generic example of this. I have used it several times, so it's funny that I can't come up with a good example. A good use would be when the subroutine is needed to perform a minor transformation on some of the data before it can be used. This would be useful when this structure is being stored with a list of other similar structures. By providing this little sub we could, temporarily, massage the data into a form needed for processing, without losing its original nature.
foreach my $file (@files)
{
if($is_relative)
{
push @files, $file
if -s "$base_dir$file" > $size;
}
else
{
push @files, $file
if -s $file > $size;
}
}
Notice that the condition will be tested for each item in the list. For a complex conditional, this could be expensive. Worse, if there are multiple conditions, the code quickly becomes really hard to understand.
my $test;
if($is_relative)
{
$test = sub { -s "$base_dir$_[0]" > $size ? ($_[0]) : () };
}
else
{
$test = sub { -s $_[0] > $size ? ($_[0]) : () };
}
foreach my $file (@files)
{
push @files, $test->( $file );
}
The code doesn't look much smaller. However, the conditions at the beginning are only executed once, not every time through the loop. We can also move this code elsewhere to make this code easier to maintain.
Lexical variables are effectively the same as my
variables.
sub make_multiplier
{
my $factor = shift;
return sub { return shift * $factor; };
}
my $doubler = make_multiplier( 2 );
my $tripler = make_multiplier( 3 );
This seems to be the way most people introduce closures. Not because it is particularly useful or realistic. I think it is just easy to understand.
Given: $obj
holds an object we want to activate using its
apply
method.
my $apply_btn = $window->Button(
-text => "Apply",
-command => sub { $obj->apply(); }
);
One problem with the normal UI callback method is that it is hard to call methods on a separate object using a simple coderef. This problem is easy to solve by using a closure to capture a reference to the object and the method to call.
sub add { return $_[0] + $_[1]; }
sub subtract { return $_[0] - $_[1]; }
sub bind_second
{
my $coderef = shift;
my $second = shift;
return sub { return $coderef->( $_[0], $second ); };
}
my $inc = bind_second( \&add, 1 );
my $dec = bind_second( \&subtract, 1 );
Once again, not particularly useful, but it is easy to understand.
sub filesize_test
{
my $pred = shift;
return sub { $pred->( -s $_[0] ) };
}
my @small_files =
grep_files( filesize_test( sub { $_[0] < $max } ), $dir );
my @large_files =
grep_files( filesize_test( sub { $min < $_[0] } ), $dir );
This is pretty complicated, but you can walk through it fairly easily. As you become used to this technique, reading these kinds of expressions gets easier.