So,
each
,
keys
, and
values
all use the same iterator under the covers. I knew this and it is the reason I almost never use
each
. There is just too much chance of each leaving the iterator partway through the hash. For instance, the following code is fine:
while (my ($key, $value) = each %hash) {
do_something($key, $value);
}
But this code has a subtle bug just waiting to bite you:
eval {
while (my ($key, $value) = each %hash) {
do_something($key, $value);
}
1;
} or do {
die $@ unless $@ eq "We're all fine here now, thank you. How are you?\n";
}
If the
eval
block
die
s with the acceptable message then the code will continue on with a borked iterator. A more common mistake is the use of
last
in a loop using
each
:
while (my ($key, $value) = each %hash) {
last unless do_something($key, $value);
}
Adding or removing items from the hash while using
each
can also bite you. So, all of those little problems means I tend to write the loops above like this:
for my $key (keys %hash) {
my $value = $hash{$key};
do_something($key, $value);
}
All of this means I only dust off
each
and try to remember all of its issues when I know the number of keys in the hash (or the size of the keys themselves) is going to be huge in relation to memory (which generally means it is a
tie
d dbm). However, it just struck me today that this behavior has can be used for good. I want to loop over a bunch of hash entries checking to see if the are all equal to each other. Now I could say:
my ($first, @others) = values %hash;
die "bad" if grep { $first ne $_ } @others;
but I could also say:
my ($key, $value) = each %hash;
die "bad" if grep { $value ne $_ } values %hash;
And the second bit of code is between twice as fast (tiny hashes) and five times as fast (mid-size and up):
#!/usr/bin/perl
use strict;
use warnings;
use Carp;
use Benchmark;
sub benchmark {
my $subs = shift;
my %results;
for my $sub (keys %$subs) {
$results{$sub} = $subs->{$sub}->();
}
my ($k, $v) = each %results;
croak "bad" if grep { $v ne $_ } values %results;
Benchmark::cmpthese -1, $subs;
}
for my $n (10, 100, 1_000, 10_000) {
my %h = map { $_ => $_ } 1 .. $n;
print "for $n:\n";
benchmark {
values => sub {
my ($first, @others) = values %h;
return join "", $first, @others;
},
each => sub {
my ($k, $v) = each %h;
return join "", values %h;
},
};
}
And yes, the reason I started thinking about this is the bit in the benchmark function. Of course, now that I am looking at it with a critical eye, I see it should be
my ($k, $sub) = each %$subs;
my $value = $sub->();
croak "bad" if first { $value ne $_->() } values %$subs
You can use each in scalar context – it will return just the key. No need for the throw-away $value declaration.
ReplyDeleteAlso, keys in void context will reset the iterator without doing anything else; this is documented behaviour. Therefore you can safely use each in loops that won’t ever call any code that’s not under your control – eg. simple construction of a string from the data in a hash.