Background: I’ve been working on a suite of tools for the past few months to help with some config automation, and one of the config files allows you to put in a range (ie: 1-100, or even 0200-0399). The tool will take the range, look for running jobs with an ID in that range, then do something with it. In some cases, we do have jobs whose ID starts with a zero (or multiple zeroes), so the leading zero can be important. This is where I ran into some interesting discoveries!
Okay, so, to aid in making it easy to iterate over the range, I was using the range operator in Perl, which is the little “..” thing you can do, (or dotdot, or dot-dot, or whatever you might call it). For example:
for(1 .. 100) { printf("%s\n",$_); }
.. or, being a little more realistic:
my $start = 1; my $end = 100; for my $i( $start .. $end ) { printf("%s\n",$i); }
Run those, and you’ll get all the numbers from 1 to 100. Fantastic. Want to get even more crazy? How about letters. Yeah, it does those too. Sweet.
In this example, we’ll get everything from a to z:
my $start = "a"; my $end = "z"; for my $i( $start .. $end ) { printf("%s\n",$i); }
Hrmm, what about if we want the range from a to zzzzz? Yup, it does that too:
my $start = "a"; my $end = "zzzzz"; for my $i( $start .. $end ) { printf("%s\n",$i); }
Of course, I’m not giving you the output, because that would be insane.. but it works. Nice!
There’s actually a bunch more you can do with it, too. There’s even a “…” (dot-dot-dot) that you can do something with (but I forget what). However, it is all documented here:
http://perldoc.perl.org/perlop.html#Range-Operators.
But that’s not what this post is about. It’s about something interesting I ran into. Still interested? Good. Keep reading.
So anyway, in the case of my config parser, I would parse out ranges as tokens (ie: 0200-0300), split, and do a search across the range. For example:
# $range = "0200-0300" my($start,$end) = $range =~ /^(\d+)\-(\d+)$/; for my $id( $start .. $end ) { if(found_job_id($id)) { do_something_with($id); } }
This basically allows us to lazily put large ranges into the config, but requires us to be more specific when creating jobs with new IDs (to ensure it falls within our range). In any case, this works exactly as we’d expect, leading zeroes and all.
Now, as part of error checking, I decided I should make sure that the $start value was less than the $end value, since the range operator will not work if $start is greater than $end. It will just return an empty set. This is documented and not an error, so I wanted to throw an error.
To do so, I just modified my code to implement a simple comparison check:
# $range = "0200-0300" my($start,$end) = $range =~ /^(\d+)\-(\d+)$/; if($start > $end) { throw_error_and_exit("Start is greater than end! Fix your range, buddy."); } for my $id( $start .. $end ) { if(found_job_id($id)) { do_something_with($id); } }
The change seemed harmless enough, but when I ran the config parser again to make sure things worked, I got nothing. Literally. Nothing. It appeared as if everything broke. Weird. No errors, the range is valid (I was using 0100-0200), but, um, what?
After scratching my head for about 20 seconds, I went, “aha!” Yes, just like that.
Sure enough, when I parse out my range from the config, the values I get back are strings (which makes sense), so in the case of 0200-0300, the range operator is actually performing the operation based on the string value of each, using what they consider the “magical auto-increment algorithm” (per the Perl documentation). I don’t know what type of analysis it does, but it’s probably just based on ASCII values.
In any case, AFTER I added my new comparison operation (to check if $start was greater than $end), Perl had to typecast my variables (which were strings) to integers to compare them, therefore dropping the leading zeroes! Here’s something a little more graphic:
# $range = "0200-0300" my($start,$end) = $range =~ /^(\d+)\-(\d+)$/; # <--- STRINGS if($start > $end) { # <--- OMG NOW THEY'RE INTEGERS throw_error_and_exit("Start is greater than end! Fix that."); } for my $id( $start .. $end ) { # <--- HEY WHERE'D THE LEADING ZERO GO? if(found_job_id($id)) { do_something_with($id); } }
This is one of those things with Perl where a mix of dynamic and weak typing features lead to strings magically (and uncontrollably) being casted to integers where needed. In most cases, having a language with both dynamic and weak typing on variables can be nice, but it can also lead to laziness and unexpected results if you’re not careful.
Well, that’s all for today. Just thought I’d share that one.



