From Zero to 35 with Raku

[5] Published 7. April 2019.

Perl 6 → Raku

This article has been moved from «perl6.eu» and updated to reflect the language rename in 2019.

This is my response to the Perl Weekly Challenge #2.

Challenge #2.1

Write a script or one-liner to remove leading zeros from positive numbers.

Leading zeros means that the value is a string. All we have to do, is coerce it to a number. The shortest way to do that is using the + (Numeric) prefix.

As a one liner (with single or double quotes):

$ raku -e 'say +@*ARGS[0]' 0000012
12

$ raku -e "say +@*ARGS[0]" 0000000009
9

And as a script:

File: sans-zero

say +@*ARGS[0];

This will choke on non-numeric input, and removes leading zeroes from all numeric values.

We can write it as a more traditional looking progam with a MAIN procedure, and add the Numeric type constraint on the input value. The type constraint doesn't exclude leading zeroes, as the input from the command line is of the IntStr type (if it looks like an integer) or the RatStr type (if it looks like a rational number).

This will allow any numeric value, and removes leading zeroes:

File: sans-zero2

sub MAIN (Numeric $number)
{
  # say $number.WHAT;
  say +$number;
}

See docs.raku.org/language/create-cli#index-entry-MAIN for more information about the special MAIN procedure.

Uncomment the first say line, and run the program to see the types:

$ raku sans-zero2 12
(IntStr)
12

$ raku sans-zero2 12.1
(RatStr)
12.1

The reason that IntStr and some others exist is that Raku gets the values from the command line without any quotes, as the shell removes them. In Raku "12" is a string, whereas 12 is an integer. When we get the value from the command line (without the quotes), we must be able to treat it as both - without an error. Hence the special type.

The task was to remove the leading zeroes from positive numbers only, so we'll fix that:

File: sans-zero3

multi sub MAIN (Numeric $number where $number >= 0)
{
 say +$number;
}

multi sub MAIN ($value)
{
 say $value;
}

Positive numbers with leading zeroes, are stripped of leading zeroes. The challenge didn't say what to do with other values, so in this version they are printed unchanged.

Note that multi sub gives us multiple dispatch, where we set up procedures with the same name, but with different signatures. The version to use is decided when called, depending on the arguments.

See docs.raku.org/syntax/multi for more information about multi sub.

Note the where clause on the type. It excludes negative values. (I have chosen to consider zero as a positive value, so that any leading zeros are removed from it as well.)

See docs.raku.org/type/Signature for more information about where clauses.

Challenge #2.2

Write a script that can convert integers to and from a base35 representation, using the characters 0-9 and A-Y.

Raku has built in support for this, so this is easy(ish):

Base 35 to Base 10

We can specify a number in any base we want (between 2 and 36, both included) with the :<value> syntax:

Try it out in REPL:

$ raku
> :2<1111>
15
> :16<FF>
255

As a program:

File: base35to10

sub MAIN (Str $number)
{
  say :35($number);
}

Running it:

$ base35to10 12X
1328

$ ./base35to10 99x
11373

See the top of docs.raku.org/type/Int for more information about :<value>.

We can also use the parse-base method:

File: base35to10-2

sub MAIN (Str $number)
{
  say $number.parse-base: 35;
}

See docs.raku.org/routine/parse-base for more information about parse-base.

The built in support recognizes both upper- and lower case letters. The challenge stated upper case letters only, but I'll disregard that for now.

Base 10 to Base 35

The challenge stated «to and from». So we'll do the other direction, with the base method:

File: base10to35

sub MAIN (Str $number)
{
  say $number.base: 35;
}

Running it:

$ base10to35 11373
99X

The value round trips. base uses upper case letters, so we are good.

See docs.raku.org/routine/base for more information about base.

Combined

The challenge actually said «write a script that...», and I have written two. So I'll merge them.

This program assumes base 10 if there are digits only in the given value, and base 35 otherwise:

File: base1035

multi sub MAIN (Int $number where $number !~~ /\D/)
{
  ay $number.base: 35;
}

multi sub MAIN (Str $number)
{
  say $number.parse-base: 35;
}

The problem is that a sequence of digits can be a valid base 35 value. The next version fixes that, by allowing the use of the «--base=35» flag to force the usage of base 35:

File: base1035-2

multi sub MAIN (Int $number where $number !~~ /\D/, :$base is copy = 10)
{
  $base == 35
   ?? say $number.parse-base: $base
   !! say $number.base: $base;
}

multi sub MAIN (Str $number)
{
  say $number.parse-base: 35;
}

It is unneccessary to have both «multi»s handle base 35. We can rewrite it, and fix the problem of allowing other values than 10 and 35 as base at the same time:

File: base1035-3

subset Base10or35 of Int where { $_ == 10 || $_ == 35 };

sub MAIN (Str $value, Base10or35 :$base is copy = 10)
{
  $base = 35 if $value ~~ /\D/;

  $base == 35
   ?? say $value.parse-base: $base  # base 35 -> 10
   !! say $value.base: $base        # base 10 -> 35
}

See docs.raku.org/language/typesystem#subset for more information about subset.

I should address the problem that we allow lower case letters in the base 35 values. This version fixes that:

File: base1035-4

subset BaseValue of Str where { /^<[ 0 .. 9, A .. Y ]>+$/ }
subset Base10or35 of Int where { $_ == 10 || $_ == 35 }

sub MAIN (BaseValue $value, Base10or35 :$base is copy = 10)
{
  $base = 35 if $value ~~ /\D/;

  $base == 35
   ?? say $value.parse-base: $base  # base 35 -> 10
   !! say $value.base: $base        # base 10 -> 35
}

Even Higher (Bonus)

The built in support in Raku stops at base 36, as we only have ten digits and 26 letters to choose from. But what if we want to go higher? Let's say to base 42, also known as Duoquadragesimal.

The first thing we'd have to, is define the characters to use after the initial 36 ones (0 .. 9, A .. Z). (I have chosen to support upper case letters only, so I could have used the lower case letters as the next characters. But that would be confusing, right?) I am from Norway, so using the additional Norwegian letters «Æ», «Ø» and «Å» seemed natural. I needed three more, and chose the Swedish letters «Ä» and «Ö» and the German «ß» (which I know is a lower case letter).

The module uses base 42 unless told otherwise.

To base 42

File: lib/BaseX.pm6 (partial)

unit module BaseX;

subset LegalBase of Int where { 2 <= $_ <= 42 };

my @digits = (0 .. 9, "A" .. "Z", "Æ", "Ø", "Å", "Ä", "Ö", "ß").flat;

our sub to-baseX (Int $value is copy, LegalBase :$base = 42)
{
  my @result;
  while ($value)
  {
    @result.push($value % $base);   # The remainder 
    $value = $value div $base;      # Integer division
  }
  return @result.reverse.map( { @digits[$_] } ).join;
}

File: baseX (partial)

use lib "lib";

use BaseX;

say BaseX::to-baseX(1234567890);       # -> ÄSEZF
say BaseX::to-baseX(256, base => 16);  # -> 100
say BaseX::to-baseX(255, base => 2);   # -> 11111111

A walk-through may help, starting with the value 123456789:

123456789 / 42 = 2939447 * 42 + 15
2939447 / 42 = 69986 * 42 + 35
69986 / 42 = 1666 * 42 + 14
1666 / 42 = 39 * 42 + 28
39 / 42 = 0 * 42 + 39

This gives us the @result array with the values in reverse order: 15,35,14,28,39.

We use map to convert from the values (in correct order: 39,28,14,35,15) to the digits ÄSEZF. Try it in REPL, to make sure:

$ raku
> my @digits = (0 .. 9, "A" .. "Z", "Æ", "Ø", "Å", "Ä", "Ö", "ß").flat;
> say @digits[39,28,14,35,15];
(Ä S E Z F)

Too much code, perhaps? It is possible to do the conversion much shorter, with polymod:

File: lib/BaseX.rakumod (partial)

our sub to-baseX2 (Int $value is copy, LegalBase :$base = 42)
{
  return $value.polymod($base xx 50).reverse.map( { @digits[$_] } ).join;
}

See docs.raku.org/routine/polymod for more information about polymod.

This gives us a lot of leading zeroes (the total number of "digits" is 50 - a number we get from xx 50):

0000000000000000000000000000000000000000000000ÄSEZF
000000000000000000000000000000000000000000000000100
000000000000000000000000000000000000000000011111111

The leading zeroes must be removed, and we cannot use + as the value isn't numeric. A Regex does the trick:

File: lib/BaseX.pm6 (partial)

our sub to-baseX3 (Int $value is copy, LegalBase :$base = 42)
{
  $value.polymod($base xx 50).reverse.map( { @digits[$_] } ).join ~~ /^0*(.*)/;
  return $0.Str;
}

$0 is a match object and not a string, so we have to stringify it manually.

Sadly it isn't a one liner any more...

Excercise: What happens if we specify a number that has more than 50 digits after conversion?

The array slice I used to explain the values in REPL can replace the map:

File: lib/BaseX.pm6 (partial)

our sub to-baseX4 (Int $value is copy, LegalBase :$base = 42)
{
  return (@digits[$value.polymod($base xx 50).reverse].join ~~ /^0*(.*)/)[0].Str;
}

We can use an index on the match object (which is returned by the regex, but usually not used directly), so we ended up with a one liner after all.

From base 42

File: lib/BaseX.pm6 (partial)

my %values = @digits.map( { $_ => $++ } );

our sub from-baseX (Str $value, LegalBase :$base = 42)
{
  my $return = 0;
  my $exp = 0;

  for $value.flip.comb -> $digit
  {
    $return += %values{$digit} * $base ** $exp++;
  }
  return $return;
}

flip flips a string (reverses it), and comb splits a string into a list of characters, one at a time.

File: baseX (partial)

use lib "lib";

use BaseX;

say BaseX::from-baseX("ÄSEZF");             # -> 123456789
say BaseX::from-baseX("FFFF", base => 16);  # -> 65535
say BaseX::from-baseX("1000", base => 2);   # -> 8

A walk-through may help this time as well, starting with the string "ÄSEZF". The values can be computed like this (in REPL):

> my @digits = (0 .. 9, "A" .. "Z", "Æ", "Ø", "Å", "Ä", "Ö", "ß").flat;
> my %values = @digits.map( { $_ => $++ } );
> say %values{"ÄSEZF".comb};  
(39 28 14 35 15)

"F" -> 15 * 42 ** 0 = 15
"Z" -> 35 * 42 ** 1 = 1470
"E" -> 14 * 42 ** 2 = 24696
"S" -> 28 * 42 ** 3 = 2074464
"Ä" -> 39 * 42 ** 4 = 121356144

Then we can check the sum in REPL:

> say sum(<15 1470 24696 2074464 121356144>);  # -> 123456789

We can shorten this one as well to a one liner with map and sum, even without a trick like polymod:

File: lib/BaseX.pm6 (partial)

our sub from-baseX2 (Str $value, LegalBase :$base = 42)
{
  return $value.flip.comb.map( {%values{$_} * $base ** $++ } ).sum;
}

$ is an Anonymous State Variable, starting with zero as value the first time it is used. I have used it so that I don't have to declare an explicit variable, as that would have messed up the one liner.

See docs.raku.org/syntax/$ for more information about $.

Links

About me
Keyword Index
Raku Courses
Beginning Raku (book)
RSS Feed
Reddit Comments
Code (zip)