Raku EU Edition

by Arne Sommer

Raku EU Edition

[1] Published 16. March 2019. Rakuified 29. February 2020

Perl 6 → Raku

This article has been moved from «perl6.eu» and updated to reflect the language rename in 2019.

Let us assume that EU decides to demand the usage of instead of $ as the sigil for scalars in Raku programs funded by Brussel.

Or if that sounds too bureaucratically far fetched, blame my need to pay homage to the fact that the name of this web site ends with «eu». [Update: The old site was «perl.eu», before the language rename Perl 6 → Raku in 2019.]

Source Filters and Preprocessors

Raku doesn't have source filters (as Perl) yet, or a preprocessor (as C).

EVAL

I'll ignore Grammars for now, and start programming. We can write a sort of preprocessor with EVAL:

File: raku-eu
#! /usr/bin/env raku

use MONKEY-SEE-NO-EVAL;       # [1]

EVAL slurp.trans('€'=> '$');  # [2]

[1] We must specify use MONKEY-SEE-NO-EVAL in order to use EVAL, as it is considered dangerous. Now the programmer assures the compiler that he (hopefully) knows what he is doing.

[2] We read the whole program (with slurp), replacing all the «€» characters in the source code with «$», before passing the code on to EVAL for execution.

Here is a program using this modified version of Raku:

File: my-eu
#! /usr/bin/env raku-eu # [1]

my €a = 12;             # [2]

say €a;                 # [2]

[1] Note that we specify the «raku-eu» program here. If it is in the system path, we can simply execute the «my-eu» program like this:

$ ./my-eu
12

[2] Now we can use «€» instead of «$» as much as we like in the code.

See docs.raku.org/routine/EVAL for more information about EVAL.

Brexit Blues

The UK may have a problem with our program, so we should present them with a version that uses £ instead of (and $):

File: raku-uk
#! /usr/bin/env raku

use MONKEY-SEE-NO-EVAL;

EVAL slurp.trans('£'=> '$');

«raku-uk» will fail if we try to pass it -prefixed scalars, and that should satisfy hard core Brexiters.

A program using this:

File: my-uk
#! /usr/bin/env raku-eu

my £a = 12;
say £a;

Command Line Arg(uments)!

«raku-eu» (and «raku-uk») only work for programs without arguments on the command line. Any command line arguments (after the first one, which is the program) are also gobbled up by the lines function.

We can demonstrate this whith a modified «hello» program:

File: hello-eu
#! /usr/bin/env raku-eu

sub MAIN (€name)
{
  say "Hello, €name!"
}

Executing it doesn't go well:

$ ./hello-eu Hans
Failed to open file /home/raku/code/raku-eu/Hans: No such file or directory
in block  at /usr/local/bin/raku-eu line 5

The only way to sort this out, is to execute a real program. We can do that by writing the translated program to a temporary file, executing that file, and deleting it afterwards.

File: raku-eu2
#! /usr/bin/env raku

my $code = @*ARGS.shift.IO.slurp.trans('€'=> '$'); # [1]

my $tmp  = $*TMPDIR.add('raku-eu.tmp');            # [2]

$tmp.IO.spurt: $code;                              # [3]

run "raku", $tmp, @*ARGS;                         # [4]

unlink $tmp;                                       # [5]

[1] The first argument is the program. We remove it from the argument list (with @*ARGS.shift), read the file (with slurp) and translates (with trans) the «€« characters to «$».

[2] We use $*TMPDIR to get the location (name) of the directory set aside by the operating system for temporary files.

[3] Then we put the modified (pure Raku) program there.

[4] Running the program, with the rest of the arguments.

[5] And finally removing the temporary file.

Running this version works:

$ raku-eu2 hello-eu Hans
Hello, Hans!

I got rid of the potentially dangerous EVAL, but writing modified code to disk and executing it has exactly the same effect.

Error Handling

The temporary file will include a path, if the program was specified with a path. (E.g. /home/raku/bin/raku-eu/hello-eu Hans.) And that will cause the program to fail, if those directories are missing in the temporary file location. As they surely will.

We can fix this by replacing the directory separator with a dash:

File: raku-eu3 (partial)
my $tmp = $*TMPDIR.add($cmd.trans('/' => '-'));

Note that this will not work on Windows, as the directory separator character is a backslash there. We could use $*SPEC.dir-sep to get the separator character, as that gives «\» on Windows and «/» everywhere else.

The temporary file is not removed if we terminate the program before it reaches the unlink statement. Adding an END phaser doesn't work, so we hook it into the «SIGINT» signal handler (caused by pressing «Control-C»). It doesn't catch runtime errors, but is the best I can do.

File: raku-eu3
unit sub MAIN ($cmd, *@args);

my $code = $cmd.IO.slurp.trans('€'=> '$');

my $tmp  = $*TMPDIR.add($cmd.trans('/'=> '-'));

signal(SIGINT).tap({ unlink $tmp });
  
$tmp.IO.spurt: $code;

run "raku", $tmp, @args;

unlink $tmp;
File: test-eu3
#! /usr/bin/env raku-eu3

my €name = prompt "What is your name: ";

say "Hello, €name!";

Pressing «Control-C» at the prompt will terminate the program, but the temporary file is removed.

The UK can get a «£»-version of this program, if they pay for it. Pound Sterling is acceptable.

Caveat

The problem with this simplistic approach is that every occurence of the «€» character in the file will be replaced, even if placed inside a string where it really is meant as the Euro currency symbol.

And the program will fail if we specify a relative module location, e.g. use lib "lib";, as we run the code from a temporary file somewhere else in the file system. (We could fix that by placing the temporary file in the same directory as the original file, but that requires write access.)

The rest of this article is a dead end that doesn't produce workable code, so feel free to stop reading.

Extending the Grammar

The Raku language is specified as a grammar, written in NQP (Not Quite Perl), and it should be doable to extend the grammar.

A search in the source code reveals line 349 in the Grammar.nqp file:

token sigil { <[$@%&]> }

We can add the missing «€» sigil to the list, recompile NQP and Rakudo, and hope for the best. It will fail in interesting ways, as we have added a way for the «€» character to be accepted by the parser - but it doesn't know what to do with it afterwards.

Further inspection of the code reveals that the sigil $ is hard coded a lot of places, so it would require duplicating all that code with instead. It is certainly doable, but it would only be for my private version of Raku, and adding £ instead would require the whole change and compile sequence again.

Augmenting the Grammar

The article A Mutable Grammar For Raku, written i 2008, gave me a few ideas and I ended up with this code snippet:

File: raku-eu-grammar
use MONKEY-TYPING;
    
augment grammar Perl
{
  token sigil:sym<$> { '€' | '$' }
}

my $a = 12;      
say "The value is €a.";      

It is supposed to tell the internal Grammar that the scalar sigil (shown as sigil:sym<$>) can be specified as either «€» or «$».

We can use augment to modify existing grammars and classes, but have to add the use MONKEY-TYPING line to allow it.

See docs.raku.org/syntax/augment for more information about augment.

I have used «$» in the declaration, and «€» in the output (inside a string), so that the program doesn't crash if the grammar extension doesn't work. And it doesn't:

$ ./raku-eu-grammar
The value is €a.  

The reason it doesn't work is probably because the grammar change isn't done before Raku has parsed the whole program, and by then it is too late.

We can try to wrap the grammar change with a Phaser. The first to be executed is the BEGIN phaser, but it kicks in after the program has been parsed, and by then Raku has already choked on the «€». So that doesn't help.

See docs.raku.org/language/phasers for more information about Phasers.

We need a Phaser that is triggered before the rest of the file is parsed, but that is a logical impossibility.

We can try moving the code into a module:

File: lib/ScalarEU.rakumod
use MONKEY-TYPING;

augment grammar Perl
{
  token sigil:sym<$> { '€' | '$' }
}

It should work fine without a namespace and EXPORT.

File: hello-eu-lib
use lib "lib";
    
use ScalarEU;

sub MAIN ($name)
{
  say "Hello, €name!";
}

It doesn't quite work:

$ ./hello-eu-lib Tom
Hello, €name!   

Slangs

It turns out that Raku has a mechanism for modifying Grammars at run time, through Slangs.

A Slang (probably short for «Sub Language») is a separate language. In Raku we have several of them; e.g. Raku itself, strings and Regexes. We can define a new Slang, and hook that into the Raku grammar.

Slangs aren't explained in the official documentation, beacuse they are not even specified. They are a «work in progress» feature of Rakudo. But I found one article and a couple of moules:

That lead to this code:

File: lib/ScalarEU2.rakumod
use nqp;
use MONKEY-TYPING;

unit module ScalarEU2;

sub EXPORT(|)
{
  my role Euscalar
  {
    token sigil:sym<$> { '€' | '$' }
  }

  my Mu $MAIN-grammar := nqp::atkey(%*LANG, 'MAIN');
  my $grammar := $MAIN-grammar.HOW.mixin($MAIN-grammar, Euscalar);

  $*LANG.define_slang('MAIN', $grammar, $*LANG.actions);

  {}
}

Changing «hello-eu-lib» (available as «hello-eu-lib2») to use this module gives the same result as when we used «ScalarEU».

Rakudo issue 2404 says that $*LANG is obsolete, and to use $?LANG instead. REPL agrees:

> $*LANG
Dynamic variable $*LANG not found

> $?LANG
(low-level object `Perl6::Grammar`)

Changing %*LANG to %?LANG in the module gives a compilation error («Variable '%?LANG' is not declared»), so that didn't work out. Note that both $*LANG and $?LANG compiles in a program (unlike in REPL), but nothing happens either way.

Commercial Break

This article was written as part of the work on my coming course «Advanced Raku».