DIY Cryptography with Perl 6

Part 3:
Changing Keys

by Arne Sommer

DIY Cryptography with Perl 6

Part 3: Changing Keys

See also: The Introduction | Part 1: Base 36 | Part 2: Base 400.

Very long messages are vulnerable to Letter Frequency Analysis, even when we use Base 400 encryption, as the 10 possible codes for each character will be evenly distributed for large selections. We can avoid that (or at least making it considerably harder) by changing the key after encoding part of the message.

I'll describe the decryption part first:

  1. We start with a key (e.g. "YW") and a base 36 string (e.g. "RB0192LKQJQL98"), giving a sequence of base 400 numbers (0-399)
  2. We decode the first number with the key, giving a base 40 character.
  3. We use the index in the base 40 character set to get an integer. That integer is the number of characters to decode, with that key.
  4. Then we decode that number of characters. (The decoder stops when reaching the end, so the number can be longer than the actual number of numbers to decode.)
  5. If there are more integers, decode the next two as well. They are the next key to use.
  6. Go to step 2

If we specify the wrong key, a modified string, or the secret keys don't match, we'll get gibberish in return. As the key is also specified as two base 40 characters the program will happily go on. The decryption algorithm gives valid output regardless of what we have given it, so an attacker will not get any help when trying to guess the keys. And will have to try all the possibilites and look for recognizable text in them. E.g:

$ perl6 decrypt AA 149N6WI1IUI6ODASEEJRO
YSYI44FE6YUUOH8W

$ perl6 decrypt AB 149N6WI1IUI6ODASEEJRO
4E0DPEVQ8.1AZN Y

The code:

File: lib/Cryptoish.pm6 (partial)
my %base40values = @base40.map( { $_ => $++ } );

our sub multidecrypt (Alphabet $key is copy, Str $value, :$secret)
{
  my &get-secret = load-it($secret);

  sub char-from-key($key, $index)
  {
    return @( &get-secret($key) ).substr($index, 1);
  }

  my $number = $value.parse-base($wrapper);                   # [1]
  my @numbers;                                                # [1]
  while $number                                               # [1]
  {
    @numbers.push($number % $base);                           # [1]
    $number = $number div $base;                              # [1]
  }

  my $result = "";
  
  while (@numbers)
  {
    my $count = %base40values{ char-from-key($key, @numbers.shift) }; # [2][3]

    if $count
    {
      my @partial = @numbers.splice(0, $count);           # [4]

      $result ~= char-from-key($key, $_) for @partial;    # [4]
    }

    if @numbers > 1                                                    # [5] [7]
    {
      my ($a, $b) = @numbers.splice(0, 2);                             # [5]

      my $new-key = char-from-key($key, $a) ~ char-from-key($key, $b); # [5]

      $key = $new-key;                                                 # [5]
    }
  }

  return $result;
}

[7] Note the >1, as we want at least two elemens so that «splice» (which removes the two first elements from the list) doesn't choke. A single element will be left for the next iteration («while @numbers») takes it as a count, but ends up reading zero values as there are none left. That is ok. This is done so that decrypting messages with the wrong key cannot make the program crash, as that would have told an attacker that the key was wrong. An attacker can of course modify the program to discover the fact, and use it to discard the solution. (And I can have fun by adding a single dummy base 400 character to the end of the encrypted string in the encryption program to invalidate this assumption, without messing up the decrypted message.)

The encryption part:

  1. We start with a key (e.g. "AA") and a string (e.g "HELLO, MY NAME IS ARNE")
  2. We make a list of single characters from the text to encrypt
  3. In a loop, as long as there are more characters to encrypt
  4. Set up the mapping for the given key
  5. The program picks a random number between 0 and 39 (both included)
  6. That number is replaced with the base 40 letter (by the index), and added to the output list
  7. The program reads that number of integers (or less, if the number is greater than the length) from the list of integers, converts them to base 400 with the current key, and adds them to the output list. Note that the number 0 is ok (and it is used as dummy noise)
  8. If there are more elements in the list of integers, the program picks a random key (a two letter base 36 string), contverts those two letters to base 400 with the current key, and adds the result to the output list. Then it replaces the key with the new random value
  9. If there are more elements in the list of integers, go to step 3
  10. It converts the list to a number, by adding the digit values
  11. And finally, converting that number to a base 36 string

The code:

File: lib/Cryptoish.pm6 (partial)
our sub multiencrypt (Alphabet $key is copy, Alphabet $text is copy, :$secret)
{                                                         # [1]
  my &get-secret = load-it($secret);

  sub random_key
  {
    return (0 .. 9, "A" .. "Z").flat.roll(2).join;
  }

  my @chars  = $text.comb;      # [2]
  my @return;
  
  while @chars                  # [3] [9]
  {
    my $idx = 0;                                          # [4]
    my %values;                                           # [4]
    %values{$_}.push($idx++) for &get-secret($key).comb;  # [4]

    my $count = (0 .. 39).pick;                    # [5]

    @return.push(%values{@base40[$count]}.pick);   # [6]

    if $count                                      # [7]
    {
      my @partial = @chars.splice(0, $count);      # [7]

      @return.push(%values{$_}.pick) for @partial; # [7]
    }

    if @chars                                            # [8]
    {
      my $new-key = random_key;                          # [8]
      @return.push(%values{$new-key.substr(0,1)}.pick);  # [8]
      @return.push(%values{$new-key.substr(1,1)}.pick);  # [8]
      $key = $new-key;                                   # [8]
    }
  }
  return @return.map( { $_ * $base ** $++ } ).sum.base($wrapper);
                 # [10] ######################### # [11] #######
}

The scripts:

File: multiencrypt
#! /usr/bin/env perl6

use lib "lib";

use Cryptoish;

sub MAIN (Cryptoish::Alphabet $key, Str $text, :$secret)
{
  say Cryptoish::multiencrypt($key, $text, :$secret);
}
File: multidecrypt
#! /usr/bin/env perl6

use lib "lib";

use Cryptoish;

sub MAIN (Cryptoish::Alphabet $key, Str $value, :$secret)
{
  say Cryptoish::multidecrypt($key, $value, :$secret);
}

See the «Verbose» section of Part 4: Postscript for a step by step walk through.

Missing Flip?

You may have noticed the missing flip in «mutliencrypt»/«multidecrypt» compared to «encrypt»/«decrypt». The original purpose was to get the digits in the correct order, as the algorithms works on the digits from the right, and builds them up from the left.

But as we use this for encryption, we don't really need to make it a correct base 40 (or base 400) encoding. So I dropped them both. This change results in different encrypted texts, but that doesn't matter as both the encrypter and decrypter agree on the algorithm.

$ perl6 encrypt AA "THIS IS A TEST. YES."
MPWCOONJ21TNCP7D6JVNGT8YFUZ5L0LAH

$ perl6 decrypt AA MPWCOONJ21TNCP7D6JVNGT8YFUZ5L0LAH
THIS IS A TEST. YES.

$ perl6 multidecrypt AA MPWCOONJ21TNCP7D6JVNGT8YFUZ5L0LAH
SEY .TSET A SI SIHT

The last one shows that the string has been reversed. The first letter in the encrypted string is taken as the count. The character is a «.» (a period), with index 37 (in the base 40 character set). That means that the algorithm reads up to 37 values, and as the message is shorter, we get everything. The period is not part of the message, and is missing.

We can add «flip» as a command line option:

$ perl6 multiencrypt --flip AA "ARNE SOMMER"
FU8WIQ53PP76WL6GACRCXV0XP

$ perl6 multidecrypt AA FU8WIQ53PP76WL6GACRCXV0XP
6QCEMMOS ENRAA

Note that the text is reversed, and somewhat similar. The changes in the next section will further reduce the similarities, so that «flip» actually makes the message unreadable.

Only the changes to the module are shown here,  like this . See the next section for the complete code, including the scripts (or scroll up, and imagine «:$flip» added in the same way as «:$secret» in the programs).

File: lib/Cryptoish.pm6 (changes only)
our sub multiencrypt (Alphabet $key is copy, Alphabet $text is copy,
                      :$secret, Bool :$flip = False)  # [1]
    
return @return.reverse.map( { $_ * $base ** $++ } ).sum.base($wrapper)
  if $flip; # [2]

return @return.map( { $_ * $base ** $++ } ).sum.base($wrapper); # [3]

our sub multidecrypt (Alphabet $key is copy, Str $value is copy,
                      :$secret, Bool :$flip = False)  # [4]

@numbers = @numbers.reverse if $flip; # [5]

[1] Adding the optional named Boolean parameter «flip».

[2] Note the «reverse», if we have enabled «flip». This line is added just before [3]

[3] Without «reverse» otherwise.

[4] As [1].

[5] We reverse the list, if we have enabled «flip».

Repetition is a Bad Thing

It is possible that a character that occurs more than one time get the same encoded value two or more times, especially if the encryption algorithm chooses to encrypt a high number of characters (the maximum is 39) before switching the key.

The easiest way of reducing the possibility for that to happen is sending short messages only. That isn't very user friendly, though.

The problem is that we we use «pick» on a list of ten values. We can prevent the possibility of repetition using «grab» instead, as it removes the value from the list (where as «pick» doesn't change the list). Then we'd have to regenerate the list for the character again, when all ten have been used. This will reduce the possibility for two adjacant identical letters in the original message from ending up with the same value in the encrypted version. (It will not remove the possibility altogether, as the first one can be the tenth, and the second one is then taken from a new random list.)

Or we could switch the key, and go back and update the count before this part of the encrypted message. That is harder.

See docs.perl6.org/routine/grab for more information about «pick».

I have not programmed this, as the next section gives a solution that is much better - and much easier to implement.

Changing Keys more Often

Another approach is reducing the number of characters to encode before changing the key, thus reducing the risk of duplicate values for the same character. We leave the original value (0 - 39) as chosen by the function in the encoded string, but apply modulo something on it to get a lower value. We do this both on encoding and decoding, so that every value in the encoded string still can be a full base 400 value. The value for something can either be hard coded in the program, or we could specify it on the command line to make it even harder to break the encryption. I'll go for harder. If we don't specify a value, we get 40 (and the original value is used unchanged). This means that using the correct key doesn't decrypt the message if it has been encrypted with a module value. The lower the module value the better (and longer encrypted version).

File: lib/Cryptoish.pm6
unit module Cryptoish;

constant @base40  := (0 .. 9, "A" .. "Z", " ", ".", "?", "!").flat;
constant $base    := 400;
constant $wrapper := 36;

our subset Alphabet of Str where { /^@base40+$/   };
our subset Modulo   of Int where { 2 <= $_ <= 40 };  # [1]

sub load-it ($secret = "")
{
  my $name = $secret ?? "Cryptoish::Secret::$secret" !! "Cryptoish::Secret";
  require ::($name) '&get-secret';
  # say "[Loading $name]";
  return &get-secret;
}

our sub encrypt (Alphabet $key, Alphabet $text, :$secret)
{
  my &get-secret = load-it($secret);

  my $idx = 0;
  my %values;
  %values{$_}.push($++) for &get-secret($key).comb;

  return $text.flip.comb.map( { %(%values).{$_}.pick * $base ** $++ } )
    .sum.base($wrapper);
}

our sub decrypt (Alphabet $key, Str $value, :$secret)
{
  my &get-secret = load-it($secret);
  
  my $result = "";

  my $number = $value.parse-base($wrapper);
  while $number
  {
    $result ~= @( &get-secret($key) ).substr($number % $base, 1);
    $number = $number div $base; 
  }
  return $result.flip;
}

our sub multiencrypt (Alphabet $key is copy, Alphabet $text is copy,
  		      Modulo :$modulo = 40, Bool :$flip = False, 
                      :$secret) # [2]
{
  my &get-secret = load-it($secret);
  
  sub random_key
  {
    return (0 .. 9, "A" .. "Z").flat.roll(2).join;
  }

  my @chars = $text.comb;

  my @return;
  
  while @chars
  {
    my $idx = 0;
    my %values;
    %values{$_}.push($idx++) for &get-secret($key).comb;

    my $count = (0 .. 39).pick;

    $count = $count % $modulo;  # [3]
  
    if $count
    {
      my @partial = @chars.splice(0, $count);

      @return.push(%values{$_}.pick) for @partial;
    }
    
    if @chars
    {
      my $new-key = random_key;
      @return.push(%values{$new-key.substr(0,1)}.pick);
      @return.push(%values{$new-key.substr(1,1)}.pick);
      $key = $new-key;
    }
  }
  
  return @return.reverse.map( { $_ * $base ** $++ } ).sum.base($wrapper)
    if $flip;
    
  return @return.map( { $_ * $base ** $++ } ).sum.base($wrapper);
}

my %base40values = @base40.map( { $_ => $++ } );

our sub multidecrypt (Alphabet $key is copy, Str $value, Modulo :$modulo = 40,
                      Bool :$flip = False, :$secret, :$verbose = False) # [4]
{
  my &get-secret = load-it($secret);
  
  sub char-from-key($key, $index)
  {
    say "K: $key I: $index" if $verbose;
    return @( &get-secret($key) ).substr($index, 1);
  }
  
  my $number = $value.parse-base($wrapper);
    
  my @numbers;
  while $number
  {
    @numbers.push($number % $base);
    $number = $number div $base; 
  }

  @numbers = @numbers.reverse if $flip;

  my $result = "";
  
  while @numbers
  {
    my $count = %base40values{ char-from-key($key, @numbers.shift) };

    if $count
    {
    {
      $count = $count % $modulo; # [5]
      my @partial = @numbers.splice(0, $count);

      $result ~= char-from-key($key, $_) for @partial;
    }
    

    if @numbers.elems > 1
    {
      my ($a, $b) = @numbers.splice(0, 2);
      
      my $new-key = char-from-key($key, $a) ~ char-from-key($key, $b);

      $key = $new-key;
    }
  }

  return $result;
}

[1] I have added a «subset» to be used as type constraint on the modulo value. Values 2..40 only are allowed.

[2] The modulo argument, as an optional named parameter. The default value is 40.

[3] Applying modulo on the value, after adding the original one to the output.

[4] The same as #2.

[5] Applying modulo on the value, before using it.

And finally the scripts, with the changes for both modulo and flip highlighted:

File: multiencrypt
#! /usr/bin/env perl6

use lib "lib";

use Cryptoish;

sub MAIN (Cryptoish::Alphabet $key, Str $text :$secret,
          Cryptoish::Modulo :$modulo = 40, Bool :$flip = False)
{
  say Cryptoish::multiencrypt($key, $text :$secret, :$modulo, :$flip);
}
File: multidecrypt
#! /usr/bin/env perl6

use lib "lib";

use Cryptoish;

sub MAIN (Cryptoish::Alphabet $key, Str $value :$secret,
          Cryptoish::Modulo :$modulo = 40, Bool :$flip = False)
{
  say Cryptoish::multidecrypt($key, $value :$secret, :$modulo, :$flip);
}

Testing «modulo»:

$ perl6 multiencrypt --modulo=4 AA "THIS IS A TEST."
6MHTT7KYGD2XV59W84QUZR0QHCTYOLGEKHP1BC22KCNPD2GIN8A5CI39HL4LB85BOAJLK7TR5LRQZ

$ perl6 multidecrypt --modulo=4 AA 6MHTT7KYGD2XV59W84QUZR0QHCTYOLGEKHP1BC22KCNPD2GIN8A5CI39HL4LB85BOAJLK7TR5LRQZ
THIS IS A TEST.

$ perl6 multidecrypt --modulo=10 AA 6MHTT7KYGD2XV59W84QUZR0QHCTYOLGEKHP1BC22KCNPD2GIN8A5CI39HL4LB85BOAJLK7TR5LRQZ
THICKT.J7U1?27DNVIQN78.95TL 5A

$ perl6 multidecrypt AA 6MHTT7KYGD2XV59W84QUZR0QHCTYOLGEKHP1BC22KCNPD2GIN8A5CI39HL4LB85BOAJLK7TR5LRQZ
THICKT.J7U1?0N2I9QBU S8 7?EVND8PTYNPJVD

The first three characters («THI») are decrypted correctly if we don't apply the user specified modulo value (of 4). Note that the program picks a random number, so we get from 0 to 3 characters before switching keys. In this initial case we got the maximum (3), which is pure luck (or unluck).

Testing «flip»:

$ perl6 multiencrypt --modulo=10 AA ABCDEFGHIJKLMNOPQRSTUVW
KISA5EIEWO113MR1NJ4WYYBJ3ZJF60PLW28VVPFTNTQRL7YUAPCGSRQ6NQ25

$ perl6 multidecrypt --modulo=10 AA KISA5EIEWO113MR1NJ4WYYBJ3ZJF60PLW28VVPFTNTQRL7YUAPCGSRQ6NQ25
ABCDEFGHIJKLMNOPQRSTUVW

$ perl6 multidecrypt --modulo=10 --flip AA KISA5EIEWO113MR1NJ4WYYBJ3ZJF60PLW28VVPFTNTQRL7YUAPCGSRQ6NQ25
HM4JDL98.615ZIBZL

$ perl6 multiencrypt --modulo=10 --flip AA ABCDEFGHIJKLMNOPQRSTUVW
1AOIHFGSFLN7VB6MV64TN8U8HFV5HIJUDUO5JO9WYQF7052FQ5RN0RND

$ perl6 multidecrypt --modulo=10 --flip AA 1AOIHFGSFLN7VB6MV64TN8U8HFV5HIJUDUO5JO9WYQF7052FQ5RN0RND
ABCDEFGHIJKLMNOPQRSTUVW

$ perl6 multidecrypt --modulo=10 AA 1AOIHFGSFLN7VB6MV64TN8U8HFV5HIJUDUO5JO9WYQF7052FQ5RN0RND
.9PEWCC1KM.J73M8BW5AUF

Using «modulo=2», the lowest possible value, gives the longest encrypted result (as it encrypts 0 or 1 value with each key, before changing the key). The result should be very hard to crack without the correct Secret File, as any repetition of the base 400 numbers doesn't mean anything whatsoever.

Can Repetition be a Good Thing?

Changing keys can result in the same encrypted value beeing used several times, but from a different key. So they can actually represent different base 40 characters. This makes it possible for an attacker to see a pattern (a repetition) that really isn't there. And as I haven't reduced the possibility of an actual repetition with the same key (as described in the Repetition is a Bad Thing section above), he cannot know if the repetition is real or not. I am rather satisfied with that conundrum.

Part 4: Postscript

See the next and final part; Part 4: Postscript.