Twelve Ways to Style The Twelve Days

By: Kevin Jones
Senior Consultant, Embedded Systems

6th December 2018

Home » Insights » Processing & Programming

At this time of year I sometimes think about a now-famous piece of obfuscated C code. This was the winning entry in the 1988 International Obfuscated C Code Contest [see reference 1] and, believe it or not, will print the entire lyrics to “The Twelve Days of Christmas”:

#include <stdio.h>
main(t,_,a)
char *a;
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,
"@n'+,#'/*{}w+/w#cdnr/+,{}r/*de}+,/*{*+,/w{%+,/w#q#n+,/#{l+,/n{n+,/+#n+,/#\
;#q#n+,/+k#;*+,/'r :'d*'3,}{w+K w'K:'+}e#';dq#'l \
q#'+d'K#!/+k#;q#'r}eKK#}w'r}eKK{nl]'/#;#q#n'){)#}w'){){nl]'/+#n';d}rw' i;# \
){nl]!/n{n#'; r{#w'r nc{nl]'/#{l,+'K {rw' iK{;[{nl]'/w#q#n'wk nw' \
iwk{KK{nl]!/w{%'l##w#' i; :{nl]'/*{q#'ld;r'}{nlwb!/*de}'c \
;;{nl'-{}rw]'/+,}##'*}#nc,',#nw]'/+kd'+e}+;#'rdq#w! nr'/ ') }+}{rl#'{n' ')# \
}'+}##(!!/")
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a=='/')+t,_,a+1)
:0<t?main(2,2,"%s"):*a=='/'||main(0,main(-61,*a,
"!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:\nuwloca-O;m .vpbks,fxntdCeghiry"),a+1);
}

There are plenty of articles online that explain how this works; just use your favourite search engine to find out more [see reference 2]. Rather than repeat those descriptions, I’ll stick to the festive theme that prompted this blog and illustrate twelve ways to improve the readability of this wonderful algorithm.

1) Extract the Constant Strings

There are two strings lurking in the source code. The longer string contains an encoded version of the song lyrics and the shorter string contains a substitution cipher to decode the lyrics. The source code is immediately much clearer if these strings are declared near the beginning of the file. In the following example, “static” limits the scope of the strings to one source file and “const” declares them as read-only. I like to use uppercase symbols for constants but that’s nothing more than personal choice. Here’s the modified source code:

#include <stdio.h>

static const char *CIPHER =
"!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:\nuwloca-O;m .vpbks,fxntdCeghiry";

static const char *LYRICS =
"@n'+,#'/*{}w+/w#cdnr/+,{}r/*de}+,/*{*+,/w{%+,/w#q#n+,/#{l+,/n{n+,/+#n+,/#\
;#q#n+,/+k#;*+,/'r :'d*'3,}{w+K w'K:'+}e#';dq#'l \
q#'+d'K#!/+k#;q#'r}eKK#}w'r}eKK{nl]'/#;#q#n'){)#}w'){){nl]'/+#n';d}rw' i;# \
){nl]!/n{n#'; r{#w'r nc{nl]'/#{l,+'K {rw' iK{;[{nl]'/w#q#n'wk nw' \
iwk{KK{nl]!/w{%'l##w#' i; :{nl]'/*{q#'ld;r'}{nlwb!/*de}'c \
;;{nl'-{}rw]'/+,}##'*}#nc,',#nw]'/+kd'+e}+;#'rdq#w! nr'/ ') }+}{rl#'{n' ')# \
}'+}##(!!/";

main(t,_,a)
char *a;
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,LYRICS)
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a=='/')+t,_,a+1)
:0<t?main(2,2,"%s"):*a=='/'||main(0,main(-61,*a,CIPHER),a+1);
}

2) Reformat the Strings

The two strings are recursively indexed to decode and display the lyrics. Reformatting these strings helps to understand the algorithm. In the next step the lyrics and the cipher are declared using C’s alternative string concatenation syntax. The substitution cipher is split after the 31st character, the lyrics are split at each “/” character and the original code’s “\” concatenation in the lyrics string has been eliminated. There’s also some indentation to improve readability:

#include <stdio.h>

static const char *CIPHER =
     "!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:"
    "\nuwloca-O;m .vpbks,fxntdCeghiry";

static const char *LYRICS =
    "@n'+,#'/"
    "*{}w+/"
    "w#cdnr/"
    "+,{}r/"
    "*de}+,/"
    "*{*+,/"
    "w{%+,/"
    "w#q#n+,/"
    "#{l+,/"
    "n{n+,/"
    "+#n+,/"
    "#;#q#n+,/"
    "+k#;*+,/"
    "'r :'d*'3,}{w+K w'K:'+}e#';dq#'l q#'+d'K#!/"
    "+k#;q#'r}eKK#}w'r}eKK{nl]'/"
    "#;#q#n'){)#}w'){){nl]'/"
    "+#n';d}rw' i;# ){nl]!/"
    "n{n#'; r{#w'r nc{nl]'/"
    "#{l,+'K {rw' iK{;[{nl]'/"
    "w#q#n'wk nw' iwk{KK{nl]!/"
    "w{%'l##w#' i; :{nl]'/"
    "*{q#'ld;r'}{nlwb!/"
    "*de}'c ;;{nl'-{}rw]'/"
    "+,}##'*}#nc,',#nw]'/"
    "+kd'+e}+;#'rdq#w! nr'/"
    " ') }+}{rl#'{n' ')# }'+}##(!!/";

main(t,_,a)
char *a;
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,LYRICS)
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a=='/')+t,_,a+1)
:0<t?main(2,2,"%s"):*a=='/'||main(0,main(-61,*a,CIPHER),a+1);
}

3) Add Some Comments

All good programmers know that comments are useful. Annotating the constant strings immediately helps understand the algorithm even before any changes are made to the function. I prefer block comments for maximum portability but end-of-line comments would work just as well:

static const char *CIPHER =
    /*
    * Substitution cipher:
    * Replace '!' with '\n'
    * Replace 'e' with 'u'
    * ...
    * Replace ':' with 'y'
    */
     "!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:"
    "\nuwloca-O;m .vpbks,fxntdCeghiry";

static const char *LYRICS =
    "@n'+,#'/"  /* On the */
    "*{}w+/"    /* first */
    "w#cdnr/"   /* second */
    "+,{}r/"    /* third */
    "*de}+,/"   /* fourth */
    "*{*+,/"    /* fifth */
    "w{%+,/"    /* sixth */
    "w#q#n+,/"  /* seventh */
    "#{l+,/"    /* eighth */
    "n{n+,/"    /* ninth */
    "+#n+,/"    /* tenth */
    "#;#q#n+,/" /* eleventh */
    "+k#;*+,/"  /* twelfth */

    /* day of Christmas my true love gave to me\n */
    "'r :'d*'3,}{w+K w'K:'+}e#';dq#'l q#'+d'K#!/"

    /* twelve drummers drumming, */
    "+k#;q#'r}eKK#}w'r}eKK{nl]'/"

    /* eleven pipers piping, */
    "#;#q#n'){)#}w'){){nl]'/"

    /* ten lords a-leaping,\n */
    "+#n';d}rw' i;# ){nl]!/"

    /* nine ladies dancing, */
    "n{n#'; r{#w'r nc{nl]'/"

    /* eight maids a-milking, */
    "#{l,+'K {rw' iK{;[{nl]'/"

    /* seven swans a-swimming,\n */
    "w#q#n'wk nw' iwk{KK{nl]!/"

    /* six geese a-laying, */
    "w{%'l##w#' i; :{nl]'/"

    /* five gold rings;\n */
    "*{q#'ld;r'}{nlwb!/"

    /* four calling birds, */
    "*de}'c ;;{nl'-{}rw]'/"

    /* three french hens, */
    "+,}##'*}#nc,',#nw]'/"

    /* two turtle doves\nand */
    "+kd'+e}+;#'rdq#w! nr'/"

    /* a partridge in a pear tree */
    " ') }+}{rl#'{n' ')# }'+}##(!!/";

4) Indent the Function Contents

This step is straight forward… simply indent the contents of the single function:

main(t,_,a)
char *a;
{
    return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
    1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
    main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,LYRICS)
    :t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a=='/')+t,_,a+1)
    :0<t?main(2,2,"%s"):*a=='/'||main(0,main(-61,*a,CIPHER),a+1);
}

5) Eliminate Implicit Types

If you’re compiling the examples then you may have noticed that there are plenty of implicit type warnings. The main function uses an old C syntax to declare parameter “a” as pointer-to-char. The remaining parameters are implicit integers and the return type of the main function is also an implicit integer. These are fixed in the next sample while taking the opportunity to include the “const” type qualifier:

int main(int t, int _, const char *a)
{
    return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
    1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
    main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,LYRICS)
    :t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a=='/')+t,_,a+1)
    :0<t?main(2,2,"%s"):*a=='/'||main(0,main(-61,*a,CIPHER),a+1);
}

6) Rename the Underscore Parameter

The second parameter of the main function is a single underscore. While this is perfectly legal in C, it’s a lot harder to read than an alphabetic character. The next sample has replaced all instances of the underscore parameter with the single character “b”:

int main(int t, int b, const char *a)
{
    return!0<t?t<3?main(-79,-13,a+main(-87,1-b,main(-86,0,a+1)+a)):
    1,t<b?main(t+1,b,a):3,main(-94,-27+t,a)&&t==2?b<13?
    main(2,b+1,"%s %d %d\n"):9:16:t<0?t<-72?main(b,t,LYRICS)
    :t<-50?b==*a?putchar(31[a]):main(-65,b,a+1):main((*a=='/')+t,b,a+1)
    :0<t?main(2,2,"%s"):*a=='/'||main(0,main(-61,*a,CIPHER),a+1);
}

7) Reformat the Ternary Conditional Operator

The obfuscated implementation makes extensive use of the ternary conditional operator. The next sample reformats these ternary operators with some additional whitespace:

int main(int t, int b, const char *a)
{
    return
    !0<t
    ?
        t<3
        ?
            main(-79,-13,a+main(-87,1-b,main(-86,0,a+1)+a))
        :
            1, t<b
            ?
                main(t+1,b,a)
            :
                3, main(-94,-27+t,a)&&t==2
                ?
                    b<13
                    ?
                        main(2,b+1,"%s %d %d\n")
                    :
                        9
                :
                    16
    :
        t<0
        ?
            t<-72
            ?
                main(b,t,LYRICS)
            :
                t<-50
                ?
                    b==*a
                    ?
                        putchar(31[a])
                    :
                        main(-65,b,a+1)
                :
                    main((*a=='/')+t,b,a+1)
        :
            0<t
            ?
                main(2,2,"%s")
            :
                *a=='/'||main(0,main(-61,*a,CIPHER),a+1);
}

8) Replace the Ternary Conditional Statements with If-Else

A simple replacement of the ternary conditional statements with if-else statements improves the presentation of the algorithm. In the following sample, there is yet more white space and the multiple return points have been rationalised into a single point of return by using a temporary integer variable “r”:

int main(int t, int b, const char *a)
{
    int r;
    if (!0 < t)
    {
        if (t < 3)
        {
            main(-79, -13, a + main(-87, 1 - b, main(-86, 0, a + 1) + a));
        }
        else
        {
            1;
        }
        if (t < b)
        {
            main(t + 1, b, a);
        }
        else
        {
            3;
        }
        if (main(-94, -27 + t, a) && t == 2)
        {
            if (b < 13)
            {
                r = main(2, b + 1, "%s %d %d\n");
            }
            else
            {
                r = 9;
            }
        }
        else
        {
            r = 16;
        }
    }
    else
    {
        if (t < 0)
        {
            if (t < -72)
            {
                r = main(b, t, LYRICS);
            }
            else
            {
                if (t < -50)
                {
                    if (b == *a)
                    {
                        r = putchar(31[a]);
                    }
                    else
                    {
                        r = main(-65, b, a + 1);
                    }
                }
                else
                {
                    r = main((*a == '/') + t, b, a + 1);
                }
            }
        }
        else
        {
            if (0 < t)
            {
                r = main(2, 2, "%s");
            }
            else
            {
                r = *a == '/' || main(0, main(-61, *a, CIPHER), a + 1);
            }
        }
    }
    return r;
}

9) Simplify an Array Access

This change is straight forward. The array access “31[a]” is exactly the same as “a[31]” yet the latter is much easier to understand in the context of the 31 character substitution cipher:

    ...
    r = putchar(a[31]);
    ...

10) Add Brackets to Logical Expressions

Personally, I prefer plenty of brackets in compound logical expressions. The following snips are small improvements:

    ...
    if (main(-94, -27 + t, a) && (t == 2))
    ...

    ...
    r = ((*a == '/') || (main(0, main(-61, *a, CIPHER), a + 1)));
    ...

11) Remove Misdirection

There are several instances where the implementation is deliberately misleading. Given the reformatted source code so far, it’s easy to see that there are two “else” clauses with no side effects (“else {1;}” and “else {3;}”). Less obvious is that the parameters that look like string formatting specifiers passed to the main function are unused in that iteration of the function and can be replaced with zero:

    ...
    if (b < 13)
    {
        r = main(2, b + 1, 0);
    }
    ...

    ...
    r = main(2, 2, 0);
    ...

12) Implement a Stand-Alone Recursive Function

The entire algorithm so far uses the single “main” function in an unusual recursive manner. After the last of the twelve changes, the main function looks like a standard C main function and there’s a separate local-scope (static) recursive function which, in time-honoured programmer’s tradition, is called “foo”. This gives the final complete sample after all twelve changes:

#include <stdio.h>

static const char *CIPHER =
    /*
    * Substitution cipher:
    * Replace '!' with '\n'
    * Replace 'e' with 'u'
    * ...
    * Replace ':' with 'y'
    */
     "!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:"
    "\nuwloca-O;m .vpbks,fxntdCeghiry";

static const char *LYRICS =
    "@n'+,#'/"  /* On the */
    "*{}w+/"    /* first */
    "w#cdnr/"   /* second */
    "+,{}r/"    /* third */
    "*de}+,/"   /* fourth */
    "*{*+,/"    /* fifth */
    "w{%+,/"    /* sixth */
    "w#q#n+,/"  /* seventh */
    "#{l+,/"    /* eighth */
    "n{n+,/"    /* ninth */
    "+#n+,/"    /* tenth */
    "#;#q#n+,/" /* eleventh */
    "+k#;*+,/"  /* twelfth */

    /* day of Christmas my true love gave to me\n */
    "'r :'d*'3,}{w+K w'K:'+}e#';dq#'l q#'+d'K#!/"

    /* twelve drummers drumming, */
    "+k#;q#'r}eKK#}w'r}eKK{nl]'/"

    /* eleven pipers piping, */
    "#;#q#n'){)#}w'){){nl]'/"

    /* ten lords a-leaping,\n */
    "+#n';d}rw' i;# ){nl]!/"

    /* nine ladies dancing, */
    "n{n#'; r{#w'r nc{nl]'/"

    /* eight maids a-milking, */
    "#{l,+'K {rw' iK{;[{nl]'/"

    /* seven swans a-swimming,\n */
    "w#q#n'wk nw' iwk{KK{nl]!/"

    /* six geese a-laying, */
    "w{%'l##w#' i; :{nl]'/"

    /* five gold rings;\n */
    "*{q#'ld;r'}{nlwb!/"

    /* four calling birds, */
    "*de}'c ;;{nl'-{}rw]'/"

    /* three french hens, */
    "+,}##'*}#nc,',#nw]'/"

    /* two turtle doves\nand */
    "+kd'+e}+;#'rdq#w! nr'/"

    /* a partridge in a pear tree */
    " ') }+}{rl#'{n' ')# }'+}##(!!/";

static int foo(int t, int b, const char *a)
{
    int r;
    if (!0 < t)
    {
        if (t < 3)
        {
            foo(-79, -13, a + foo(-87, 1 - b, foo(-86, 0, a + 1) + a));
        }
        if (t < b)
        {
            foo(t + 1, b, a);
        }
        if (foo(-94, -27 + t, a) && (t == 2))
        {
            if (b < 13)
            {
                r = foo(2, b + 1, 0);
            }
            else
            {
                r = 9;
            }
        }
        else
        {
            r = 16;
        }
    }
    else
    {
        if (t < 0)
        {
            if (t < -72)
            {
                r = foo(b, t, LYRICS);
            }
            else
            {
                if (t < -50)
                {
                    if (b == *a)
                    {
                        r = putchar(a[31]);
                    }
                    else
                    {
                        r = foo(-65, b, a + 1);
                    }
                }
                else
                {
                    r = foo((*a == '/') + t, b, a + 1);
                }
            }
        }
        else
        {
            if (0 < t)
            {
                r = foo(2, 2, 0);
            }
            else
            {
                r = ((*a == '/') || (foo(0, foo(-61, *a, CIPHER), a + 1)));
            }
        }
    }
    return r;
}

int main(int argc, char *argv[])
{
    return foo(1, 0, 0);
}

There are plenty of changes that could still be made to improve readability, such as more comments and replacing the recursive algorithm with iterative loops… but I’ve made my promised twelve improvements, leaving you with plenty of opportunity to make some of your own.



[1] At the time of writing this blog, the original code can be found on the International Obfuscated C Code Contest website on the “Winning Entries” web page, listed as “phillipps.c” under the 1988 sub-heading.

[2] Credit goes to an analysis by Mike Markowski and to a similar exercise in reformatting the source code by Michael Nahas.

At this time of year I sometimes think about a now-famous piece of obfuscated C code. This was the winning entry in the 1988 International Obfuscated C Code Contest [see reference 1] and, believe it or not, will print the entire lyrics to “The Twelve Days of Christmas”:

#include <stdio.h>
main(t,_,a)
char *a;
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,
"@n'+,#'/*{}w+/w#cdnr/+,{}r/*de}+,/*{*+,/w{%+,/w#q#n+,/#{l+,/n{n+,/+#n+,/#\
;#q#n+,/+k#;*+,/'r :'d*'3,}{w+K w'K:'+}e#';dq#'l \
q#'+d'K#!/+k#;q#'r}eKK#}w'r}eKK{nl]'/#;#q#n'){)#}w'){){nl]'/+#n';d}rw' i;# \
){nl]!/n{n#'; r{#w'r nc{nl]'/#{l,+'K {rw' iK{;[{nl]'/w#q#n'wk nw' \
iwk{KK{nl]!/w{%'l##w#' i; :{nl]'/*{q#'ld;r'}{nlwb!/*de}'c \
;;{nl'-{}rw]'/+,}##'*}#nc,',#nw]'/+kd'+e}+;#'rdq#w! nr'/ ') }+}{rl#'{n' ')# \
}'+}##(!!/")
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a=='/')+t,_,a+1)
:0<t?main(2,2,"%s"):*a=='/'||main(0,main(-61,*a,
"!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:\nuwloca-O;m .vpbks,fxntdCeghiry"),a+1);
}

There are plenty of articles online that explain how this works; just use your favourite search engine to find out more [see reference 2]. Rather than repeat those descriptions, I’ll stick to the festive theme that prompted this blog and illustrate twelve ways to improve the readability of this wonderful algorithm.

1) Extract the Constant Strings

There are two strings lurking in the source code. The longer string contains an encoded version of the song lyrics and the shorter string contains a substitution cipher to decode the lyrics. The source code is immediately much clearer if these strings are declared near the beginning of the file. In the following example, “static” limits the scope of the strings to one source file and “const” declares them as read-only. I like to use uppercase symbols for constants but that’s nothing more than personal choice. Here’s the modified source code:

#include <stdio.h>

static const char *CIPHER =
"!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:\nuwloca-O;m .vpbks,fxntdCeghiry";

static const char *LYRICS =
"@n'+,#'/*{}w+/w#cdnr/+,{}r/*de}+,/*{*+,/w{%+,/w#q#n+,/#{l+,/n{n+,/+#n+,/#\
;#q#n+,/+k#;*+,/'r :'d*'3,}{w+K w'K:'+}e#';dq#'l \
q#'+d'K#!/+k#;q#'r}eKK#}w'r}eKK{nl]'/#;#q#n'){)#}w'){){nl]'/+#n';d}rw' i;# \
){nl]!/n{n#'; r{#w'r nc{nl]'/#{l,+'K {rw' iK{;[{nl]'/w#q#n'wk nw' \
iwk{KK{nl]!/w{%'l##w#' i; :{nl]'/*{q#'ld;r'}{nlwb!/*de}'c \
;;{nl'-{}rw]'/+,}##'*}#nc,',#nw]'/+kd'+e}+;#'rdq#w! nr'/ ') }+}{rl#'{n' ')# \
}'+}##(!!/";

main(t,_,a)
char *a;
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,LYRICS)
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a=='/')+t,_,a+1)
:0<t?main(2,2,"%s"):*a=='/'||main(0,main(-61,*a,CIPHER),a+1);
}

2) Reformat the Strings

The two strings are recursively indexed to decode and display the lyrics. Reformatting these strings helps to understand the algorithm. In the next step the lyrics and the cipher are declared using C’s alternative string concatenation syntax. The substitution cipher is split after the 31st character, the lyrics are split at each “/” character and the original code’s “\” concatenation in the lyrics string has been eliminated. There’s also some indentation to improve readability:

#include <stdio.h>

static const char *CIPHER =
     "!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:"
    "\nuwloca-O;m .vpbks,fxntdCeghiry";

static const char *LYRICS =
    "@n'+,#'/"
    "*{}w+/"
    "w#cdnr/"
    "+,{}r/"
    "*de}+,/"
    "*{*+,/"
    "w{%+,/"
    "w#q#n+,/"
    "#{l+,/"
    "n{n+,/"
    "+#n+,/"
    "#;#q#n+,/"
    "+k#;*+,/"
    "'r :'d*'3,}{w+K w'K:'+}e#';dq#'l q#'+d'K#!/"
    "+k#;q#'r}eKK#}w'r}eKK{nl]'/"
    "#;#q#n'){)#}w'){){nl]'/"
    "+#n';d}rw' i;# ){nl]!/"
    "n{n#'; r{#w'r nc{nl]'/"
    "#{l,+'K {rw' iK{;[{nl]'/"
    "w#q#n'wk nw' iwk{KK{nl]!/"
    "w{%'l##w#' i; :{nl]'/"
    "*{q#'ld;r'}{nlwb!/"
    "*de}'c ;;{nl'-{}rw]'/"
    "+,}##'*}#nc,',#nw]'/"
    "+kd'+e}+;#'rdq#w! nr'/"
    " ') }+}{rl#'{n' ')# }'+}##(!!/";

main(t,_,a)
char *a;
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,LYRICS)
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a=='/')+t,_,a+1)
:0<t?main(2,2,"%s"):*a=='/'||main(0,main(-61,*a,CIPHER),a+1);
}

3) Add Some Comments

All good programmers know that comments are useful. Annotating the constant strings immediately helps understand the algorithm even before any changes are made to the function. I prefer block comments for maximum portability but end-of-line comments would work just as well:

static const char *CIPHER =
    /*
    * Substitution cipher:
    * Replace '!' with '\n'
    * Replace 'e' with 'u'
    * ...
    * Replace ':' with 'y'
    */
     "!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:"
    "\nuwloca-O;m .vpbks,fxntdCeghiry";

static const char *LYRICS =
    "@n'+,#'/"  /* On the */
    "*{}w+/"    /* first */
    "w#cdnr/"   /* second */
    "+,{}r/"    /* third */
    "*de}+,/"   /* fourth */
    "*{*+,/"    /* fifth */
    "w{%+,/"    /* sixth */
    "w#q#n+,/"  /* seventh */
    "#{l+,/"    /* eighth */
    "n{n+,/"    /* ninth */
    "+#n+,/"    /* tenth */
    "#;#q#n+,/" /* eleventh */
    "+k#;*+,/"  /* twelfth */

    /* day of Christmas my true love gave to me\n */
    "'r :'d*'3,}{w+K w'K:'+}e#';dq#'l q#'+d'K#!/"

    /* twelve drummers drumming, */
    "+k#;q#'r}eKK#}w'r}eKK{nl]'/"

    /* eleven pipers piping, */
    "#;#q#n'){)#}w'){){nl]'/"

    /* ten lords a-leaping,\n */
    "+#n';d}rw' i;# ){nl]!/"

    /* nine ladies dancing, */
    "n{n#'; r{#w'r nc{nl]'/"

    /* eight maids a-milking, */
    "#{l,+'K {rw' iK{;[{nl]'/"

    /* seven swans a-swimming,\n */
    "w#q#n'wk nw' iwk{KK{nl]!/"

    /* six geese a-laying, */
    "w{%'l##w#' i; :{nl]'/"

    /* five gold rings;\n */
    "*{q#'ld;r'}{nlwb!/"

    /* four calling birds, */
    "*de}'c ;;{nl'-{}rw]'/"

    /* three french hens, */
    "+,}##'*}#nc,',#nw]'/"

    /* two turtle doves\nand */
    "+kd'+e}+;#'rdq#w! nr'/"

    /* a partridge in a pear tree */
    " ') }+}{rl#'{n' ')# }'+}##(!!/";

4) Indent the Function Contents

This step is straight forward… simply indent the contents of the single function:

main(t,_,a)
char *a;
{
    return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
    1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
    main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,LYRICS)
    :t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a=='/')+t,_,a+1)
    :0<t?main(2,2,"%s"):*a=='/'||main(0,main(-61,*a,CIPHER),a+1);
}

5) Eliminate Implicit Types

If you’re compiling the examples then you may have noticed that there are plenty of implicit type warnings. The main function uses an old C syntax to declare parameter “a” as pointer-to-char. The remaining parameters are implicit integers and the return type of the main function is also an implicit integer. These are fixed in the next sample while taking the opportunity to include the “const” type qualifier:

int main(int t, int _, const char *a)
{
    return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
    1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
    main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,LYRICS)
    :t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a=='/')+t,_,a+1)
    :0<t?main(2,2,"%s"):*a=='/'||main(0,main(-61,*a,CIPHER),a+1);
}

6) Rename the Underscore Parameter

The second parameter of the main function is a single underscore. While this is perfectly legal in C, it’s a lot harder to read than an alphabetic character. The next sample has replaced all instances of the underscore parameter with the single character “b”:

int main(int t, int b, const char *a)
{
    return!0<t?t<3?main(-79,-13,a+main(-87,1-b,main(-86,0,a+1)+a)):
    1,t<b?main(t+1,b,a):3,main(-94,-27+t,a)&&t==2?b<13?
    main(2,b+1,"%s %d %d\n"):9:16:t<0?t<-72?main(b,t,LYRICS)
    :t<-50?b==*a?putchar(31[a]):main(-65,b,a+1):main((*a=='/')+t,b,a+1)
    :0<t?main(2,2,"%s"):*a=='/'||main(0,main(-61,*a,CIPHER),a+1);
}

7) Reformat the Ternary Conditional Operator

The obfuscated implementation makes extensive use of the ternary conditional operator. The next sample reformats these ternary operators with some additional whitespace:

int main(int t, int b, const char *a)
{
    return
    !0<t
    ?
        t<3
        ?
            main(-79,-13,a+main(-87,1-b,main(-86,0,a+1)+a))
        :
            1, t<b
            ?
                main(t+1,b,a)
            :
                3, main(-94,-27+t,a)&&t==2
                ?
                    b<13
                    ?
                        main(2,b+1,"%s %d %d\n")
                    :
                        9
                :
                    16
    :
        t<0
        ?
            t<-72
            ?
                main(b,t,LYRICS)
            :
                t<-50
                ?
                    b==*a
                    ?
                        putchar(31[a])
                    :
                        main(-65,b,a+1)
                :
                    main((*a=='/')+t,b,a+1)
        :
            0<t
            ?
                main(2,2,"%s")
            :
                *a=='/'||main(0,main(-61,*a,CIPHER),a+1);
}

8) Replace the Ternary Conditional Statements with If-Else

A simple replacement of the ternary conditional statements with if-else statements improves the presentation of the algorithm. In the following sample, there is yet more white space and the multiple return points have been rationalised into a single point of return by using a temporary integer variable “r”:

int main(int t, int b, const char *a)
{
    int r;
    if (!0 < t)
    {
        if (t < 3)
        {
            main(-79, -13, a + main(-87, 1 - b, main(-86, 0, a + 1) + a));
        }
        else
        {
            1;
        }
        if (t < b)
        {
            main(t + 1, b, a);
        }
        else
        {
            3;
        }
        if (main(-94, -27 + t, a) && t == 2)
        {
            if (b < 13)
            {
                r = main(2, b + 1, "%s %d %d\n");
            }
            else
            {
                r = 9;
            }
        }
        else
        {
            r = 16;
        }
    }
    else
    {
        if (t < 0)
        {
            if (t < -72)
            {
                r = main(b, t, LYRICS);
            }
            else
            {
                if (t < -50)
                {
                    if (b == *a)
                    {
                        r = putchar(31[a]);
                    }
                    else
                    {
                        r = main(-65, b, a + 1);
                    }
                }
                else
                {
                    r = main((*a == '/') + t, b, a + 1);
                }
            }
        }
        else
        {
            if (0 < t)
            {
                r = main(2, 2, "%s");
            }
            else
            {
                r = *a == '/' || main(0, main(-61, *a, CIPHER), a + 1);
            }
        }
    }
    return r;
}

9) Simplify an Array Access

This change is straight forward. The array access “31[a]” is exactly the same as “a[31]” yet the latter is much easier to understand in the context of the 31 character substitution cipher:

    ...
    r = putchar(a[31]);
    ...

10) Add Brackets to Logical Expressions

Personally, I prefer plenty of brackets in compound logical expressions. The following snips are small improvements:

    ...
    if (main(-94, -27 + t, a) && (t == 2))
    ...

    ...
    r = ((*a == '/') || (main(0, main(-61, *a, CIPHER), a + 1)));
    ...

11) Remove Misdirection

There are several instances where the implementation is deliberately misleading. Given the reformatted source code so far, it’s easy to see that there are two “else” clauses with no side effects (“else {1;}” and “else {3;}”). Less obvious is that the parameters that look like string formatting specifiers passed to the main function are unused in that iteration of the function and can be replaced with zero:

    ...
    if (b < 13)
    {
        r = main(2, b + 1, 0);
    }
    ...

    ...
    r = main(2, 2, 0);
    ...

12) Implement a Stand-Alone Recursive Function

The entire algorithm so far uses the single “main” function in an unusual recursive manner. After the last of the twelve changes, the main function looks like a standard C main function and there’s a separate local-scope (static) recursive function which, in time-honoured programmer’s tradition, is called “foo”. This gives the final complete sample after all twelve changes:

#include <stdio.h>

static const char *CIPHER =
    /*
    * Substitution cipher:
    * Replace '!' with '\n'
    * Replace 'e' with 'u'
    * ...
    * Replace ':' with 'y'
    */
     "!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:"
    "\nuwloca-O;m .vpbks,fxntdCeghiry";

static const char *LYRICS =
    "@n'+,#'/"  /* On the */
    "*{}w+/"    /* first */
    "w#cdnr/"   /* second */
    "+,{}r/"    /* third */
    "*de}+,/"   /* fourth */
    "*{*+,/"    /* fifth */
    "w{%+,/"    /* sixth */
    "w#q#n+,/"  /* seventh */
    "#{l+,/"    /* eighth */
    "n{n+,/"    /* ninth */
    "+#n+,/"    /* tenth */
    "#;#q#n+,/" /* eleventh */
    "+k#;*+,/"  /* twelfth */

    /* day of Christmas my true love gave to me\n */
    "'r :'d*'3,}{w+K w'K:'+}e#';dq#'l q#'+d'K#!/"

    /* twelve drummers drumming, */
    "+k#;q#'r}eKK#}w'r}eKK{nl]'/"

    /* eleven pipers piping, */
    "#;#q#n'){)#}w'){){nl]'/"

    /* ten lords a-leaping,\n */
    "+#n';d}rw' i;# ){nl]!/"

    /* nine ladies dancing, */
    "n{n#'; r{#w'r nc{nl]'/"

    /* eight maids a-milking, */
    "#{l,+'K {rw' iK{;[{nl]'/"

    /* seven swans a-swimming,\n */
    "w#q#n'wk nw' iwk{KK{nl]!/"

    /* six geese a-laying, */
    "w{%'l##w#' i; :{nl]'/"

    /* five gold rings;\n */
    "*{q#'ld;r'}{nlwb!/"

    /* four calling birds, */
    "*de}'c ;;{nl'-{}rw]'/"

    /* three french hens, */
    "+,}##'*}#nc,',#nw]'/"

    /* two turtle doves\nand */
    "+kd'+e}+;#'rdq#w! nr'/"

    /* a partridge in a pear tree */
    " ') }+}{rl#'{n' ')# }'+}##(!!/";

static int foo(int t, int b, const char *a)
{
    int r;
    if (!0 < t)
    {
        if (t < 3)
        {
            foo(-79, -13, a + foo(-87, 1 - b, foo(-86, 0, a + 1) + a));
        }
        if (t < b)
        {
            foo(t + 1, b, a);
        }
        if (foo(-94, -27 + t, a) && (t == 2))
        {
            if (b < 13)
            {
                r = foo(2, b + 1, 0);
            }
            else
            {
                r = 9;
            }
        }
        else
        {
            r = 16;
        }
    }
    else
    {
        if (t < 0)
        {
            if (t < -72)
            {
                r = foo(b, t, LYRICS);
            }
            else
            {
                if (t < -50)
                {
                    if (b == *a)
                    {
                        r = putchar(a[31]);
                    }
                    else
                    {
                        r = foo(-65, b, a + 1);
                    }
                }
                else
                {
                    r = foo((*a == '/') + t, b, a + 1);
                }
            }
        }
        else
        {
            if (0 < t)
            {
                r = foo(2, 2, 0);
            }
            else
            {
                r = ((*a == '/') || (foo(0, foo(-61, *a, CIPHER), a + 1)));
            }
        }
    }
    return r;
}

int main(int argc, char *argv[])
{
    return foo(1, 0, 0);
}

There are plenty of changes that could still be made to improve readability, such as more comments and replacing the recursive algorithm with iterative loops… but I’ve made my promised twelve improvements, leaving you with plenty of opportunity to make some of your own.

[1] At the time of writing this blog, the original code can be found on the International Obfuscated C Code Contest website on the “Winning Entries” web page, listed as “phillipps.c” under the 1988 sub-heading.

[2] Credit goes to an analysis by Mike Markowski and to a similar exercise in reformatting the source code by Michael Nahas.

Further Reading

Twelve Ways to Style The Twelve Days

Twelve Ways to Style The Twelve Days

Kevin Jones - Senior Consultant, Embedded Systems

By: Kevin Jones
Senior Consultant, Embedded Systems

10th January 2018

Home » Insights » Processing & Programming

During the Christmas period, I was reminded of a now-famous piece of obfuscated C code. This was the winning entry in the 1988 International Obfuscated C Code Contest [see reference 1] and, believe it or not, will print the entire lyrics to “The Twelve Days of Christmas”:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save


#include <stdio.h>
main(t,_,a)
char *a;
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,
"@n'+,#'/*{}w+/w#cdnr/+,{}r/*de}+,/*{*+,/w{%+,/w#q#n+,/#{l+,/n{n+,/+#n+,/#\
;#q#n+,/+k#;*+,/'r :'d*'3,}{w+K w'K:'+}e#';dq#'l \
q#'+d'K#!/+k#;q#'r}eKK#}w'r}eKK{nl]'/#;#q#n'){)#}w'){){nl]'/+#n';d}rw' i;# \
){nl]!/n{n#'; r{#w'r nc{nl]'/#{l,+'K {rw' iK{;[{nl]'/w#q#n'wk nw' \
iwk{KK{nl]!/w{%'l##w#' i; :{nl]'/*{q#'ld;r'}{nlwb!/*de}'c \
;;{nl'-{}rw]'/+,}##'*}#nc,',#nw]'/+kd'+e}+;#'rdq#w! nr'/ ') }+}{rl#'{n' ')# \
}'+}##(!!/")
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a=='/')+t,_,a+1)
:0<t?main(2,2,"%s"):*a=='/'||main(0,main(-61,*a,
"!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:\nuwloca-O;m .vpbks,fxntdCeghiry"),a+1);
}

There are plenty of articles online that explain how this works; just use your favourite search engine to find out more [see reference 2]. Rather than repeat those descriptions, I’ll stick to the festive theme that prompted this blog and illustrate twelve ways to improve the readability of this wonderful algorithm.

1) Extract the Constant Strings

There are two strings lurking in the source code. The longer string contains an encoded version of the song lyrics and the shorter string contains a substitution cipher to decode the lyrics. The source code is immediately much clearer if these strings are declared near the beginning of the file. In the following example, “static” limits the scope of the strings to one source file and “const” declares them as read-only. I like to use uppercase symbols for constants but that’s nothing more than personal choice. Here’s the modified source code:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

#include 

static const char *CIPHER =
"!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:\nuwloca-O;m .vpbks,fxntdCeghiry";

static const char *LYRICS =
"@n'+,#'/*{}w+/w#cdnr/+,{}r/*de}+,/*{*+,/w{%+,/w#q#n+,/#{l+,/n{n+,/+#n+,/#\
;#q#n+,/+k#;*+,/'r :'d*'3,}{w+K w'K:'+}e#';dq#'l \
q#'+d'K#!/+k#;q#'r}eKK#}w'r}eKK{nl]'/#;#q#n'){)#}w'){){nl]'/+#n';d}rw' i;# \
){nl]!/n{n#'; r{#w'r nc{nl]'/#{l,+'K {rw' iK{;[{nl]'/w#q#n'wk nw' \
iwk{KK{nl]!/w{%'l##w#' i; :{nl]'/*{q#'ld;r'}{nlwb!/*de}'c \
;;{nl'-{}rw]'/+,}##'*}#nc,',#nw]'/+kd'+e}+;#'rdq#w! nr'/ ') }+}{rl#'{n' ')# \
}'+}##(!!/";

main(t,_,a)
char *a;
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,LYRICS)
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a=='/')+t,_,a+1)
:0<t?main(2,2,"%s"):*a=='/'||main(0,main(-61,*a,CIPHER),a+1);
}

2) Reformat the Strings

The two strings are recursively indexed to decode and display the lyrics. Reformatting these strings helps to understand the algorithm. In the next step the lyrics and the cipher are declared using C’s alternative string concatenation syntax. The substitution cipher is split after the 31st character, the lyrics are split at each “/” character and the original code’s “\” concatenation in the lyrics string has been eliminated. There’s also some indentation to improve readability:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

#include 

static const char *CIPHER =
     &quot;!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:&quot;
    &quot;\nuwloca-O;m .vpbks,fxntdCeghiry&quot;;

static const char *LYRICS =
    &quot;@n'+,#'/&quot;
    &quot;*{}w+/&quot;
    &quot;w#cdnr/&quot;
    &quot;+,{}r/&quot;
    &quot;*de}+,/&quot;
    &quot;*{*+,/&quot;
    &quot;w{%+,/&quot;
    &quot;w#q#n+,/&quot;
    &quot;#{l+,/&quot;
    &quot;n{n+,/&quot;
    &quot;+#n+,/&quot;
    &quot;#;#q#n+,/&quot;
    &quot;+k#;*+,/&quot;
    &quot;'r :'d*'3,}{w+K w'K:'+}e#';dq#'l q#'+d'K#!/&quot;
    &quot;+k#;q#'r}eKK#}w'r}eKK{nl]'/&quot;
    &quot;#;#q#n'){)#}w'){){nl]'/&quot;
    &quot;+#n';d}rw' i;# ){nl]!/&quot;
    &quot;n{n#'; r{#w'r nc{nl]'/&quot;
    &quot;#{l,+'K {rw' iK{;[{nl]'/&quot;
    &quot;w#q#n'wk nw' iwk{KK{nl]!/&quot;
    &quot;w{%'l##w#' i; :{nl]'/&quot;
    &quot;*{q#'ld;r'}{nlwb!/&quot;
    &quot;*de}'c ;;{nl'-{}rw]'/&quot;
    &quot;+,}##'*}#nc,',#nw]'/&quot;
    &quot;+kd'+e}+;#'rdq#w! nr'/&quot;
    &quot; ') }+}{rl#'{n' ')# }'+}##(!!/&quot;;

main(t,_,a)
char *a;
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,&quot;%s %d %d\n&quot;):9:16:t<0?t<-72?main(_,t,LYRICS)
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a=='/')+t,_,a+1)
:0<t?main(2,2,&quot;%s&quot;):*a=='/'||main(0,main(-61,*a,CIPHER),a+1);
}

3) Add Some Comments

All good programmers know that comments are useful. Annotating the constant strings immediately helps understand the algorithm even before any changes are made to the function. I prefer block comments for maximum portability but end-of-line comments would work just as well:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

static const char *CIPHER =
    /*
    * Substitution cipher:
    * Replace '!' with '\n'
    * Replace 'e' with 'u'
    * ...
    * Replace ':' with 'y'
    */
     &quot;!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:&quot;
    &quot;\nuwloca-O;m .vpbks,fxntdCeghiry&quot;;

static const char *LYRICS =
    &quot;@n'+,#'/&quot;  /* On the */
    &quot;*{}w+/&quot;    /* first */
    &quot;w#cdnr/&quot;   /* second */
    &quot;+,{}r/&quot;    /* third */
    &quot;*de}+,/&quot;   /* fourth */
    &quot;*{*+,/&quot;    /* fifth */
    &quot;w{%+,/&quot;    /* sixth */
    &quot;w#q#n+,/&quot;  /* seventh */
    &quot;#{l+,/&quot;    /* eighth */
    &quot;n{n+,/&quot;    /* ninth */
    &quot;+#n+,/&quot;    /* tenth */
    &quot;#;#q#n+,/&quot; /* eleventh */
    &quot;+k#;*+,/&quot;  /* twelfth */

    /* day of Christmas my true love gave to me\n */
    &quot;'r :'d*'3,}{w+K w'K:'+}e#';dq#'l q#'+d'K#!/&quot;

    /* twelve drummers drumming, */
    &quot;+k#;q#'r}eKK#}w'r}eKK{nl]'/&quot;

    /* eleven pipers piping, */
    &quot;#;#q#n'){)#}w'){){nl]'/&quot;

    /* ten lords a-leaping,\n */
    &quot;+#n';d}rw' i;# ){nl]!/&quot;

    /* nine ladies dancing, */
    &quot;n{n#'; r{#w'r nc{nl]'/&quot;

    /* eight maids a-milking, */
    &quot;#{l,+'K {rw' iK{;[{nl]'/&quot;

    /* seven swans a-swimming,\n */
    &quot;w#q#n'wk nw' iwk{KK{nl]!/&quot;

    /* six geese a-laying, */
    &quot;w{%'l##w#' i; :{nl]'/&quot;

    /* five gold rings;\n */
    &quot;*{q#'ld;r'}{nlwb!/&quot;

    /* four calling birds, */
    &quot;*de}'c ;;{nl'-{}rw]'/&quot;

    /* three french hens, */
    &quot;+,}##'*}#nc,',#nw]'/&quot;

    /* two turtle doves\nand */
    &quot;+kd'+e}+;#'rdq#w! nr'/&quot;

    /* a partridge in a pear tree */
    &quot; ') }+}{rl#'{n' ')# }'+}##(!!/&quot;;

4) Indent the Function Contents

This step is straight forward… simply indent the contents of the single function:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

main(t,_,a)
char *a;
{
    return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
    1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
    main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,LYRICS)
    :t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a=='/')+t,_,a+1)
    :0<t?main(2,2,"%s"):*a=='/'||main(0,main(-61,*a,CIPHER),a+1);
}

5) Eliminate Implicit Types

If you’re compiling the examples then you may have noticed that there are plenty of implicit type warnings. The main function uses an old C syntax to declare parameter “a” as pointer-to-char. The remaining parameters are implicit integers and the return type of the main function is also an implicit integer. These are fixed in the next sample while taking the opportunity to include the “const” type qualifier:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

int main(int t, int _, const char *a)
{
    return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
    1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
    main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,LYRICS)
    :t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a=='/')+t,_,a+1)
    :0<t?main(2,2,"%s"):*a=='/'||main(0,main(-61,*a,CIPHER),a+1);
}

6) Rename the Underscore Parameter

The second parameter of the main function is a single underscore. While this is perfectly legal in C, it’s a lot harder to read than an alphabetic character. The next sample has replaced all instances of the underscore parameter with the single character “b”:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

int main(int t, int b, const char *a)
{
    return!0<t?t<3?main(-79,-13,a+main(-87,1-b,main(-86,0,a+1)+a)):
    1,t<b?main(t+1,b,a):3,main(-94,-27+t,a)&&t==2?b<13?
    main(2,b+1,"%s %d %d\n"):9:16:t<0?t<-72?main(b,t,LYRICS)
    :t<-50?b==*a?putchar(31[a]):main(-65,b,a+1):main((*a=='/')+t,b,a+1)
    :0<t?main(2,2,"%s"):*a=='/'||main(0,main(-61,*a,CIPHER),a+1);
}

7) Reformat the Ternary Conditional Operator

The obfuscated implementation makes extensive use of the ternary conditional operator. The next sample reformats these ternary operators with some additional whitespace:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

int main(int t, int b, const char *a)
{
return
!0

<pre> 

8) Replace the Ternary Conditional Statements with If-Else

A simple replacement of the ternary conditional statements with if-else statements improves the presentation of the algorithm. In the following sample, there is yet more white space and the multiple return points have been rationalised into a single point of return by using a temporary integer variable “r”:

int main(int t, int b, const char *a)
{
    int r;
    if (!0 < t)
    {
        if (t < 3)
        {
            main(-79, -13, a + main(-87, 1 - b, main(-86, 0, a + 1) + a));
        }
        else
        {
            1;
        }
        if (t < b)
        {
            main(t + 1, b, a);
        }
        else
        {
            3;
        }
        if (main(-94, -27 + t, a) && t == 2)
        {
            if (b < 13)
            {
                r = main(2, b + 1, "%s %d %d\n");
            }
            else
            {
                r = 9;
            }
        }
        else
        {
            r = 16;
        }
    }
    else
    {
        if (t < 0)
        {
            if (t < -72)
            {
                r = main(b, t, LYRICS);
            }
            else
            {
                if (t < -50)
                {
                    if (b == *a)
                    {
                        r = putchar(31[a]);
                    }
                    else
                    {
                        r = main(-65, b, a + 1);
                    }
                }
                else
                {
                    r = main((*a == '/') + t, b, a + 1);
                }
            }
        }
        else
        {
            if (0 < t)
            {
                r = main(2, 2, "%s");
            }
            else
            {
                r = *a == '/' || main(0, main(-61, *a, CIPHER), a + 1);
            }
        }
    }
    return r;
}

9) Simplify an Array Access

This change is straight forward. The array access “31[a]” is exactly the same as “a[31]” yet the latter is much easier to understand in the context of the 31 character substitution cipher:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

    ...
    r = putchar(a[31]);
    ...

10) Add Brackets to Logical Expressions

Personally, I prefer plenty of brackets in compound logical expressions. The following snips are small improvements:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

    ...
    if (main(-94, -27 + t, a) && (t == 2))
    ...

    ...
    r = ((*a == '/') || (main(0, main(-61, *a, CIPHER), a + 1)));
    ...

11) Remove Misdirection

There are several instances where the implementation is deliberately misleading. Given the reformatted source code so far, it’s easy to see that there are two “else” clauses with no side effects (“else {1;}” and “else {3;}”). Less obvious is that the parameters that look like string formatting specifiers passed to the main function are unused in that iteration of the function and can be replaced with zero:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

    ...
    if (b < 13)
    {
        r = main(2, b + 1, 0);
    }
    ...

    ...
    r = main(2, 2, 0);
    ...

12) Implement a Stand-Alone Recursive Function

The entire algorithm so far uses the single “main” function in an unusual recursive manner. After the last of the twelve changes, the main function looks like a standard C main function and there’s a separate local-scope (static) recursive function which, in time-honoured programmer’s tradition, is called “foo”. This gives the final complete sample after all twelve changes:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

#include 

static const char *CIPHER =
    /*
    * Substitution cipher:
    * Replace '!' with '\n'
    * Replace 'e' with 'u'
    * ...
    * Replace ':' with 'y'
    */
     "!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:"
    "\nuwloca-O;m .vpbks,fxntdCeghiry";

static const char *LYRICS =
    "@n'+,#'/"  /* On the */
    "*{}w+/"    /* first */
    "w#cdnr/"   /* second */
    "+,{}r/"    /* third */
    "*de}+,/"   /* fourth */
    "*{*+,/"    /* fifth */
    "w{%+,/"    /* sixth */
    "w#q#n+,/"  /* seventh */
    "#{l+,/"    /* eighth */
    "n{n+,/"    /* ninth */
    "+#n+,/"    /* tenth */
    "#;#q#n+,/" /* eleventh */
    "+k#;*+,/"  /* twelfth */

    /* day of Christmas my true love gave to me\n */
    "'r :'d*'3,}{w+K w'K:'+}e#';dq#'l q#'+d'K#!/"

    /* twelve drummers drumming, */
    "+k#;q#'r}eKK#}w'r}eKK{nl]'/"

    /* eleven pipers piping, */
    "#;#q#n'){)#}w'){){nl]'/"

    /* ten lords a-leaping,\n */
    "+#n';d}rw' i;# ){nl]!/"

    /* nine ladies dancing, */
    "n{n#'; r{#w'r nc{nl]'/"

    /* eight maids a-milking, */
    "#{l,+'K {rw' iK{;[{nl]'/"

    /* seven swans a-swimming,\n */
    "w#q#n'wk nw' iwk{KK{nl]!/"

    /* six geese a-laying, */
    "w{%'l##w#' i; :{nl]'/"

    /* five gold rings;\n */
    "*{q#'ld;r'}{nlwb!/"

    /* four calling birds, */
    "*de}'c ;;{nl'-{}rw]'/"

    /* three french hens, */
    "+,}##'*}#nc,',#nw]'/"

    /* two turtle doves\nand */
    "+kd'+e}+;#'rdq#w! nr'/"

    /* a partridge in a pear tree */
    " ') }+}{rl#'{n' ')# }'+}##(!!/";

static int foo(int t, int b, const char *a)
{
    int r;
    if (!0 < t)
    {
        if (t < 3)
        {
            foo(-79, -13, a + foo(-87, 1 - b, foo(-86, 0, a + 1) + a));
        }
        if (t < b)
        {
            foo(t + 1, b, a);
        }
        if (foo(-94, -27 + t, a) && (t == 2))
        {
            if (b < 13)
            {
                r = foo(2, b + 1, 0);
            }
            else
            {
                r = 9;
            }
        }
        else
        {
            r = 16;
        }
    }
    else
    {
        if (t < 0)
        {
            if (t < -72)
            {
                r = foo(b, t, LYRICS);
            }
            else
            {
                if (t < -50)
                {
                    if (b == *a)
                    {
                        r = putchar(a[31]);
                    }
                    else
                    {
                        r = foo(-65, b, a + 1);
                    }
                }
                else
                {
                    r = foo((*a == '/') + t, b, a + 1);
                }
            }
        }
        else
        {
            if (0 < t)
            {
                r = foo(2, 2, 0);
            }
            else
            {
                r = ((*a == '/') || (foo(0, foo(-61, *a, CIPHER), a + 1)));
            }
        }
    }
    return r;
}

int main(int argc, char *argv[])
{
    return foo(1, 0, 0);
}

There are plenty of changes that could still be made to improve readability, such as more comments and replacing the recursive algorithm with iterative loops… but I’ve made my promised twelve improvements, leaving you with plenty of opportunity to make some of your own.





[1] At the time of writing this blog, the original code can be found on the International Obfuscated C Code Contest website on the “Winning Entries” web page, listed as “phillipps.c” under the 1988 sub-heading.

[2] Credit goes to an analysis by Mike Markowski and to a similar exercise in reformatting the source code by Michael Nahas.

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

During the Christmas period, I was reminded of a now-famous piece of obfuscated C code. This was the winning entry in the 1988 International Obfuscated C Code Contest [see reference 1] and, believe it or not, will print the entire lyrics to “The Twelve Days of Christmas”:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save


#include <stdio.h>
main(t,_,a)
char *a;
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,
"@n'+,#'/*{}w+/w#cdnr/+,{}r/*de}+,/*{*+,/w{%+,/w#q#n+,/#{l+,/n{n+,/+#n+,/#\
;#q#n+,/+k#;*+,/'r :'d*'3,}{w+K w'K:'+}e#';dq#'l \
q#'+d'K#!/+k#;q#'r}eKK#}w'r}eKK{nl]'/#;#q#n'){)#}w'){){nl]'/+#n';d}rw' i;# \
){nl]!/n{n#'; r{#w'r nc{nl]'/#{l,+'K {rw' iK{;[{nl]'/w#q#n'wk nw' \
iwk{KK{nl]!/w{%'l##w#' i; :{nl]'/*{q#'ld;r'}{nlwb!/*de}'c \
;;{nl'-{}rw]'/+,}##'*}#nc,',#nw]'/+kd'+e}+;#'rdq#w! nr'/ ') }+}{rl#'{n' ')# \
}'+}##(!!/")
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a=='/')+t,_,a+1)
:0<t?main(2,2,"%s"):*a=='/'||main(0,main(-61,*a,
"!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:\nuwloca-O;m .vpbks,fxntdCeghiry"),a+1);
}

There are plenty of articles online that explain how this works; just use your favourite search engine to find out more [see reference 2]. Rather than repeat those descriptions, I’ll stick to the festive theme that prompted this blog and illustrate twelve ways to improve the readability of this wonderful algorithm.

1) Extract the Constant Strings

There are two strings lurking in the source code. The longer string contains an encoded version of the song lyrics and the shorter string contains a substitution cipher to decode the lyrics. The source code is immediately much clearer if these strings are declared near the beginning of the file. In the following example, “static” limits the scope of the strings to one source file and “const” declares them as read-only. I like to use uppercase symbols for constants but that’s nothing more than personal choice. Here’s the modified source code:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

#include <stdio.h>

static const char *CIPHER =
&quot;!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:\nuwloca-O;m .vpbks,fxntdCeghiry&quot;;

static const char *LYRICS =
&quot;@n'+,#'/*{}w+/w#cdnr/+,{}r/*de}+,/*{*+,/w{%+,/w#q#n+,/#{l+,/n{n+,/+#n+,/#\
;#q#n+,/+k#;*+,/'r :'d*'3,}{w+K w'K:'+}e#';dq#'l \
q#'+d'K#!/+k#;q#'r}eKK#}w'r}eKK{nl]'/#;#q#n'){)#}w'){){nl]'/+#n';d}rw' i;# \
){nl]!/n{n#'; r{#w'r nc{nl]'/#{l,+'K {rw' iK{;[{nl]'/w#q#n'wk nw' \
iwk{KK{nl]!/w{%'l##w#' i; :{nl]'/*{q#'ld;r'}{nlwb!/*de}'c \
;;{nl'-{}rw]'/+,}##'*}#nc,',#nw]'/+kd'+e}+;#'rdq#w! nr'/ ') }+}{rl#'{n' ')# \
}'+}##(!!/&quot;;

main(t,_,a)
char *a;
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,&quot;%s %d %d\n&quot;):9:16:t<0?t<-72?main(_,t,LYRICS)
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a=='/')+t,_,a+1)
:0<t?main(2,2,&quot;%s&quot;):*a=='/'||main(0,main(-61,*a,CIPHER),a+1);
}

2) Reformat the Strings

The two strings are recursively indexed to decode and display the lyrics. Reformatting these strings helps to understand the algorithm. In the next step the lyrics and the cipher are declared using C’s alternative string concatenation syntax. The substitution cipher is split after the 31st character, the lyrics are split at each “/” character and the original code’s “\” concatenation in the lyrics string has been eliminated. There’s also some indentation to improve readability:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

#include 

static const char *CIPHER =
     &quot;!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:&quot;
    &quot;\nuwloca-O;m .vpbks,fxntdCeghiry&quot;;

static const char *LYRICS =
    &quot;@n'+,#'/&quot;
    &quot;*{}w+/&quot;
    &quot;w#cdnr/&quot;
    &quot;+,{}r/&quot;
    &quot;*de}+,/&quot;
    &quot;*{*+,/&quot;
    &quot;w{%+,/&quot;
    &quot;w#q#n+,/&quot;
    &quot;#{l+,/&quot;
    &quot;n{n+,/&quot;
    &quot;+#n+,/&quot;
    &quot;#;#q#n+,/&quot;
    &quot;+k#;*+,/&quot;
    &quot;'r :'d*'3,}{w+K w'K:'+}e#';dq#'l q#'+d'K#!/&quot;
    &quot;+k#;q#'r}eKK#}w'r}eKK{nl]'/&quot;
    &quot;#;#q#n'){)#}w'){){nl]'/&quot;
    &quot;+#n';d}rw' i;# ){nl]!/&quot;
    &quot;n{n#'; r{#w'r nc{nl]'/&quot;
    &quot;#{l,+'K {rw' iK{;[{nl]'/&quot;
    &quot;w#q#n'wk nw' iwk{KK{nl]!/&quot;
    &quot;w{%'l##w#' i; :{nl]'/&quot;
    &quot;*{q#'ld;r'}{nlwb!/&quot;
    &quot;*de}'c ;;{nl'-{}rw]'/&quot;
    &quot;+,}##'*}#nc,',#nw]'/&quot;
    &quot;+kd'+e}+;#'rdq#w! nr'/&quot;
    &quot; ') }+}{rl#'{n' ')# }'+}##(!!/&quot;;

main(t,_,a)
char *a;
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,&quot;%s %d %d\n&quot;):9:16:t<0?t<-72?main(_,t,LYRICS)
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a=='/')+t,_,a+1)
:0<t?main(2,2,&quot;%s&quot;):*a=='/'||main(0,main(-61,*a,CIPHER),a+1);
}

3) Add Some Comments

All good programmers know that comments are useful. Annotating the constant strings immediately helps understand the algorithm even before any changes are made to the function. I prefer block comments for maximum portability but end-of-line comments would work just as well:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

static const char *CIPHER =
    /*
    * Substitution cipher:
    * Replace '!' with '\n'
    * Replace 'e' with 'u'
    * ...
    * Replace ':' with 'y'
    */
     &quot;!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:&quot;
    &quot;\nuwloca-O;m .vpbks,fxntdCeghiry&quot;;

static const char *LYRICS =
    &quot;@n'+,#'/&quot;  /* On the */
    &quot;*{}w+/&quot;    /* first */
    &quot;w#cdnr/&quot;   /* second */
    &quot;+,{}r/&quot;    /* third */
    &quot;*de}+,/&quot;   /* fourth */
    &quot;*{*+,/&quot;    /* fifth */
    &quot;w{%+,/&quot;    /* sixth */
    &quot;w#q#n+,/&quot;  /* seventh */
    &quot;#{l+,/&quot;    /* eighth */
    &quot;n{n+,/&quot;    /* ninth */
    &quot;+#n+,/&quot;    /* tenth */
    &quot;#;#q#n+,/&quot; /* eleventh */
    &quot;+k#;*+,/&quot;  /* twelfth */

    /* day of Christmas my true love gave to me\n */
    &quot;'r :'d*'3,}{w+K w'K:'+}e#';dq#'l q#'+d'K#!/&quot;

    /* twelve drummers drumming, */
    &quot;+k#;q#'r}eKK#}w'r}eKK{nl]'/&quot;

    /* eleven pipers piping, */
    &quot;#;#q#n'){)#}w'){){nl]'/&quot;

    /* ten lords a-leaping,\n */
    &quot;+#n';d}rw' i;# ){nl]!/&quot;

    /* nine ladies dancing, */
    &quot;n{n#'; r{#w'r nc{nl]'/&quot;

    /* eight maids a-milking, */
    &quot;#{l,+'K {rw' iK{;[{nl]'/&quot;

    /* seven swans a-swimming,\n */
    &quot;w#q#n'wk nw' iwk{KK{nl]!/&quot;

    /* six geese a-laying, */
    &quot;w{%'l##w#' i; :{nl]'/&quot;

    /* five gold rings;\n */
    &quot;*{q#'ld;r'}{nlwb!/&quot;

    /* four calling birds, */
    &quot;*de}'c ;;{nl'-{}rw]'/&quot;

    /* three french hens, */
    &quot;+,}##'*}#nc,',#nw]'/&quot;

    /* two turtle doves\nand */
    &quot;+kd'+e}+;#'rdq#w! nr'/&quot;

    /* a partridge in a pear tree */
    &quot; ') }+}{rl#'{n' ')# }'+}##(!!/&quot;;

4) Indent the Function Contents

This step is straight forward… simply indent the contents of the single function:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

main(t,_,a)
char *a;
{
    return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
    1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
    main(2,_+1,&quot;%s %d %d\n&quot;):9:16:t<0?t<-72?main(_,t,LYRICS)
    :t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a=='/')+t,_,a+1)
    :0<t?main(2,2,&quot;%s&quot;):*a=='/'||main(0,main(-61,*a,CIPHER),a+1);
}

5) Eliminate Implicit Types

If you’re compiling the examples then you may have noticed that there are plenty of implicit type warnings. The main function uses an old C syntax to declare parameter “a” as pointer-to-char. The remaining parameters are implicit integers and the return type of the main function is also an implicit integer. These are fixed in the next sample while taking the opportunity to include the “const” type qualifier:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

int main(int t, int _, const char *a)
{
    return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
    1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
    main(2,_+1,&quot;%s %d %d\n&quot;):9:16:t<0?t<-72?main(_,t,LYRICS)
    :t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a=='/')+t,_,a+1)
    :0<t?main(2,2,&quot;%s&quot;):*a=='/'||main(0,main(-61,*a,CIPHER),a+1);
}

6) Rename the Underscore Parameter

The second parameter of the main function is a single underscore. While this is perfectly legal in C, it’s a lot harder to read than an alphabetic character. The next sample has replaced all instances of the underscore parameter with the single character “b”:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

int main(int t, int b, const char *a)
{
    return!0<t?t<3?main(-79,-13,a+main(-87,1-b,main(-86,0,a+1)+a)):
    1,t<b?main(t+1,b,a):3,main(-94,-27+t,a)&&t==2?b<13?
    main(2,b+1,&quot;%s %d %d\n&quot;):9:16:t<0?t<-72?main(b,t,LYRICS)
    :t<-50?b==*a?putchar(31[a]):main(-65,b,a+1):main((*a=='/')+t,b,a+1)
    :0<t?main(2,2,&quot;%s&quot;):*a=='/'||main(0,main(-61,*a,CIPHER),a+1);
}

7) Reformat the Ternary Conditional Operator

The obfuscated implementation makes extensive use of the ternary conditional operator. The next sample reformats these ternary operators with some additional whitespace:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

int main(int t, int b, const char *a)
{
    return
    !0<t
    ?
        t<3
        ?
            main(-79,-13,a+main(-87,1-b,main(-86,0,a+1)+a))
        :
            1, t<b
            ?
                main(t+1,b,a)
            :
                3, main(-94,-27+t,a)&&t==2
                ?
                    b<13
                    ?
                        main(2,b+1,&quot;%s %d %d\n&quot;)
                    :
                        9
                :
                    16
    :
        t<0
        ?
            t<-72
            ?
                main(b,t,LYRICS)
            :
                t<-50
                ?
                    b==*a
                    ?
                        putchar(31[a])
                    :
                        main(-65,b,a+1)
                :
                    main((*a=='/')+t,b,a+1)
        :
            0<t
            ?
                main(2,2,&quot;%s&quot;)
            :
                *a=='/'||main(0,main(-61,*a,CIPHER),a+1);
}

8) Replace the Ternary Conditional Statements with If-Else

A simple replacement of the ternary conditional statements with if-else statements improves the presentation of the algorithm. In the following sample, there is yet more white space and the multiple return points have been rationalised into a single point of return by using a temporary integer variable “r”:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

int main(int t, int b, const char *a)
{
    int r;
    if (!0 < t)
    {
        if (t < 3)
        {
            main(-79, -13, a + main(-87, 1 - b, main(-86, 0, a + 1) + a));
        }
        else
        {
            1;
        }
        if (t < b)
        {
            main(t + 1, b, a);
        }
        else
        {
            3;
        }
        if (main(-94, -27 + t, a) && t == 2)
        {
            if (b < 13)
            {
                r = main(2, b + 1, &quot;%s %d %d\n&quot;);
            }
            else
            {
                r = 9;
            }
        }
        else
        {
            r = 16;
        }
    }
    else
    {
        if (t < 0)
        {
            if (t < -72)
            {
                r = main(b, t, LYRICS);
            }
            else
            {
                if (t < -50)
                {
                    if (b == *a)
                    {
                        r = putchar(31[a]);
                    }
                    else
                    {
                        r = main(-65, b, a + 1);
                    }
                }
                else
                {
                    r = main((*a == '/') + t, b, a + 1);
                }
            }
        }
        else
        {
            if (0 < t)
            {
                r = main(2, 2, &quot;%s&quot;);
            }
            else
            {
                r = *a == '/' || main(0, main(-61, *a, CIPHER), a + 1);
            }
        }
    }
    return r;
}

9) Simplify an Array Access

This change is straight forward. The array access “31[a]” is exactly the same as “a[31]” yet the latter is much easier to understand in the context of the 31 character substitution cipher:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

    ...
    r = putchar(a[31]);
    ...

10) Add Brackets to Logical Expressions

Personally, I prefer plenty of brackets in compound logical expressions. The following snips are small improvements:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

    ...
    if (main(-94, -27 + t, a) && (t == 2))
    ...

    ...
    r = ((*a == '/') || (main(0, main(-61, *a, CIPHER), a + 1)));
    ...

11) Remove Misdirection

There are several instances where the implementation is deliberately misleading. Given the reformatted source code so far, it’s easy to see that there are two “else” clauses with no side effects (“else {1;}” and “else {3;}”). Less obvious is that the parameters that look like string formatting specifiers passed to the main function are unused in that iteration of the function and can be replaced with zero:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

    ...
    if (b < 13)
    {
        r = main(2, b + 1, 0);
    }
    ...

    ...
    r = main(2, 2, 0);
    ...

12) Implement a Stand-Alone Recursive Function

The entire algorithm so far uses the single “main” function in an unusual recursive manner. After the last of the twelve changes, the main function looks like a standard C main function and there’s a separate local-scope (static) recursive function which, in time-honoured programmer’s tradition, is called “foo”. This gives the final complete sample after all twelve changes:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

#include 

static const char *CIPHER =
    /*
    * Substitution cipher:
    * Replace '!' with '\n'
    * Replace 'e' with 'u'
    * ...
    * Replace ':' with 'y'
    */
     &quot;!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:&quot;
    &quot;\nuwloca-O;m .vpbks,fxntdCeghiry&quot;;

static const char *LYRICS =
    &quot;@n'+,#'/&quot;  /* On the */
    &quot;*{}w+/&quot;    /* first */
    &quot;w#cdnr/&quot;   /* second */
    &quot;+,{}r/&quot;    /* third */
    &quot;*de}+,/&quot;   /* fourth */
    &quot;*{*+,/&quot;    /* fifth */
    &quot;w{%+,/&quot;    /* sixth */
    &quot;w#q#n+,/&quot;  /* seventh */
    &quot;#{l+,/&quot;    /* eighth */
    &quot;n{n+,/&quot;    /* ninth */
    &quot;+#n+,/&quot;    /* tenth */
    &quot;#;#q#n+,/&quot; /* eleventh */
    &quot;+k#;*+,/&quot;  /* twelfth */

    /* day of Christmas my true love gave to me\n */
    &quot;'r :'d*'3,}{w+K w'K:'+}e#';dq#'l q#'+d'K#!/&quot;

    /* twelve drummers drumming, */
    &quot;+k#;q#'r}eKK#}w'r}eKK{nl]'/&quot;

    /* eleven pipers piping, */
    &quot;#;#q#n'){)#}w'){){nl]'/&quot;

    /* ten lords a-leaping,\n */
    &quot;+#n';d}rw' i;# ){nl]!/&quot;

    /* nine ladies dancing, */
    &quot;n{n#'; r{#w'r nc{nl]'/&quot;

    /* eight maids a-milking, */
    &quot;#{l,+'K {rw' iK{;[{nl]'/&quot;

    /* seven swans a-swimming,\n */
    &quot;w#q#n'wk nw' iwk{KK{nl]!/&quot;

    /* six geese a-laying, */
    &quot;w{%'l##w#' i; :{nl]'/&quot;

    /* five gold rings;\n */
    &quot;*{q#'ld;r'}{nlwb!/&quot;

    /* four calling birds, */
    &quot;*de}'c ;;{nl'-{}rw]'/&quot;

    /* three french hens, */
    &quot;+,}##'*}#nc,',#nw]'/&quot;

    /* two turtle doves\nand */
    &quot;+kd'+e}+;#'rdq#w! nr'/&quot;

    /* a partridge in a pear tree */
    &quot; ') }+}{rl#'{n' ')# }'+}##(!!/&quot;;

static int foo(int t, int b, const char *a)
{
    int r;
    if (!0 < t)
    {
        if (t < 3)
        {
            foo(-79, -13, a + foo(-87, 1 - b, foo(-86, 0, a + 1) + a));
        }
        if (t < b)
        {
            foo(t + 1, b, a);
        }
        if (foo(-94, -27 + t, a) && (t == 2))
        {
            if (b < 13)
            {
                r = foo(2, b + 1, 0);
            }
            else
            {
                r = 9;
            }
        }
        else
        {
            r = 16;
        }
    }
    else
    {
        if (t < 0)
        {
            if (t < -72)
            {
                r = foo(b, t, LYRICS);
            }
            else
            {
                if (t < -50)
                {
                    if (b == *a)
                    {
                        r = putchar(a[31]);
                    }
                    else
                    {
                        r = foo(-65, b, a + 1);
                    }
                }
                else
                {
                    r = foo((*a == '/') + t, b, a + 1);
                }
            }
        }
        else
        {
            if (0 < t)
            {
                r = foo(2, 2, 0);
            }
            else
            {
                r = ((*a == '/') || (foo(0, foo(-61, *a, CIPHER), a + 1)));
            }
        }
    }
    return r;
}

int main(int argc, char *argv[])
{
    return foo(1, 0, 0);
}

There are plenty of changes that could still be made to improve readability, such as more comments and replacing the recursive algorithm with iterative loops… but I’ve made my promised twelve improvements, leaving you with plenty of opportunity to make some of your own.





[1] At the time of writing this blog, the original code can be found on the International Obfuscated C Code Contest website on the “Winning Entries” web page, listed as “phillipps.c” under the 1988 sub-heading.

[2] Credit goes to an analysis by Mike Markowski and to a similar exercise in reformatting the source code by Michael Nahas.

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Further Reading

Programmable Systems on Modules: The ‘New Kid’ on the Embedded Block

Programmable Systems on Modules: The ‘New Kid’ on the Embedded Block

Phil Duff - Lead Consultant, Signal Processing

By: Phil Duff
Lead Consultant, Signal Processing

1st November 2017

Home » Insights » Processing & Programming

As an engineer specialising in high performance real-time embedded systems, I appreciate great ideas, but in the back of my mind I’m always asking “How can we make it a reality?” and “How can we get it off the drawing board?”

One of the technologies I have been an evangelist of is the new breed of programmable system on chip (PSoC) devices. These can often be used in the form of readily accessible Programmable System on Modules (SoM) but first; let’s quickly take a step back.

For more than 20 years of signal processing, I have seen logic-based signal processing combined with Central Processing Units (CPUs) to produce powerful solutions. All this is now delivered from a single central device.

The performance and capability are now both better than ever before and the power consumption is down, however, the complexity is still there. Regardless, these devices comprise of CPU cores and field-programmable gate array (FPGA) logic, both of which are highly programmable.

Fortunately, many vendors now provide SoMs to host these devices, together with their much-needed essentials, such as memory, power supplies, Ethernet and USB ports.

The SoM is hosted on a carrier board. The SoM vendors usually provide development boards which are very useful starting points. We typically start on one of these development boards before building a custom carrier that is specific to the product’s requirements. The PSoC and SoM also allow very flexible interfacing capabilities of programmable logic, interfacing so that almost anything can be realised.

I have seen some applications where the FPGA element is simply used to assist with interfacing, but in others it has been used for high-performance digital signal processing. The latter of which is my domain, and I thoroughly approve!

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

As an engineer specialising in high performance real-time embedded systems, I appreciate great ideas, but in the back of my mind I’m always asking “How can we make it a reality?” and “How can we get it off the drawing board?”

One of the technologies I have been an evangelist of is the new breed of programmable system on chip (PSoC) devices. These can often be used in the form of readily accessible Programmable System on Modules (SoM) but first; let’s quickly take a step back.

For more than 20 years of signal processing, I have seen logic-based signal processing combined with CPUs to produce powerful solutions. All this is now delivered from a single central device.

The performance and capability are now both better than ever before and the power consumption is down, however, the complexity is still there. Regardless, these devices comprise of CPU cores and field-programmable gate array (FPGA) logic, both of which are highly programmable.

Fortunately, many vendors now provide SoMs to host these devices, together with their much-needed essentials, such as memory, power supplies, Ethernet and USB ports.

The SoM is hosted on a carrier board. The SoM vendors usually provide development boards which are very useful starting points. We typically start on one of these development boards before building a custom carrier that is specific to the product’s requirements. The PSoC and SoM also allow very flexible interfacing capabilities of programmable logic, interfacing so that almost anything can be realised.

I have seen some applications where the FPGA element is simply used to assist with interfacing, but in others it has been used for high-performance digital signal processing. The latter of which is my domain, and I thoroughly approve!

Save

Save

Save

Save

Save

Save

Save

Save

Save

Further Reading

Neural Networks: Can Androids Recognise Electric Sheep?

Neural Networks: Can Androids Recognise Electric Sheep?

Damien Clarke - Senior Consultant, Data Exploitation

By: Damien Clarke
Senior Consultant, Data Exploitation

20th September 2017

Home » Insights » Processing & Programming

In 2010, Lt. Gen. David Deptula, the US Air Force deputy chief of staff for intelligence, was quoted as saying:

“We’re going to find ourselves, in the not too distant future, swimming in sensors and drowning in data.”

Since then, this flood of data is showing no signs of slowing. In fact, it is actually accelerating as greater volumes of data are being generated every day. This is just as true in the civilian context as the military context.

For organisations with access to these large volumes of data, it would be profitable to employ data exploitation techniques to convert the raw data into useful information. This can sometimes be achieved by developing custom data processing techniques for specific situations. However, in many cases, it is better to use machine learning techniques to allow computers to learn from data without being explicitly programmed. At Plextek, we’re passionate about developing and implementing the right data exploitation techniques for the application and are working to ensure that humanity stays afloat (and dry) in Deptula’s prediction.

There is a wide range of potential machine learning techniques to choose from, but one approach is to copy nature and mimic biological brains. This was inspired by the fact that one of the primary purposes of a brain is to process sensory inputs and extract useful information for future exploitation. A biological brain can be produced in software form by modelling a large set of connected neurons. This is an artificial neural network.

How does an artificial neural network work?

The basic building block of a neural network is a single neuron. A neuron transforms a set of one or more input values into a single output by applying a mathematical function to the weighted sum of input values. This output value is then passed to one or more connected neurons to be used as a subsequent input value.

The neural network as a whole can, therefore, be defined by three sets of parameters:

  The weight assigned to each input value for each neuron.

  The function which converts the weighted sum of input values into the output value.

  The pattern of connections between neurons.

A simple example neural network consists of three layers. The first layer contains the input values which represent the data being analysed. This layer is then connected to a hidden layer of neurons. The hidden layer then connects to the third and final layer which contains the output neurons whose values represent the processed data. This design allows a complicated relationship between inputs and outputs.



How is a neural network trained?

Just like biological brains, simply creating a neural network is not sufficient to extract information from raw data. It is also necessary to train the network by exposing it to data for which the desired outputs are already known. This process is used to define the weights assigned to each connection throughout the entire network.

As the size and complexity of the neural network increases, the number of weights that must be defined for optimum performance increases significantly. This training process, therefore, requires a large and representative set of labelled data; otherwise, the neural network may not work successfully on future input data. Also, this training process is computationally challenging and may take significant processing time to perform. GPU acceleration can be used to mitigate this; however, the process may still take days for very large data sets.  

Conversely, if large volumes of suitable training data are available, it is possible to create a more complex neural network to improve performance. This can be achieved by increasing the number of hidden layers and therefore the total number of connections within the network. This use of complex neural networks with many layers and connections is called deep learning.



What can neural networks be used for?

With a sufficiently large neural network and suitable training data, it is possible to learn complex non-linear relationships between input and output values. This can reveal insights into data which are not possible when using simple linear mathematical models.

While neural networks are suitable as general purpose problem solvers, they are particularly suited for tasks when an understanding of the underlying relationships in the raw data is neither available nor necessarily required and sufficient data is also available for training.

An important example of this capability is the recognition of objects in images. This is achieved through the use of a neural network which has been trained on a large volume of photos of known objects (e.g. ImageNet). While the training process can take a long time, subsequent object recognition is much faster and can potentially be performed in real time. Due to the large volume of training data and the complexity of the neural network used the resulting object recognition performance is close to human level performance. This can be used to in a military context to recognise different vehicles (e.g. a tank) or in a civilian context to see if computers can distinguish between different animals (Do Androids Dream of Electric Sheep?).



Neural networks are not just limited to processing photos and the same approach can be applied to a wide range of sensor and non-sensor data. The most important requirement is that a suitable volume of labelled training data is available to train the network before it can be used on unknown data.

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

In 2010, Lt. Gen. David Deptula, the US Air Force deputy chief of staff for intelligence, was quoted as saying:

“We’re going to find ourselves, in the not too distant future, swimming in sensors and drowning in data.”

Since then, this flood of data is showing no signs of slowing. In fact, it is actually accelerating as greater volumes of data are being generated every day. This is just as true in the civilian context as the military context.

For organisations with access to these large volumes of data, it would be profitable to employ data exploitation techniques to convert the raw data into useful information. This can sometimes be achieved by developing custom data processing techniques for specific situations. However, in many cases, it is better to use machine learning techniques to allow computers to learn from data without being explicitly programmed. At Plextek, we’re passionate about developing and implementing the right data exploitation techniques for the application and are working to ensure that humanity stays afloat (and dry) in Deptula’s prediction.

There is a wide range of potential machine learning techniques to choose from, but one approach is to copy nature and mimic biological brains. This was inspired by the fact that one of the primary purposes of a brain is to process sensory inputs and extract useful information for future exploitation. A biological brain can be produced in software form by modelling a large set of connected neurons. This is an artificial neural network.

How does an artificial neural network work?

The basic building block of a neural network is a single neuron. A neuron transforms a set of one or more input values into a single output by applying a mathematical function to the weighted sum of input values. This output value is then passed to one or more connected neurons to be used as a subsequent input value.

The neural network as a whole can, therefore, be defined by three sets of parameters:

  The weight assigned to each input value for each neuron.

  The function which converts the weighted sum of input values into the output value.

  The pattern of connections between neurons.

A simple example neural network consists of three layers. The first layer contains the input values which represent the data being analysed. This layer is then connected to a hidden layer of neurons. The hidden layer then connects to the third and final layer which contains the output neurons whose values represent the processed data. This design allows a complicated relationship between inputs and outputs.



How is a neural network trained?

Just like biological brains, simply creating a neural network is not sufficient to extract information from raw data. It is also necessary to train the network by exposing it to data for which the desired outputs are already known. This process is used to define the weights assigned to each connection throughout the entire network.

As the size and complexity of the neural network increases, the number of weights that must be defined for optimum performance increases significantly. This training process, therefore, requires a large and representative set of labelled data; otherwise, the neural network may not work successfully on future input data. Also, this training process is computationally challenging and may take significant processing time to perform. GPU acceleration can be used to mitigate this; however, the process may still take days for very large data sets.  

Conversely, if large volumes of suitable training data are available, it is possible to create a more complex neural network to improve performance. This can be achieved by increasing the number of hidden layers and therefore the total number of connections within the network. This use of complex neural networks with many layers and connections is called deep learning.



What can neural networks be used for?

With a sufficiently large neural network and suitable training data, it is possible to learn complex non-linear relationships between input and output values. This can reveal insights into data which are not possible when using simple linear mathematical models.

While neural networks are suitable as general purpose problem solvers, they are particularly suited for tasks when an understanding of the underlying relationships in the raw data is neither available nor necessarily required and sufficient data is also available for training.

An important example of this capability is the recognition of objects in images. This is achieved through the use of a neural network which has been trained on a large volume of photos of known objects (e.g. ImageNet). While the training process can take a long time, subsequent object recognition is much faster and can potentially be performed in real time. Due to the large volume of training data and the complexity of the neural network used the resulting object recognition performance is close to human level performance. This can be used to in a military context to recognise different vehicles (e.g. a tank) or in a civilian context to see if computers can distinguish between different animals (Do Androids Dream of Electric Sheep?).



Neural networks are not just limited to processing photos and the same approach can be applied to a wide range of sensor and non-sensor data. The most important requirement is that a suitable volume of labelled training data is available to train the network before it can be used on unknown data.

Save

Save

Save

Save

Save

Save

Save

Save

Further Reading

The start-ups using artificial intelligence to solve everyday tasks

The start-ups using artificial intelligence to solve everyday tasks

Dr Matthew Roberts - Senior Consultant, Data Exploitation

By: Matthew Roberts
Senior Consultant, Data Exploitation

5th July 2017

Home » Insights » Processing & Programming

I recently attended the inaugural Cambridge Wireless Artificial Intelligence & Mobility Conference. The event focussed on artificial intelligence (AI), the business use cases enabled by AI, innovative start-up companies, and how start-up companies can gain funding. Unlike the technical conferences that I am used to attending, this event was much more about the business-side of AI.

Like many engineers, I usually like to look at the technical aspects of things, but this event gave me a different, and somewhat refreshing, perspective on the use of AI. I enjoy hearing about how companies, like DeepMind, are using AI to play video games and diagnose medical conditions, but perhaps I don’t pay enough attention to the companies that are using AI to solve everyday tasks. The Cambridge-based event gave start-ups the opportunity to talk and exhibit and gave people like me the chance to learn more about them.

You have probably heard of the driverless car technology being developed by organisations like Google and Uber, but what you might not know about are the driverless cars in the UK. Three driverless car projects were awarded funding by the UK government, and members of the public were given the opportunity ride in driverless cars.

Oxbotica, an Oxford University spinout, was involved in two of the projects. Oxbotica’s Selenium software formed the brains of the vehicles used in both projects. The software almost certainly uses AI to perform two key tasks: understanding the wealth of sensor data that is used to observe the car’s environment and controlling the car.

Another company that is working on self-driving cars is FiveAI. At the event, Stan Boland, CEO of FiveAI, spoke of how FiveAI is aiming to become a customer to large organisations instead of a supplier. FiveAI intends to do this by competing with the likes of Uber, but with self-driving cars. The company is currently part of a consortium that plans to test such cars on public roads in London, and AI will be a key part of making that a success.

Hoxton Analytics is using AI to solve a completely different kind of perception task. It is using cameras combined with AI to measure footfall. The cameras are mounted at ground level in order to avoid privacy concerns. Not only can it be used by shops to determine how many people it can attract, but it can also be used to infer the types of shoppers. This information can be used to help determine which demographics are being lured into shops and at what times. Solving such a task manually can be very labour-intensive.

Another example of the use of AI to solve everyday tasks is the 3D sensor that has been created by Titan Reality. Titan Reality’s sensor can be used in a wide variety of perception and control tasks, from sorting objects to pouring the correct drink based on what kind of glass is placed on the sensor.

This is just a tiny set of examples of where small companies have embraced AI to provide high-tech solutions to everyday tasks that would traditionally be performed by people. It is not just large companies like Google and Netflix that are using AI to make a big impact.

Save

Save

Save

Save

Save

Save

I recently attended the inaugural Cambridge Wireless Artificial Intelligence & Mobility Conference. The event focussed on artificial intelligence (AI), the business use cases enabled by AI, innovative start-up companies, and how start-up companies can gain funding. Unlike the technical conferences that I am used to attending, this event was much more about the business-side of AI.

Like many engineers, I usually like to look at the technical aspects of things, but this event gave me a different, and somewhat refreshing, perspective on the use of AI. I enjoy hearing about how companies, like DeepMind, are using AI to play video games and diagnose medical conditions, but perhaps I don’t pay enough attention to the companies that are using AI to solve everyday tasks. The Cambridge-based event gave start-ups the opportunity to talk and exhibit and gave people like me the chance to learn more about them.

You have probably heard of the driverless car technology being developed by organisations like Google and Uber, but what you might not know about are the driverless cars in the UK. Three driverless car projects were awarded funding by the UK government, and members of the public were given the opportunity ride in driverless cars.

Oxbotica, an Oxford University spinout, was involved in two of the projects. Oxbotica’s Selenium software formed the brains of the vehicles used in both projects. The software almost certainly uses AI to perform two key tasks: understanding the wealth of sensor data that is used to observe the car’s environment and controlling the car.

Another company that is working on self-driving cars is FiveAI. At the event, Stan Boland, CEO of FiveAI, spoke of how FiveAI is aiming to become a customer to large organisations instead of a supplier. FiveAI intends to do this by competing with the likes of Uber, but with self-driving cars. The company is currently part of a consortium that plans to test such cars on public roads in London, and AI will be a key part of making that a success.

Hoxton Analytics is using AI to solve a completely different kind of perception task. It is using cameras combined with AI to measure footfall. The cameras are mounted at ground level in order to avoid privacy concerns. Not only can it be used by shops to determine how many people it can attract, but it can also be used to infer the types of shoppers. This information can be used to help determine which demographics are being lured into shops and at what times. Solving such a task manually can be very labour-intensive.

Another example of the use of AI to solve everyday tasks is the 3D sensor that has been created by Titan Reality. Titan Reality’s sensor can be used in a wide variety of perception and control tasks, from sorting objects to pouring the correct drink based on what kind of glass is placed on the sensor.

This is just a tiny set of examples of where small companies have embraced AI to provide high-tech solutions to everyday tasks that would traditionally be performed by people. It is not just large companies like Google and Netflix that are using AI to make a big impact.

Save

Save

Save

Save

Save

Save

Further Reading