Moreton Hall, COVID Safe, business effectiveness

Moreton Hall and Other Hotspots

Nicholas Hill, Plextek

By: Nicholas Hill

CEO

7th January 2021

5 minute read

Home » Insights » Processing & Programming

As we enter into the UK’s third lockdown, Nick Hill, CEO, reflects on how we have used our unique tech skills during this pandemic to meet staffing challenges. 

Two weeks ago I’d never heard of Moreton Hall, but it suddenly came to my attention thanks to an experimental app that one of our software engineers had just developed. 

Our office is just south of Cambridge, but we have staff who live over quite a wide geographical area.  Since we came out of the original full lockdown earlier this year we have implemented a mixture of office based and home working, with most staff coming into the office around two days a week – the maximum that our office and lab can support while remaining Covid-safe.  While the general incidence of Covid infections in the district where our offices are located has been low throughout the pandemic, we’d been wondering if the same was true for all the places where our staff lived.  Was there a higher risk of certain staff bringing infections into the office because of where they lived? 

Tailor made data 

None of the internet resources we could find displayed the infection rate information we were looking for in a way that was quick and easy to assess.  We wanted a way to quickly review infection rates in the areas where our staff lived, excluding the clutter of data from all other locations.  We wanted to see a highlevel overview, with the ability to drill down and look at the smallest districts for which data was available.  We wanted to track local trends and, as a baseline, to compare them to the national average.   And, of course, we wanted to be always working with the latest government data. 

A bespoke solution 

After a bit of discussion, we set someone in our software team the task of investigating what was possible, which resulted in a plan to develop a bespoke app.  It was quickly determined that we could pull the raw data we needed from an open government source and that we could build the app around the set of postcodes where our staff lived. 

The user interface of our CovidView app starts with a simplified heat map of the counties around Cambridge, divided into the government’s mid-tier (LTLA) regions, which have names like South Cambridgeshire and Uttlesford.  Only LTLA’s where our staff live are shown on the map.  Each LTLA is colour coded with its average infection rate but also has an indication of the range of infection rates within it.  The range indication comes from data for the government’s lowest-tier (MSOA) regions, once again only reported for postcodes where our staff live.  Clicking on an LTLA in the map brings up detail data for the district, a list of MSOAs within it and a graph showing infection rate trends, compared to national average. 

 

CovidView App
view captured 6/01/21 2:45pm

Example Screen Grab of CovidView  

 

It took about two weeks to develop the app, with quite a lot of back and forth between the developer and HR team as we worked out how some of its features were best implemented.  It was once we’d got the basic functions up and running that I started to notice some unexpected features in the infection rates, the first of which was happening in an MSOA called Moreton Hall.  One of the LTLAs on our map is West Suffolk, which showed up in low infection rate green on our heat map.  However, there was a single anomaly within the West Suffolk region – Moreton Hall showed up in red, whereas the other six MSOAs where our staff lived were all shades of green. 

Finding the Hotspots 

We reached out to the member of staff who lives in Moreton Hall and found they were well aware of the problem because the area had recently been subject to a coronavirus alert and West Suffolk Council were working hard to bring the problem under control.  Infection rates in Moreton Hall (and two other hotspots we investigated at the same time) have subsequently reduced, while our CovidView app has since highlighted new hotspots that we should look at carefully. 

Running a business 

Like all businesses, we have been trying hard to maximise our operational efficiency within the constraints imposed on us by the pandemic.  Our experience of home working is that it has a negative impact on support for junior staffcommunications between teamsteam cohesiongroup problem solving and creativity.  These factors only became clear over time and are despite our best efforts to use electronic communications to maximum effect.  To overcome these effects, we have been trying to use our office and lab facilities as much as possible, while remaining Covid-safe. This requires a careful balance between the need for business effectiveness and our responsibility to keep all our staff safe, especially as we now go from limited restrictions into higher lockdown guidelines. 

Our new CovidView app feels like an important tool to help us achieve this balance, as we continue to search for new and better ways to run the business in these pandemic-dominated times. 

As we enter into the UK’s third lockdown, Nick Hill, CEO, reflects on how we have used our unique tech skills during this pandemic to meet staffing challenges. 

Two weeks ago I’d never heard of Moreton Hall, but it suddenly came to my attention thanks to an experimental app that one of our software engineers had just developed. 

Our office is just south of Cambridge, but we have staff who live over quite a wide geographical area.  Since we came out of the original full lockdown earlier this year we have implemented a mixture of office based and home working, with most staff coming into the office around two days a week – the maximum that our office and lab can support while remaining Covid-safe.  While the general incidence of Covid infections in the district where our offices are located has been low throughout the pandemic, we’d been wondering if the same was true for all the places where our staff lived.  Was there a higher risk of certain staff bringing infections into the office because of where they lived? 

Tailor made data 

None of the internet resources we could find displayed the infection rate information we were looking for in a way that was quick and easy to assess.  We wanted a way to quickly review infection rates in the areas where our staff lived, excluding the clutter of data from all other locations.  We wanted to see a highlevel overview, with the ability to drill down and look at the smallest districts for which data was available.  We wanted to track local trends and, as a baseline, to compare them to the national average.   And, of course, we wanted to be always working with the latest government data. 

A bespoke solution

After a bit of discussion, we set someone in our software team the task of investigating what was possible, which resulted in a plan to develop a bespoke app.  It was quickly determined that we could pull the raw data we needed from an open government source and that we could build the app around the set of postcodes where our staff lived. 

The user interface of our CovidView app starts with a simplified heat map of the counties around Cambridge, divided into the government’s mid-tier (LTLA) regions, which have names like South Cambridgeshire and Uttlesford.  Only LTLA’s where our staff live are shown on the map.  Each LTLA is colour coded with its average infection rate but also has an indication of the range of infection rates within it.  The range indication comes from data for the government’s lowest-tier (MSOA) regions, once again only reported for postcodes where our staff live.  Clicking on an LTLA in the map brings up detail data for the district, a list of MSOAs within it and a graph showing infection rate trends, compared to national average. 

CovidView App
view captured 6/01/21 2:45pm

 

Example Screen Grab of CovidView  

 

It took about two weeks to develop the app, with quite a lot of back and forth between the developer and HR team as we worked out how some of its features were best implemented.  It was once we’d got the basic functions up and running that I started to notice some unexpected features in the infection rates, the first of which was happening in an MSOA called Moreton Hall.  One of the LTLAs on our map is West Suffolk, which showed up in low infection rate green on our heat map.  However, there was a single anomaly within the West Suffolk region – Moreton Hall showed up in red, whereas the other six MSOAs where our staff lived were all shades of green. 

Finding the Hotspots 

We reached out to the member of staff who lives in Moreton Hall and found they were well aware of the problem because the area had recently been subject to a coronavirus alert and West Suffolk Council were working hard to bring the problem under control.  Infection rates in Moreton Hall (and two other hotspots we investigated at the same time) have subsequently reduced, while our CovidView app has since highlighted new hotspots that we should look at carefully. 

Running a business

Like all businesses, we have been trying hard to maximise our operational efficiency within the constraints imposed on us by the pandemic.  Our experience of home working is that it has a negative impact on support for junior staffcommunications between teamsteam cohesiongroup problem solving and creativity.  These factors only became clear over time and are despite our best efforts to use electronic communications to maximum effect.  To overcome these effects, we have been trying to use our office and lab facilities as much as possible, while remaining Covid-safe. This requires a careful balance between the need for business effectiveness and our responsibility to keep all our staff safe, especially as we now go from limited restrictions into higher lockdown guidelines. 

Our new CovidView app feels like an important tool to help us achieve this balance, as we continue to search for new and better ways to run the business in these pandemic-dominated times. 

Further Reading

Twelve Ways to Style The Twelve Days

By: Kevin Jones
Senior Consultant, Embedded Systems

6th December 2018

Home » Insights » Processing & Programming

At this time of year I sometimes think about a now-famous piece of obfuscated C code. This was the winning entry in the 1988 International Obfuscated C Code Contest [see reference 1] and, believe it or not, will print the entire lyrics to “The Twelve Days of Christmas”:

[code language=”c”]
#include <stdio.h>
main(t,_,a)
char *a;
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,
"@n’+,#’/*{}w+/w#cdnr/+,{}r/*de}+,/*{*+,/w{%+,/w#q#n+,/#{l+,/n{n+,/+#n+,/#\
;#q#n+,/+k#;*+,/’r :’d*’3,}{w+K w’K:’+}e#’;dq#’l \
q#’+d’K#!/+k#;q#’r}eKK#}w’r}eKK{nl]’/#;#q#n’){)#}w’){){nl]’/+#n’;d}rw’ i;# \
){nl]!/n{n#’; r{#w’r nc{nl]’/#{l,+’K {rw’ iK{;[{nl]’/w#q#n’wk nw’ \
iwk{KK{nl]!/w{%’l##w#’ i; :{nl]’/*{q#’ld;r’}{nlwb!/*de}’c \
;;{nl’-{}rw]’/+,}##’*}#nc,’,#nw]’/+kd’+e}+;#’rdq#w! nr’/ ‘) }+}{rl#'{n’ ‘)# \
}’+}##(!!/")
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a==’/’)+t,_,a+1)
:0<t?main(2,2,"%s"):*a==’/’||main(0,main(-61,*a,
"!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:\nuwloca-O;m .vpbks,fxntdCeghiry"),a+1);
}
[/code]

There are plenty of articles online that explain how this works; just use your favourite search engine to find out more [see reference 2]. Rather than repeat those descriptions, I’ll stick to the festive theme that prompted this blog and illustrate twelve ways to improve the readability of this wonderful algorithm.

1) Extract the Constant Strings

There are two strings lurking in the source code. The longer string contains an encoded version of the song lyrics and the shorter string contains a substitution cipher to decode the lyrics. The source code is immediately much clearer if these strings are declared near the beginning of the file. In the following example, “static” limits the scope of the strings to one source file and “const” declares them as read-only. I like to use uppercase symbols for constants but that’s nothing more than personal choice. Here’s the modified source code:

[code language=”c”]
#include <stdio.h>

static const char *CIPHER =
"!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:\nuwloca-O;m .vpbks,fxntdCeghiry";

static const char *LYRICS =
"@n’+,#’/*{}w+/w#cdnr/+,{}r/*de}+,/*{*+,/w{%+,/w#q#n+,/#{l+,/n{n+,/+#n+,/#\
;#q#n+,/+k#;*+,/’r :’d*’3,}{w+K w’K:’+}e#’;dq#’l \
q#’+d’K#!/+k#;q#’r}eKK#}w’r}eKK{nl]’/#;#q#n’){)#}w’){){nl]’/+#n’;d}rw’ i;# \
){nl]!/n{n#’; r{#w’r nc{nl]’/#{l,+’K {rw’ iK{;[{nl]’/w#q#n’wk nw’ \
iwk{KK{nl]!/w{%’l##w#’ i; :{nl]’/*{q#’ld;r’}{nlwb!/*de}’c \
;;{nl’-{}rw]’/+,}##’*}#nc,’,#nw]’/+kd’+e}+;#’rdq#w! nr’/ ‘) }+}{rl#'{n’ ‘)# \
}’+}##(!!/";

main(t,_,a)
char *a;
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,LYRICS)
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a==’/’)+t,_,a+1)
:0<t?main(2,2,"%s"):*a==’/’||main(0,main(-61,*a,CIPHER),a+1);
}
[/code]

2) Reformat the Strings

The two strings are recursively indexed to decode and display the lyrics. Reformatting these strings helps to understand the algorithm. In the next step the lyrics and the cipher are declared using C’s alternative string concatenation syntax. The substitution cipher is split after the 31st character, the lyrics are split at each “/” character and the original code’s “\” concatenation in the lyrics string has been eliminated. There’s also some indentation to improve readability:

[code language=”c”]
#include <stdio.h>

static const char *CIPHER =
"!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:"
"\nuwloca-O;m .vpbks,fxntdCeghiry";

static const char *LYRICS =
"@n’+,#’/"
"*{}w+/"
"w#cdnr/"
"+,{}r/"
"*de}+,/"
"*{*+,/"
"w{%+,/"
"w#q#n+,/"
"#{l+,/"
"n{n+,/"
"+#n+,/"
"#;#q#n+,/"
"+k#;*+,/"
"’r :’d*’3,}{w+K w’K:’+}e#’;dq#’l q#’+d’K#!/"
"+k#;q#’r}eKK#}w’r}eKK{nl]’/"
"#;#q#n’){)#}w’){){nl]’/"
"+#n’;d}rw’ i;# ){nl]!/"
"n{n#’; r{#w’r nc{nl]’/"
"#{l,+’K {rw’ iK{;[{nl]’/"
"w#q#n’wk nw’ iwk{KK{nl]!/"
"w{%’l##w#’ i; :{nl]’/"
"*{q#’ld;r’}{nlwb!/"
"*de}’c ;;{nl’-{}rw]’/"
"+,}##’*}#nc,’,#nw]’/"
"+kd’+e}+;#’rdq#w! nr’/"
" ‘) }+}{rl#'{n’ ‘)# }’+}##(!!/";

main(t,_,a)
char *a;
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,LYRICS)
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a==’/’)+t,_,a+1)
:0<t?main(2,2,"%s"):*a==’/’||main(0,main(-61,*a,CIPHER),a+1);
}
[/code]

3) Add Some Comments

All good programmers know that comments are useful. Annotating the constant strings immediately helps understand the algorithm even before any changes are made to the function. I prefer block comments for maximum portability but end-of-line comments would work just as well:

[code language=”c”]
static const char *CIPHER =
/*
* Substitution cipher:
* Replace ‘!’ with ‘\n’
* Replace ‘e’ with ‘u’
* …
* Replace ‘:’ with ‘y’
*/
"!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:"
"\nuwloca-O;m .vpbks,fxntdCeghiry";

static const char *LYRICS =
"@n’+,#’/" /* On the */
"*{}w+/" /* first */
"w#cdnr/" /* second */
"+,{}r/" /* third */
"*de}+,/" /* fourth */
"*{*+,/" /* fifth */
"w{%+,/" /* sixth */
"w#q#n+,/" /* seventh */
"#{l+,/" /* eighth */
"n{n+,/" /* ninth */
"+#n+,/" /* tenth */
"#;#q#n+,/" /* eleventh */
"+k#;*+,/" /* twelfth */

/* day of Christmas my true love gave to me\n */
"’r :’d*’3,}{w+K w’K:’+}e#’;dq#’l q#’+d’K#!/"

/* twelve drummers drumming, */
"+k#;q#’r}eKK#}w’r}eKK{nl]’/"

/* eleven pipers piping, */
"#;#q#n’){)#}w’){){nl]’/"

/* ten lords a-leaping,\n */
"+#n’;d}rw’ i;# ){nl]!/"

/* nine ladies dancing, */
"n{n#’; r{#w’r nc{nl]’/"

/* eight maids a-milking, */
"#{l,+’K {rw’ iK{;[{nl]’/"

/* seven swans a-swimming,\n */
"w#q#n’wk nw’ iwk{KK{nl]!/"

/* six geese a-laying, */
"w{%’l##w#’ i; :{nl]’/"

/* five gold rings;\n */
"*{q#’ld;r’}{nlwb!/"

/* four calling birds, */
"*de}’c ;;{nl’-{}rw]’/"

/* three french hens, */
"+,}##’*}#nc,’,#nw]’/"

/* two turtle doves\nand */
"+kd’+e}+;#’rdq#w! nr’/"

/* a partridge in a pear tree */
" ‘) }+}{rl#'{n’ ‘)# }’+}##(!!/";
[/code]

4) Indent the Function Contents

This step is straight forward… simply indent the contents of the single function:

[code language=”c”]
main(t,_,a)
char *a;
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,LYRICS)
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a==’/’)+t,_,a+1)
:0<t?main(2,2,"%s"):*a==’/’||main(0,main(-61,*a,CIPHER),a+1);
}
[/code]

5) Eliminate Implicit Types

If you’re compiling the examples then you may have noticed that there are plenty of implicit type warnings. The main function uses an old C syntax to declare parameter “a” as pointer-to-char. The remaining parameters are implicit integers and the return type of the main function is also an implicit integer. These are fixed in the next sample while taking the opportunity to include the “const” type qualifier:

[code language=”c”]
int main(int t, int _, const char *a)
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,LYRICS)
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a==’/’)+t,_,a+1)
:0<t?main(2,2,"%s"):*a==’/’||main(0,main(-61,*a,CIPHER),a+1);
}
[/code]

6) Rename the Underscore Parameter

The second parameter of the main function is a single underscore. While this is perfectly legal in C, it’s a lot harder to read than an alphabetic character. The next sample has replaced all instances of the underscore parameter with the single character “b”:

[code language=”c”]
int main(int t, int b, const char *a)
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-b,main(-86,0,a+1)+a)):
1,t<b?main(t+1,b,a):3,main(-94,-27+t,a)&&t==2?b<13?
main(2,b+1,"%s %d %d\n"):9:16:t<0?t<-72?main(b,t,LYRICS)
:t<-50?b==*a?putchar(31[a]):main(-65,b,a+1):main((*a==’/’)+t,b,a+1)
:0<t?main(2,2,"%s"):*a==’/’||main(0,main(-61,*a,CIPHER),a+1);
}
[/code]

7) Reformat the Ternary Conditional Operator

The obfuscated implementation makes extensive use of the ternary conditional operator. The next sample reformats these ternary operators with some additional whitespace:

[code language=”c”]
int main(int t, int b, const char *a)
{
return
!0<t
?
t<3
?
main(-79,-13,a+main(-87,1-b,main(-86,0,a+1)+a))
:
1, t<b
?
main(t+1,b,a)
:
3, main(-94,-27+t,a)&&t==2
?
b<13
?
main(2,b+1,"%s %d %d\n")
:
9
:
16
:
t<0
?
t<-72
?
main(b,t,LYRICS)
:
t<-50
?
b==*a
?
putchar(31[a])
:
main(-65,b,a+1)
:
main((*a==’/’)+t,b,a+1)
:
0<t
?
main(2,2,"%s")
:
*a==’/’||main(0,main(-61,*a,CIPHER),a+1);
}
[/code]

8) Replace the Ternary Conditional Statements with If-Else

A simple replacement of the ternary conditional statements with if-else statements improves the presentation of the algorithm. In the following sample, there is yet more white space and the multiple return points have been rationalised into a single point of return by using a temporary integer variable “r”:

[code language=”c”]
int main(int t, int b, const char *a)
{
int r;
if (!0 < t)
{
if (t < 3)
{
main(-79, -13, a + main(-87, 1 – b, main(-86, 0, a + 1) + a));
}
else
{
1;
}
if (t < b)
{
main(t + 1, b, a);
}
else
{
3;
}
if (main(-94, -27 + t, a) && t == 2)
{
if (b < 13)
{
r = main(2, b + 1, "%s %d %d\n");
}
else
{
r = 9;
}
}
else
{
r = 16;
}
}
else
{
if (t < 0)
{
if (t < -72)
{
r = main(b, t, LYRICS);
}
else
{
if (t < -50)
{
if (b == *a)
{
r = putchar(31[a]);
}
else
{
r = main(-65, b, a + 1);
}
}
else
{
r = main((*a == ‘/’) + t, b, a + 1);
}
}
}
else
{
if (0 < t)
{
r = main(2, 2, "%s");
}
else
{
r = *a == ‘/’ || main(0, main(-61, *a, CIPHER), a + 1);
}
}
}
return r;
}
[/code]

9) Simplify an Array Access

This change is straight forward. The array access “31[a]” is exactly the same as “a[31]” yet the latter is much easier to understand in the context of the 31 character substitution cipher:

[code language=”c”]

r = putchar(a[31]);

[/code]

10) Add Brackets to Logical Expressions

Personally, I prefer plenty of brackets in compound logical expressions. The following snips are small improvements:

[code language=”c”]

if (main(-94, -27 + t, a) && (t == 2))


r = ((*a == ‘/’) || (main(0, main(-61, *a, CIPHER), a + 1)));

[/code]

11) Remove Misdirection

There are several instances where the implementation is deliberately misleading. Given the reformatted source code so far, it’s easy to see that there are two “else” clauses with no side effects (“else {1;}” and “else {3;}”). Less obvious is that the parameters that look like string formatting specifiers passed to the main function are unused in that iteration of the function and can be replaced with zero:

[code language=”c”]

if (b < 13)
{
r = main(2, b + 1, 0);
}


r = main(2, 2, 0);

[/code]

12) Implement a Stand-Alone Recursive Function

The entire algorithm so far uses the single “main” function in an unusual recursive manner. After the last of the twelve changes, the main function looks like a standard C main function and there’s a separate local-scope (static) recursive function which, in time-honoured programmer’s tradition, is called “foo”. This gives the final complete sample after all twelve changes:

[code language=”c”]
#include <stdio.h>

static const char *CIPHER =
/*
* Substitution cipher:
* Replace ‘!’ with ‘\n’
* Replace ‘e’ with ‘u’
* …
* Replace ‘:’ with ‘y’
*/
"!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:"
"\nuwloca-O;m .vpbks,fxntdCeghiry";

static const char *LYRICS =
"@n’+,#’/" /* On the */
"*{}w+/" /* first */
"w#cdnr/" /* second */
"+,{}r/" /* third */
"*de}+,/" /* fourth */
"*{*+,/" /* fifth */
"w{%+,/" /* sixth */
"w#q#n+,/" /* seventh */
"#{l+,/" /* eighth */
"n{n+,/" /* ninth */
"+#n+,/" /* tenth */
"#;#q#n+,/" /* eleventh */
"+k#;*+,/" /* twelfth */

/* day of Christmas my true love gave to me\n */
"’r :’d*’3,}{w+K w’K:’+}e#’;dq#’l q#’+d’K#!/"

/* twelve drummers drumming, */
"+k#;q#’r}eKK#}w’r}eKK{nl]’/"

/* eleven pipers piping, */
"#;#q#n’){)#}w’){){nl]’/"

/* ten lords a-leaping,\n */
"+#n’;d}rw’ i;# ){nl]!/"

/* nine ladies dancing, */
"n{n#’; r{#w’r nc{nl]’/"

/* eight maids a-milking, */
"#{l,+’K {rw’ iK{;[{nl]’/"

/* seven swans a-swimming,\n */
"w#q#n’wk nw’ iwk{KK{nl]!/"

/* six geese a-laying, */
"w{%’l##w#’ i; :{nl]’/"

/* five gold rings;\n */
"*{q#’ld;r’}{nlwb!/"

/* four calling birds, */
"*de}’c ;;{nl’-{}rw]’/"

/* three french hens, */
"+,}##’*}#nc,’,#nw]’/"

/* two turtle doves\nand */
"+kd’+e}+;#’rdq#w! nr’/"

/* a partridge in a pear tree */
" ‘) }+}{rl#'{n’ ‘)# }’+}##(!!/";

static int foo(int t, int b, const char *a)
{
int r;
if (!0 < t)
{
if (t < 3)
{
foo(-79, -13, a + foo(-87, 1 – b, foo(-86, 0, a + 1) + a));
}
if (t < b)
{
foo(t + 1, b, a);
}
if (foo(-94, -27 + t, a) && (t == 2))
{
if (b < 13)
{
r = foo(2, b + 1, 0);
}
else
{
r = 9;
}
}
else
{
r = 16;
}
}
else
{
if (t < 0)
{
if (t < -72)
{
r = foo(b, t, LYRICS);
}
else
{
if (t < -50)
{
if (b == *a)
{
r = putchar(a[31]);
}
else
{
r = foo(-65, b, a + 1);
}
}
else
{
r = foo((*a == ‘/’) + t, b, a + 1);
}
}
}
else
{
if (0 < t)
{
r = foo(2, 2, 0);
}
else
{
r = ((*a == ‘/’) || (foo(0, foo(-61, *a, CIPHER), a + 1)));
}
}
}
return r;
}

int main(int argc, char *argv[])
{
return foo(1, 0, 0);
}
[/code]

There are plenty of changes that could still be made to improve readability, such as more comments and replacing the recursive algorithm with iterative loops… but I’ve made my promised twelve improvements, leaving you with plenty of opportunity to make some of your own.



[1] At the time of writing this blog, the original code can be found on the International Obfuscated C Code Contest website on the “Winning Entries” web page, listed as “phillipps.c” under the 1988 sub-heading.

[2] Credit goes to an analysis by Mike Markowski and to a similar exercise in reformatting the source code by Michael Nahas.

At this time of year I sometimes think about a now-famous piece of obfuscated C code. This was the winning entry in the 1988 International Obfuscated C Code Contest [see reference 1] and, believe it or not, will print the entire lyrics to “The Twelve Days of Christmas”:

[code language=”c”]
#include <stdio.h>
main(t,_,a)
char *a;
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,
"@n’+,#’/*{}w+/w#cdnr/+,{}r/*de}+,/*{*+,/w{%+,/w#q#n+,/#{l+,/n{n+,/+#n+,/#\
;#q#n+,/+k#;*+,/’r :’d*’3,}{w+K w’K:’+}e#’;dq#’l \
q#’+d’K#!/+k#;q#’r}eKK#}w’r}eKK{nl]’/#;#q#n’){)#}w’){){nl]’/+#n’;d}rw’ i;# \
){nl]!/n{n#’; r{#w’r nc{nl]’/#{l,+’K {rw’ iK{;[{nl]’/w#q#n’wk nw’ \
iwk{KK{nl]!/w{%’l##w#’ i; :{nl]’/*{q#’ld;r’}{nlwb!/*de}’c \
;;{nl’-{}rw]’/+,}##’*}#nc,’,#nw]’/+kd’+e}+;#’rdq#w! nr’/ ‘) }+}{rl#'{n’ ‘)# \
}’+}##(!!/")
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a==’/’)+t,_,a+1)
:0<t?main(2,2,"%s"):*a==’/’||main(0,main(-61,*a,
"!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:\nuwloca-O;m .vpbks,fxntdCeghiry"),a+1);
}
[/code]

There are plenty of articles online that explain how this works; just use your favourite search engine to find out more [see reference 2]. Rather than repeat those descriptions, I’ll stick to the festive theme that prompted this blog and illustrate twelve ways to improve the readability of this wonderful algorithm.

1) Extract the Constant Strings

There are two strings lurking in the source code. The longer string contains an encoded version of the song lyrics and the shorter string contains a substitution cipher to decode the lyrics. The source code is immediately much clearer if these strings are declared near the beginning of the file. In the following example, “static” limits the scope of the strings to one source file and “const” declares them as read-only. I like to use uppercase symbols for constants but that’s nothing more than personal choice. Here’s the modified source code:

[code language=”c”]
#include <stdio.h>

static const char *CIPHER =
"!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:\nuwloca-O;m .vpbks,fxntdCeghiry";

static const char *LYRICS =
"@n’+,#’/*{}w+/w#cdnr/+,{}r/*de}+,/*{*+,/w{%+,/w#q#n+,/#{l+,/n{n+,/+#n+,/#\
;#q#n+,/+k#;*+,/’r :’d*’3,}{w+K w’K:’+}e#’;dq#’l \
q#’+d’K#!/+k#;q#’r}eKK#}w’r}eKK{nl]’/#;#q#n’){)#}w’){){nl]’/+#n’;d}rw’ i;# \
){nl]!/n{n#’; r{#w’r nc{nl]’/#{l,+’K {rw’ iK{;[{nl]’/w#q#n’wk nw’ \
iwk{KK{nl]!/w{%’l##w#’ i; :{nl]’/*{q#’ld;r’}{nlwb!/*de}’c \
;;{nl’-{}rw]’/+,}##’*}#nc,’,#nw]’/+kd’+e}+;#’rdq#w! nr’/ ‘) }+}{rl#'{n’ ‘)# \
}’+}##(!!/";

main(t,_,a)
char *a;
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,LYRICS)
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a==’/’)+t,_,a+1)
:0<t?main(2,2,"%s"):*a==’/’||main(0,main(-61,*a,CIPHER),a+1);
}
[/code]

2) Reformat the Strings

The two strings are recursively indexed to decode and display the lyrics. Reformatting these strings helps to understand the algorithm. In the next step the lyrics and the cipher are declared using C’s alternative string concatenation syntax. The substitution cipher is split after the 31st character, the lyrics are split at each “/” character and the original code’s “\” concatenation in the lyrics string has been eliminated. There’s also some indentation to improve readability:

[code language=”c”]
#include <stdio.h>

static const char *CIPHER =
"!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:"
"\nuwloca-O;m .vpbks,fxntdCeghiry";

static const char *LYRICS =
"@n’+,#’/"
"*{}w+/"
"w#cdnr/"
"+,{}r/"
"*de}+,/"
"*{*+,/"
"w{%+,/"
"w#q#n+,/"
"#{l+,/"
"n{n+,/"
"+#n+,/"
"#;#q#n+,/"
"+k#;*+,/"
"’r :’d*’3,}{w+K w’K:’+}e#’;dq#’l q#’+d’K#!/"
"+k#;q#’r}eKK#}w’r}eKK{nl]’/"
"#;#q#n’){)#}w’){){nl]’/"
"+#n’;d}rw’ i;# ){nl]!/"
"n{n#’; r{#w’r nc{nl]’/"
"#{l,+’K {rw’ iK{;[{nl]’/"
"w#q#n’wk nw’ iwk{KK{nl]!/"
"w{%’l##w#’ i; :{nl]’/"
"*{q#’ld;r’}{nlwb!/"
"*de}’c ;;{nl’-{}rw]’/"
"+,}##’*}#nc,’,#nw]’/"
"+kd’+e}+;#’rdq#w! nr’/"
" ‘) }+}{rl#'{n’ ‘)# }’+}##(!!/";

main(t,_,a)
char *a;
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,LYRICS)
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a==’/’)+t,_,a+1)
:0<t?main(2,2,"%s"):*a==’/’||main(0,main(-61,*a,CIPHER),a+1);
}
[/code]

3) Add Some Comments

All good programmers know that comments are useful. Annotating the constant strings immediately helps understand the algorithm even before any changes are made to the function. I prefer block comments for maximum portability but end-of-line comments would work just as well:

[code language=”c”]
static const char *CIPHER =
/*
* Substitution cipher:
* Replace ‘!’ with ‘\n’
* Replace ‘e’ with ‘u’
* …
* Replace ‘:’ with ‘y’
*/
"!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:"
"\nuwloca-O;m .vpbks,fxntdCeghiry";

static const char *LYRICS =
"@n’+,#’/" /* On the */
"*{}w+/" /* first */
"w#cdnr/" /* second */
"+,{}r/" /* third */
"*de}+,/" /* fourth */
"*{*+,/" /* fifth */
"w{%+,/" /* sixth */
"w#q#n+,/" /* seventh */
"#{l+,/" /* eighth */
"n{n+,/" /* ninth */
"+#n+,/" /* tenth */
"#;#q#n+,/" /* eleventh */
"+k#;*+,/" /* twelfth */

/* day of Christmas my true love gave to me\n */
"’r :’d*’3,}{w+K w’K:’+}e#’;dq#’l q#’+d’K#!/"

/* twelve drummers drumming, */
"+k#;q#’r}eKK#}w’r}eKK{nl]’/"

/* eleven pipers piping, */
"#;#q#n’){)#}w’){){nl]’/"

/* ten lords a-leaping,\n */
"+#n’;d}rw’ i;# ){nl]!/"

/* nine ladies dancing, */
"n{n#’; r{#w’r nc{nl]’/"

/* eight maids a-milking, */
"#{l,+’K {rw’ iK{;[{nl]’/"

/* seven swans a-swimming,\n */
"w#q#n’wk nw’ iwk{KK{nl]!/"

/* six geese a-laying, */
"w{%’l##w#’ i; :{nl]’/"

/* five gold rings;\n */
"*{q#’ld;r’}{nlwb!/"

/* four calling birds, */
"*de}’c ;;{nl’-{}rw]’/"

/* three french hens, */
"+,}##’*}#nc,’,#nw]’/"

/* two turtle doves\nand */
"+kd’+e}+;#’rdq#w! nr’/"

/* a partridge in a pear tree */
" ‘) }+}{rl#'{n’ ‘)# }’+}##(!!/";
[/code]

4) Indent the Function Contents

This step is straight forward… simply indent the contents of the single function:

[code language=”c”]
main(t,_,a)
char *a;
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,LYRICS)
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a==’/’)+t,_,a+1)
:0<t?main(2,2,"%s"):*a==’/’||main(0,main(-61,*a,CIPHER),a+1);
}
[/code]

5) Eliminate Implicit Types

If you’re compiling the examples then you may have noticed that there are plenty of implicit type warnings. The main function uses an old C syntax to declare parameter “a” as pointer-to-char. The remaining parameters are implicit integers and the return type of the main function is also an implicit integer. These are fixed in the next sample while taking the opportunity to include the “const” type qualifier:

[code language=”c”]
int main(int t, int _, const char *a)
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,LYRICS)
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a==’/’)+t,_,a+1)
:0<t?main(2,2,"%s"):*a==’/’||main(0,main(-61,*a,CIPHER),a+1);
}
[/code]

6) Rename the Underscore Parameter

The second parameter of the main function is a single underscore. While this is perfectly legal in C, it’s a lot harder to read than an alphabetic character. The next sample has replaced all instances of the underscore parameter with the single character “b”:

[code language=”c”]
int main(int t, int b, const char *a)
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-b,main(-86,0,a+1)+a)):
1,t<b?main(t+1,b,a):3,main(-94,-27+t,a)&&t==2?b<13?
main(2,b+1,"%s %d %d\n"):9:16:t<0?t<-72?main(b,t,LYRICS)
:t<-50?b==*a?putchar(31[a]):main(-65,b,a+1):main((*a==’/’)+t,b,a+1)
:0<t?main(2,2,"%s"):*a==’/’||main(0,main(-61,*a,CIPHER),a+1);
}
[/code]

7) Reformat the Ternary Conditional Operator

The obfuscated implementation makes extensive use of the ternary conditional operator. The next sample reformats these ternary operators with some additional whitespace:

[code language=”c”]
int main(int t, int b, const char *a)
{
return
!0<t
?
t<3
?
main(-79,-13,a+main(-87,1-b,main(-86,0,a+1)+a))
:
1, t<b
?
main(t+1,b,a)
:
3, main(-94,-27+t,a)&&t==2
?
b<13
?
main(2,b+1,"%s %d %d\n")
:
9
:
16
:
t<0
?
t<-72
?
main(b,t,LYRICS)
:
t<-50
?
b==*a
?
putchar(31[a])
:
main(-65,b,a+1)
:
main((*a==’/’)+t,b,a+1)
:
0<t
?
main(2,2,"%s")
:
*a==’/’||main(0,main(-61,*a,CIPHER),a+1);
}
[/code]

8) Replace the Ternary Conditional Statements with If-Else

A simple replacement of the ternary conditional statements with if-else statements improves the presentation of the algorithm. In the following sample, there is yet more white space and the multiple return points have been rationalised into a single point of return by using a temporary integer variable “r”:

[code language=”c”]
int main(int t, int b, const char *a)
{
int r;
if (!0 < t)
{
if (t < 3)
{
main(-79, -13, a + main(-87, 1 – b, main(-86, 0, a + 1) + a));
}
else
{
1;
}
if (t < b)
{
main(t + 1, b, a);
}
else
{
3;
}
if (main(-94, -27 + t, a) && t == 2)
{
if (b < 13)
{
r = main(2, b + 1, "%s %d %d\n");
}
else
{
r = 9;
}
}
else
{
r = 16;
}
}
else
{
if (t < 0)
{
if (t < -72)
{
r = main(b, t, LYRICS);
}
else
{
if (t < -50)
{
if (b == *a)
{
r = putchar(31[a]);
}
else
{
r = main(-65, b, a + 1);
}
}
else
{
r = main((*a == ‘/’) + t, b, a + 1);
}
}
}
else
{
if (0 < t)
{
r = main(2, 2, "%s");
}
else
{
r = *a == ‘/’ || main(0, main(-61, *a, CIPHER), a + 1);
}
}
}
return r;
}
[/code]

9) Simplify an Array Access

This change is straight forward. The array access “31[a]” is exactly the same as “a[31]” yet the latter is much easier to understand in the context of the 31 character substitution cipher:

[code language=”c”]

r = putchar(a[31]);

[/code]

10) Add Brackets to Logical Expressions

Personally, I prefer plenty of brackets in compound logical expressions. The following snips are small improvements:

[code language=”c”]

if (main(-94, -27 + t, a) && (t == 2))


r = ((*a == ‘/’) || (main(0, main(-61, *a, CIPHER), a + 1)));

[/code]

11) Remove Misdirection

There are several instances where the implementation is deliberately misleading. Given the reformatted source code so far, it’s easy to see that there are two “else” clauses with no side effects (“else {1;}” and “else {3;}”). Less obvious is that the parameters that look like string formatting specifiers passed to the main function are unused in that iteration of the function and can be replaced with zero:

[code language=”c”]

if (b < 13)
{
r = main(2, b + 1, 0);
}


r = main(2, 2, 0);

[/code]

12) Implement a Stand-Alone Recursive Function

The entire algorithm so far uses the single “main” function in an unusual recursive manner. After the last of the twelve changes, the main function looks like a standard C main function and there’s a separate local-scope (static) recursive function which, in time-honoured programmer’s tradition, is called “foo”. This gives the final complete sample after all twelve changes:

[code language=”c”]
#include <stdio.h>

static const char *CIPHER =
/*
* Substitution cipher:
* Replace ‘!’ with ‘\n’
* Replace ‘e’ with ‘u’
* …
* Replace ‘:’ with ‘y’
*/
"!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:"
"\nuwloca-O;m .vpbks,fxntdCeghiry";

static const char *LYRICS =
"@n’+,#’/" /* On the */
"*{}w+/" /* first */
"w#cdnr/" /* second */
"+,{}r/" /* third */
"*de}+,/" /* fourth */
"*{*+,/" /* fifth */
"w{%+,/" /* sixth */
"w#q#n+,/" /* seventh */
"#{l+,/" /* eighth */
"n{n+,/" /* ninth */
"+#n+,/" /* tenth */
"#;#q#n+,/" /* eleventh */
"+k#;*+,/" /* twelfth */

/* day of Christmas my true love gave to me\n */
"’r :’d*’3,}{w+K w’K:’+}e#’;dq#’l q#’+d’K#!/"

/* twelve drummers drumming, */
"+k#;q#’r}eKK#}w’r}eKK{nl]’/"

/* eleven pipers piping, */
"#;#q#n’){)#}w’){){nl]’/"

/* ten lords a-leaping,\n */
"+#n’;d}rw’ i;# ){nl]!/"

/* nine ladies dancing, */
"n{n#’; r{#w’r nc{nl]’/"

/* eight maids a-milking, */
"#{l,+’K {rw’ iK{;[{nl]’/"

/* seven swans a-swimming,\n */
"w#q#n’wk nw’ iwk{KK{nl]!/"

/* six geese a-laying, */
"w{%’l##w#’ i; :{nl]’/"

/* five gold rings;\n */
"*{q#’ld;r’}{nlwb!/"

/* four calling birds, */
"*de}’c ;;{nl’-{}rw]’/"

/* three french hens, */
"+,}##’*}#nc,’,#nw]’/"

/* two turtle doves\nand */
"+kd’+e}+;#’rdq#w! nr’/"

/* a partridge in a pear tree */
" ‘) }+}{rl#'{n’ ‘)# }’+}##(!!/";

static int foo(int t, int b, const char *a)
{
int r;
if (!0 < t)
{
if (t < 3)
{
foo(-79, -13, a + foo(-87, 1 – b, foo(-86, 0, a + 1) + a));
}
if (t < b)
{
foo(t + 1, b, a);
}
if (foo(-94, -27 + t, a) && (t == 2))
{
if (b < 13)
{
r = foo(2, b + 1, 0);
}
else
{
r = 9;
}
}
else
{
r = 16;
}
}
else
{
if (t < 0)
{
if (t < -72)
{
r = foo(b, t, LYRICS);
}
else
{
if (t < -50)
{
if (b == *a)
{
r = putchar(a[31]);
}
else
{
r = foo(-65, b, a + 1);
}
}
else
{
r = foo((*a == ‘/’) + t, b, a + 1);
}
}
}
else
{
if (0 < t)
{
r = foo(2, 2, 0);
}
else
{
r = ((*a == ‘/’) || (foo(0, foo(-61, *a, CIPHER), a + 1)));
}
}
}
return r;
}

int main(int argc, char *argv[])
{
return foo(1, 0, 0);
}
[/code]

There are plenty of changes that could still be made to improve readability, such as more comments and replacing the recursive algorithm with iterative loops… but I’ve made my promised twelve improvements, leaving you with plenty of opportunity to make some of your own.

[1] At the time of writing this blog, the original code can be found on the International Obfuscated C Code Contest website on the “Winning Entries” web page, listed as “phillipps.c” under the 1988 sub-heading.

[2] Credit goes to an analysis by Mike Markowski and to a similar exercise in reformatting the source code by Michael Nahas.

Further Reading

Twelve Ways to Style The Twelve Days

Twelve Ways to Style The Twelve Days

Kevin Jones - Senior Consultant, Embedded Systems

By: Kevin Jones
Senior Consultant, Embedded Systems

10th January 2018

Home » Insights » Processing & Programming

During the Christmas period, I was reminded of a now-famous piece of obfuscated C code. This was the winning entry in the 1988 International Obfuscated C Code Contest [see reference 1] and, believe it or not, will print the entire lyrics to “The Twelve Days of Christmas”:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save


#include <stdio.h>
main(t,_,a)
char *a;
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,
"@n'+,#'/*{}w+/w#cdnr/+,{}r/*de}+,/*{*+,/w{%+,/w#q#n+,/#{l+,/n{n+,/+#n+,/#\
;#q#n+,/+k#;*+,/'r :'d*'3,}{w+K w'K:'+}e#';dq#'l \
q#'+d'K#!/+k#;q#'r}eKK#}w'r}eKK{nl]'/#;#q#n'){)#}w'){){nl]'/+#n';d}rw' i;# \
){nl]!/n{n#'; r{#w'r nc{nl]'/#{l,+'K {rw' iK{;[{nl]'/w#q#n'wk nw' \
iwk{KK{nl]!/w{%'l##w#' i; :{nl]'/*{q#'ld;r'}{nlwb!/*de}'c \
;;{nl'-{}rw]'/+,}##'*}#nc,',#nw]'/+kd'+e}+;#'rdq#w! nr'/ ') }+}{rl#'{n' ')# \
}'+}##(!!/")
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a=='/')+t,_,a+1)
:0<t?main(2,2,"%s"):*a=='/'||main(0,main(-61,*a,
"!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:\nuwloca-O;m .vpbks,fxntdCeghiry"),a+1);
}

There are plenty of articles online that explain how this works; just use your favourite search engine to find out more [see reference 2]. Rather than repeat those descriptions, I’ll stick to the festive theme that prompted this blog and illustrate twelve ways to improve the readability of this wonderful algorithm.

1) Extract the Constant Strings

There are two strings lurking in the source code. The longer string contains an encoded version of the song lyrics and the shorter string contains a substitution cipher to decode the lyrics. The source code is immediately much clearer if these strings are declared near the beginning of the file. In the following example, “static” limits the scope of the strings to one source file and “const” declares them as read-only. I like to use uppercase symbols for constants but that’s nothing more than personal choice. Here’s the modified source code:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

[c]
#include

static const char *CIPHER =
"!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:\nuwloca-O;m .vpbks,fxntdCeghiry";

static const char *LYRICS =
"@n’+,#’/*{}w+/w#cdnr/+,{}r/*de}+,/*{*+,/w{%+,/w#q#n+,/#{l+,/n{n+,/+#n+,/#\
;#q#n+,/+k#;*+,/’r :’d*’3,}{w+K w’K:’+}e#’;dq#’l \
q#’+d’K#!/+k#;q#’r}eKK#}w’r}eKK{nl]’/#;#q#n’){)#}w’){){nl]’/+#n’;d}rw’ i;# \
){nl]!/n{n#’; r{#w’r nc{nl]’/#{l,+’K {rw’ iK{;[{nl]’/w#q#n’wk nw’ \
iwk{KK{nl]!/w{%’l##w#’ i; :{nl]’/*{q#’ld;r’}{nlwb!/*de}’c \
;;{nl’-{}rw]’/+,}##’*}#nc,’,#nw]’/+kd’+e}+;#’rdq#w! nr’/ ‘) }+}{rl#'{n’ ‘)# \
}’+}##(!!/";

main(t,_,a)
char *a;
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,LYRICS)
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a==’/’)+t,_,a+1)
:0<t?main(2,2,"%s"):*a==’/’||main(0,main(-61,*a,CIPHER),a+1);
}

[/c]

2) Reformat the Strings

The two strings are recursively indexed to decode and display the lyrics. Reformatting these strings helps to understand the algorithm. In the next step the lyrics and the cipher are declared using C’s alternative string concatenation syntax. The substitution cipher is split after the 31st character, the lyrics are split at each “/” character and the original code’s “\” concatenation in the lyrics string has been eliminated. There’s also some indentation to improve readability:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

[c]
#include

static const char *CIPHER =
&quot;!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:&quot;
&quot;\nuwloca-O;m .vpbks,fxntdCeghiry&quot;;

static const char *LYRICS =
&quot;@n’+,#’/&quot;
&quot;*{}w+/&quot;
&quot;w#cdnr/&quot;
&quot;+,{}r/&quot;
&quot;*de}+,/&quot;
&quot;*{*+,/&quot;
&quot;w{%+,/&quot;
&quot;w#q#n+,/&quot;
&quot;#{l+,/&quot;
&quot;n{n+,/&quot;
&quot;+#n+,/&quot;
&quot;#;#q#n+,/&quot;
&quot;+k#;*+,/&quot;
&quot;’r :’d*’3,}{w+K w’K:’+}e#’;dq#’l q#’+d’K#!/&quot;
&quot;+k#;q#’r}eKK#}w’r}eKK{nl]’/&quot;
&quot;#;#q#n’){)#}w’){){nl]’/&quot;
&quot;+#n’;d}rw’ i;# ){nl]!/&quot;
&quot;n{n#’; r{#w’r nc{nl]’/&quot;
&quot;#{l,+’K {rw’ iK{;[{nl]’/&quot;
&quot;w#q#n’wk nw’ iwk{KK{nl]!/&quot;
&quot;w{%’l##w#’ i; :{nl]’/&quot;
&quot;*{q#’ld;r’}{nlwb!/&quot;
&quot;*de}’c ;;{nl’-{}rw]’/&quot;
&quot;+,}##’*}#nc,’,#nw]’/&quot;
&quot;+kd’+e}+;#’rdq#w! nr’/&quot;
&quot; ‘) }+}{rl#'{n’ ‘)# }’+}##(!!/&quot;;

main(t,_,a)
char *a;
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,&quot;%s %d %d\n&quot;):9:16:t<0?t<-72?main(_,t,LYRICS)
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a==’/’)+t,_,a+1)
:0<t?main(2,2,&quot;%s&quot;):*a==’/’||main(0,main(-61,*a,CIPHER),a+1);
}

[/c]

3) Add Some Comments

All good programmers know that comments are useful. Annotating the constant strings immediately helps understand the algorithm even before any changes are made to the function. I prefer block comments for maximum portability but end-of-line comments would work just as well:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

[c]
static const char *CIPHER =
/*
* Substitution cipher:
* Replace ‘!’ with ‘\n’
* Replace ‘e’ with ‘u’
* …
* Replace ‘:’ with ‘y’
*/
&quot;!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:&quot;
&quot;\nuwloca-O;m .vpbks,fxntdCeghiry&quot;;

static const char *LYRICS =
&quot;@n’+,#’/&quot; /* On the */
&quot;*{}w+/&quot; /* first */
&quot;w#cdnr/&quot; /* second */
&quot;+,{}r/&quot; /* third */
&quot;*de}+,/&quot; /* fourth */
&quot;*{*+,/&quot; /* fifth */
&quot;w{%+,/&quot; /* sixth */
&quot;w#q#n+,/&quot; /* seventh */
&quot;#{l+,/&quot; /* eighth */
&quot;n{n+,/&quot; /* ninth */
&quot;+#n+,/&quot; /* tenth */
&quot;#;#q#n+,/&quot; /* eleventh */
&quot;+k#;*+,/&quot; /* twelfth */

/* day of Christmas my true love gave to me\n */
&quot;’r :’d*’3,}{w+K w’K:’+}e#’;dq#’l q#’+d’K#!/&quot;

/* twelve drummers drumming, */
&quot;+k#;q#’r}eKK#}w’r}eKK{nl]’/&quot;

/* eleven pipers piping, */
&quot;#;#q#n’){)#}w’){){nl]’/&quot;

/* ten lords a-leaping,\n */
&quot;+#n’;d}rw’ i;# ){nl]!/&quot;

/* nine ladies dancing, */
&quot;n{n#’; r{#w’r nc{nl]’/&quot;

/* eight maids a-milking, */
&quot;#{l,+’K {rw’ iK{;[{nl]’/&quot;

/* seven swans a-swimming,\n */
&quot;w#q#n’wk nw’ iwk{KK{nl]!/&quot;

/* six geese a-laying, */
&quot;w{%’l##w#’ i; :{nl]’/&quot;

/* five gold rings;\n */
&quot;*{q#’ld;r’}{nlwb!/&quot;

/* four calling birds, */
&quot;*de}’c ;;{nl’-{}rw]’/&quot;

/* three french hens, */
&quot;+,}##’*}#nc,’,#nw]’/&quot;

/* two turtle doves\nand */
&quot;+kd’+e}+;#’rdq#w! nr’/&quot;

/* a partridge in a pear tree */
&quot; ‘) }+}{rl#'{n’ ‘)# }’+}##(!!/&quot;;

[/c]

4) Indent the Function Contents

This step is straight forward… simply indent the contents of the single function:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

[c]
main(t,_,a)
char *a;
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,LYRICS)
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a==’/’)+t,_,a+1)
:0<t?main(2,2,"%s"):*a==’/’||main(0,main(-61,*a,CIPHER),a+1);
}

[/c]

5) Eliminate Implicit Types

If you’re compiling the examples then you may have noticed that there are plenty of implicit type warnings. The main function uses an old C syntax to declare parameter “a” as pointer-to-char. The remaining parameters are implicit integers and the return type of the main function is also an implicit integer. These are fixed in the next sample while taking the opportunity to include the “const” type qualifier:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

[c]
int main(int t, int _, const char *a)
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,LYRICS)
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a==’/’)+t,_,a+1)
:0<t?main(2,2,"%s"):*a==’/’||main(0,main(-61,*a,CIPHER),a+1);
}

[/c]

6) Rename the Underscore Parameter

The second parameter of the main function is a single underscore. While this is perfectly legal in C, it’s a lot harder to read than an alphabetic character. The next sample has replaced all instances of the underscore parameter with the single character “b”:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

[c]
int main(int t, int b, const char *a)
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-b,main(-86,0,a+1)+a)):
1,t<b?main(t+1,b,a):3,main(-94,-27+t,a)&&t==2?b<13?
main(2,b+1,"%s %d %d\n"):9:16:t<0?t<-72?main(b,t,LYRICS)
:t<-50?b==*a?putchar(31[a]):main(-65,b,a+1):main((*a==’/’)+t,b,a+1)
:0<t?main(2,2,"%s"):*a==’/’||main(0,main(-61,*a,CIPHER),a+1);
}

[/c]

7) Reformat the Ternary Conditional Operator

The obfuscated implementation makes extensive use of the ternary conditional operator. The next sample reformats these ternary operators with some additional whitespace:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

[c]
int main(int t, int b, const char *a)
{
return
!0

<pre> [/c]

8) Replace the Ternary Conditional Statements with If-Else

A simple replacement of the ternary conditional statements with if-else statements improves the presentation of the algorithm. In the following sample, there is yet more white space and the multiple return points have been rationalised into a single point of return by using a temporary integer variable “r”:

[c]
int main(int t, int b, const char *a)
{
int r;
if (!0 < t)
{
if (t < 3)
{
main(-79, -13, a + main(-87, 1 – b, main(-86, 0, a + 1) + a));
}
else
{
1;
}
if (t < b)
{
main(t + 1, b, a);
}
else
{
3;
}
if (main(-94, -27 + t, a) && t == 2)
{
if (b < 13)
{
r = main(2, b + 1, "%s %d %d\n");
}
else
{
r = 9;
}
}
else
{
r = 16;
}
}
else
{
if (t < 0)
{
if (t < -72)
{
r = main(b, t, LYRICS);
}
else
{
if (t < -50)
{
if (b == *a)
{
r = putchar(31[a]);
}
else
{
r = main(-65, b, a + 1);
}
}
else
{
r = main((*a == ‘/’) + t, b, a + 1);
}
}
}
else
{
if (0 < t)
{
r = main(2, 2, "%s");
}
else
{
r = *a == ‘/’ || main(0, main(-61, *a, CIPHER), a + 1);
}
}
}
return r;
}

[/c]

9) Simplify an Array Access

This change is straight forward. The array access “31[a]” is exactly the same as “a[31]” yet the latter is much easier to understand in the context of the 31 character substitution cipher:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

[c]

r = putchar(a[31]);

[/c]

10) Add Brackets to Logical Expressions

Personally, I prefer plenty of brackets in compound logical expressions. The following snips are small improvements:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

[c]

if (main(-94, -27 + t, a) && (t == 2))


r = ((*a == ‘/’) || (main(0, main(-61, *a, CIPHER), a + 1)));

[/c]

11) Remove Misdirection

There are several instances where the implementation is deliberately misleading. Given the reformatted source code so far, it’s easy to see that there are two “else” clauses with no side effects (“else {1;}” and “else {3;}”). Less obvious is that the parameters that look like string formatting specifiers passed to the main function are unused in that iteration of the function and can be replaced with zero:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

[c]

if (b < 13)
{
r = main(2, b + 1, 0);
}


r = main(2, 2, 0);

[/c]

12) Implement a Stand-Alone Recursive Function

The entire algorithm so far uses the single “main” function in an unusual recursive manner. After the last of the twelve changes, the main function looks like a standard C main function and there’s a separate local-scope (static) recursive function which, in time-honoured programmer’s tradition, is called “foo”. This gives the final complete sample after all twelve changes:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

[c]
#include

static const char *CIPHER =
/*
* Substitution cipher:
* Replace ‘!’ with ‘\n’
* Replace ‘e’ with ‘u’
* …
* Replace ‘:’ with ‘y’
*/
"!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:"
"\nuwloca-O;m .vpbks,fxntdCeghiry";

static const char *LYRICS =
"@n’+,#’/" /* On the */
"*{}w+/" /* first */
"w#cdnr/" /* second */
"+,{}r/" /* third */
"*de}+,/" /* fourth */
"*{*+,/" /* fifth */
"w{%+,/" /* sixth */
"w#q#n+,/" /* seventh */
"#{l+,/" /* eighth */
"n{n+,/" /* ninth */
"+#n+,/" /* tenth */
"#;#q#n+,/" /* eleventh */
"+k#;*+,/" /* twelfth */

/* day of Christmas my true love gave to me\n */
"’r :’d*’3,}{w+K w’K:’+}e#’;dq#’l q#’+d’K#!/"

/* twelve drummers drumming, */
"+k#;q#’r}eKK#}w’r}eKK{nl]’/"

/* eleven pipers piping, */
"#;#q#n’){)#}w’){){nl]’/"

/* ten lords a-leaping,\n */
"+#n’;d}rw’ i;# ){nl]!/"

/* nine ladies dancing, */
"n{n#’; r{#w’r nc{nl]’/"

/* eight maids a-milking, */
"#{l,+’K {rw’ iK{;[{nl]’/"

/* seven swans a-swimming,\n */
"w#q#n’wk nw’ iwk{KK{nl]!/"

/* six geese a-laying, */
"w{%’l##w#’ i; :{nl]’/"

/* five gold rings;\n */
"*{q#’ld;r’}{nlwb!/"

/* four calling birds, */
"*de}’c ;;{nl’-{}rw]’/"

/* three french hens, */
"+,}##’*}#nc,’,#nw]’/"

/* two turtle doves\nand */
"+kd’+e}+;#’rdq#w! nr’/"

/* a partridge in a pear tree */
" ‘) }+}{rl#'{n’ ‘)# }’+}##(!!/";

static int foo(int t, int b, const char *a)
{
int r;
if (!0 < t)
{
if (t < 3)
{
foo(-79, -13, a + foo(-87, 1 – b, foo(-86, 0, a + 1) + a));
}
if (t < b)
{
foo(t + 1, b, a);
}
if (foo(-94, -27 + t, a) && (t == 2))
{
if (b < 13)
{
r = foo(2, b + 1, 0);
}
else
{
r = 9;
}
}
else
{
r = 16;
}
}
else
{
if (t < 0)
{
if (t < -72)
{
r = foo(b, t, LYRICS);
}
else
{
if (t < -50)
{
if (b == *a)
{
r = putchar(a[31]);
}
else
{
r = foo(-65, b, a + 1);
}
}
else
{
r = foo((*a == ‘/’) + t, b, a + 1);
}
}
}
else
{
if (0 < t)
{
r = foo(2, 2, 0);
}
else
{
r = ((*a == ‘/’) || (foo(0, foo(-61, *a, CIPHER), a + 1)));
}
}
}
return r;
}

int main(int argc, char *argv[])
{
return foo(1, 0, 0);
}

[/c]

There are plenty of changes that could still be made to improve readability, such as more comments and replacing the recursive algorithm with iterative loops… but I’ve made my promised twelve improvements, leaving you with plenty of opportunity to make some of your own.





[1] At the time of writing this blog, the original code can be found on the International Obfuscated C Code Contest website on the “Winning Entries” web page, listed as “phillipps.c” under the 1988 sub-heading.

[2] Credit goes to an analysis by Mike Markowski and to a similar exercise in reformatting the source code by Michael Nahas.

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

During the Christmas period, I was reminded of a now-famous piece of obfuscated C code. This was the winning entry in the 1988 International Obfuscated C Code Contest [see reference 1] and, believe it or not, will print the entire lyrics to “The Twelve Days of Christmas”:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save


#include <stdio.h>
main(t,_,a)
char *a;
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,"%s %d %d\n"):9:16:t<0?t<-72?main(_,t,
"@n'+,#'/*{}w+/w#cdnr/+,{}r/*de}+,/*{*+,/w{%+,/w#q#n+,/#{l+,/n{n+,/+#n+,/#\
;#q#n+,/+k#;*+,/'r :'d*'3,}{w+K w'K:'+}e#';dq#'l \
q#'+d'K#!/+k#;q#'r}eKK#}w'r}eKK{nl]'/#;#q#n'){)#}w'){){nl]'/+#n';d}rw' i;# \
){nl]!/n{n#'; r{#w'r nc{nl]'/#{l,+'K {rw' iK{;[{nl]'/w#q#n'wk nw' \
iwk{KK{nl]!/w{%'l##w#' i; :{nl]'/*{q#'ld;r'}{nlwb!/*de}'c \
;;{nl'-{}rw]'/+,}##'*}#nc,',#nw]'/+kd'+e}+;#'rdq#w! nr'/ ') }+}{rl#'{n' ')# \
}'+}##(!!/")
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a=='/')+t,_,a+1)
:0<t?main(2,2,"%s"):*a=='/'||main(0,main(-61,*a,
"!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:\nuwloca-O;m .vpbks,fxntdCeghiry"),a+1);
}

There are plenty of articles online that explain how this works; just use your favourite search engine to find out more [see reference 2]. Rather than repeat those descriptions, I’ll stick to the festive theme that prompted this blog and illustrate twelve ways to improve the readability of this wonderful algorithm.

1) Extract the Constant Strings

There are two strings lurking in the source code. The longer string contains an encoded version of the song lyrics and the shorter string contains a substitution cipher to decode the lyrics. The source code is immediately much clearer if these strings are declared near the beginning of the file. In the following example, “static” limits the scope of the strings to one source file and “const” declares them as read-only. I like to use uppercase symbols for constants but that’s nothing more than personal choice. Here’s the modified source code:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

[c]
#include <stdio.h>

static const char *CIPHER =
&quot;!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:\nuwloca-O;m .vpbks,fxntdCeghiry&quot;;

static const char *LYRICS =
&quot;@n’+,#’/*{}w+/w#cdnr/+,{}r/*de}+,/*{*+,/w{%+,/w#q#n+,/#{l+,/n{n+,/+#n+,/#\
;#q#n+,/+k#;*+,/’r :’d*’3,}{w+K w’K:’+}e#’;dq#’l \
q#’+d’K#!/+k#;q#’r}eKK#}w’r}eKK{nl]’/#;#q#n’){)#}w’){){nl]’/+#n’;d}rw’ i;# \
){nl]!/n{n#’; r{#w’r nc{nl]’/#{l,+’K {rw’ iK{;[{nl]’/w#q#n’wk nw’ \
iwk{KK{nl]!/w{%’l##w#’ i; :{nl]’/*{q#’ld;r’}{nlwb!/*de}’c \
;;{nl’-{}rw]’/+,}##’*}#nc,’,#nw]’/+kd’+e}+;#’rdq#w! nr’/ ‘) }+}{rl#'{n’ ‘)# \
}’+}##(!!/&quot;;

main(t,_,a)
char *a;
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,&quot;%s %d %d\n&quot;):9:16:t<0?t<-72?main(_,t,LYRICS)
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a==’/’)+t,_,a+1)
:0<t?main(2,2,&quot;%s&quot;):*a==’/’||main(0,main(-61,*a,CIPHER),a+1);
}

[/c]

2) Reformat the Strings

The two strings are recursively indexed to decode and display the lyrics. Reformatting these strings helps to understand the algorithm. In the next step the lyrics and the cipher are declared using C’s alternative string concatenation syntax. The substitution cipher is split after the 31st character, the lyrics are split at each “/” character and the original code’s “\” concatenation in the lyrics string has been eliminated. There’s also some indentation to improve readability:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

[c]
#include

static const char *CIPHER =
&quot;!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:&quot;
&quot;\nuwloca-O;m .vpbks,fxntdCeghiry&quot;;

static const char *LYRICS =
&quot;@n’+,#’/&quot;
&quot;*{}w+/&quot;
&quot;w#cdnr/&quot;
&quot;+,{}r/&quot;
&quot;*de}+,/&quot;
&quot;*{*+,/&quot;
&quot;w{%+,/&quot;
&quot;w#q#n+,/&quot;
&quot;#{l+,/&quot;
&quot;n{n+,/&quot;
&quot;+#n+,/&quot;
&quot;#;#q#n+,/&quot;
&quot;+k#;*+,/&quot;
&quot;’r :’d*’3,}{w+K w’K:’+}e#’;dq#’l q#’+d’K#!/&quot;
&quot;+k#;q#’r}eKK#}w’r}eKK{nl]’/&quot;
&quot;#;#q#n’){)#}w’){){nl]’/&quot;
&quot;+#n’;d}rw’ i;# ){nl]!/&quot;
&quot;n{n#’; r{#w’r nc{nl]’/&quot;
&quot;#{l,+’K {rw’ iK{;[{nl]’/&quot;
&quot;w#q#n’wk nw’ iwk{KK{nl]!/&quot;
&quot;w{%’l##w#’ i; :{nl]’/&quot;
&quot;*{q#’ld;r’}{nlwb!/&quot;
&quot;*de}’c ;;{nl’-{}rw]’/&quot;
&quot;+,}##’*}#nc,’,#nw]’/&quot;
&quot;+kd’+e}+;#’rdq#w! nr’/&quot;
&quot; ‘) }+}{rl#'{n’ ‘)# }’+}##(!!/&quot;;

main(t,_,a)
char *a;
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,&quot;%s %d %d\n&quot;):9:16:t<0?t<-72?main(_,t,LYRICS)
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a==’/’)+t,_,a+1)
:0<t?main(2,2,&quot;%s&quot;):*a==’/’||main(0,main(-61,*a,CIPHER),a+1);
}

[/c]

3) Add Some Comments

All good programmers know that comments are useful. Annotating the constant strings immediately helps understand the algorithm even before any changes are made to the function. I prefer block comments for maximum portability but end-of-line comments would work just as well:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

[c]
static const char *CIPHER =
/*
* Substitution cipher:
* Replace ‘!’ with ‘\n’
* Replace ‘e’ with ‘u’
* …
* Replace ‘:’ with ‘y’
*/
&quot;!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:&quot;
&quot;\nuwloca-O;m .vpbks,fxntdCeghiry&quot;;

static const char *LYRICS =
&quot;@n’+,#’/&quot; /* On the */
&quot;*{}w+/&quot; /* first */
&quot;w#cdnr/&quot; /* second */
&quot;+,{}r/&quot; /* third */
&quot;*de}+,/&quot; /* fourth */
&quot;*{*+,/&quot; /* fifth */
&quot;w{%+,/&quot; /* sixth */
&quot;w#q#n+,/&quot; /* seventh */
&quot;#{l+,/&quot; /* eighth */
&quot;n{n+,/&quot; /* ninth */
&quot;+#n+,/&quot; /* tenth */
&quot;#;#q#n+,/&quot; /* eleventh */
&quot;+k#;*+,/&quot; /* twelfth */

/* day of Christmas my true love gave to me\n */
&quot;’r :’d*’3,}{w+K w’K:’+}e#’;dq#’l q#’+d’K#!/&quot;

/* twelve drummers drumming, */
&quot;+k#;q#’r}eKK#}w’r}eKK{nl]’/&quot;

/* eleven pipers piping, */
&quot;#;#q#n’){)#}w’){){nl]’/&quot;

/* ten lords a-leaping,\n */
&quot;+#n’;d}rw’ i;# ){nl]!/&quot;

/* nine ladies dancing, */
&quot;n{n#’; r{#w’r nc{nl]’/&quot;

/* eight maids a-milking, */
&quot;#{l,+’K {rw’ iK{;[{nl]’/&quot;

/* seven swans a-swimming,\n */
&quot;w#q#n’wk nw’ iwk{KK{nl]!/&quot;

/* six geese a-laying, */
&quot;w{%’l##w#’ i; :{nl]’/&quot;

/* five gold rings;\n */
&quot;*{q#’ld;r’}{nlwb!/&quot;

/* four calling birds, */
&quot;*de}’c ;;{nl’-{}rw]’/&quot;

/* three french hens, */
&quot;+,}##’*}#nc,’,#nw]’/&quot;

/* two turtle doves\nand */
&quot;+kd’+e}+;#’rdq#w! nr’/&quot;

/* a partridge in a pear tree */
&quot; ‘) }+}{rl#'{n’ ‘)# }’+}##(!!/&quot;;

[/c]

4) Indent the Function Contents

This step is straight forward… simply indent the contents of the single function:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

[c]
main(t,_,a)
char *a;
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,&quot;%s %d %d\n&quot;):9:16:t<0?t<-72?main(_,t,LYRICS)
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a==’/’)+t,_,a+1)
:0<t?main(2,2,&quot;%s&quot;):*a==’/’||main(0,main(-61,*a,CIPHER),a+1);
}

[/c]

5) Eliminate Implicit Types

If you’re compiling the examples then you may have noticed that there are plenty of implicit type warnings. The main function uses an old C syntax to declare parameter “a” as pointer-to-char. The remaining parameters are implicit integers and the return type of the main function is also an implicit integer. These are fixed in the next sample while taking the opportunity to include the “const” type qualifier:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

[c]
int main(int t, int _, const char *a)
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-_,main(-86,0,a+1)+a)):
1,t<_?main(t+1,_,a):3,main(-94,-27+t,a)&&t==2?_<13?
main(2,_+1,&quot;%s %d %d\n&quot;):9:16:t<0?t<-72?main(_,t,LYRICS)
:t<-50?_==*a?putchar(31[a]):main(-65,_,a+1):main((*a==’/’)+t,_,a+1)
:0<t?main(2,2,&quot;%s&quot;):*a==’/’||main(0,main(-61,*a,CIPHER),a+1);
}

[/c]

6) Rename the Underscore Parameter

The second parameter of the main function is a single underscore. While this is perfectly legal in C, it’s a lot harder to read than an alphabetic character. The next sample has replaced all instances of the underscore parameter with the single character “b”:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

[c]
int main(int t, int b, const char *a)
{
return!0<t?t<3?main(-79,-13,a+main(-87,1-b,main(-86,0,a+1)+a)):
1,t<b?main(t+1,b,a):3,main(-94,-27+t,a)&&t==2?b<13?
main(2,b+1,&quot;%s %d %d\n&quot;):9:16:t<0?t<-72?main(b,t,LYRICS)
:t<-50?b==*a?putchar(31[a]):main(-65,b,a+1):main((*a==’/’)+t,b,a+1)
:0<t?main(2,2,&quot;%s&quot;):*a==’/’||main(0,main(-61,*a,CIPHER),a+1);
}

[/c]

7) Reformat the Ternary Conditional Operator

The obfuscated implementation makes extensive use of the ternary conditional operator. The next sample reformats these ternary operators with some additional whitespace:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

[c]
int main(int t, int b, const char *a)
{
return
!0<t
?
t<3
?
main(-79,-13,a+main(-87,1-b,main(-86,0,a+1)+a))
:
1, t<b
?
main(t+1,b,a)
:
3, main(-94,-27+t,a)&&t==2
?
b<13
?
main(2,b+1,&quot;%s %d %d\n&quot;)
:
9
:
16
:
t<0
?
t<-72
?
main(b,t,LYRICS)
:
t<-50
?
b==*a
?
putchar(31[a])
:
main(-65,b,a+1)
:
main((*a==’/’)+t,b,a+1)
:
0<t
?
main(2,2,&quot;%s&quot;)
:
*a==’/’||main(0,main(-61,*a,CIPHER),a+1);
}

[/c]

8) Replace the Ternary Conditional Statements with If-Else

A simple replacement of the ternary conditional statements with if-else statements improves the presentation of the algorithm. In the following sample, there is yet more white space and the multiple return points have been rationalised into a single point of return by using a temporary integer variable “r”:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

[c]
int main(int t, int b, const char *a)
{
int r;
if (!0 < t)
{
if (t < 3)
{
main(-79, -13, a + main(-87, 1 – b, main(-86, 0, a + 1) + a));
}
else
{
1;
}
if (t < b)
{
main(t + 1, b, a);
}
else
{
3;
}
if (main(-94, -27 + t, a) && t == 2)
{
if (b < 13)
{
r = main(2, b + 1, &quot;%s %d %d\n&quot;);
}
else
{
r = 9;
}
}
else
{
r = 16;
}
}
else
{
if (t < 0)
{
if (t < -72)
{
r = main(b, t, LYRICS);
}
else
{
if (t < -50)
{
if (b == *a)
{
r = putchar(31[a]);
}
else
{
r = main(-65, b, a + 1);
}
}
else
{
r = main((*a == ‘/’) + t, b, a + 1);
}
}
}
else
{
if (0 < t)
{
r = main(2, 2, &quot;%s&quot;);
}
else
{
r = *a == ‘/’ || main(0, main(-61, *a, CIPHER), a + 1);
}
}
}
return r;
}

[/c]

9) Simplify an Array Access

This change is straight forward. The array access “31[a]” is exactly the same as “a[31]” yet the latter is much easier to understand in the context of the 31 character substitution cipher:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

[c]

r = putchar(a[31]);

[/c]

10) Add Brackets to Logical Expressions

Personally, I prefer plenty of brackets in compound logical expressions. The following snips are small improvements:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

[c]

if (main(-94, -27 + t, a) && (t == 2))


r = ((*a == ‘/’) || (main(0, main(-61, *a, CIPHER), a + 1)));

[/c]

11) Remove Misdirection

There are several instances where the implementation is deliberately misleading. Given the reformatted source code so far, it’s easy to see that there are two “else” clauses with no side effects (“else {1;}” and “else {3;}”). Less obvious is that the parameters that look like string formatting specifiers passed to the main function are unused in that iteration of the function and can be replaced with zero:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

[c]

if (b < 13)
{
r = main(2, b + 1, 0);
}


r = main(2, 2, 0);

[/c]

12) Implement a Stand-Alone Recursive Function

The entire algorithm so far uses the single “main” function in an unusual recursive manner. After the last of the twelve changes, the main function looks like a standard C main function and there’s a separate local-scope (static) recursive function which, in time-honoured programmer’s tradition, is called “foo”. This gives the final complete sample after all twelve changes:

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

[c]
#include

static const char *CIPHER =
/*
* Substitution cipher:
* Replace ‘!’ with ‘\n’
* Replace ‘e’ with ‘u’
* …
* Replace ‘:’ with ‘y’
*/
&quot;!ek;dc i@bK'(q)-[w]*%n+r3#l,{}:&quot;
&quot;\nuwloca-O;m .vpbks,fxntdCeghiry&quot;;

static const char *LYRICS =
&quot;@n’+,#’/&quot; /* On the */
&quot;*{}w+/&quot; /* first */
&quot;w#cdnr/&quot; /* second */
&quot;+,{}r/&quot; /* third */
&quot;*de}+,/&quot; /* fourth */
&quot;*{*+,/&quot; /* fifth */
&quot;w{%+,/&quot; /* sixth */
&quot;w#q#n+,/&quot; /* seventh */
&quot;#{l+,/&quot; /* eighth */
&quot;n{n+,/&quot; /* ninth */
&quot;+#n+,/&quot; /* tenth */
&quot;#;#q#n+,/&quot; /* eleventh */
&quot;+k#;*+,/&quot; /* twelfth */

/* day of Christmas my true love gave to me\n */
&quot;’r :’d*’3,}{w+K w’K:’+}e#’;dq#’l q#’+d’K#!/&quot;

/* twelve drummers drumming, */
&quot;+k#;q#’r}eKK#}w’r}eKK{nl]’/&quot;

/* eleven pipers piping, */
&quot;#;#q#n’){)#}w’){){nl]’/&quot;

/* ten lords a-leaping,\n */
&quot;+#n’;d}rw’ i;# ){nl]!/&quot;

/* nine ladies dancing, */
&quot;n{n#’; r{#w’r nc{nl]’/&quot;

/* eight maids a-milking, */
&quot;#{l,+’K {rw’ iK{;[{nl]’/&quot;

/* seven swans a-swimming,\n */
&quot;w#q#n’wk nw’ iwk{KK{nl]!/&quot;

/* six geese a-laying, */
&quot;w{%’l##w#’ i; :{nl]’/&quot;

/* five gold rings;\n */
&quot;*{q#’ld;r’}{nlwb!/&quot;

/* four calling birds, */
&quot;*de}’c ;;{nl’-{}rw]’/&quot;

/* three french hens, */
&quot;+,}##’*}#nc,’,#nw]’/&quot;

/* two turtle doves\nand */
&quot;+kd’+e}+;#’rdq#w! nr’/&quot;

/* a partridge in a pear tree */
&quot; ‘) }+}{rl#'{n’ ‘)# }’+}##(!!/&quot;;

static int foo(int t, int b, const char *a)
{
int r;
if (!0 < t)
{
if (t < 3)
{
foo(-79, -13, a + foo(-87, 1 – b, foo(-86, 0, a + 1) + a));
}
if (t < b)
{
foo(t + 1, b, a);
}
if (foo(-94, -27 + t, a) && (t == 2))
{
if (b < 13)
{
r = foo(2, b + 1, 0);
}
else
{
r = 9;
}
}
else
{
r = 16;
}
}
else
{
if (t < 0)
{
if (t < -72)
{
r = foo(b, t, LYRICS);
}
else
{
if (t < -50)
{
if (b == *a)
{
r = putchar(a[31]);
}
else
{
r = foo(-65, b, a + 1);
}
}
else
{
r = foo((*a == ‘/’) + t, b, a + 1);
}
}
}
else
{
if (0 < t)
{
r = foo(2, 2, 0);
}
else
{
r = ((*a == ‘/’) || (foo(0, foo(-61, *a, CIPHER), a + 1)));
}
}
}
return r;
}

int main(int argc, char *argv[])
{
return foo(1, 0, 0);
}

[/c]

There are plenty of changes that could still be made to improve readability, such as more comments and replacing the recursive algorithm with iterative loops… but I’ve made my promised twelve improvements, leaving you with plenty of opportunity to make some of your own.





[1] At the time of writing this blog, the original code can be found on the International Obfuscated C Code Contest website on the “Winning Entries” web page, listed as “phillipps.c” under the 1988 sub-heading.

[2] Credit goes to an analysis by Mike Markowski and to a similar exercise in reformatting the source code by Michael Nahas.

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Further Reading

Programmable Systems on Modules: The ‘New Kid’ on the Embedded Block

Programmable Systems on Modules: The ‘New Kid’ on the Embedded Block

Phil Duff - Lead Consultant, Signal Processing

By: Phil Duff
Lead Consultant, Signal Processing

1st November 2017

Home » Insights » Processing & Programming

As an engineer specialising in high performance real-time embedded systems, I appreciate great ideas, but in the back of my mind I’m always asking “How can we make it a reality?” and “How can we get it off the drawing board?”

One of the technologies I have been an evangelist of is the new breed of programmable system on chip (PSoC) devices. These can often be used in the form of readily accessible Programmable System on Modules (SoM) but first; let’s quickly take a step back.

For more than 20 years of signal processing, I have seen logic-based signal processing combined with Central Processing Units (CPUs) to produce powerful solutions. All this is now delivered from a single central device.

The performance and capability are now both better than ever before and the power consumption is down, however, the complexity is still there. Regardless, these devices comprise of CPU cores and field-programmable gate array (FPGA) logic, both of which are highly programmable.

Fortunately, many vendors now provide SoMs to host these devices, together with their much-needed essentials, such as memory, power supplies, Ethernet and USB ports.

The SoM is hosted on a carrier board. The SoM vendors usually provide development boards which are very useful starting points. We typically start on one of these development boards before building a custom carrier that is specific to the product’s requirements. The PSoC and SoM also allow very flexible interfacing capabilities of programmable logic, interfacing so that almost anything can be realised.

I have seen some applications where the FPGA element is simply used to assist with interfacing, but in others it has been used for high-performance digital signal processing. The latter of which is my domain, and I thoroughly approve!

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

As an engineer specialising in high performance real-time embedded systems, I appreciate great ideas, but in the back of my mind I’m always asking “How can we make it a reality?” and “How can we get it off the drawing board?”

One of the technologies I have been an evangelist of is the new breed of programmable system on chip (PSoC) devices. These can often be used in the form of readily accessible Programmable System on Modules (SoM) but first; let’s quickly take a step back.

For more than 20 years of signal processing, I have seen logic-based signal processing combined with CPUs to produce powerful solutions. All this is now delivered from a single central device.

The performance and capability are now both better than ever before and the power consumption is down, however, the complexity is still there. Regardless, these devices comprise of CPU cores and field-programmable gate array (FPGA) logic, both of which are highly programmable.

Fortunately, many vendors now provide SoMs to host these devices, together with their much-needed essentials, such as memory, power supplies, Ethernet and USB ports.

The SoM is hosted on a carrier board. The SoM vendors usually provide development boards which are very useful starting points. We typically start on one of these development boards before building a custom carrier that is specific to the product’s requirements. The PSoC and SoM also allow very flexible interfacing capabilities of programmable logic, interfacing so that almost anything can be realised.

I have seen some applications where the FPGA element is simply used to assist with interfacing, but in others it has been used for high-performance digital signal processing. The latter of which is my domain, and I thoroughly approve!

Save

Save

Save

Save

Save

Save

Save

Save

Save

Further Reading

Neural Networks: Can Androids Recognise Electric Sheep?

Neural Networks: Can Androids Recognise Electric Sheep?

Damien Clarke - Senior Consultant, Data Exploitation

By: Damien Clarke
Senior Consultant, Data Exploitation

20th September 2017

Home » Insights » Processing & Programming

In 2010, Lt. Gen. David Deptula, the US Air Force deputy chief of staff for intelligence, was quoted as saying:

“We’re going to find ourselves, in the not too distant future, swimming in sensors and drowning in data.”

Since then, this flood of data is showing no signs of slowing. In fact, it is actually accelerating as greater volumes of data are being generated every day. This is just as true in the civilian context as the military context.

For organisations with access to these large volumes of data, it would be profitable to employ data exploitation techniques to convert the raw data into useful information. This can sometimes be achieved by developing custom data processing techniques for specific situations. However, in many cases, it is better to use machine learning techniques to allow computers to learn from data without being explicitly programmed. At Plextek, we’re passionate about developing and implementing the right data exploitation techniques for the application and are working to ensure that humanity stays afloat (and dry) in Deptula’s prediction.

There is a wide range of potential machine learning techniques to choose from, but one approach is to copy nature and mimic biological brains. This was inspired by the fact that one of the primary purposes of a brain is to process sensory inputs and extract useful information for future exploitation. A biological brain can be produced in software form by modelling a large set of connected neurons. This is an artificial neural network.

How does an artificial neural network work?

The basic building block of a neural network is a single neuron. A neuron transforms a set of one or more input values into a single output by applying a mathematical function to the weighted sum of input values. This output value is then passed to one or more connected neurons to be used as a subsequent input value.

The neural network as a whole can, therefore, be defined by three sets of parameters:

  The weight assigned to each input value for each neuron.

  The function which converts the weighted sum of input values into the output value.

  The pattern of connections between neurons.

A simple example neural network consists of three layers. The first layer contains the input values which represent the data being analysed. This layer is then connected to a hidden layer of neurons. The hidden layer then connects to the third and final layer which contains the output neurons whose values represent the processed data. This design allows a complicated relationship between inputs and outputs.



How is a neural network trained?

Just like biological brains, simply creating a neural network is not sufficient to extract information from raw data. It is also necessary to train the network by exposing it to data for which the desired outputs are already known. This process is used to define the weights assigned to each connection throughout the entire network.

As the size and complexity of the neural network increases, the number of weights that must be defined for optimum performance increases significantly. This training process, therefore, requires a large and representative set of labelled data; otherwise, the neural network may not work successfully on future input data. Also, this training process is computationally challenging and may take significant processing time to perform. GPU acceleration can be used to mitigate this; however, the process may still take days for very large data sets.  

Conversely, if large volumes of suitable training data are available, it is possible to create a more complex neural network to improve performance. This can be achieved by increasing the number of hidden layers and therefore the total number of connections within the network. This use of complex neural networks with many layers and connections is called deep learning.



What can neural networks be used for?

With a sufficiently large neural network and suitable training data, it is possible to learn complex non-linear relationships between input and output values. This can reveal insights into data which are not possible when using simple linear mathematical models.

While neural networks are suitable as general purpose problem solvers, they are particularly suited for tasks when an understanding of the underlying relationships in the raw data is neither available nor necessarily required and sufficient data is also available for training.

An important example of this capability is the recognition of objects in images. This is achieved through the use of a neural network which has been trained on a large volume of photos of known objects (e.g. ImageNet). While the training process can take a long time, subsequent object recognition is much faster and can potentially be performed in real time. Due to the large volume of training data and the complexity of the neural network used the resulting object recognition performance is close to human level performance. This can be used to in a military context to recognise different vehicles (e.g. a tank) or in a civilian context to see if computers can distinguish between different animals (Do Androids Dream of Electric Sheep?).



Neural networks are not just limited to processing photos and the same approach can be applied to a wide range of sensor and non-sensor data. The most important requirement is that a suitable volume of labelled training data is available to train the network before it can be used on unknown data.

Save

Save

Save

Save

Save

Save

Save

Save

Save

Save

In 2010, Lt. Gen. David Deptula, the US Air Force deputy chief of staff for intelligence, was quoted as saying:

“We’re going to find ourselves, in the not too distant future, swimming in sensors and drowning in data.”

Since then, this flood of data is showing no signs of slowing. In fact, it is actually accelerating as greater volumes of data are being generated every day. This is just as true in the civilian context as the military context.

For organisations with access to these large volumes of data, it would be profitable to employ data exploitation techniques to convert the raw data into useful information. This can sometimes be achieved by developing custom data processing techniques for specific situations. However, in many cases, it is better to use machine learning techniques to allow computers to learn from data without being explicitly programmed. At Plextek, we’re passionate about developing and implementing the right data exploitation techniques for the application and are working to ensure that humanity stays afloat (and dry) in Deptula’s prediction.

There is a wide range of potential machine learning techniques to choose from, but one approach is to copy nature and mimic biological brains. This was inspired by the fact that one of the primary purposes of a brain is to process sensory inputs and extract useful information for future exploitation. A biological brain can be produced in software form by modelling a large set of connected neurons. This is an artificial neural network.

How does an artificial neural network work?

The basic building block of a neural network is a single neuron. A neuron transforms a set of one or more input values into a single output by applying a mathematical function to the weighted sum of input values. This output value is then passed to one or more connected neurons to be used as a subsequent input value.

The neural network as a whole can, therefore, be defined by three sets of parameters:

  The weight assigned to each input value for each neuron.

  The function which converts the weighted sum of input values into the output value.

  The pattern of connections between neurons.

A simple example neural network consists of three layers. The first layer contains the input values which represent the data being analysed. This layer is then connected to a hidden layer of neurons. The hidden layer then connects to the third and final layer which contains the output neurons whose values represent the processed data. This design allows a complicated relationship between inputs and outputs.



How is a neural network trained?

Just like biological brains, simply creating a neural network is not sufficient to extract information from raw data. It is also necessary to train the network by exposing it to data for which the desired outputs are already known. This process is used to define the weights assigned to each connection throughout the entire network.

As the size and complexity of the neural network increases, the number of weights that must be defined for optimum performance increases significantly. This training process, therefore, requires a large and representative set of labelled data; otherwise, the neural network may not work successfully on future input data. Also, this training process is computationally challenging and may take significant processing time to perform. GPU acceleration can be used to mitigate this; however, the process may still take days for very large data sets.  

Conversely, if large volumes of suitable training data are available, it is possible to create a more complex neural network to improve performance. This can be achieved by increasing the number of hidden layers and therefore the total number of connections within the network. This use of complex neural networks with many layers and connections is called deep learning.



What can neural networks be used for?

With a sufficiently large neural network and suitable training data, it is possible to learn complex non-linear relationships between input and output values. This can reveal insights into data which are not possible when using simple linear mathematical models.

While neural networks are suitable as general purpose problem solvers, they are particularly suited for tasks when an understanding of the underlying relationships in the raw data is neither available nor necessarily required and sufficient data is also available for training.

An important example of this capability is the recognition of objects in images. This is achieved through the use of a neural network which has been trained on a large volume of photos of known objects (e.g. ImageNet). While the training process can take a long time, subsequent object recognition is much faster and can potentially be performed in real time. Due to the large volume of training data and the complexity of the neural network used the resulting object recognition performance is close to human level performance. This can be used to in a military context to recognise different vehicles (e.g. a tank) or in a civilian context to see if computers can distinguish between different animals (Do Androids Dream of Electric Sheep?).



Neural networks are not just limited to processing photos and the same approach can be applied to a wide range of sensor and non-sensor data. The most important requirement is that a suitable volume of labelled training data is available to train the network before it can be used on unknown data.

Save

Save

Save

Save

Save

Save

Save

Save

Further Reading