Binary representations of interger values have a flaw in certain applications. The difference between the number 15 and 16 is 1, but in the binary representation I need to flip three bits instead of one. Because of this phenomenon binary representation isn't ideal for a couple of genetic algorithms as well as certain communication problems.

Gray coding is an alternative representation to binary that combats this characteristic. It has the characteristic that the levenstein distance between two consequtive binary numbers is always one. The gray coding pattern is also cyclic; ie. the levenstein distance between the lowest and the hightest number also becomes one.

Gray Coding Patterns

As always, it's better to show the pattern before investigating the math. Below I've drawn the connections between numbers that have a levenstein distance of one for both binary numbers as well as gray coded numbers.

Binary Levenstein 16 Nodes

Gray Levenstein 16 Nodes

You should instantly recognize the benefit for genetic algoritms. The gray coding is far less likely to end up in a local optimum. The cyclic aspect can also be welcome, but can only be applied if you have $$$2^n$$$ categories for some $$$n$$$. By increasing the amount of nodes the point becomes even more obvious.

Binary Levenstein 32 Nodes

Gray Levenstein 32 Nodes



Sampling a Tree

Visualising these binary patterns can also be done on a chart instead through a graph. Below are two trees. The xaxis represents the decimal number and the yaxis represents the index on the binary string. If there is a one, a blue square is drawn, otherwise it is left white.

Binary Tree Chart

Gray Tree Chart

Notice the spread of gray coding is more spread out over the entire interval.




Difference Chart

What would happend if we draw the difference between binary and gray digits?

This is where it gets curious but beautiful. The green rectanglues represent the same digit while the orange rectangles represent the difference. The resulting tree looks as if it binary, but it is slightly different. The leaves of the tree are a bit thicker, it looks similar to the binary chart but it is different.

One might wonder why we can see a recursing tree if we look at the difference of the two representations. The answer lies in recursion and it is faily easy to spot.

Translation Table

For extra clarity I've added the translation table between integer, binary and gray code.

int binary grey

The 2-bit case.

Notice that the first bits of both binary and gray representations are all the same. Only for $$$i=2,3$$$ do we see that the second bit is different. If we are interested in what is different, the first bits do not matter.

Translation Table

int binary grey
0 00 00
1 01 01
2 10 11
3 10 11

The 3-bit case.

Notice again that the first bits of both binary and gray representations are all the same. Suppose that we remove this first strip. We will then have a reduced table in which it seems like we have two tables. The green table is exactly the same as the previous table. The red table has a certain L shape in it's body.

Do you see the recursion?

Original Table

int binary grey
0 000 000
1 001 001
2 011 010
3 011 010
4 100 110
5 101 111
6 110 100
7 111 100

Stripped Table

int binary grey
0 000 000
1 001 001
2 010 011
3 011 010
4 100 110
5 101 111
6 110 101
7 111 100

Ending Remarks

If you want to know more about the applications of gray coding, I can recommend checking out the wikipedia page.