Coding *

The Binary Symmetric Channel *

Redundancy Encoding *

Hamming codes *

Hamming Distance and Code capabilities *

Automatic Repeat Request (ARQ) *

Forward Error Correction(FEC) *

Code Efficiency *

Code Implementation *

Generator Polynomials *

Convolutional Codes *

Exercises *

The Binary Symmetric Channel


A fruitful way to look at the digital communications system depicted in Figure 1 of the Data Transmission notes is to view the modulator-channel-demodulator portion simply as a black box whose input and output are streams of binary digits. Because channel noise and bandwidth restrictions, errors are made in conveying these bits from the input to the output. In many situations, it is reasonable to assume that these errors are independent from one bit to the next. Further, it may be assumed that the errors are symmetric, i.e. The proability of either type of error is denoted Under these assumptions, the black box can be represented as the Binary Symmetric Channel(BSC) shown in Figure 1. As shown, the inputs and output are "zeros" and "ones" with error probability . The BSC will serve as a useful model for the action of the Modulator-Channel-Demodulator in the rest of these notes.

Redundancy Encoding

By adding extra bits to a data steam transmission errors can be corrected. There are two kinds of redundancy codes, block codes and convolutional codes. Block codes consist of a set of words, all of the same length. The pair (n,k) indicates, respectively, the length of each of the codewords and the number information bit it carries. A common example of a redundancy code the ubiquitous ASCII code which consists of 128 codewords, each with eight binary digits; accordingly, it is a (8,7) code. The codewords express upper and lower case English letters, the numbers 0 through 9, punctuation and control characters. An example of the last is an eight-bit character indicating the acknowledgement of a received message (ACK). Clearly there is redundancy since only seven bits are required to represent 128 characters. The eighth bit is a parity check bit which is such that the modulo two sum of all of the binary digits in the word is equal to 0. For example, the code word for the letter "e" is 10100110. For the EOT character we have 00100001. In both cases the last bit was set so that there were an even number of ones in the codeword. A single error in a codeword is indicated by the parity check over the codeword which would yield a binary one rather than a binary zero.

A fruitful way of looking at the parity check operation in the preceding paragraph is to examine the Hamming distance between codewords. The Hamming distance is defined as the number of places in which the codewords differ. For example, the codewords for "F" and "V" are 01100011 and 01101010, respectively; consequently, the Hamming distance is two. If there were no eighth parity bit, the distance would be one and it would not be possible to detect an error in transmission.

Hamming codes

In a nutshell error detecting and correcting codes have distance properties between codewords such that errors can be detected and/or corrected. A prime example of such a code is the (7,4) Hamming code shown in Table 1 there are 16 codewords, conveying four bits of information, with each codeword consisting of seven binary digits. The code is shown in systematic form with the first four bit containing the information and the last three being parity bits. Each of these parity bits is set so that there is zero parity over specified subsets of the seven binary digits. The equations for the parity check are



Because of the parity checks, the minimum Hamming distance between any air of codewords is three. For example codewords 14 and 15 are distance three from one another. Some pairs of codewords are at distance four, 10 and 15, for example. The further the distance is between the codewords of a code, the more capabilities the code possesses. For example, the (7,4) Hamming code can be used to correct a single error in transmission. The received seven-bit block is compared with the valid codewords and the transmitted codeword is taken to be the one, which is closest. Since the minimum distance is three and there is only one bit in error there can be only one closest. For example, suppose that 1000110 is transmitted and there is an error in the sixth bit resulting in 1000100 received. Since there is no valid codeword closer than 1000110, this is the one that is decoded.

The distance properties may be used to do error detection. Suppose, example, the channel were such that there could be two errors in transmitting a codeword in Table 1. These errors could put a received seven bit block distance one from an incorrect decoding. If this possibility is known the code may be used to detect two errors with no error correction.

The (7,4) Hamming code may be improved by adding another parity bit over all of the binary digits in the codeword, i.e., in addition to (12) we have


The resulting code is shown in Table 2. Notice that the minimum distance between pairs of codewords is now four. Accordingly, one error may be corrected and two errors detected. Before adding the extra parity bit one error can be detected or two errors detected.

Table 2-Augmented Hamming code


Hamming Distance and Code capabilities

As the following example demonstrates improving error performance by increasing the minimum distance between codewords has a cost in terms of the information carrying ability of codewords. Consider the following two codes 1) A (3,2) code with words 000,011,101,110. 2) A (3,1)code with codewords 000 and 111. These codes may be represented as vertices on a three dimensional cube as indicated in Figure 2. An error in transmission move the received sequence from a codeword to another vertex. For the first of these the minimum distance is two; consequently only single errors can be detected. For the second code a single error can be corrected.

Fig. 2 Simple codes

After the examples that we have just presented, it should come as no surprise that there is a general relation between the minimum distance between codewords and the error detecting and correction capabilities of a code. Let the minimum distance between codewords be denoted as m. The maximum number of errors that can be detected, denoted as D is given by


Similarly, C the maximum number of errors that can be corrected is given by


It is left as an exercise to prove (3a,b).

Automatic Repeat Request (ARQ)

As we see the redundancy in a code can be used either to detect or correct bit errors. The error detecting capability is used in Automatic Repeat Request(ARQ) systems depicted in Figure 2. When an error is detected at the receiver, a negative acknowledgement(NACK) is sent back to the transmitter and the message transmission is repeated. Alternatively, the transmitter waits for a positive acknowledgement(ACK) and if it is not received within a specified time interval, the message is retransmitted. If this goes on for a specified number of retransmissions, it is assumed that there is a failure of some kind and a higher level protocol takes over.

Forward Error Correction(FEC)

ARQ is widely used to give reliable transmission in system where roundtrip delay is small; however, in satellite systems, for example, it is not efficient. In such systems we use Forward Error Correction in which all of the parity check bits are used to correct errors. As the above examples demonstrated the higher percentage of a codeword that is devoted to parity check bits the better will the code’s error performance. We can quantify this by applying some basic ideas from probability theory. Suppose that the probability of a binary digit in a codeword being in error is given by . Suppose also that errors occur independently from bit to bit. The total number of errors in an n-bit codeword is binomially distributed


According to (4b), if there is sufficient minimum distance between codewords, at most C of these errors can be corrected. Now, the codeword will be received erroneously if there are more than C errors. The probability of this event is


When , which is a reasonable operating point, we have


Example: We compare the performance of three different length Hamming codes, (7,4), (15,11) and (31,26) and the (23,12) Golay code. The Hamming codes all have minimum distance three; accordingly, single errors can be corrected. The Golay code has minimum distance seven and can correct up to three errors. The rate at which information bits are transmitted is the number of information bits divided by the length of the codeword. The probability of decoding error is calculated by substituting into (5b).


Data Rate,k/n

Word error probability for

Word error probability for

(7,4) Hamming


(15,11) Hamming


(31,26) Hamming


(23,12) Golay


Table 3

Code Efficiency

The block codes that are used in digital communications systems have deep mathematical structure. The objective of this mathematics is to design codes which have the properties of being efficient and easy to implement. Beginning with the first property, we can view an (n,k) as 2k points on the vertices of an n dimensional hypercube. The objective is to place each of these points on the 2n vertices in such a way that the minimum distance between codewords is maximized while packing the space symmetrically. This point may be illustrated by counting the number of codewords in a code which can correct up to C errors. Because of this capability, we see each code point as being surrounded by a set of vertices which are no more than Hamming distance C away. These vertices "belong" to the codeword. For an n-bit code there are vertices which are exactly at distance i. The total number of vertices which "belong" to a codeword are . Since there are 2k codewords, the total number of vertices associates with one codeword or another is . This must be no more than 2n , the number of vertices in the whole space; accordingly, we have




for an (n,k) code to correct C errors. Equation (6a-b) is known as the sphere packing bound. Codes that satisfy this bound with equality are known as perfect codes. The Hamming and the Golay codes that we have seen above are examples of perfect codes. Codes that are not perfect have vertices that are wasted since they do not contribute to the code error performance.

Code Implementation

The second valuable code property is easy implementation, meaning encoding and decoding with high-speed digital circuitry. A class of codes with this property are the Cyclic codes. The codewords of a cyclic code can be segmented into subgroups such all member can be found by cyclic shift. For example, in the (7,4) Hamming code shown in Table 1, codewords 1,3,4,6,8,10 and 13 are all cyclic shifts of one another.

A codeword can be represented by a polynomial. For example, in the (7,4) Hamming code, the codewords 1 and 14 can be represented as x3+ x2+1 and x6+ x5+ x4+ x, respectively. The conventional operations of addition, multiplication can be carried out with these polynomials with modulo 2 arithmetic for the coefficients. These operations are illustrated in Figure 3. Notice that multiplication by xi has the effect of shifting a sequence. For example, multiplication by x3 shifts 1011 to 1011000. As we shall see presently, each of the operations is easily implementable in digital circuitry. For example, to shift is achieve by passing through a delay.

Each cyclic code has a unique generator polynomial of degree n-k. For example the generator polynomial for the (7,4) Hamming code is x3+ x2+1. This polynomial, which is designated as G(x), is used to generate the codewords of the code. Let D(x) and C(x) be, respectively, the polynomial representations of the k-bit sequence to be transmitted and the encoding of the sequence. The following procedure is used to generate D(x) from C(x)

We run through an example for the (7,4) Hamming code with the sequence to be transmitted 1110 giving the polynomial D(x)= x3+ x2+ x. After multiplication by x3 we have x6+ x5+ x4. The remainder after division by x3+ x2+1 is x. The polynomial representation of the codeword is then x6+ x5+ x4+ x which is just the sequence 1110010.

The property of codewords generated in this fashion is that they are evenly divisible by the generator polynomial. If we divide by the generator polynomial G(x), the result is a quotient, Q(x), and a remainder,R(x). We can write this as


By adding R(x) to both sides, we have the polynomial representation for the codeword.


Thus, if a received sequence has a transmission error it may not be a codeword-a fact that can be checked by simple division by G(x). For example, suppose that 1110010 is transmitted but there is an error in the fourth bit and the received sequence 1111010. Notice that, in terms of the polynomial representations, this error is represented by adding the error polynomial E(x)=x3. Now dividing x6+ x5+ x4+ x3 + x by G(x)=x3+ x2+1 gives the remainder x3.

The application of this mathematics lies in the fact that the arithmetic operations can be realized with digital circuitry. In Figure 4, the implementation division by the generator polyomial is shown. The divider is a linear sequential circuit which is capable of high speed operation.

Fig. 4 Divider Circuit


Generator Polynomials

The generator polynomials for codes determine their error correcting and detecting capabilities. In Figure 5 the generator polynomials for several theoretical codes and for codes that are part of standards are shown. The factors of the polynomials determine their error detecting capability of the corresponding code. For example, the factor x+1 in a generator polynomial insures that single errors will be detected. Because of this factor, we have . But a single error is represented by the polynomial E(x)=xi; i=0,1,…,n-1 ;consequently, . (See exercise 5.)

A second property is that all bursts of errors which have duration (n-k) or less can be detected. The error polynomial for such a burst is where B(x) represents the particular pattern of errors and i is its position within the block. The maximum degree of B(x) is n-k-1 and the degree of G(x) is n-k with a term 1. As a consequence, E(x) is not divisible by G(x).

Convolutional Codes

An alternative form of redundancy encoding is the convolutional code. Rather than encode blocks the entire transmitted sequence is continuously encoded. An example of a convolutional encoder is shown in Figure 6. In this case the redundancy is embodied in the fact that for every input bit there are four output bits. The convolution lies in the fact that each output bit is a function of two previous bits as well as the current input bit. In general, the convolutional encoder has k input and n output bits, with n>k. The term k/n convolutional encoder is used to indicate this redundancy. The amount of memory spanning over previous inputs also characterizes the encoder. During the encoding process, the bits in this memory is the state of the coder. Clearly, the output is a function of the memory and of the current input.

The encoder can be viewed as mapping the input sequence into a path through the states of the encoder. The encoding is efficient when possible output paths are distant from one another. The decoding process involves deciding which of the possible paths is most likely. The Viterbi algorithm is a computationally efficient technique for unraveling these paths.

The convolutional encoder is good for correcting randomly occurring errors. On the other hand, block codes can be designed to correct errors that occur in bursts. In order to take advantage of both capabilities, the two different techniques can be concatinated. This is shown in Figure 7 where the output of the convolutional encoder is fed into the block encoder.


[Gihawe95] R.D. Gitlin, J.F. Hayes, S.B. Weinstein, Data Communication Principles, Plenum, 1995

[Stall] W. Stallings, Data and Computer Communications, Macmillan, 1985


1) Suppose that a block code of length 16 has minimum distance four.

  1. What is the maximum number of bit errors that can be detected?
  2. What is the maximum number of bit errors that can be corrected?
  3. If the code is used for error correction only and the probability of a bit error is .002, what is the probability an error?

2) Prove equations (3a,b)

3) The (23,12) Golay code has minimum distance seven.

a) How many binary digits in a codeword?

b) How many bits of information does a codeword convey?

c) What is the maximum number errors that can be detected?

d) What is the maximum number errors that can be corrected?

e) If the code is required only to correct two errors, what is the maximum number of errors that can be detected?

4) For the Hammings (7,4), find all of the subgroups of codewords which are cyclic shifts of one another.

5) If the generator polynomial for a code has (x+1) as a factor, show that all patterns of odd numbered errors can be detected.