Theory of CRCs ============== The definitive online CRC documentation is by Ross Williams: http://www.ross.net/crc/crcpaper.html ...and I recommend you read that if you want to know what's going on. I wrote the notes here to get my own head around the subject, but if they help anyone else, good. * Fields A field is something that supports multiplication and addition. It always has at least two elements, 0 (the additive identity) and 1 (the multiplicative identity). Adding or multiplying any two elements of the field gives another element of the field (i.e. the field is "closed"). Addition and multiplication in any field are: * commutative - they work both ways around, i.e. a+b=b+a. * associative - you can move brackets around, i.e. a+(b+c)=(a+b)+c. There are other properties we don't care about right now. Note that addition and multiplication in some given field *don't* have to correspond to the familiar concepts of addition and multiplication. An example of a field would be the rational numbers (i.e. all numbers of the form A/B where A an any integer and B any positive integer). In this case addition and multiplication work like you expect. * Polynomials A polynomial over some field is a sum of finitely many terms where each term is a member of that field multiplied by a unique non-negative integral power of a variable. The degree of the polynomial is the highest power that occurs in it. e.g. 5X^7 + 2.5X^2 + 1 is a polynomial of degree 7 over the rational numbers. X is just a placeholder; we could evaluate the polynomial by substituting a value for X but we don't care about that here. * GF(2) GF(2) is the field with exactly two elements: 0 and 1. (If you are familiar fields then it is clear that) addition is equivalent to bitwise XOR; this is obviously convenient for computers. CRC theory involves polynomials over GF(2) such as: P(X) = X^32+X^26+X^23+X^22+X^16+X^12+X^11+X^10+X^8+X^7+X^5+X^4+X^2+X+1 Note that if we know the degree of a polynomial over GF(2), and its more than 0, we only have to care about storing the coefficients of power *less* than the degree. This will come in handy later. * Division Of Polynomials Polynomials can be divided to produce a quotient and a remainder (both polynomials). This isn't the same as evaluating them at some value and then dividing the results. We always divide a dividend by a divisor, to produce a quotient and a remainder (or "modulus") which are both polynomials. A simple example is X^4 + X^2 --------- = X^3 + X, with remainder 0 X Here is a less simple example: X^4 + X^2 + 1 ------------- = X^3 + X^2, with remainder 1 (in GF(2)) X + 1 You can compute this using the usual long division algorithm that is commonly applied to integers. This is actually easier in the polynomials of GF(2) than in either the integers or the polynomials of the rationals, because each "digit" (i.e. coefficient) is just a 0 or a 1, and the subtraction is an exclusive OR. (It's also more difficult in that you're doing arithmetic using unfamiliar operations.) Here is the working for the division above: X^3 + X^2 + 0 + 0 <-- quotient .---------------------------- X + 1 | X^4 + 0 + X^2 + 0 + 1 X^4 + X^3 --------- X^3 + X^2 X^3 + X^2 --------- 0 + 0 0 ------- 0 + 1 0 ------- 1 <-- remainder You can check that the answer is correct by multiplying the quotient by the divisor and adding the remainder: the answer should be equal to the dividend. Given that all the coefficients are 0 or 1, we can omit the variable and the addition signs and just write the coefficients: 1100 .------ 11 | 10101 11 -- 11 11 -- 00 00 -- 01 00 -- 1 * CRCs A byte can be converted into a polynomial over GF(2) of degree 7 or less. If this is B(X) and P(X) is some fixed polynomial of degree N then the CRC is the remainder of B(X) * X^N/P(X). In practice we are interested in the CRC of a stream of zero or more bytes, but this is just same idea for longer B. (In reality the order of bits in B varies between specifications for different CRCs, and sometimes there are some additional wrinkles; but we'll come to those.) In practice of course we don't do the computation in terms of polynomials, we leave bytes as bytes. A computation might look like this. Here P=1010, or X^3+X, and B=11010101 (or X^7+X^6+X^4+X^2+1). Note that muliplying by X^3 is equivalent to adding three zero bits on the right. 11101 000 (in reality we ignore the quotient) P .------------------ 1010 | 11010101 000 1010 | | | ---- | | | 111 1110| | | 1010| | | ----v | | 100 1001 | | 1010 | | ---- | | 011 0110| | 0000| | ----v | 110 1101 | 1010 | ---- | 111 111 0 | 101 0 | ----- | 100 10 00| 10 10| -----v 010 0 100 0 000 ----- 100 100 (the remainder) Doing this by hand is one thing, we want to formalize it so a computer can do it. * A Basic Algorithm for CRC Computation Let P be the binary representation of the divisor polynomial, of degree N, and so N+1 bits long. Let B be a message, with N extra zero bits added. R will be the remainder, N bits long (really it should be N+1 but it's always safe to ignore the last bit). Algorithm 1. 1) let R be 0. 2) if there are no more bits in B, the answer is R. Stop. 3) shift R left by one. The new bit 0 should be the next bit of the message. We also shift a bit out of the left of R. 4) If this bit is 0 go back to 2. 5) Otherwise it is 1. XOR R with the bottom N bits of P and go back to step 2. If we apply this algorithm to the above division then we get the folowing steps: R=000 B=11010101+000 P=1010 R<-001 B<-1010101+000 shifted 0 R<-011 B<-010101+000 shifted 0 R<-110 B<-10101+000 shifted 0 R<-101 B<-0101+000 shifted 1 R<-111 R<-110 B<-101+000 shifted 1 R<-100 R<-001 B<-01+000 shifted 1 R<-011 R<-110 B<-1+000 shifted 0 R<-101 B<-000 shifted 1 R<-111 R<-110 B<-00 shifted 1 R<-100 R<-000 B<-0 shifted 1 R<-010 R<-100 B<-empty shifted 0 stop with R = 100. Shifting R left corresponds to looking at the next term in the dividend, looking at the bit shifted out the top corresponds to comparing the divisor with the remainder, and the XOR to subtracting the divisor from the remainder. * Scaling Up Normally we would have a much larger P: say degree 32 (so 33 bits long - in practice we store this in a 32 bit word with the topmost bit implicit; we don't need it anyway.) Also it's inefficient and inconvenient to process every bit separately when we only ever handle streams of bytes; we want to process whole bytes at once. In other words we want to somehow encapsulate 8 iterations of the above algorithm into a single step. Imagine we've just shifted 8 bits into R (and so 8 bits out of the top). If the top bit of the byte we've shifted out (call it S) is set then we XOR in the 32 bits of P to the top 25 bits of R, and the bottom 7 bits of S. For the next bit we XOR with the the top 24 bits of R and the bottom 6 bits S; and so on. Note that we only care about XORing things into S for the purposes of determining the next few values of S - which in turn only affects what we XOR in to R; we'll shortly discard S and pick a new one. So if we can take a single value of S and use that to do all the XORs into R at once, we never need alter S. XOR is associative, so we can do this. How do we calculate the table? Given a byte S we're going to XOR R with some value, so if we set R to 0 then the resulting R will be the value to XOR R with for that S. So we iterate over each S and do the calculation "long-hand", and that gives us the lookup table. crc32tab.pl contains an example. If T is the table then our algorithm is then: Algorithm 2. 1. Let R=0. 2. If B is now empty, the answer is R. Stop. 3. Let S be the top 8 bits of R. 4. Shift R left by 8. 5. Replace the bottom 8 bits of R with the next byte of the message. 6. XOR R with T[S]. 7. Goto 2. * Another Optimization The extra N bits that we have to paste onto the end of the message are a nuisance. It turns out that there is an optimization that can get rid of them. First, consider a byte from the message as it is introduced into R (in step 5) and propagated up through it. When it comes out the top it will have been XOR'd with four different values, each taken from T. Second, observe that the four padding bytes at the end just mean that the last byte of the message makes it through R to the top. Thirdly, notice that the first four lookups into T are always for T[0], which is 0. So the first four steps don't do any useful work. If instead on the first step we could deduce what the correct value to XOR with the first byte of the message was, we could eliminate those first four steps. Of course we can: it's 0. We can do this for successive steps as well. The values that a byte is XOR'd with as it goes through R depend solely on the part of R "above" it; in other words if we introduce a 0 into R and only XOR in the message byte at the end, just before the lookup in T, rather than at the beginning, it doesn't actually change which entry of T we look up for the next XOR. (We are again relying on the associtivity of XOR.) The following algorithm implements this, with the same preconditions as before except that the padding bytes are no longer necessary. Algorithm 3. 1. Let R=0. 2. If B is now empty, the answer is R. Stop. 3. Let S be the top 8 bits of R. 4. XOR S with the next byte of the message. 5. Shift R left by 8 bits. 6. XOR R with T[S]. 7. Goto 2. * CRC32 CRC32 has a polynomial of: X^32+X^26+X^23+X^22+X^16+X^12+X^11+X^10+X^8+X^7+X^5+X^4+X^2+X+1 It also has the variation that the representation is backwards, i.e. the MSB or a byte or word contains the coefficient of X^0; the initial value of R is FFFFFFFF and the final result is XOR'd with FFFFFFFF. The binary representation of P comes out as EDB88320 (the X^32 coefficient being implicit). This means that "left" and "right" are reversed, as are "top" and "bottom".

#! /usr/bin/perl -w use strict; use integer; my $P = 0xEDB88320; # this is just algorithm 1 from crc32.c, applied to each possible byte, with # 32 bits of padding, which we handle implicitly - effectively we've bubbled # the byte $b through 32 shifts before we enter the $bit loop. for(my $b = 0; $b < 256; ++$b) { my $r = $b; # we want the highest coefficient first, this is the lowest bit number for(my $bit = 0; $bit < 8; ++$bit) { # bit 0 will be the bit that "pops out" of $r if($r & 1) { # multiplying through by X is a _right_ shift in this representation $r = ($r >> 1) & 0x7FFFFFFF; $r ^= $P; } else { $r = ($r >> 1) & 0x7FFFFFFF; } } if($b % 4 == 0) { print " "; } printf "0x%08X,", $r; if($b % 4 == 3) { print "\n"; } else { print " "; } }