The Modbus CRC Endianness Kerfuffle

Modbus is a quaint protocol. It's one of my favourite protocols -- it's not very convenient to use, but it's pretty convenient to implement and remarkably flexible for an otherwise pretty opinionated protocol. Its specs are very self-contained and easy to follow.

That being said, like all protocols that are a) from an entirely different era of computing and b) royalty-free, there are a lot of non-conforming devices out there. When you run into one, you quickly start to doubt the specs, your documentation, your code and eventually your sanity. My favourite stumbling block? The endianness of the CRC value.

It's not so much that nobody gets it right -- in fact, it's the one thing that even non-conforming devices get right, because their developers end up swapping the bytes until they get the order right, otherwise the device can't talk to anything. It's just that a lot of people don't understand why they got it right.

First, the good news: virtually every useful Modbus development tool out there gets it right. Pymodbus gets it right. Modbus PLC Simulator gets it right. They all do (if Google got you here and you were looking for one of these, I'll just refer you to Dale Scott's page which is the page you were really looking for).

If you're ever in doubt about whether you got it right or not, just look at one of these tools. They get it right.

Modbus and endianness. Well-designed, or rather well-specced protocols, don't usually have a problem with endianness. The protocol specs say that values are encoded in either big-endian or little-endian format, and that's that.

Modbus is a little special though. For historical reasons that are so fascinating it would be insulting to briefly list them here instead of granting them a detailed post of their own, Modbus is a mixed bag:

  • 16-bit values are always big-endian...
  • ...except for the 16-bit CRC which is little-endian
  • The format of values longer than 16-bit is unspecified. Most manufacturers pick the sane option and output them in big-endian format, too, but some manufacturers output them in little-endian order because why not.

That CRC Function in the Appendix

The Modbus protocol specifications helpfully includes an implementation of the CRC function (you want to look at Appendix B of the Modbus over Serial Line Specification and Implementation Guide). It's concise, battle-tested code -- so old and battle-tested that it uses K&R syntax.

The specs also include a note that "[the] function performs the swapping of the high/low CRC bytes internally. The bytes are already swapped in the CRC value that is returned from the function. Therefore the CRC value returned from the function can be directly placed into the message for transmission."

So you make a nice frame like 01 03 00 10 00 08, and you compute its CRC on your x86(_64) machine (with that function, or with a CRC calculator like this one) and it comes out as 0xC945.

At this point -- and I guarantee you that everyone who's ever had to think about it because their gizmo wasn't working has had the same thought, so don't feel bad --  it's tempting to think that, since the bytes are already swapped internally, that's how you put it out on the line: C9 45.

But the right order is, of course, 45 C9. The bytes are swapped internally, and if you give your "Modbus transmit" function a pointer to the 16-bit unsigned int that the function returns, it will put it out on the line just right. But that's because the underlying representation is already little-endian. It comes out "wrong" when you print it, but the underlying storage is correct. The value is stored in memory as 45 C9.

In trying to figure out if you got it right, there's a good chance that Google's SEO-driven madness gets you to this thread, which doesn't help at all. It's easier to see why this is the right order with this simple snippet that works almost like your favourite "take this buffer and put it out on the serial line" function:

void output(char *buf) {
        printf("OUT: %X %X\n", buf[0] & 0xFF, buf[1] & 0xFF);

If you look at the CRC function, you'll notice that the result is obtained by fusing together two one-byte values. Those end up with the same value on any platform: uchCRCLowill be 0x45, uchCRCHiwill be 0xC9. But the function then returns uchCRCHi << 8 | uchRCRLoand that's where the fun begins.

If we pass this value to the function above on a little-endian machine, the output will be:

OUT: 45 C9

In other words: uchCRCLofirst, uchCRCHisecond. Which is actually what we want when outputting the value in little-endian format.

Does that mean that the function's output is broken on big-endian machines? Oh yes it does!

OUT: C9 45

The computation is correct: the CRC is indeed 0xC945-- if you do the math, with the good ol' pen and paper, that's the number that you get. But when the function swaps the bytes on a big-endian platform, they end up swapped for good. If you want to output them over a serial line, you need to swap them again (or to modify the CRC function so that it stops swapping them).

Making sense of the semantics. The labeling of the two variables may seem misleading, but it actually makes perfect sense: they are, indeed, the high- and low-byte of the CRC in ~~sane~~ big-endian format. Since the CRC is given in low-endian order, you transmit the low-order byte first. In the docs' own words, "The CRC field is appended to the message as the last field in the message. When this is done, the low–order byte of the field is appended first, followed by the high–order byte. The CRC high–order byte is the last byte to be sent in the message. "