Printing an unsigned char with %x or %u is not absolutely correct

January 15, 2015

How to correctly match the format string in the corresponding argument

Printing an unsigned char

A mistake in C language

The C standard library function printf() takes a format string and subsequent arguments of various types. On an architecture where arguments are passed on the stack, the format string tells the printf() function what types it should interpret the blob of arguments with. On an architecture where printf()‘s arguments are passed by (possibly specialized) registers, the format string tells printf() in what registers it should look for them.

 For this reason, the arguments of printf() after the first one should match the indications of the format string. The C11 standard leaves no room for deviation:

7.21.6.1:9 […] If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.

 A common C pattern used for printing bytes is the following:

#include
…
unsigned char u = …;
printf("%02x", u);

 

The format %x is well-known for expecting an unsigned int. The problem in the above code snippet is that printf() is not passed an unsigned int. “Default arguments promotions” are applied to printf()‘s arguments after the format string. On most architectures, an unsigned char is promoted to int, because int can represent all the values contained in unsigned char.

 This mistake is quite harmless, because the types int and unsigned int are guaranteed to have the same representation, and it seems unlikely that an ABI would pass int and unsigned int in different registers. However, if you like writing for the C standard instead of for the couple of platforms that happen to be popular at the moment, you had better use the line below instead:

printf("%02x", (unsigned int) u);

 

This time, the type of the second argument clearly matches the format string in the first argument.

Newsletter