31

I have a simple function testing if two arrays are each others inverse. They are seemingly identical, except for a tmp variable. One works the other doesn't. I can't for the life of me figure out why the compiler would optimize this out - if it indeed is an optimization problem (my compiler is IAR Workbench v4.30.1). Here's my code:

// this works as expected
uint8 verifyInverseBuffer(uint8 *buf, uint8 *bufi, uint32 len)
{
  uint8 tmp;
  for (uint32 i = 0; i < len; i++)
  {
    tmp = ~bufi[i];
    if (buf[i] != tmp)
    {
      return 0;
    }
  }
  return 1;  
}

// this does NOT work as expected (I only removed the tmp!)
uint8 verifyInverseBuffer(uint8 *buf, uint8 *bufi, uint32 len)
{
  for (uint32 i = 0; i < len; i++)
  {
    if (buf[i] != (~bufi[i]))
    {
      return 0;
    }
  }
  return 1;  
}

The first version of the code works, the second does not. Can anyone figure out why? Or come with some tests to probe what is wrong?

chqrlie
  • 98,886
  • 10
  • 89
  • 149
SupAl
  • 481
  • 3
  • 10
  • 2
    What is the input? – Tony Sep 06 '19 at 13:35
  • Did you consider using debug print statements to evaluate the equality yourself? – Trickzter Sep 06 '19 at 13:36
  • @TonyTannous the input is two buffers with verified inverse bytes (4 bytes long). – SupAl Sep 06 '19 at 13:39
  • @SupAl I understand, what are the numbers? – Tony Sep 06 '19 at 13:39
  • [Godbolts compiler explorer](https://godbolt.org) could be useful here. It allows you to see the generated assembly code together with the C or C++ source code, and let you make direct comparisons with different compiler flags (like different optimizer levels). – Some programmer dude Sep 06 '19 at 13:40
  • @Trickzter It's running on an embedded system, so i need to set-up a channel all the way into the function. I suspect that the if statement is somehow skipped. – SupAl Sep 06 '19 at 13:41
  • 1
    buf = [0xC0, 0xA8, 0x03, 0x87] bufi = [0x3F, 0x57, 0xFC, 0x78] – SupAl Sep 06 '19 at 13:42
  • 1
    In case anyone stumbe upon this, the problem was verified to be implicit type promotion. Solve by either casting to specified type after bitwise operation or store in temp variable of the same type. – SupAl Sep 06 '19 at 14:38
  • 3
    @SupAl If an answer addressed your question, you should [accept it](https://stackoverflow.com/help/accepted-answer). – dbush Sep 06 '19 at 16:57
  • There are other tests that would work too. For example: `if (buf[i] != bufi[i] ^ 0xff)`, or `if (buf[i] ^ bufi[i] != 0xff)`, or `if (buf[i] ^ bufi[i] ^ 0xff)`. – Ian Abbott Sep 06 '19 at 17:03
  • 1
    The title of this question is poor: you have made an (incorrect) assumption about what is happening, then asked about that assumption - (an X-Y problem). You question should just be "Why does this not work?". You would also do well to explain how it does not work - the result, and the input that generated the result. – Clifford Sep 07 '19 at 10:35
  • Define "works". What happens? Exactly? – Lightness Races in Orbit Sep 07 '19 at 20:43

3 Answers3

43

What you see happening is a result of the rules of integer promotions. Anytime a variable smaller than an int is used in an expression the value is promoted to type int.

Suppose bufi[i] contains the value 255. The hex representation of this is 0xFF. This value is then operand of the ~ operator. So the value will first be promoted to int which (assuming it is 32 bit) will have the value 0x000000FF, and applying ~ to this gives you 0xFFFFFF00. You then compare this value with buf[i] which is of type uint8_t. The value 0xFFFFFF00 is outside of this range so the comparison will always be false.

If you assign the result of the ~ back to a variable of type uint8_t, the value 0xFFFFFF00 is converted to 0x00. It is this converted value that is then compared against buf[i].

So the behavior you see is not the result of an optimization but the rules of the language. Using a temp variable as you are is one way to address this issue. You could also cast the result to uint8:

if(buf[i] != (uint8)(~bufi[i]))

Or mask out all but the lowest order byte:

if(buf[i] != (~bufi[i] & 0xff))
dbush
  • 162,826
  • 18
  • 167
  • 209
22

The problem is integer promotion. The ~ operator is very dangerous!

In case of ~bufi[i], the operand of ~ gets promoted according to the integer promotions. Making the code equivalent to ~(int)bufi[i].

So in the second case buf[i] != (~bufi[i]) you get something like 0xXX != 0xFFFFFFFFYY, where "XX" and "YY" are the actual values you wish to compare and 0xFFFF is unintended crap placed there by taking the bitwise complement of an int. This will always evaluate to true so the compiler might optimize away parts of the code, creating a very subtle bug.

In case of tmp = ~bufi[i]; you dodge this bug by truncating 0xFFFFFFFFYY into "YY", the value you are interested in.

See Implicit type promotion rules for details. Also consider adopting MISRA-C to dodge subtle bugs like this.

Lundin
  • 155,020
  • 33
  • 213
  • 341
  • MISRA-C unfortunately seems to have been written on the assumption that the Standard would adopt integer promotion rules that differ from what C89 ended up using. The range of expressions for which `expr > (u16a-u16b)` would be accepted, for example, would only really make sense if that expression was equivalent to `expr > (uint16_t)(u16a-u16b)`, which of course it isn't. Adding a tiny bit more complexity to the type system by having separate "wrapping unsigned algebraic ring" versus "unsigned numeric value" types could have made it possible to write code with genuinely-portable... – supercat Sep 07 '19 at 19:15
  • ...16-bit types (the difference between two 16-bit unsigned-algebraic-ring values would be another value of that same type, regardless of the size of `int`, while operations on a "16-bit unsigned numeric value" type would behave as though it was promoted to a signed type capable of handling the full range--again regardless of the size of `int`. Had such a feature been included, MISRA could have required its use, thus ensuring that MISRA-conforming code would behave consistently on all machines. No such thing has yet happened, however. – supercat Sep 07 '19 at 19:18
  • @supercat The very problem is that `expr > (u16a-u16b)` is not equivalent to `expr > (uint16_t)(u16a-u16b)` and it isn't obvious at all, because of C's dangerous and irrational integer promotion rules. MISRA-C only seeks to prevent bugs caused by the broken type promotion system, nothing else. In the best of worlds C would have no implicit promotions at all, instead trusting that programmers are competent enough to consider value ranges when they write code. And since there's no such thing as "undefined integer overflow" on any mainstream CPU, they could fix that part too. – Lundin Sep 08 '19 at 09:42
  • @Lundin: My point was that the rules of MISRA-C assume those expressions are equivalent, even though their behavior isn't. As for integer promotion rules, C was originally designed on the assumption that all integer operations will behave as though performed with a single "largest" integer type, and floating-point values will likewise behave as though performed with a single "largest" floating-point type. That's a great behavioral model if operations on the largest type are as fast as operations on smaller types, and the largest integer type is usable as a "bag of bits". – supercat Sep 09 '19 at 15:04
  • @Lundin: That behavioral model is also the safest default one for code which is dealing with values as numbers rather than bags of bits. The problem is that when C added `long` and was ported to platforms where those original assumptions ceased to apply, it didn't add distinct "number" and "bag of bits" types, but instead used ad hoc rules to make some things act as numbers and others as bags of bits. – supercat Sep 09 '19 at 15:08
5

As already noted by Lundin and dbush, the comparison in the second version always fails because the opposite of any uint8 value promoted to int is different from all uint8 values. In other words, the second version is equivalent to:

// this does NOT work as expected (I only removed the tmp!)
uint8 verifyInverseBuffer(uint8 *buf, uint8 *bufi, uint32 len) {
    if (len) return 0;
    return 1;
}

As can be seen on Godbolt's compiler explorer, both gcc and clang detect this and optimize the code out completely:

verifyInverseBuffer:
    test    edx, edx
    sete    al
    ret

gcc produces a rather cryptic warning, pointing a suspicious signed/unsigned comparison issue which is not the real problem... Close but no banana.

<source>: In function 'verifyInverseBuffer':
<source>:8:16: warning: comparison of promoted bitwise complement of an unsigned value with unsigned [-Wsign-compare]
    8 |     if (buf[i] != (~bufi[i]))
      |                ^~
Compiler returned: 0
chqrlie
  • 98,886
  • 10
  • 89
  • 149