What is wrong with this bit-manipulation code from an interview question?

Question

I was having a look over this page: http://www.devbistro.com/tech-interview-questions/Cplusplus.jsp, and didn't understand this question:

What’s potentially wrong with the following code?
long value;
//some stuff
value &= 0xFFFF;
Note: Hint to the candidate about the base platform they’re developing for. If the person still doesn’t find anything wrong with the code, they are not experienced with C++.

Can someone elaborate on it?

Thanks!

There are potentially a lot of things wrong with this code. On the other hand, it might be just fine. Without context, it is impossible to provide a reasonable answer to this question. — James McNellis, Nov 23 '10 at 01:39
@James: Maybe on some implementations, maybe not. That's a terrible excuse in any case, though. — GManNickG, Nov 23 '10 at 01:47
**Type limits: http://stackoverflow.com/questions/271076/what-is-the-difference-between-an-int-and-a-long-in-c/271132#271132** For the inevitable train-wreck of types, sizes, ranges, and bits. — GManNickG, Nov 23 '10 at 01:56
Apart from using an uninitialized variable, there is potential issue of promoting an int literal to a long literal, in which case, you end up with and-ing 0xffffffff (i.e. do nothing). — Axn, Nov 23 '10 at 02:00
I deleted my answer about sign extension because it's wrong. I suspect it's what the interviewer is looking for, but if the interviewer is looking for that, they're wrong too. — Omnifarious, Nov 23 '10 at 02:48
One can be "experienced in C++" and never used the bitwise operators. C++ is such a big language that supports so many programming paradigms that (beyond "Hello World!" type things) I'm not sure it's possible to definitively determine if one has experience based on 1 question. — JohnMcG, Nov 23 '10 at 03:52
If there is unspecified stuff between the declaration and the and-with statement, I would assume the stuff initializes 'value'. Given that, I would consider the code as the most straightforward way to mask off all but the bottom 16 bits of 'value'. There are some interesting caveats with the signed/unsigned behavior of hex constants (e.g. the effects of comparing 0xFFFF to -2 or -200000 on a 16-bit machine) but this code does not encounter them. — supercat, Nov 23 '10 at 04:44
Axn: `0xffff` will *not* be promoted to `0xffffffff` in any case - `0xffff` is always the same as `65535`, which means that the constant has type `unsigned int` if `int` is only 16 bits. It will be remain `0xffff` / `65535` when promoted to a `long`. — caf, Nov 23 '10 at 05:25
@caf: `0xffff` and `65535` are not exactly the same, however. If `int` has a width of 16 bits, the former is of type `unsigned int` while the latter is of type `long`. They do always have the same value, of course. — James McNellis, Nov 23 '10 at 05:45
@Josh: it's using bit-wise operation. I don't know about you, but that's always a red herring for me... — Matthieu M., Nov 23 '10 at 07:57

score 41 · Accepted Answer · edited Nov 23 '10 at 21:54

Several answers here state that if an int has a width of 16 bits, 0xFFFF is negative. This is not true. 0xFFFF is never negative.

A hexadecimal literal is represented by the first of the following types that is large enough to contain it: int, unsigned int, long, and unsigned long.

If int has a width of 16 bits, then 0xFFFF is larger than the maximum value representable by an int. Thus, 0xFFFF is of type unsigned int, which is guaranteed to be large enough to represent 0xFFFF.

When the usual arithmetic conversions are performed for evaluation of the &, the unsigned int is converted to a long. The conversion of a 16-bit unsigned int to long is well-defined because every value representable by a 16-bit unsigned int is also representable by a 32-bit long.

There's no sign extension needed because the initial type is not signed, and the result of using 0xFFFF is the same as the result of using 0xFFFFL.

Alternatively, if int is wider than 16 bits, then 0xFFFF is of type int. It is a signed, but positive, number. In this case both operands are signed, and long has the greater conversion rank, so the int is again promoted to long by the usual arithmetic conversions.

As others have said, you should avoid performing bitwise operations on signed operands because the numeric result is dependent upon how signedness is represented.

Aside from that, there's nothing particularly wrong with this code. I would argue that it's a style concern that value is not initialized when it is declared, but that's probably a nit-pick level comment and depends upon the contents of the //some stuff section that was omitted.

It's probably also preferable to use a fixed-width integer type (like uint32_t) instead of long for greater portability, but really that too depends on the code you are writing and what your basic assumptions are.

You're right. I bet an interviewer who asked this question would be shown up as wrong. :-) — Omnifarious, Nov 23 '10 at 03:00
@Omnifarious: That's likely. Either way, I think it's a terrible interview question. — James McNellis, Nov 23 '10 at 03:06
Note though that the numeric result of doing bitwise operations on signed operands only varies if negative numbers / the sign bit are involved. For example, in this case, if `value` is positive then the result is well-defined (the same as `value % 65536`), but if `value` is negative then there are three possible results. — caf, Nov 23 '10 at 05:32
Out of interest, what are the three possible results `caf` mentions? — T ., Nov 25 '10 at 16:39

score 3 · Answer 2 · answered Nov 23 '10 at 01:45

3

I think depending on the size of a long the 0xffff literal (-1) could be promoted to a larger size and being a signed value it will be sign extended, potentially becoming 0xffffffff (still -1).

answered Nov 23 '10 at 01:45

John Gordon

2,366
2
20
27

5

`0xFFFF` is never negative. I've posted an answer with details. – James McNellis Nov 23 '10 at 02:31

Joe · Answer 3 · 2010-11-23T02:26:02.980

2

I'll assume it's because there's no predefined size for a long, other than it must be at least as big as the preceding size (int). Thus, depending on the size, you might either truncate value to a subset of bits (if long is more than 32 bits) or overflow (if it's less than 32 bits).

Yeah, longs (per the spec, and thanks for the reminder in the comments) must be able to hold at least -2147483647 to 2147483647 (LONG_MIN and LONG_MAX).

edited Nov 23 '10 at 02:26

answered Nov 23 '10 at 01:40

Joe

38,368
16
103
119

You mean if it's less/greater than 16 bits? – Charles Salvia Nov 23 '10 at 01:42
+1 - I didn't think about the fact that the size of long can change while looking at this question. – James Black Nov 23 '10 at 01:43
I thought it was just int that had an undefined size, long = 32bit, long long = 64bit. But ferenuff, one learns something every day :) – dutt Nov 23 '10 at 01:45
`long` is required to be at least 32bits – Šimon Tóth Nov 23 '10 at 01:47
-1 While your first sentence is true, the rest of your answer doesn't follow. – Omnifarious Nov 23 '10 at 01:47
4

-1 A `long` is guaranteed at least 32 bits, by reference to the C standard (where it follows from guaranteed value range). Hence there's nothing technically wrong with the statement. It masks all but the lower 16 bits of the value, that's all, and if that's the intention (and a value has been assigned) then the code is correct, and if that isn't the intention then the code is wrong. The interviewer doesn't know C++. Cheers & hth., – Cheers and hth. - Alf Nov 23 '10 at 01:48
@Alf P. Steinbach - Is an int also required to be at least 32 bits by the standard? – Omnifarious Nov 23 '10 at 01:49
@Omnifarious: no, `int` is just required to be at least 16 bits. – Cheers and hth. - Alf Nov 23 '10 at 01:50
The standard (C) doesn't mandate bit sizes but it _does_ mandate minimum ranges. So a long must be (at minimum) 32 bits wide since it must hold from -2147483647 through 2147483647. – paxdiablo Nov 23 '10 at 01:50
@Alf P. Steinbach - So, in truth the code may not actually truncate the lower 16 bits because the 0xffff might be sign extended, as I state in my answer. :-) – Omnifarious Nov 23 '10 at 01:52
@pax: You're right on the first second, wrong on the second. It only needs to support ±32767, or 16-bits. – GManNickG Nov 23 '10 at 01:56
1

@GMan That is true for `int`, `long` has to be at least 32 bits. – Šimon Tóth Nov 23 '10 at 01:59
@Let: Oh, I thought we were talking about `int` (and failed to even read the details.). Well that settles that then, my apologies. – GManNickG Nov 23 '10 at 02:01

score 1 · Answer 4 · answered Nov 23 '10 at 01:44

1

For one value isn't initialized before doing the and so I think the behaviour is undefined, value could be anything.

answered Nov 23 '10 at 01:44

dutt

7,007
10
46
76

It's not undefined -- the content could be garbage but it can't (For example) format your hard drive. – Billy ONeal Nov 23 '10 at 01:46
1

@Billy: formally it can do the nasal daemons thing. but in practice, on 32-bit systems an indeterminate value of type `int` is just *some* `int` value. on a 64-bit system, + on some very archaic architectures, there may be checking and a trap. cheers, – Cheers and hth. - Alf Nov 23 '10 at 02:03
@Alf: Yes, the actual **contents** of the variable are undefined. That does NOT mean that it is **undefined behavior**. If it was undefined behavior a conformant compiler could format your hard drive. Whereas in this case it's only allowed to put garbage into a single variable. – Billy ONeal Nov 23 '10 at 02:06
2

@Billy: It is indeed *undefined behavior* to read (perform an lvalue-to-rvalue conversion on) an uninitialized variable. It can very well reformat your hard-drive. See §4.1. – GManNickG Nov 23 '10 at 02:16

Julio Guerra · Answer 5 · 2010-11-23T01:50:38.030

0

long type size is platform/compiler specific.

What you can here say is:

It is signed.
We can't know the result of value &= 0xFFFF; since it could be for example value &= 0x0000FFFF; and will not do what expected.

edited Nov 23 '10 at 01:50

answered Nov 23 '10 at 01:45

Julio Guerra

4,877
7
44
66

1

Again, `long` doesn't imply 4 bytes. It only implies >= `sizeof(int)` – Charles Salvia Nov 23 '10 at 01:46
@Billy and @Charles: Julio didn't say four bytes. He implied that the size of `long` *can be* 32 bits (which might be 1, 2 or 4 bytes depending on `CHAR_BIT`). And minimum 32 bit size for `long` is guaranteed by the standard, so it can indeed be 32 bits. Cheers & hth. – Cheers and hth. - Alf Nov 23 '10 at 02:11

score 0 · Answer 6 · answered Nov 23 '10 at 05:52

While one could argue that since it's not a buffer-overflow or some other error that's likely to be exploitable, it's a style thing and not a bug, I'm 99% confident that the answer that the question-writer is looking for is that value is operated on before it's assigned to. The value is going to be arbitrary garbage, and that's unlikely to be what was meant, so it's "potentially wrong".

score -1 · Answer 7 · answered Jun 13 '14 at 19:20

Using MSVC I think that the statement would perform what was most likely intended - that is: clear all but the least significant 16 bits of value, but I have encountered other platforms which would interpret the literal 0xffff as equivalent to (short)-1, then sign extend to convert to long, in which case the statement "value &= 0xFFFF" would have no effect. "value &= 0x0FFFF" is more explicit and robust.

What is wrong with this bit-manipulation code from an interview question?

What’s potentially wrong with the following code?

7 Answers7