22

Why is unsigned short * unsigned short converted to int in C++11?

The int is too small to handle max values as demonstrated by this line of code.

cout << USHRT_MAX * USHRT_MAX << endl;

overflows on MinGW 4.9.2

-131071

because (source)

USHRT_MAX = 65535 (2^16-1) or greater*

INT_MAX = 32767 (2^15-1) or greater*

and (2^16-1)*(2^16-1) = ~2^32.


Should I expect any problems with this solution?

unsigned u = static_cast<unsigned>(t*t);

This program

unsigned short t;
cout<<typeid(t).name()<<endl;
cout<<typeid(t*t).name()<<endl;

gives output

t
i

on

gcc version 4.4.7 20120313 (Red Hat 4.4.7-16) (GCC)
gcc version 4.8.2 (GCC)
MinGW 4.9.2

with both

g++ p.cpp
g++ -std=c++11 p.cpp

which proves that t*t is converted to int on these compilers.


Usefull resources:

Signed to unsigned conversion in C - is it always safe?

Signed & unsigned integer multiplication

https://bytes.com/topic/c-sharp/answers/223883-multiplication-types-smaller-than-int-yields-int

http://www.cplusplus.com/reference/climits

http://en.cppreference.com/w/cpp/language/types


Edit: I have demonstrated the problem on the following image.

enter image description here

Community
  • 1
  • 1
Slazer
  • 4,100
  • 7
  • 25
  • 53
  • 1
    If `int` is 16 bits on your platform, then the result you get is not an `int`. Do note the disclaimer for the values in the table you link to: "the actual value depends on the particular system and library implementation, but shall reflect the limits of these types in the target platform." – Some programmer dude Nov 16 '15 at 08:39
  • 1
    Are you sure that `USHRT_MAX` is of type `unsigned short`? In my environment (GCC 4.8 64 bit under Lubuntu), `USHRT_MAX` is actually of type `int` (defined as `(32767 * 2 + 1)`). No wonder then if `USHRT_MAX*USHRT_MAX` overflows. – Paolo M Nov 16 '15 at 08:41
  • There was a typo and I corrected it. Has it changed anything? – Slazer Nov 16 '15 at 08:42
  • confirmed for clang 3.5 on Mac OSX – Walter Nov 16 '15 at 08:43
  • Same here with MSVS2015/MSVC14. – Simon Kraemer Nov 16 '15 at 08:44
  • 1
    `USHRT_MAX` is `0xFFFF` And `0xFFFF * 0xFFFF = 0xFFFE0001` (no overflow) This is equal to `4294836225` or `-131071` so it's just the final conversion to `int` which throws it off. – Barmak Shemirani Nov 16 '15 at 09:25
  • 1
    @BarmakShemirani `INT_MAX` is `0x7FFFFFFF`, so that is in fact overflow. – M.M Nov 16 '15 at 10:02
  • See [Why must a short be converted to an int before arithmetic operations in C and C++?](http://stackoverflow.com/q/24371868/1708801) – Shafik Yaghmour Nov 16 '15 at 10:21

6 Answers6

13

You may want to read about implicit conversions, especially the section about numeric promotions where it says

Prvalues of small integral types (such as char) may be converted to prvalues of larger integral types (such as int). In particular, arithmetic operators do not accept types smaller than int as arguments

What the above says is that if you use something smaller than int (like unsigned short) in an expression that involves arithmetic operators (which of course includes multiplication) then the values will be promoted to int.

Some programmer dude
  • 363,249
  • 31
  • 351
  • 550
  • 6
    Isn't this a design flaw? Especially as overflow of unsigned types is defined behaviour while overflow of signed types isn't? I would understand that `char` and `short` become promoted to signed `int` but would have expected that `unsigned char` and `unsigned short` would be promoted to `unsigned int` to allow defined overflow... Or am I wrong here and assigning the signed `int` result of an `unsigned short` arithmetic operation can be safely casted to `unsigned short` without provoking UB? – Simon Kraemer Nov 16 '15 at 09:00
  • @SimonKraemer maybe so, but it's far too late to do anything about it now. It came about from the very first days of C when any integer smaller than an `int` was stored in an int-sized register. (Probably this behaviour came from one of C's precursors) – M.M Nov 16 '15 at 09:52
  • @M.M I guess you are right. I just opened up another question to specifically analyze this behaviour: http://stackoverflow.com/questions/33732489/does-multiplying-unsigned-short-cause-undefined-behaviour – Simon Kraemer Nov 16 '15 at 09:56
10

It's the usual arithmetic conversions in action.

Commonly called argument promotion, although the standard uses that term in a more restricted way (the eternal conflict between reasonable descriptive terms and standardese).

C++11 §5/9:

Many binary operators that expect operands of arithmetic or enumeration type cause conversions and yield result types in a similar way. The purpose is to yield a common type, which is also the type of the result. This pattern is called the usual arithmetic conversions […]

The paragraph goes on to describe the details, which amount to conversions up a ladder of more general types, until all arguments can be represented. The lowest rung on this ladder is integral promotion of both operands of a binary operation, so at least that is performed (but the conversion can start at a higher rung). And integral promotion starts with this:

C++11 §4.5/1:

A prvalue of an integer type other than bool, char16_t, char32_t, or wchar_t whose integer conversion rank (4.13) is less than the rank of int can be converted to a prvalue of type int if int can represent all the values of the source type; otherwise, the source prvalue can be converted to a prvalue of type unsigned int

Crucially, this is about types, not arithmetic expressions. In your case the arguments of the multiplication operator * are converted to int. Then the multiplication is performed as an int multiplication, yielding an int result.

Cheers and hth. - Alf
  • 135,616
  • 15
  • 192
  • 304
  • I think OP is safe here, because if int cannot represent short int completely then it is (or at least can be converted, however you interpret that) converted to unsigned int, as is written in your second standard quote. – this Nov 16 '15 at 19:49
  • @this: Well, the `int` multiplication can overflow, which is formally Undefined Behavior. And a compiler may take "advantage" of that. Essentially, the compiler's programmers can reason that it can always be assumed that UB doesn't occur (for if it does occur then any effect is valid behavior), and then rather perplexing behaviors that in some obscure cases will shave off a nanosecond or two, can result from optimizations under such assumption. – Cheers and hth. - Alf Nov 16 '15 at 20:27
6

As pointed out by Paolo M in comments, USHRT_MAX has type int (this is specified by 5.2.4.2.1/1: all such macros have a type at least as big as int).

So USHRT_MAX * USHRT_MAX is already an int x int, no promotions occur.

This invokes signed integer overflow on your system, causing undefined behaviour.


Regarding the proposed solution:

unsigned u = static_cast<unsigned>(t*t);

This does not help because t*t itself causes undefined behaviour due to signed integer overflow. As explained by the other answers, t is promoted to int before the multiplication occurs, for historical reasons.

Instead you could use:

auto u = static_cast<unsigned int>(t) * t;

which, after integer promotion, is an unsigned int multiplied by an int; and then according to the rest of the usual arithmetic conversions, the int is promoted to unsigned int, and a well-defined modular multiplication occurs.

M.M
  • 130,300
  • 18
  • 171
  • 314
  • And what about other operators? Is this ok? `ulong i = x + static_cast(y)*m_mapSize.getX()` where `x`,`y` and `getX()` are `unsigned short int` How exactly is the type of `i` inferred? – Slazer Nov 16 '15 at 13:15
  • @Slazer the type of `i` is `ulong` because you said as much. The type of the result of any operator depends on its two operands. In your code the operands of `*` are `ulong` and `ushort`; according to the promotion rules the latter is promoted to `int` and then `ulong` giving a `ulong` result. Then the operands of `+` are `ushort` and `ulong`, so again the `ushort` is ultimately promoted to `ulong` for a result of `ulong`. – M.M Nov 16 '15 at 20:00
5

With integer promotion rules

USHRT_MAX value is promoted to int. then we do the multiplication of 2 int (with possible overflow).

Jarod42
  • 173,454
  • 13
  • 146
  • 250
4

It seems that nobody has answered this part of the question yet:

Should I expect any problems with this solution?

u = static_cast<unsigned>(t*t);

Yes, there is a problem here: it first computes t*t and allows it to overflow, then it converts the result to unsigned. Integer overflow causes undefined behavior according to the C++ standard (even though it may always work fine in practice). The correct solution is:

u = static_cast<unsigned>(t)*t;

Note that the second t is promoted to unsigned before the multiplication because the first operand is unsigned.

Eugene
  • 4,579
  • 1
  • 16
  • 29
  • You should also note that while int is larger than short on most current platforms it is not gauranteed to be larger by the C standard. – plugwash Nov 16 '15 at 19:04
  • @plugwash In C, it is most definitely guaranteed. ISO/IEC 9899:201x 6.2.5, paragraph 8. – this Nov 16 '15 at 19:29
  • That part of the standard could be clearer but i'm pretty sure it's intended to mean "same size or larger" rather than "strictly larger". If it didn't then every C compiler i've ever seen would be noncompliant. – plugwash Nov 16 '15 at 19:35
  • @plugwash Quote: *the range of values of the type with smaller integer conversion rank is a subrange of the values of the other type.* In case that isn't clear enough, the very next paragraph explains exactly what subrange is. – this Nov 16 '15 at 19:37
  • @plugwash I employ you to find a similar paragraph in the C++ Standard, since this question is about C++ and not C. Perhaps C++ has a different rule, but you did mention only C. (You perhaps mean C++ in the first place, in the first comment?) – this Nov 16 '15 at 19:40
  • If your interpretation was correct it would require long long to be bigger than long, long to be bigger than int and int to be bigger than short. Can you name a single platform where that is the case? – plugwash Nov 16 '15 at 19:48
  • @plugwash You are incorrectly equating rank and ranges. Ranks are defined to be strictly larger or smaller, but ranges are not. A subrange can be equal to range. A rank or long long is always larger than long ( 6.3.1.1, paragraph 1), but the ranges of those types can be the same. (You might want to use the @ symbol or I won't know you replied.) – this Nov 16 '15 at 19:54
  • Sorry if my original note was not clear, what I was trying to say is that that u = static_cast(t)*t; (where t is unsigned short) is not gauranteed to be free of overflow on all platforms because on some platforms unsigned int and unsigned short have the same range. Do we agree on that? – plugwash Nov 16 '15 at 20:03
  • @plugwash No, it absolutely cannot overflow, because C and C++ Standards defined unsigned arithmetic to wrap-around. – this Nov 16 '15 at 20:23
3

As it has been pointed out by other answers, this happens due to integer promotion rules.

The simplest way to avoid the conversion from an unsigned type with a smaller rank than a signed type with a larger rank, is to make sure the conversion is done into an unsigned int and not int.

This is done by multiplying by the value 1 that is of type unsigned int. Due to 1 being a multiplicative identity, the result will remain unchanged:

unsigned short c = t * 1U * t;

First the operands t and 1U are evaluated. Left operand is signed and has a smaller rank than the unsigned right operand, so it gets converted to the type of the right operand. Then the operands are multiplied and the same happens with the result and the remaining right operand. The last paragraph in the Standard cited below is used for this promotion.

Otherwise, the integer promotions are performed on both operands. Then the following rules are applied to the promoted operands:

-If both operands have the same type, then no further conversion is needed.

-Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank.

-Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.

this
  • 5,087
  • 1
  • 17
  • 49