itoa() c implementation int min underflow

Question

I'm running some test cases against my itoa() function but keep getting

did not allocate memory for the int min value

I'm doing the check but it's something I'm missing here, what is it?

char *ft_itoa(int x) {
    char *s;
    size_t len;
    long int n;

    n = x;
    if (x == -2147483648)
        return (ft_strdup("-2147483648"));

    len = ft_intlen(n) + 1;
    if (!(s = (char*)malloc(sizeof(char) * len)))
        return (NULL);

    if (n == 0)
        s[0] = '0';

    if (n < 0) {
        s[0] = '-';
        n = -n;
    }
    s[len - 1] = '\0';
    while (n) {
        len--;
        s[len - 1] = (n % 10) + '0';
        n /= 10;
    }
    return (s);
}

that is the full code, strdup just allocates the string, ft_intlen just returns length of string, both pass the test cases — franklinexpress, Oct 08 '16 at 07:52
This `... s[len - 1] = '\0'; while (n) { len--; s[len - 1] ...` looks fishy. Too many minuses. — alk, Oct 08 '16 at 08:05
because i did len + 1. size of int is 4 bytes, is that what you mean? — franklinexpress, Oct 08 '16 at 08:06
Also you want to decide whether you use C or C++. The title states C, casting `malloc()` indicates C++, the question is tagged both ... — alk, Oct 08 '16 at 08:08
`ft_intlen` is important, you should show it, but if you don't want to show it then that's that. You also need to show the test cases where this fails. — Barmak Shemirani, Oct 08 '16 at 08:14
You will not get a better answer than mine. You refuse to post a fully working example that actually demonstrates your problem and your +150 bounty doesn't change that fact. How do you want anyone to help you if you don't show your problem? — Bodo Thiesen, Oct 10 '16 at 21:57
Note that `-2147483648` may exceed the range of type `int` because `2147483648` exceeds the range of 32-bit signed integers. Use `-2147483647-1` to compute -2**31 as a 32-bit value. — chqrlie, Oct 11 '16 at 00:44
@chqrlie Your comment is true, yet not an issue with this code. There is no problem with `if (x == -2147483648)`. `-2147483648` and `-2147483647-1` have the same value, but likely different types. Except `-2147483647-1` does have a problem on rare 32-bit non 2's complement platforms. — chux - Reinstate Monica, Oct 13 '16 at 18:21

score 10 · Accepted Answer · answered Oct 10 '16 at 13:45

This line:

if (x == -2147483648)

does not do what you think it does. C does not have negative integer constants. This is an unsigned int constant with the value 2^31, that you apply the unary minus operator on. This means that the expression x == -21... will depend on the C standard your compiler uses.

If you use C99 or C11, you'll be fine. There is a signed type that is big enough - long long is guaranteed to be big enough for this number, so both x and -21... will be converted into long long and then compared. But if you're using a C89 compiler and your machine doesn't have a long enough type, you're hitting implementation-defined behavior here:

When an integer is demoted to a signed integer with smaller size, or an unsigned integer is converted to its corresponding signed integer, if the value cannot be represented the result is implementation-defined.

This is why people are saying to use limits.h. Not because they are being pedantic, but because this is dangerous territory. If you look closely at what limits.h contains, you'll most likely find a line like this:

#define INT_MIN (- INT_MAX - 1)

This expression actually has the correct type and value.

Other than that I can't see any errors in the code you posted. If this is not the problem either ft_intlen or ft_strdup are wrong. Or you're calling your function in testing wrong (the same problems apply to -21... when calling tests).

`if (x == -2147483648)` is fine aside from `INT_MIN` may have a different value. With C99/C11 `2147483648`, `-2147483648` can well fit as a `long long`. With C89 (and 32-bit long), `2147483648` is `unsigned long` "The type of an integer constant is the first of the corresponding list in which its value can be represented. Unsuffixed decimal: int, long int, unsigned long int;" C89 3.1.3.2 Integer constants. The cite does not apply as the `-` and `==` are done with `unsigned long` math. "If either operand has type unsigned long int, the other operand is converted to unsigned long int." 3.2.1.5 — chux - Reinstate Monica, Oct 13 '16 at 16:42
@chux The usual arithmetic conversions don't actually say explicitly how unary operators should behave, but I verified in gcc (funnily enough `_Generic` works in gcc in c86 mode) that the conversion is to signed int. Even if the equality is done with unsigned long, it just means that we're converting a value that can't be represented into an unsigned long and that's once again implementation defined. — Art, Oct 14 '16 at 06:06
Converting an `int` to any unsigned type is well defined, not implementation defined. "Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type." C11 6.3 Conversions So `if (x == -2147483648)` becomes `if (((unsigned long)x) == 2147483648UL)` on 32-bit long C89 machines. — chux - Reinstate Monica, Oct 14 '16 at 14:15

score 2 · Answer 2 · edited Oct 08 '16 at 09:25

Status: RESOLVED INVALID

Reason: WORKS_FOR_ME

Anyways, I improved on some points.

sizeof(char) is always 1, no need for it.
don't cast malloc
if you handle special case 0, then just handle it in one go.
-2147483648 is very very bad. That's what INT_MIN is for.
return is not a function, don't return (value), just return value.
don't s[len - 1] all the time, better decrements len prior to entering the loop. Or, since you need len + 1 only in the malloc call, just have len as intlen returns it and call malloc using len + 1

ft_itoa.c

#include <stdbool.h>
#include <limits.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <btstr.h>

int ft_intlen(int n) {
        char buffer[8192];
        return snprintf(buffer, sizeof buffer, "%i", n);
}

char * ft_itoa(int n) {
        char * s;
        size_t l, len;
        bool fix_int_min = false;

        if (!n) {
                return mstrcpy("0");
        }

        if (-INT_MAX != INT_MIN && n == INT_MIN) {
                ++n;
                fix_int_min = true;
        }

        len = ft_intlen(n);
        if (!(s = malloc(len + 1))) {
                return NULL;
        }
        if (n < 0) {
                s[0] = '-';
                n = -n;
        }
        s[l = len] = '\0';
        while (n) {
                s[--len] = (n % 10) + '0';
                n /= 10;
        }

        if (fix_int_min) {
                --l;
                while (s[l] == '9') {
                        s[l++] = 0;
                }
                if (s[l] == '-') {
                        // realloc +1 and write "-1[0....0]\0"
                } else {
                        ++s[l];
                }
        }

        return s;
}

main.c

#include <limits.h>
#include <stdio.h>

char * ft_itoa(int n);

void check(int n) {
        printf("%i = %s\n", n, ft_itoa(n));
}

int main() {
        check(0);
        check(-1);
        check(1);
        check(23);
        check(42);
        check(4711);
        check(1000);
        check(INT_MAX);
        check(1+INT_MIN);
        check(INT_MIN);
}

Result

$ gcc -W -Wall -Wextra -lBtLinuxLibrary ft_itoa.c main.c -o ft_itoa && ./ft_itoa
0 = 0
-1 = -1
1 = 1
23 = 23
42 = 42
4711 = 4711
1000 = 1000
2147483647 = 2147483647
-2147483647 = -2147483647
-2147483648 = -2147483648

I'm trying to use it without limits.h, and having parenthesis around return is convention at some places. It's perfectly acceptable. — franklinexpress, Oct 08 '16 at 12:21
In (c) you ((can) write (as)) many (parenthesises (as (you (want)))). True. Question is: Is this useful? The thing about limits.h: It's there for a reason. Trying to do it without is not really an acceptable answer. Because, if you're trying to do it without, why not just malloc(12) but ft_intlen() instead? (Please don't tell, because you only need 8 bytes, today, each malloc returns n*16 bytes with n being an integer number. So, you're getting 16 bytes anyways. — Bodo Thiesen, Oct 08 '16 at 12:27
I'm just saying the parenthesis thing shouldn't even be an argument, it's irrelevant — franklinexpress, Oct 08 '16 at 12:44
For me, it's an improvement. If you disagree, that's fine, just ignore that point and keep writing parenthesises around your return statements. It's not an error. — Bodo Thiesen, Oct 08 '16 at 12:47
It's just that I have to, the tests check conventions too lol so I've gotten used to it but don't necessarily love it. I agree with you. — franklinexpress, Oct 08 '16 at 12:57
I guess, my first comment answering franklinexpresses comment offended him. But there is no way to know for sure, (a) who and (b) why (s)he downvoted. — Bodo Thiesen, Oct 10 '16 at 22:00

Yakov Galka · Answer 3 · 2016-10-14T18:25:00.083

2

You don't need that check. Instead convert it to unsigned, that will fit the absolute value :

size_t ft_uintlen(unsigned n)
{
    size_t len = 0;
    do {
        ++len;
        n /= 10;
    } while(n);
    return len;
}

char *ft_itoa(int x)
{
    char    *s;
    size_t  len;
    unsigned n;
    int negative;

    negative = x < 0;
    n = negative ? 0-(unsigned)x : (unsigned)x;
    len = ft_uintlen(n) + negative + 1;
    if (!(s = (char*)malloc(len)))
        return (NULL);

    s[--len] = '\0';
    if (negative)
        s[0] = '-';
    do {
        s[--len] = (n % 10) + '0';
        n /= 10;
    } while(n);
    return (s);
}

Note that this uses a new size_t ft_uintlen(unsigned) function that works on unsigned arguments.

edited Oct 14 '16 at 18:25

answered Oct 10 '16 at 22:02

Yakov Galka

61,035
13
128
192

I'm afraid `n = negative ? -x : x;` invokes undefined behavior if `x == INT_MIN` because of arithmetic overflow. – chqrlie Oct 11 '16 at 00:37
@chqrlie: sorry, forgot to cast to unsigned. Fixed. – Yakov Galka Oct 11 '16 at 07:21
`unsigned` commonly has about 2x range as `int`. Per C spec `INT_MAX == UINT_MAX` is a possibility. It that rare situation, converting to `unsigned` does not form correct results with `INT_MIN`. – chux - Reinstate Monica Oct 13 '16 at 17:55
Simplification: With this code `if (n == 0) s[0] = '0'; else` is not needed. – chux - Reinstate Monica Oct 13 '16 at 17:56
@chux: Thanks! I like such simplifications! Regarding `INT_MAX == UINT_MAX`: I guess it can, but as long as there aren't any such architectures in use today [I consider it a theoretical spec m10n](http://stackoverflow.com/questions/6971886/exotic-architectures-the-standards-committees-care-about). – Yakov Galka Oct 14 '16 at 18:31
The concern about specs that address the theoretical is that C compilers take advantage, such as UB on `int` overflow. That stems from the 3 integer codings formats (2,s, 1,s sign-mag) - even though 2's complement is ubiquitous today. So if the spec allows something, code should take care. – chux - Reinstate Monica Oct 14 '16 at 18:36

score 0 · Answer 4 · edited May 23 '17 at 10:32

Probably problem in your overflow prevention mechanism. You try to assign x of type int to n with type long int. But specification doesn't guarantee that type long int can handle value range large then int. More info can be found "Long Vs. Int".

Use long long int type for n if your compiler supports it. Update your ft_intlen function to int ft_intlen(long long int n). In this case you will be able to handle the whole int type value range and remove the following lines:

if (x == -2147483648)
  return (ft_strdup("-2147483648"));

Also error message did not allocate memory for the int min value is not one of the system error numbers. You need to add more logging into your application, especially if it's not possible to debug it for some reason. Check errno for each system function call, e.g.:

char* errmsg;
// Other code skipped here 
if (!(s = (char*)malloc(sizeof(char) * len)))
{
  errmsg = strerror(errno);          // Use strerror_s if possible 
  printf("Malloc error: %s\n", errmsg);
  return (NULL);
}

Note: specification doesn't guarantee that type `long long int` can handle value range large then `int` either. — chux - Reinstate Monica, Oct 13 '16 at 18:03

chux - Reinstate Monica · Answer 5 · 2016-10-13T22:30:11.543

Potential code failures, in order of suspicion:

ft_strdup() as that code is called with "int min value" and error occurs.
Prototypes lacking for various functions. Especially ft_strdup()/strdup().
Calling/test code is faulty.
"int min value" is larger than -2147483648. (Better to use INT_MIN.)
ft_intlen(n) is coded incorrectly and returns INT_MAX, then code tries malloc(INT_MIN).
int/long both 64-bit. This messes the first s[len - 1] = (n % 10) + '0'; with INT_MIN.

Otherwise, if INT_MIN has the value -2147483648, ft_itoa(int x) is fine.

OP asserts "... strdup just allocates the string, ft_intlen just returns length of string, both pass the test cases – franklinexpress Oct 8 at 7:52"

Passing test cases does not mean it worked without invoking undefined behavior. Best to post ft_intlen(), ft_strdup() and test harness for review.

Candidate portable implementation. No dependency on int/long size or 2's complement. No need for <limits.h> aside from CHAR_BIT which code could assume is 8 without sacrificing too much potability. Works with C89/99/11.

// Buffer size needed to decimal print any `int`
// '-' + Ceiling(value bit size * log10(2)) + \0
#define INT_STR_SIZE (1 + ((CHAR_BIT*sizeof(int) - 1)/3 + 1) + 1)

char *ft_itoa(int x) {
  char buf[INT_STR_SIZE];
  char *s = buf + sizeof buf - 1;  // Set to end of buffer
  *s = '\0';

  int n = x; // no need for wider types like long

  if (n > 0) {
    // fold positive numbers to negative ones
    // This avoids the special code for `INT_MIN` and need for wider types
    n = -n;
  }

  // Using a do loop avoids special code for `x==0`
  do {
    // Use `div()` rather than / % in case we are using C89.
    // / %  has implementation defined results for negative arguments.
    div_t qr = div(n, 10);
    *--s = (char) ('0' - qr.rem);  // Form digit from negative .rem
    n = qr.quot;
  } while (n);

  if (x < 0) {
    *--s = '-';
  }

  // Double check ft_strdup() is coded correctly
  // Insure calling code frees the buffer when done.
  return ft_strdup(s); 
}

score 0 · Answer 6 · edited May 23 '17 at 12:10

The piece of code you gave compile and works on OsX, but with my own ft_stdup and ft_intlen. So you can either show us the code or check them for errors. I made some tests (including 2147483647, -2147483648). It works nicely.

Anyway, the lines:

if (x == -2147483648) return (ft_strdup("-2147483648"));

Are useless as long as you copy your x value into a long long variable (Art) before doing any kind of operation it. So you don't need including types.h (notorious moulinette will not give you -42).

It happens that on OsX it works also on long values, but this is non portable safe.

FloppySoftware · Answer 7 · 2016-10-17T10:09:49.540

0

Just use:

INT_MIN

instead of:

-2147483648

in your test:

if (x == INT_MIN)
    return (ft_strdup("-2147483648"));

The reason for this is that some compilers could have problems understanding that number.

The standard C library limits.h usually define it as:

#define INT_MIN  (-INT_MAX - 1)

to avoid this problem.

edited Oct 17 '16 at 10:09

answered Oct 17 '16 at 09:10

FloppySoftware

111
4

itoa() c implementation int min underflow

7 Answers7

Linked