-1

The following code crashes in 64 bit system. If file name length is less than 3, then underflow happen for the 'len'. But this program is not showing any segmentation fault in 32 bit system. But I am getting segmentation fault in 64 bit system. Why This program is not showing any segmentation fault in 32 bit system ?

 DIR * dirp = opendir(dirPath);
 struct dirent * dp;
 while(dirp)
 {
   if((dp = readdir(dirp)) != NULL)
   {
    unsigned int len = strlen(dp->d_name);
    //underflow happens if filename length less than 3
    if((dp->d_name[len - 3] == 'j'))
    }
  }
GS - Apologise to Monica
  • 27,973
  • 10
  • 81
  • 109
Deepak
  • 978
  • 4
  • 16
  • 39
  • Because `dp->d_name[len - 3]` is inside your process on 32 bit (by chance) and not inside your process on 64 bit. Why do you need to do this very dangerious chech? – Ilya Jun 24 '14 at 11:12
  • @llya **inside your process on 32 bit (by chance) and not inside your process on 64 bit**...could you please elaborate this statement more. – instance Jun 24 '14 at 11:14
  • just wondering, did you include `` ? – Montaldo Jun 24 '14 at 11:17
  • 2
    So what do you suppose reading `dp->d_name[len - 3]` does for you when `dp->d_name` holds,say, `"."`. Hmm..... Its UB. You're question seems to be asking for something *definitive* about something that *isn't*. – WhozCraig Jun 24 '14 at 11:20
  • I regularly run the program which contains above code with centOS 32 bit, till now I didn't get any segmentation fault. Today I installed centOS 64 bit. Now I am getting segmentation fault – Deepak Jun 24 '14 at 11:39
  • 1
    @user61455 Do you understand that your code is broken? Isn't it just easiest to fix it. It has always been broken. It is broken everywhere. You've just been really unlucky until now and never seen a seg fault. Now you got lucky and have discovered the error. – David Heffernan Jun 24 '14 at 12:00
  • Regarding this line: if((dp->d_name[len - 3] == 'j')) -- the program is refering to an offset from d_name. if the value 'len' is less than 3, then the offset goes negative, but offsets are not signed, so a VERY LARGE offset is the result. When this VERY LARGE offset results in an address outside your process map, the result is a seg fault. – user3629249 Jun 25 '14 at 16:02

3 Answers3

5

You program results in undefined behaviour, as you appear to be aware of. You are attempting to access outside the bounds of the array. And undefined behaviour is just what it sounds like. The behaviour is not defined. Anything could happen.

You might get a segmentation fault one time you run, and not another time. Or you might see different behaviour under different compilers. Undefined behaviour is by its very nature unpredictable. The fact that you seemed to get away with this error in your code under one compiler does not make your code correct.

Obviously what you should do is to avoid writing programs that result in undefined behaviour.

David Heffernan
  • 572,264
  • 40
  • 974
  • 1,389
4

Why This program is not showing any segmentation fault in 32 bit system ?

Look, this is slightly simplified your program:

1       int main(int argc, char *argv[])
2       {
3         char name[100];
4         unsigned int len = 3;
5         name[len-argc] = 1;
6         return 0;
7       }

So when I build it as 32-bit program gcc -m32 -g main.c -o main32 this is how under gdb the address space of a process looks:

$ gdb -q --args ./main32 1 2 3
Reading symbols from /home/main32...done.
(gdb) start

(gdb) info proc mappings
process 28330
Mapped address spaces:

        Start Addr   End Addr       Size     Offset objfile
          0x110000   0x111000     0x1000        0x0 [vdso]
          0x3fa000   0x418000    0x1e000        0x0 /lib/ld-2.12.so
          0x418000   0x419000     0x1000    0x1d000 /lib/ld-2.12.so
          0x419000   0x41a000     0x1000    0x1e000 /lib/ld-2.12.so
          0x41c000   0x5a8000   0x18c000        0x0 /lib/libc-2.12.so
          0x5a8000   0x5aa000     0x2000   0x18c000 /lib/libc-2.12.so
          0x5aa000   0x5ab000     0x1000   0x18e000 /lib/libc-2.12.so
          0x5ab000   0x5ae000     0x3000        0x0
         0x8048000  0x8049000     0x1000        0x0 /home/main32
         0x8049000  0x804a000     0x1000        0x0 /home/main32
        0xf7fdf000 0xf7fe0000     0x1000        0x0
        0xf7ffd000 0xf7ffe000     0x1000        0x0
        0xfffe9000 0xffffe000    0x15000        0x0 [stack]
(gdb) p/x &(name[len-argc])
$2 = 0xffffcfab

As you can see name[3-4] (it is underflow as you say) actually points to a valid address on stack. This is why your process does not crash.

When I build the same program as 64 bit (gcc -m64 -g main.c -o main64) the address will not be valid

(gdb) info proc mappings
process 29253
Mapped address spaces:

          Start Addr           End Addr       Size     Offset objfile
            0x400000           0x401000     0x1000        0x0 /home/main64
            0x600000           0x601000     0x1000        0x0 /home/main64
        0x3c40a00000       0x3c40a20000    0x20000        0x0 /lib64/ld-2.12.so
        0x3c40c1f000       0x3c40c20000     0x1000    0x1f000 /lib64/ld-2.12.so
        0x3c40c20000       0x3c40c21000     0x1000    0x20000 /lib64/ld-2.12.so
        0x3c40c21000       0x3c40c22000     0x1000        0x0
        0x3c41200000       0x3c41389000   0x189000        0x0 /lib64/libc-2.12.so
        0x3c41389000       0x3c41588000   0x1ff000   0x189000 /lib64/libc-2.12.so
        0x3c41588000       0x3c4158c000     0x4000   0x188000 /lib64/libc-2.12.so
        0x3c4158c000       0x3c4158d000     0x1000   0x18c000 /lib64/libc-2.12.so
        0x3c4158d000       0x3c41592000     0x5000        0x0
      0x7ffff7fdd000     0x7ffff7fe0000     0x3000        0x0
      0x7ffff7ffd000     0x7ffff7ffe000     0x1000        0x0
      0x7ffff7ffe000     0x7ffff7fff000     0x1000        0x0 [vdso]
      0x7ffffffea000     0x7ffffffff000    0x15000        0x0 [stack]
  0xffffffffff600000 0xffffffffff601000     0x1000        0x0 [vsyscall]
(gdb) p/x &name[len-argc]
$5 = 0x8000ffffde3f

One more thing. This is how assembler looks for 64-bit application:

(gdb) disassemble /m
Dump of assembler code for function main:

5         name[len-argc] = 1;
   0x0000000000400472 <+22>:    mov    -0x74(%rbp),%edx
   0x0000000000400475 <+25>:    mov    -0x4(%rbp),%eax
   0x0000000000400478 <+28>:    sub    %edx,%eax
   0x000000000040047a <+30>:    mov    %eax,%eax
=> 0x000000000040047c <+32>:    movb   $0x1,-0x70(%rbp,%rax,1)

This is $eax::

(gdb) p $eax
$1 = -1

But assigning use rax since you are in 64 mode. And this is the value of $rax:

(gdb) p/x $rax
$3 = 0xffffffff

So the program adds to a valid stack addres a huge positive offset and it results in invalid address.

I would like to underline that this is undefined behavior in both 32 and 64 modes. If you want to fix this undefined behavior you can read my another answer https://stackoverflow.com/a/24287919/184968.

Community
  • 1
  • 1
  • You ought to mention that the code is broken everywhere because of the UB – David Heffernan Jun 24 '14 at 13:58
  • 1
    Sure it is UB in both cases. But as far as understand the question was about why first UB does not crash while the second UB causes a crash. –  Jun 24 '14 at 14:14
  • I'm not sure that the asker really understands that the program is always broken. I think asker believes that it is fine under 32 bit. – David Heffernan Jun 24 '14 at 14:26
  • Yes I know that the program is always broken. My doubt was why first UB does not crash while the second UB causes a crash. – Deepak Jun 25 '14 at 07:57
1

dp->d_name[len - 3] == 'j' the len - 3 might be within your segment on this 32-bit machine and just outside your segment on the 64-bit machine. It has to do with your operating system.

Montaldo
  • 853
  • 7
  • 16