242

I have recently started to learn C and I am taking a class with C as the subject. I'm currently playing around with loops and I'm running into some odd behaviour which I don't know how to explain.

#include <stdio.h>

int main()
{
  int array[10],i;

  for (i = 0; i <=10 ; i++)
  {
    array[i]=0; /*code should never terminate*/
    printf("test \n");

  }
  printf("%d \n", sizeof(array)/sizeof(int));
  return 0;
}

On my laptop running Ubuntu 14.04, this code does not break. It runs to completion. On my school's computer running CentOS 6.6, it also runs fine. On Windows 8.1, the loop never terminates.

What's even more strange is that when I edit the condition of the for loop to: i <= 11, the code only terminates on my laptop running Ubuntu. It never terminates in CentOS and Windows.

Can anyone explain what's happening in the memory and why the different OSes running the same code give different outcomes?

EDIT: I know the for loop goes out of bounds. I'm doing it intentionally. I just can't figure out how the behaviour can be different across different OSes and computers.

Ziezi
  • 6,049
  • 3
  • 34
  • 45
JonCav
  • 1,657
  • 2
  • 8
  • 8
  • 149
    Since you are overrunning the array then undefined behaviour occurs. Undefined behaviour means anything can happen including it appearing to work. Thus "code should never terminate" is not a valid expectation. – kaylum Jun 24 '15 at 02:37
  • 38
    Exactly, welcome to C. Your array has 10 elements - numbered 0 to 9. – Yetti99 Jun 24 '15 at 02:38
  • 3
    What I find wierd is that you start your loop at `0` and end it at `10`? Thus, it's looping 11 times `0 1 2 3 4 5 6 7 8 9 10`, count them and see. – Iharob Al Asimi Jun 24 '15 at 02:39
  • 4
    I should add I am trying to break the code. I'm coming from Java. I am forcing an out of bounds behavior and I don't know what is happening exactly in memory. – JonCav Jun 24 '15 at 02:39
  • Depending on your compiler you can add an option for bounds checking. GCC has -fbounds-check – Yetti99 Jun 24 '15 at 02:41
  • 14
    @JonCav You did break the code. You are getting undefined behaviour which is broken code. – kaylum Jun 24 '15 at 02:41
  • 1
    You will probably find that the behavior changes -- on each platform -- if you turn optimization on. – zwol Jun 24 '15 at 02:42
  • 50
    Well, the whole point is that undefined behaviour is exactly that. You can't reliably test it and prove something defined will happen. What's probably going on in your Windows machine, is that the variable `i` is stored right after the end of `array`, and you are overwriting it with `array[10]=0;`. This might not be the case in an optimised build on the same platform, which may store `i` in a register and never refer to it in memory at all. – paddy Jun 24 '15 at 02:42
  • 1
    Well I guess I meant, why isn't it breaking in a predictable manner. – JonCav Jun 24 '15 at 03:09
  • 46
    Because non-predictability is a fundamental property of Undefined Behaviour. You need to understand this... Absolutely all bets are off. – paddy Jun 24 '15 at 03:22
  • 3
    Incidentally, you just learned by yourself what a buffer overrun is! – sleblanc Jun 24 '15 at 08:23
  • 1
    @Yetti99: In the current gcc 5.1 (and older) the option `-fbounds-check` is supported only for Java and Fortran frontends. See https://gcc.gnu.org/onlinedocs/gcc-5.1.0/gcc/Code-Gen-Options.html – pabouk Jun 24 '15 at 08:52
  • 7
    I would have changed "playing around with loops" to "seeing what happens when I buffer overrun" to avoid the instant answer of "don't do what you're doing!". But yes, there's an inherent assumption that "i" will end up next to your array. There is no reason to assume this. – deworde Jun 24 '15 at 11:08
  • [This image](http://image.slidesharecdn.com/seethroughc-140329093101-phpapp01/95/see-through-c-3-638.jpg?cb=1396085504) shows that `array[10]` MAY overwrite stuff next to it. Try and add variables before/after and see if they change (depending on OS/compiler) – WernerCD Jun 24 '15 at 12:23
  • I see this being referred to as "memory stomp". Is that the same as stack smashing? Because this looks like a classic case of stack smashing – somtingwong Jun 24 '15 at 19:08
  • 3
    Aaaugh! You've invoked the [nasal demons](http://www.catb.org/jargon/html/N/nasal-demons.html)! – Mark Jun 24 '15 at 21:52
  • 3
    Undefined behaviour is undefined. I've actually debugged a begginer's C code before that managed to *overwrite its own code*. By using the result of a function that finds the index of a char in string directly in strcpy (or something similar); of course, when the char wasn't found, it returned something like -1, and with a bit more of arithmetic, this found its way to the executing code. It was a lot of fun seeing this in the debugger - as you stepped through, suddenly it started behaving absolutely chaotic, ignoring `if`s and such... fun times. Legacy C code is full of this kind of stuff. – Luaan Jun 25 '15 at 07:19
  • 1
    @o11c: You write `C, where "it seems to work" doesn't mean anything anymore.`. **0)** Did it ever? **1)** "It seems to work" means nothing in _any_ programming language, be it PHP, C or Shakespeare. – Sebastian Mach Jun 25 '15 at 14:36
  • @JonCav: A good memory hook to read the condition is `for as long as`. – Sebastian Mach Jun 25 '15 at 14:37
  • 1
    Dennis Ritchie's true legacy is the buffer overflow. – Mason Wheeler Jun 25 '15 at 15:39
  • @phresnel it's a lot harder to get UB in memory-safe languages, so "it seems to work" is often valid there as long as your input doesn't change *too* much. – o11c Jun 25 '15 at 17:59
  • 1
    Wow, thanks for all the replies! I guess I'm used to Java where I can predictably break code (since there were exception checks and all that). I'm still getting used to C and undefined behavior. But this does help a lot! :) – JonCav Jun 25 '15 at 22:35
  • @phresnel: The behavior of things like `x << n` with negative x used to be pretty consistent on two's-complement machines. I know of no two's-complement compilers where x << n would have any effect other than to multiply x by 2ⁿ for any reason other than a compiler's deciding to use the "undefinedness" of left-shifting negative values as an excuse to assume that x must not be less than zero. So I'd say the fact that `x << n` "seemed to work" would have meant something but for some compiler writers' eagerness to break it. – supercat Jun 25 '15 at 23:46
  • 2
    @o11c: A lot of "undefined behavior" in C has nothing to do with memory safety. The reason things like overflow is undefined behavior is that the early authors of the C standard didn't see any semantic difference between saying an action would invoke Undefined Behavior, versus requiring implementations must specify possible consequences of the action, but could list unconstrained "Undefined Behavior" as being among them in some or all cases where the action was performed. In practice, there are many cases where it would be useful to be able to have *some* guarantees about overflow behavior... – supercat Jun 25 '15 at 23:52
  • 3
    @supercat sure there are other kinds of UB (though memory-safe languages tend to specify those things as well, so I was brief). That said, I would not say early authors were unaware of what UB really meant, rather they were *very* aware of how much platform needs might vary for efficiency's sake. – o11c Jun 25 '15 at 23:56
  • ...even if they were very loose, but hyper-modern compiler writers think there should be none. Many, if not most, programs, have two requirements: 1. When given valid input, produce valid output; 2. Don't spontaneously launch nuclear missiles, even when given invalid input. In C the only way to meet requirement #2 is to ensure that even invalid input can never cause overflow. Having even very loose specifications regarding overflow behavior would reduce the amount of overflow-checking many programs would have to do to meet requirement #2, and thus allow them to run faster. – supercat Jun 25 '15 at 23:57
  • 1
    @o11c: I think the authors of C standards in the 1970s and 1980s expected that many platforms would specify specific behaviors in situations where the standard did not require it, and that programmers whose code wouldn't need to target platforms that couldn't offer such guarantees should feel free to take advantage of them. There's no good reason why a programmer wanting to multiply one `uint16_t` by another to yield a `uint16_t` result should have to write `x = 1u*y*z;` rather than just `x = y*z;`, but if `y*z` exceeds 2147483647 some compilers may do very wacky things with the latter. – supercat Jun 26 '15 at 00:04
  • 1
    @o11c: `harder to get UB in memory-safe languages` Which languages have UB, except C & C++? Anyways, in PHP, "seems to work" is a much stupider argument than in C & C++, but it's memory save at least. Likewise in Shell-Scripts: It's seriously hard to make a very complex shell script work correctly even only throughout the company. But it's memory save. I once wrote a C# program that seemed to work. But utterly failed on some machine with 512MiB of RAM. On a daily basis, UB was relevant maybe 2 or 3 times in the last 10 years for me; most of the time, it was other language design decisions. – Sebastian Mach Jun 26 '15 at 14:16
  • 1
    @phresnel, "undefined behavior" is simply that -- behavior that's not defined either way by the standard. Asserting that all other languages are so well-defined as to have no unexplored corners in the standard... well, it doesn't pass a laugh test. (To pick a standard I know well, POSIX.2 is *full* of both explicitly and implicitly undefined behavior, both in the shell language definition -- which I raise here by virtue of familiarity -- and elsewhere). That said, there's no shortage of examples -- hell, until it had a formal language definition, Perl was nothing *but* undefined behavior. – Charles Duffy Jun 26 '15 at 15:32
  • @phresnel: `What languages have UB except C & C++?` Did you check... Java? http://programmers.stackexchange.com/questions/153843/undefined-behaviour-in-java or maybe C#? (second answer) http://stackoverflow.com/questions/1860615/code-with-undefined-behavior-in-c-sharp It's rare, but it's still there. – Mooing Duck Jun 27 '15 at 18:10
  • @CharlesDuffy: Yeah okay, that's right. What I really meant is that C++ and C intentionally and explicitly define undefined behaviour; the standard _defines_ UB, which is unlike other languages I know. UB is integral part of the C and C++ languages. – Sebastian Mach Jun 28 '15 at 19:40
  • @phresnel, I continue to disagree that this is abnormal. Look for the phrases "results are undefined" and "undefined results" in the POSIX sh specification, and you'll find it throughout. Even in Ada -- a rigidly-defined standard if ever there was -- numerous implementation details are "unimportant" per the standard, and thus implementation-defined. (These details have less runtime correctness impact than much of C's undefined behavior, but, well, that's what makes Ada more rigidly defined). – Charles Duffy Jun 29 '15 at 04:42
  • @CharlesDuffy: I see. – Sebastian Mach Jun 29 '15 at 06:32
  • 2
    Actually in Ubuntu 15.04: it gives a error as following `stack smashing detected : ./a.out terminated` It seems Ubuntu has looked into this memory stomping problem also now – sk1pro99 Jul 21 '15 at 16:48
  • simple just change `<=` to ` – Taylor Ramirez Jul 21 '15 at 17:33
  • Printing the value of `i` in the loop would have helped. – phoxis Jul 21 '15 at 18:50
  • You don't mention what compilers you are using on each machine, but if you were using the Microsoft compiler on Windows, then you might expect that to generate completely different object code to, say, the gcc compiler on Linux. That could be an additional factor as to why you get really different results. – Tim Long Jul 21 '15 at 21:35
  • We can see an interesting unintuitive case where undefined behavior turns a finite loop into an infinite one in: [Why does this loop produce “warning: iteration 3u invokes undefined behavior” and output more than 4 lines?](http://stackoverflow.com/q/24296571/1708801) – Shafik Yaghmour Aug 09 '15 at 01:54

14 Answers14

358

On my laptop running Ubuntu 14.04, this code does not break it runs to completion. On my school's computer running CentOS 6.6, it also runs fine. On Windows 8.1, the loop never terminates.

What is more strange is when I edit the conditional of the for loop to: i <= 11, the code only terminates on my laptop running Ubuntu. CentOS and Windows never terminates.

You've just discovered memory stomping. You can read more about it here: What is a “memory stomp”?

When you allocate int array[10],i;, those variables go into memory (specifically, they're allocated on the stack, which is a block of memory associated with the function). array[] and i are probably adjacent to each other in memory. It seems that on Windows 8.1, i is located at array[10]. On CentOS, i is located at array[11]. And on Ubuntu, it's in neither spot (maybe it's at array[-1]?).

Try adding these debugging statements to your code. You should notice that on iteration 10 or 11, array[i] points at i.

#include <stdio.h>
 
int main() 
{ 
  int array[10],i; 
 
  printf ("array: %p, &i: %p\n", array, &i); 
  printf ("i is offset %d from array\n", &i - array);

  for (i = 0; i <=11 ; i++) 
  { 
    printf ("%d: Writing 0 to address %p\n", i, &array[i]); 
    array[i]=0; /*code should never terminate*/ 
  } 
  return 0; 
} 
Community
  • 1
  • 1
QuestionC
  • 9,738
  • 3
  • 21
  • 40
  • 6
    Hey thanks! That really explained quite a bit. In Windows it states that i if offset 10 from array, while in both CentOS and Ubuntu, it is -1. What's weirder is if I comment your debugger code out, CentOS cannot run the code (it hangs), but with your debugging code it runs. C seems to be a very language so far X_x – JonCav Jun 24 '15 at 03:44
  • 1
    I'm glad I could help. The things I said about the stack are really just a mnemonic device; it helps to program C with a model of assembly programming in your head, but it's not necessarily true because of optimization. That is almost certainly what you are seeing with the variant behavior. – QuestionC Jun 24 '15 at 03:54
  • 12
    @JonCav "it hangs" can happen if writing to `array[10]` destroys the stack frame, for example. How can there be a difference between code with or without the debugging output? If the address of `i` is never needed, the compiler *may* optimize `i` away. into a register, thus changing the memory layout on the stack ... – Hagen von Eitzen Jun 24 '15 at 07:28
  • Just to specify: stack is associated with a _process_ and its presence is not required and so variables don't necessary go there. – edmz Jun 24 '15 at 09:26
  • 2
    I don't think it's hanging, I think it's in an infinite loop because it's reloading the loop counter from memory (which just got zeroed by `array[10]=0`. If you compiled your code with optimization on, this probably wouldn't happen. (Because C has aliasing rules that limit what kinds of memory accesses must be assumed to potentially overlap other memory. As a local variable that you never take the address of, I think a compiler should be able to assume that nothing aliases it. Anyway, writing off the end of an array is undefined behaviour. Always try hard to avoid depending on that. – Peter Cordes Jun 24 '15 at 15:24
  • 2
    Anyway, one thing I don't think anyone said explicitly is that you have different behaviour on Linux vs. Windows probably because of using different compilers. gcc is probably keeping `i` in a register, and not reloading it from the stack location that `array[10]` references. – Peter Cordes Jun 24 '15 at 15:25
  • @JonCav C is simply a *low level* language for a certain definition of low level (e.g. you may consider a language were you can mess with the memory layout to alter the sematics a low level language). Try to write the "equivalent" code in higher level languages and you wont get this kind of behaviour (you'd probably get an exception instead...). – Bakuriu Jun 24 '15 at 18:44
  • @JonCav Could it be that the CentOS version is compiled for 64-bit? I could imagine it might force alignment for the `int i`, thus moving the value from `array + 10` to `array + 11` (occupying both +10 and +11, but +10 would be zero anyway, of course). – Luaan Jun 25 '15 at 07:14
  • @Luaan on a little endian system (which this probably is) +11 would be zero, not +10 – Steve Cox Jun 25 '15 at 19:57
  • 4
    Another alternative is that an optimizing compiler removes the array completely, as it has no observable effect (in the question’s original code). Hence, the resulting code could just print out that constant string eleven times, followed by printing the constant size and thus make the overflow entirely unnoticeable. – Holger Jun 26 '15 at 11:24
  • 9
    @JonCav I would say in general you *don't* need to know more about memory management and instead simply know not to write undefined code, specifically, don't write past the end of an array... – T. Kiley Jun 26 '15 at 11:24
  • 2
    @T.Kiley, yeah I know going beyond bounds can lead to unexpected behavior, I just like to have that understanding of what's happening in memory. – JonCav Jun 28 '15 at 23:05
  • 2
    @JonCav, ...but you *don't* know what's happening in memory until you also know the exact compiler release, the platform and architecture it's targeted to, and possibly the compile-time flags and pragmas. It's more important to know what you know and what you don't know, than to know some specific examples of what you don't know might be. :) – Charles Duffy Jun 29 '15 at 04:44
  • Could also link to Undefined behaviour, since, well, the behaviour is very much undefined. – Antti Haapala Jul 21 '15 at 16:20
  • The funny fact is that things might run even differently, when you compile this code with debug enabled or not. Please try it out, eg -g0 -O0. It might also run differently in thise modes on windows/*nix systems, as c/c++ relies heavily on compiler implementation. BTW, It's great that you took programmings classes ;) Welcome to the dark side :D – Filip Zymek Jul 21 '15 at 16:52
98

The bug lies between these pieces of code:

int array[10],i;

for (i = 0; i <=10 ; i++)

array[i]=0;

Since array only has 10 elements, in the last iteration array[10] = 0; is a buffer overflow. Buffer overflows are UNDEFINED BEHAVIOR, which means they might format your hard drive or cause demons to fly out of your nose.

It is fairly common for all stack variables to be laid out adjacent to each other. If i is located where array[10] writes to, then the UB will reset i to 0, thus leading to the unterminated loop.

To fix, change the loop condition to i < 10.

o11c
  • 13,564
  • 4
  • 46
  • 66
  • 6
    Nitpick: You can't actually format the hard drive on any sane OS on the market unless you're running as root (or the equivalent). – Kevin Jun 24 '15 at 17:55
  • 26
    @Kevin when you invoke UB, you give up any claim to sanity. – o11c Jun 24 '15 at 18:28
  • 7
    It doesn't matter whether your code is sane. The OS won't let you do that. – Kevin Jun 24 '15 at 18:31
  • 2
    @Kevin The example with formatting you hard drive originated long before that was the case. Even the unixes of the time (where C originated) were quite happy at allowing you to do things like that - and even today, a lot of the distros will happily allow you to start deleting everything with `rm -rf /` even when you're not root, not "formatting" the whole drive of course, but still destroying all your data. Ouch. – Luaan Jun 25 '15 at 07:23
  • 5
    @Kevin but undefined behavior can exploit a OS vulnerability and then elevate itself to install a new hard-disk driver and then start scrubbing the drive. – ratchet freak Jun 26 '15 at 10:52
  • 1
    @Kevin there used to be computers where you could change the CPU clock by writing to a particular memory address... on those ones you couldn't format the hard drive but you could start a fire – M.M Oct 11 '15 at 10:02
38

In what should be the last run of the loop,you write to array[10], but there are only 10 elements in the array, numbered 0 through 9. The C language specification says that this is “undefined behavior”. What this means in practice is that your program will attempt to write to the int-sized piece of memory that lies immediately after array in memory. What happens then depends on what does, in fact, lie there, and this depends not only on the operating system but more so on the compiler, on the compiler options (such as optimization settings), on the processor architecture, on the surrounding code, etc. It could even vary from execution to execution, e.g. due to address space randomization (probably not on this toy example, but it does happen in real life). Some possibilities include:

  • The location wasn't used. The loop terminates normally.
  • The location was used for something which happened to have the value 0. The loop terminates normally.
  • The location contained the function's return address. The loop terminates normally, but then the program crashes because it tries to jump to the address 0.
  • The location contains the variable i. The loop never terminates because i restarts at 0.
  • The location contains some other variable. The loop terminates normally, but then “interesting” things happen.
  • The location is an invalid memory address, e.g. because array is right at the end of a virtual memory page and the next page isn't mapped.
  • Demons fly out of your nose. Fortunately most computers lack the requisite hardware.

What you observed on Windows was that the compiler decided to place the variable i immediately after the array in memory, so array[10] = 0 ended up assigning to i. On Ubuntu and CentOS, the compiler didn't place i there. Almost all C implementations do group local variables in memory, on a memory stack, with one major exception: some local variables can be placed entirely in registers. Even if the variable is on the stack, the order of variables is determined by the compiler, and it may depend not only on the order in the source file but also on their types (to avoid wasting memory to alignment constraints that would leave holes), on their names, on some hash value used in a compiler's internal data structure, etc.

If you want to find out what your compiler decided to do, you can tell it to show you the assembler code. Oh, and learn to decipher assembler (it's easier than writing it). With GCC (and some other compilers, especially in the Unix world), pass the option -S to produce assembler code instead of a binary. For example, here's the assembler snippet for the loop from compiling with GCC on amd64 with the optimization option -O0 (no optimization), with comments added manually:

.L3:
    movl    -52(%rbp), %eax           ; load i to register eax
    cltq
    movl    $0, -48(%rbp,%rax,4)      ; set array[i] to 0
    movl    $.LC0, %edi
    call    puts                      ; printf of a constant string was optimized to puts
    addl    $1, -52(%rbp)             ; add 1 to i
.L2:
    cmpl    $10, -52(%rbp)            ; compare i to 10
    jle     .L3

Here the variable i is 52 bytes below the top of the stack, while the array starts 48 bytes below the top of the stack. So this compiler happens to have placed i just before the array; you'd overwrite i if you happened to write to array[-1]. If you change array[i]=0 to array[9-i]=0, you'll get an infinite loop on this particular platform with these particular compiler options.

Now let's compile your program with gcc -O1.

    movl    $11, %ebx
.L3:
    movl    $.LC0, %edi
    call    puts
    subl    $1, %ebx
    jne     .L3

That's shorter! The compiler has not only declined to allocate a stack location for i — it's only ever stored in the register ebx — but it hasn't bothered to allocate any memory for array, or to generate code to set its elements, because it noticed that none of the elements are ever used.

To make this example more telling, let's ensure that the array assignments are performed by providing the compiler with something it isn't able to optimize away. An easy way to do that is to use the array from another file — because of separate compilation, the compiler doesn't know what happens in another file (unless it optimizes at link time, which gcc -O0 or gcc -O1 doesn't). Create a source file use_array.c containing

void use_array(int *array) {}

and change your source code to

#include <stdio.h>
void use_array(int *array);

int main()
{
  int array[10],i;

  for (i = 0; i <=10 ; i++)
  {
    array[i]=0; /*code should never terminate*/
    printf("test \n");

  }
  printf("%zd \n", sizeof(array)/sizeof(int));
  use_array(array);
  return 0;
}

Compile with

gcc -c use_array.c
gcc -O1 -S -o with_use_array1.c with_use_array.c use_array.o

This time the assembler code looks like this:

    movq    %rsp, %rbx
    leaq    44(%rsp), %rbp
.L3:
    movl    $0, (%rbx)
    movl    $.LC0, %edi
    call    puts
    addq    $4, %rbx
    cmpq    %rbp, %rbx
    jne     .L3

Now the array is on the stack, 44 bytes from the top. What about i? It doesn't appear anywhere! But the loop counter is kept in the register rbx. It's not exactly i, but the address of the array[i]. The compiler has decided that since the value of i was never used directly, there was no point in performing arithmetic to calculate where to store 0 during each run of the loop. Instead that address is the loop variable, and the arithmetic to determine the boundaries was performed partly at compile time (multiply 11 iterations by 4 bytes per array element to get 44) and partly at run time but once and for all before the loop starts (perform a subtraction to get the initial value).

Even on this very simple example, we've seen how changing compiler options (turn on optimization) or changing something minor (array[i] to array[9-i]) or even changing something apparently unrelated (adding the call to use_array) can make a significant difference to what the executable program generated by the compiler does. Compiler optimizations can do a lot of things that may appear unintuitive on programs that invoke undefined behavior. That's why undefined behavior is left completely undefined. When you deviate ever so slightly from the tracks, in real-world programs, it can be very hard to understand the relationship between what the code does and what it should have done, even for experienced programmers.

Gilles 'SO- stop being evil'
  • 92,660
  • 35
  • 189
  • 229
25

Unlike Java, C doesn't do array boundary check, i.e, there's no ArrayIndexOutOfBoundsException, the job of making sure the array index is valid is left to the programmer. Doing this on purpose leads to undefined behavior, anything could happen.


For an array:

int array[10]

indexes are only valid in the range 0 to 9. However, you are trying to:

for (i = 0; i <=10 ; i++)

access array[10] here, change the condition to i < 10

Yu Hao
  • 111,229
  • 40
  • 211
  • 267
  • 6
    Doing it not on purpose also leads to undefined behaviour - the compiler can't tell! ;-) – Toby Speight Jun 25 '15 at 21:35
  • 1
    Just use a macro to cast your errors as warnings: #define UNINTENDED_MISTAKE(EXP) printf("Warning: " #EXP " mistake\n"); – lkraider Jun 26 '15 at 04:02
  • 1
    I mean, if you are doing a mistake on purpose you might as well identify it as such and make it safe to avoid the undefined behaviour ;D – lkraider Jun 27 '15 at 23:42
19

You have a bounds violation, and on the non-terminating platforms, I believe you are inadvertently setting i to zero at the end of the loop, so that it starts over again.

array[10] is invalid; it contains 10 elements, array[0] through array[9], and array[10] is the 11th. Your loop should be written to stop before 10, as follows:

for (i = 0; i < 10; i++)

Where array[10] lands is implementation-defined, and amusingly, on two of your platforms, it lands on i, which those platforms apparently lay out directly after array. i is set to zero and the loop continues forever. For your other platforms, i may be located before array, or array may have some padding after it.

Derek T. Jones
  • 1,422
  • 5
  • 14
  • I don't think valgrind can catch this since it's still a valid location, but ASAN can. – o11c Jun 24 '15 at 02:42
12

You declare int array[10] means array has index 0 to 9 (total 10 integer elements it can hold). But the following loop,

for (i = 0; i <=10 ; i++)

will loop 0 to 10 means 11 time. Hence when i = 10 it will overflow the buffer and cause Undefined Behavior.

So try this:

for (i = 0; i < 10 ; i++)

or,

for (i = 0; i <= 9 ; i++)
rakeb.mazharul
  • 5,423
  • 3
  • 18
  • 39
7

It is undefined at array[10], and gives undefined behavior as described before. Think about it like this:

I have 10 items in my grocery cart. They are:

0: A box of cereal
1: Bread
2: Milk
3: Pie
4: Eggs
5: Cake
6: A 2 liter of soda
7: Salad
8: Burgers
9: Ice cream

cart[10] is undefined, and may give an out of bounds exception in some compilers. But, a lot apparently don't. The apparent 11th item is an item not actually in the cart. The 11th item is pointing to, what I'm going to call, a "poltergeist item." It never existed, but it was there.

Why some compilers give i an index of array[10] or array[11] or even array[-1] is because of your initialization/declaration statement. Some compilers interpret this as:

  • "Allocate 10 blocks of ints for array[10] and another int block. to make it easier, put them right next to each other."
  • Same as before, but move it a space or two away, so that array[10] doesn't point to i.
  • Do the same as before, but allocate i at array[-1] (because an index of an array can't, or shouldn't, be negative), or allocate it at a completely different spot because the OS can handle it, and it's safer.

Some compilers want things to go quicker, and some compilers prefer safety. It's all about the context. If I was developing an app for the ancient BREW OS (the OS of a basic phone), for example, it wouldn't care about safety. If I was developing for an iPhone 6, then it could run fast no matter what, so I would need an emphasis on safety. (Seriously, have you read Apple's App Store Guidelines, or read up on the development of Swift and Swift 2.0?)

Lee Taylor
  • 6,091
  • 14
  • 26
  • 43
DDPWNAGE
  • 1,423
  • 8
  • 35
  • Note: I typed the list so it goes "0, 1, 2, 3, 4, 5, 6, 7, 8, 9", but SO's Markup language fixed the positions of my ordered list. – DDPWNAGE Jul 03 '15 at 18:10
6

Since you created an array of size 10, for loop condition should be as follows:

int array[10],i;

for (i = 0; i <10 ; i++)
{

Currently you are trying to access the unassigned location from the memory using array[10] and it is causing the undefined behavior. Undefined behavior means your program will behave undetermined fashion, so it can give different outputs in each execution.

Steephen
  • 11,597
  • 6
  • 29
  • 41
5

Well, C compiler traditionally does not check for bounds. You can get a segmentation fault in case you refer to a location that does not "belong" to your process. However, the local variables are allocated on stack and depending on the way the memory is allocated, the area just beyond the array (array[10]) may belong to the process' memory segment. Thus, no segmentation fault trap is thrown and that is what you seem to experience. As others have pointed out, this is undefined behavior in C and your code may be considered erratic. Since you are learning C, you are better off getting into the habit of checking for bounds in your code.

unxnut
  • 7,506
  • 2
  • 23
  • 36
4

Beyond the possibility that memory might be laid out so that an attempt to write to a[10] actually overwrites i, it would also be possible that an optimizing compiler might determine that the loop test cannot be reached with a value of i greater than ten without code having first accessed the non-existent array element a[10].

Since an attempt to access that element would be undefined behavior, the compiler would have no obligations with regard to what the program might do after that point. More specifically, since the compiler would have no obligation to generate code to check the loop index in any case where it might be greater than ten, it would have no obligation to generate code to check it at all; it could instead assume that the <=10 test will always yield true. Note that this would be true even if the code would read a[10] rather than writing it.

Peter Mortensen
  • 28,342
  • 21
  • 95
  • 123
supercat
  • 69,493
  • 7
  • 143
  • 184
3

When you iterate past i==9 you assign zero to the 'array items' which are actually located past the array, so you're overwritnig some other data. Most probably you overwrite the i variable, which is located after a[]. That way you simply reset the i variable to zero and thus restart the loop.

You could discover that yourself if you printed i in the loop:

      printf("test i=%d\n", i);

instead of just

      printf("test \n");

Of course that result strongly depends on the memory allocation for your variables, which in turn depends on a compiler and its settings, so it is generally Undefined Behavior — that's why results on different machines or different operating systems or on different compilers may differ.

CiaPan
  • 8,142
  • 2
  • 18
  • 32
0

the error is in portion array[10] w/c is also address of i (int array[10],i;). when array[10] is set to 0 then the i would be 0 w/c resets the entire loop and causes the infinite loop. there will be infinite loop if array[10] is between 0-10.the correct loop should be for (i = 0; i <10 ; i++) {...} int array[10],i; for (i = 0; i <=10 ; i++) array[i]=0;

0

I will suggest something that I dint find above:

Try assigning array[i] = 20;

I guess this should terminate the code everywhere.. (given you keep i< =10 or ll)

If this runs you can firmly decide that the answers specified here already are correct [the answer related to memory stomping one for ex.]

-9

There are two things wrong here. The int i is actually an array element, array[10], as seen on the stack. Because you have allowed the indexing to actually make array[10] = 0, the loop index, i, will never exceed 10. Make it for(i=0; i<10; i+=1).

i++ is, as K&R would call it, 'bad style'. It is incrementing i by the size of i, not 1. i++ is for pointer math and i+=1 is for algebra. While this depends on the compiler, it is not a good convention for portability.

Peter Mortensen
  • 28,342
  • 21
  • 95
  • 123
SkipBerne
  • 119
  • 4
  • 5
    -1 Completely wrong. Variable `i` is NOTan array element `a[10]`, there's no obligation or even suggestion for a compiler to put it on the stack immediately *after* `a[]` – it can as well be located before the array, or separated with some additional space. It could even be allocated outside the main memory, for example in a CPU register. It's also untrue that `++` is for pointers and not for integers. Completely wrong is 'i++ is incrementing i by the size of i' – read the operator description in the language definition! – CiaPan Jun 25 '15 at 07:40
  • which is why it works on some platforms and not others. it is the only logical explanation for why it loops forever on windows. with regard to I++ it is pointer math not integer. read the Scriptures ... the 'C programming language'. by Kernigan and Ritche, if you want I have an autographed copy, and have been programming in c since 1981. – SkipBerne Jun 25 '15 at 12:05
  • 1
    Read the source code by OP and find the declaration of variable `i` — it is of `int` type. It is an **integer**, not a pointer; an integer, used as an index to the `array`,. – CiaPan Jun 25 '15 at 12:21
  • 1
    I did and that is why I commented as I did. maybe you should realize that unless the compiler includes stack checks and in this case it would not matter as the stack reference when I=10 would actually be referencing, in some compiles, the array index and that is within the bounds of the stack region. compilers cant fix stupid. compiles might make a fixup as it appears this one does, but a pure interpretation of the c programming language would not support this convention and would as the OP said result in non portable results. – SkipBerne Jun 25 '15 at 13:56
  • @SkipBerne: Consider to delete your answer before you will be "awarded" with more negative points. – Peter VARGA Jul 01 '15 at 07:10
  • read the book. stop hating. Ciapan actually said the same thing in his answer but then turns around and faults me. really don't care what you all do. – SkipBerne Jul 01 '15 at 13:45