11

I am not able to understand this. Please explain.

Edit: It prints: 'hello, world!'

#include <stdio.h>

int i;
main()
{
  for(;i["]<i;++i){--i;}"];read('-'-'-',i+++"hell\o, world!\n",'/'/'/'));
  //For loop executes once, calling function read with three arguments.
}

read(j,i,p)
{
  write(j/p+p,i---j,i/i);  //how does it work? like printf?
}
War10ck
  • 11,732
  • 7
  • 38
  • 50

3 Answers3

25

Breaking is down you have:

for({initial expr};{conditional expr};{increment expr})

The '{initial expr}' is blank so it does nothing. The '{conditional expr}' is 'i["]<i;++i){--i;}"]'

which is the same as

"]<i;++i){--i;}"[i]

or

const char* str = "]<i;++i){--i;}";
for (; str[i]; )

so it's looping until the expression is false (i.e. is hits the null at the end of the string).

The {increment expr} is

read('-'-'-',i+++"hell\o, world!\n",'/'/'/')

If you break that down the read parameters you have:

'-' - '-' == char('-') - char('-') == 0

For parameter two you have:

i+++"hell\o, world!\n"

which is the same as: i++ + "hell\o, world!\n"

So it increments the 'i' variable, this means the for loop will loop for the number of characters in conditional string "]

For the first time around you end up with:

0 + "hell\o, world!\n"

The second time around the loop will be 1 + "hell\o, world!\n", etc.

So the second parameter is a pointer into the "hell\o, world!\n".

The third parameter is:

'/'/'/' == '/' / '/' == char('/') / char('/') == 1

So the third parameter is always 1.

Now we break down the read function that calls write:

write(j/p+p,i---j,i/i);

There are three parameters, the first is:

j/p+p where j == 0, p == 1 so 0/1+1 == 1.

If read the link to the write function 1 is hardcoded to write to standard out.

The second parameter to write is

i---j

which is the same is i-- - j, where i is the pointer to the string and j = 0, since i is post-decremented is doesn't do anything and '- 0' does nothing, it's simply passing the pointer through to the write function.

The third parameter is 'i / i' which will always be 1.

So for each call to 'read' it writes one character out of the "hell\o, world!\n" string each time.

0x499602D2
  • 87,005
  • 36
  • 149
  • 233
Shane Powell
  • 12,040
  • 2
  • 45
  • 53
  • ... except that `i` (the global) starts with an undefined value, so what it actually does is randomly access memory. – Tommy Mar 17 '14 at 05:53
  • 11
    'i' is defined in the 'global' area so it will always start off being zero. see [static variable initialization](http://stackoverflow.com/questions/1831290/static-variable-initialization) – Shane Powell Mar 17 '14 at 05:57
  • Can you finger-on-paper the spec? Logic and secondary evidence shows I'm wrong but can you finish that thought? E.g. in the C99 spec at http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf ? – Tommy Mar 17 '14 at 06:05
  • Why is `\o` inside `"hell\o, world!"` printed as simple `o`? I could not find that escape sequence. – halex Mar 17 '14 at 06:06
  • Undefined escape codes are taken by most compilers just to be the character. It's a misdirect. E.g. Microsoft mentions it as a 'Microsoft Specific' in http://msdn.microsoft.com/en-us/library/h21280bw.aspx – Tommy Mar 17 '14 at 06:09
  • @halex: actually, I think that in the original version the line was split in two and that backslash was used as a line continuation character. It works anyhow because unknown escape sequences are usually treated as the character itself (probably not standard, but it's widespread behavior). – Matteo Italia Mar 17 '14 at 06:14
0
read('-'-'-',i+++"hell\o, world!\n",'/'/'/')

Calls read with the first argument:

'-' - '-'

So that's the subtraction of a char from itself, i.e. zero.

The second argument is:

i++ + "hell\o, world!\n"

So that is an address within the string constant "hell\o world!\n" which will depend on the value of i.

The third argument is:

'/' / '/'

A reprise of the arithmetic on character liberals theme, this time producing 1.

Rather than the normal read, that call goes to the method defined at the bottom, which actually performs a write.

Argument 1 to the write is:

j/p+p

Which is 0/1+1 = 1.

Argument 2 is:

i-- - j

Which undoes the transformation on the earlier string literal, evaluating back to the string "hell\o world...".

The third argument is:

i/i

i.e. 1.

So the net effect of the read is to write one byte from the string passed in to file descriptor 1.

It doesn't return anything, though it should, so the result and therefore the exact behaviour of the earlier loop is undefined.

The subscript on i in the for loop is identical to writing:

*((i) + (the string given))

i.e. it grabs a byte from within that string. As the initial value of i is undefined, this could be an out-of-bounds access.

Note that the i within read is local, not the global. So the global one continues to increment, passing along one character at a time, until it gets to a terminating null in the other string literal.

If i were given 0 as an initial value then this code would be correct.

(EDIT: as has been pointed out elsewhere, I was wrong here: i is initially zero because it's a global. Teleologically, it costs nothing at runtime to give globals defined initial values so C does. It would cost to give anything on the stack an initial value, so C doesn't.)

Tommy
  • 97,164
  • 12
  • 174
  • 193
0

First see the syntax of read and write function in C and what they do:

ssize_t read(int fildes, void *buf, size_t nbyte); 

The read() function shall attempt to read nbyte bytes from the file associated with the open file descriptor, fildes, into the buffer pointed to by buf.

ssize_t write(int fildes, const void *buf, size_t nbyte);  

The write() function shall attempt to write nbyte bytes from the buffer pointed to by buf to the file associated with the open file descriptor, fildes.

Now, rewriting your for loop as

for(;i["]<i;++i){--i;}"]; read('-' - '-', i++ + "hell\o, world!\n", '/' / '/'));

Starting with i["]<i;++i){--i;}"];

"]<i;++i){--i;}" is a string. In C, if

char ch;
char *a = "string";` 

then you can write ch = "string"[i] which is equivalent to i["string"] (as a[i] = i[a]). This basically add the address of the string to i (i is initialized to 0 as it is globally defined). So, the i is initialized with the starting address of string hell\o, world!\n.
Now the point is that the for loop is not iterating only once!
The expression read('-' - '-', i++ + "hell\o, world!\n", '/' / '/') can be rewritten as (for the sake of convenience);

read(0, i++ + "hell\o, world!\n", 1)  
                                  ^ read only one byte (one character)   

Now what it will do actually is to call read and increment i (using its previous value). Starting address of string hell\o, world! get added to i. So the first call of read will just print H. On next iteration the i is incremented (contains the address of next character) and call to read will print the next character.
This will continue until i["]<i;++i){--i;}"] becomes false (at \0).

Overall the behavior of code is undefined!


EXPLANATION for UB:

Note that a function call f(a,b,c) is not a use of the comma operator and the order of evaluation for a, b, and c is unspecified.

Also C99 states that:

Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be accessed only to determine the value to be stored.

Hence the call

write(j/p+p, i-- -j, i/i);  

invokes UB. You can't modify and use a variable in the same expression. The compiler should raise a warning

[Warning] operation on 'i' may be undefined [-Wsequence-point]

haccks
  • 97,141
  • 23
  • 153
  • 244