The C (and C++) standard doesn't define the order of arguments being passed, or how they should be organised in memory. It is up to the compiler developer (usually in cooperation with the OS developers) to come up with something that works on a particular processor architecture.
In MOST architectures, the stack (and registers) is used to pass arguments to a function, and again, for MOST architectures, the stack grows from "high to low" addresses, and in most C implementations, the order of arguments being passed are "left last", so if we have a function
void test( int a, int b, int c )
then arguments are passed in the order:
c, b, a
to the function.
However, what complicates this is when the value of the arguments are passed in registers, and the code using the arguments is taking the address of those arguments - registers don't have addresses, so you can't take the address of a register variable. So the compiler will generate some code to store the address on the stack [from where we can get the address of the value] locally to the function. This is entirely up to the compiler's decision which order it does this, and I'm fairly sure this is what you are seeing.
If you we take your code and pass it through clang, we see:
define void @test(i32 %a, i32 %b, i32 %c) #0 {
entry:
%a.addr = alloca i32, align 4
%b.addr = alloca i32, align 4
%c.addr = alloca i32, align 4
store i32 %a, i32* %a.addr, align 4
store i32 %b, i32* %b.addr, align 4
store i32 %c, i32* %c.addr, align 4
%call = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([10 x i8], [10 x i8]* @.str, i32 0, i32 0), i32* %a.addr, i32* %b.addr, i32* %c.addr)
%add.ptr = getelementptr inbounds i32, i32* %b.addr, i64 -1
%0 = load i32, i32* %add.ptr, align 4
%add.ptr1 = getelementptr inbounds i32, i32* %b.addr, i64 1
%1 = load i32, i32* %add.ptr1, align 4
%call2 = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([7 x i8], [7 x i8]* @.str.1, i32 0, i32 0), i32 %0, i32 %1)
ret void
}
Although it may not be ENTIRELY trivial to read, you can see the first few lines of the test-function is:
%a.addr = alloca i32, align 4
%b.addr = alloca i32, align 4
%c.addr = alloca i32, align 4
store i32 %a, i32* %a.addr, align 4
store i32 %b, i32* %b.addr, align 4
store i32 %c, i32* %c.addr, align 4
This is essentially creating space on the stack (%alloca
) and storing the variables a
, b
, and c
into those locations.
Even less easy to read is the assembler code that gcc generates, but you can see a similar thing happening here:
subq $16, %rsp ; <-- "alloca" for 4 integers.
movl %edi, -4(%rbp) ; Store a, b and c.
movl %esi, -8(%rbp)
movl %edx, -12(%rbp)
leaq -12(%rbp), %rcx ; Take address of ...
leaq -8(%rbp), %rdx
leaq -4(%rbp), %rax
movq %rax, %rsi
movl $.LC0, %edi
movl $0, %eax
call printf ; Call printf.
You may wonder why it allocates space for 4 integers - that's because the stack should always be aligned to 16 bytes in x86-64.