2

This question comes out of me writing a toy compiler, but I don't think that it's necesary to show the compiler side of this question - this is strictly an issue of me not being very experienced with assembly.

I am working on implementing functions into my language. I have generated the following assembly (AT&T) which should print 6. The program segfaults at call *%rax, which I assume means that I'm not saving the address of my function correctly. What should happen is that the function f0 should read the arguments from the stack, multiply them, and leave the result in %rax. The print function is just a wrapper around the printf function from C to print %rax.

I'm running Ubuntu 20.04, and the code was compiled with gcc test.S -o test.out -no-pie, in case it matters.

.data
numfmt:
.asciz "%d\n"

.text
print:
push %rbx
mov %rax, %rsi
mov $numfmt, %rdi
xor %rax, %rax
call printf
pop %rbx
ret

.globl main
main:
push %rbp
mov %rsp, %rbp
jmp s0
f0:
push %rbp
mov %rsp, %rbp
mov 16(%rbp), %rax
push %rax
mov 8(%rbp), %rax
pop %rcx
imul %rcx, %rax
mov %rbp, %rsp
pop %rbp
ret
s0:
mov $f0, %rax
push %rax
mov $1, %rax
push %rax
mov $2, %rax
push %rax
mov -8(%rbp), %rax
call *%rax
add 16, %rsp
call print
mov $0, %rax
mov %rbp, %rsp
pop %rbp
ret
Alex F
  • 239
  • 1
  • 7
  • Shouldn't it be `mov $f0, %rax`? – Nate Eldredge Jul 09 '20 at 22:08
  • Also, I may have lost count of the pushes, but I don't think you've got 16-byte stack alignment when printf is called. – Nate Eldredge Jul 09 '20 at 22:11
  • 1) Changing that line doesn't fix the problem on it's own, although you may be correct. 2) I believe that the `print` aligns it, but I can't remember where I got that code from, and I might have removed the alignment part a week ago because I didn't realize I needed it for `cdecl` then. – Alex F Jul 09 '20 at 22:17
  • 2
    `print` will maintain proper alignment (that's what `push %rbx` does) if the stack is already aligned correctly when you call it. But by my count it isn't. – Nate Eldredge Jul 09 '20 at 22:18

1 Answers1

3

In AT&T syntax, mov f0, %rax is an indirect move: it loads %rax with the qword located at address f0. But you want to load the address itself. That's an immediate move, and so the operand f0 needs to be prefixed with $ to indicate that it's immediate. Thus it should say mov $f0, %rax.

For similar reasons, add 16, %rsp is wrong as it attempts to add to %rsp the value located at address 16. That page is not mapped hence segfault. Again, you want add $16, %rsp.

Next, in f0 your frame pointer offsets are wrong. 16(%rbp) is the second parameter pushed, not the first, and 8(%rbp) is the return address. Don't forget to account for the effect on the stack pointer of the push %rbp itself. So you end up computing 2 times the return address instead of 1 times 2. Make those 24(%rbp) and 16(%rbp).

Finally, you need to make sure the stack is aligned to 16 bytes when you call printf. Library functions behave unpredictably when it's not. Sometimes they may happen not to do anything that requires alignment and them everything appears to work; other times they may crash.

Nate Eldredge
  • 24,174
  • 2
  • 31
  • 43
  • I've changed that line, but the segfault remains. – Alex F Jul 09 '20 at 22:20
  • @AlexF: But I think you'll find it's not the same segfault - the call now goes to `f0` as it should, at least in my test. So you've fixed that bug and can now move on to your next bug. – Nate Eldredge Jul 09 '20 at 22:25
  • I see. Now it doesn't segfault, but prints `12596322`? Hmm... – Alex F Jul 09 '20 at 22:29
  • The final problem wasn't the stack alignment, but that my stack offsets were wrong - they should've been 16 and 24, not 8 and 16. Thanks for your time! – Alex F Jul 09 '20 at 22:51
  • @AlexF: Also note that `mov $f0, %rax` only works in non-PIE code (because it uses a 32-bit absolute address). Use `lea f0(%rip), %rax` to put an address in a register. [How to load address of function or label into register in GNU Assembler](https://stackoverflow.com/q/57212012), unless you're optimizing for non-PIE on purpose, then use `mov $f0, %eax` (32-bit operand-size = smaller instruction), or in your case `push $f0` since you ultimately need the value on the stack, not in a register, it seems. I'm sure you realize that your toy compiler's code-gen is at this point very inefficient :P – Peter Cordes Jul 10 '20 at 04:01
  • @Peter Cordes I'm fully aware that it's inefficient. This is just a toy, and there are exactly zero optimizations being done at the moment. I'm generating this way because it lets me do first-class functions easily in the compiler. About the PIE thing, I already have to compile with `-no-pie` on GCC to get it to work on my machine. Not really sure what the flag does though. – Alex F Jul 10 '20 at 04:35
  • @AlexF: no-pie makes a traditional position-*dependent* ELF executable that isn't ASLRed at load time. Static addresses are a link-time constant, and are in the low 2GiB of virtual address space so can be used as 32-bit immediates (sign- or zero-extended to 32-bit). [32-bit absolute addresses no longer allowed in x86-64 Linux?](https://stackoverflow.com/q/43367427). I know you're not intentionally optimizing anything at this point, but `mov $f0, %rax` has zero advantages; nobody should ever do that except for a non-PIE kernel that loads in the high 2GiB (can sign but not zero extend addrs). – Peter Cordes Jul 10 '20 at 04:40
  • @PeterCordes nvm, I figured out how to compile without `-no-pie` using rip-relative addressing for the format string for printf (although there may be a better way, but that's more of a Code Review question than a SO one.) Thanks for your advice and links! – Alex F Jul 10 '20 at 05:01