1

I am writing a C program that calls an x86 Assembly function which adds two numbers. Below are the contents of my C program (CallAssemblyFromC.c):

#include <stdio.h>
#include <stdlib.h>

int addition(int a, int b);

int main(void) {
    int sum = addition(3, 4);
    printf("%d", sum);
    return EXIT_SUCCESS;
}

Below is the code of the Assembly function (my idea is to code from scratch the stack frame prologue and epilogue, I have added comments to explain the logic of my code) (addition.s):

.text

# Here, we define a function addition
.global addition
addition:
    # Prologue:
    # Push the current EBP (base pointer) to the stack, so that we
    # can reset the EBP to its original state after the function's
    # execution
    push %ebp
    # Move the EBP (base pointer) to the current position of the ESP
    # register
    movl %esp, %ebp

    # Read in the parameters of the addition function
    # addition(a, b)
    #
    # Since we are pushing to the stack, we need to obtain the parameters
    # in reverse order:
    # EBP (return address) | EBP + 4 (return value) | EBP + 8 (b) | EBP + 4 (a)
    #
    # Utilize advanced indexing in order to obtain the parameters, and
    # store them in the CPU's registers
    movzbl 8(%ebp), %ebx
    movzbl 12(%ebp), %ecx

    # Clear the EAX register to store the sum
    xorl %eax, %eax
    # Add the values into the section of memory storing the return value
    addl %ebx, %eax
    addl %ecx, %eax

I am getting a segmentation fault error, which seems strange considering that I think I am allocating memory in accordance with the x86 calling conventions (e.x. allocating the correct memory sections to the function's parameters). Furthermore, if any of you have a solution, it would be greatly appreciated if you could provide some advice as to how to debug an Assembly program embedded with C (I have been using the GDB debugger but it simply points to the line of the C program where the segmentation fault happens instead of the line in the Assembly program).

Adam Lee
  • 354
  • 1
  • 9
  • 21
  • 1
    Stepping through assembly code: https://stackoverflow.com/questions/2420813/using-gdb-to-single-step-assembly-code-outside-specified-executable-causes-error – Jabberwocky Nov 13 '20 at 10:03
  • `movl %ebp, %esp` in AT&T syntax this moves the value in `ebp` to the register `esp`. You want the reverse. – ecm Nov 13 '20 at 11:48
  • Just to check - you're sure you're correctly compiling and assembling the whole program as 32-bit code? On a 64-bit system that normally means using `-m32` when compiling and linking. – Nate Eldredge Nov 14 '20 at 00:21
  • @NateEldredge Yes, I am compiling the program as 32-bit code. – Adam Lee Nov 14 '20 at 01:25

2 Answers2

2
  1. Your function has no epilogue. You need to restore %ebp and pop the stack back to where it was, and then ret. If that's really missing from your code, then that explains your segfault: the CPU will go on executing whatever garbage happens to be after the end of your code in memory.

  2. You clobber (i.e. overwrite) the %ebx register which is supposed to be callee-saved. (You mention following the x86 calling conventions, but you seem to have missed that detail.) That would be the cause of your next segfault, after you fixed the first one. If you use %ebx, you need to save and restore it, e.g. with push %ebx after your prologue and pop %ebx before your epilogue. But in this case it is better to rewrite your code so as not to use it at all; see below.

  3. movzbl loads an 8-bit value from memory and zero-extends it into a 32-bit register. Here the parameters are int so they are already 32 bits, so plain movl is correct. As it stands your function would give incorrect results for any arguments which are negative or larger than 255.

  4. You're using an unnecessary number of registers. You could move the first operand for the addition directly into %eax rather than putting it into %ebx and adding it to zero. And on x86 it is not necessary to get both operands into registers before adding; arithmetic instructions have a mem, reg form where one operand can be loaded directly from memory. With this approach we don't need any registers other than %eax itself, and in particular we don't have to worry about %ebx anymore.

I would write:

.text

# Here, we define a function addition
.global addition
addition:
    # Prologue:
    push %ebp
    movl %esp, %ebp

    # load first argument
    movl 8(%ebp), %eax 
    # add second argument
    addl 12(%ebp), %eax

    # epilogue
    movl %ebp, %esp  # redundant since we haven't touched esp, but will be needed in more complex functions 
    pop %ebp
    ret

In fact, you don't need a stack frame for this function at all, though I understand if you want to include it for educational value. But if you omit it, the function can be reduced to

.text
.global addition
addition:
    movl 4(%esp), %eax
    addl 8(%esp), %eax
    ret
Nate Eldredge
  • 24,174
  • 2
  • 31
  • 43
  • Thank you so much for this amazing answer that was so detailed. I was wondering, do you have any tips on debugging C programs that are meshed with Assembly code (in the GDB debugger I can only see the seg fault line in the C program instead of the Assembly program)? – Adam Lee Nov 14 '20 at 02:31
  • 1
    The answers in the link that Jabberwocky posted cover most of what there is. Useful commands: `display/i $eip` `info registers` `si` `ni` `break` `disassemble` `print $eax` `x/8xw $ebp` – Nate Eldredge Nov 14 '20 at 02:33
  • I just had one clarifying question about the second point you made: when you say the %ebx register is supposed to be callee-saved, what does that precisely mean? Furthermore, is the %ebx register the only register in x86 that is callee-saved? – Adam Lee Nov 14 '20 at 02:36
  • It means that your function (the one that is called, i.e. the "callee") needs to ensure that its value upon return is the same as upon entry, so that whatever value your caller may have been keeping there is still there, as if it had never changed. See https://stackoverflow.com/questions/9268586/what-are-callee-and-caller-saved-registers. The caller/callee saved registers are part of the calling conventions; if your reference didn't explain this, you could see https://wiki.osdev.org/Calling_Conventions. The registers `%ebx, %esi, %edi, %ebp` are the callee-saved registers on 32-bit x86. – Nate Eldredge Nov 14 '20 at 03:07
  • Gotcha. Last question: Is the EAX register always supposed to store the return value of a function in Assembly? I thought EBP + 4 stores the return value? – Adam Lee Nov 14 '20 at 11:35
  • @AdamLee: No, it's EAX, at least for functions returning 32-bit integers or pointers (other return types have other conventions). I don't know any situation where the return value would be at EBP+4 specifically and am not sure where you might have got that idea. The official reference for all of this is the [System V ABI](https://github.com/hjl-tools/x86-psABI/wiki/intel386-psABI-1.1.pdf) which you should really study if you're going to write assembly programs for Unix-like OSes. – Nate Eldredge Nov 14 '20 at 17:23
  • @AdamLee: Also see https://stackoverflow.com/tags/x86/info for a lot more useful links. – Nate Eldredge Nov 14 '20 at 17:24
0

You are corrupting the stacke here:

movb %al, 4(%ebp)

To return the value, simply put it in eax. Also why do you need to clear eax? that's inefficient as you can load the first value directly into eax and then add to it.

Also EBX must be saved if you intend to use it, but you don't really need it anyway.

Devolus
  • 20,356
  • 11
  • 56
  • 104