Assembly x86-32 and some c functions

Question

I never learn C language so it makes me confuse. I just like to know if I did it correctly or where I need to improve. For this code I used assembly x86 32 bit. Thanks

This is what I supposed to do: Write a procedure with the signature

char *strchar(char *s1, char c1)

that returns a pointer to the first occurrence of the character c1 within the string s1 or, if not found, returns a null.

This is what I came out with:

strchar   (char*, char):  
push      ebp
mov       ebp,        esp
mov       dword       ptr     [ebp-24], edi
mov       EAX ,       esi
mov       BYTE PTR            [ebp-28], al

.L5:
mov       EAX ,       dword   ptr  [ebp-24]
movzx     EAX ,       byte ptr  [ EAX ]
test      AL, AL
je .L2
mov      EAX , dword PTR [ebp-24]
movzx   EAX , BYTE PTR [ EAX ]
cmp            BYTE PTR  [ebp-28], al
jne .L3
mov eax,      dword PTR [ebp-24]
jmp .L6

.L3:
add dword PTR [ebp-24], 1
jmp .L5
.L2:
LEA eax, [ebp-9]
MOV  DWORD PTR [EBP-8], eax
MOV  EAX, DWORD PTR [ebp-8]

.L6:    
POP EBP
RET

Do you get any errors when compiling? If not, did you test it -- does your code work? — Jongware, Nov 26 '18 at 11:33
[Parameters are passed on the stack](https://en.wikipedia.org/wiki/X86_calling_conventions#List_of_x86_calling_conventions) in 32-bit code. At least with the most common calling conventions (cdecl and stdcall). — Michael, Nov 26 '18 at 11:34
Did you port x86-64 gcc output to 32-bit by just changing the register names? It looks like unoptimized compiler output, so it's super inefficient and hard to read. And like Michael said, doesn't follow the normal calling convention. Compile with `gcc -m32 -O3` if you want to look at compiler-generated asm. [How to remove "noise" from GCC/clang assembly output?](https://stackoverflow.com/q/38552116) — Peter Cordes, Nov 26 '18 at 11:56
Thank you everyone. I actually did port it from x86-64 output to 32bit. I’m just confuse with the c — user10416143, Nov 26 '18 at 18:13
Can you fix the formatting of the assembly code (aligned columns)? — Peter Mortensen, Dec 30 '18 at 17:20

score 0 · Answer 1 · answered Nov 26 '18 at 16:05

The lines:

mov       dword       ptr     [ebp-24], edi
mov       EAX ,       esi
mov       BYTE PTR            [ebp-28], al

assume that a stack frame has been allocated for this function which doesn’t appear true; I think you should have something like:

sub esp, 32

after the

mov   ebp,esp

Also, the three lines after L2 seem confused. The only way to get to L2 is if the nil (0) byte is discovered in the string, at which point, the code should return a NULL pointer. The exit path in the code (L6) leaves eax alone, so all that should be needed is:

L2:
   mov eax, 0

It might make debugging easier if you kept the alias up to date; so:

L2:
   mov eax, 0
   mov [ebp-24], eax

Also, the calling convention used here is a bit odd: the string is passed in edi and the character in esi. Normally, in x86-32, these would both be passed on the stack. This looks like it might have been x86-64 code, converted to x86-32....

A final note; this assembly code looks like the output of a compiler with optimisations disabled. Often, generating the assembly with the optimisations enabled generates easier to understand code. This code, for example, could be much more concisely written as below, without even devolving into weird intel ops:

strchar:
 mov edx, esi
 mov eax, edi
L:
 mov dh, [eax]
 test dh, dh
 jz   null
 cmp dh, dl
 je done
 inc eax
 jmp L
null:
 mov eax, 0
done:
 ret

Thank you this help a lot – user10416143 Nov 27 '18 at 18:30 — user10416143, Nov 27 '18 at 18:30

score 0 · Answer 2 · answered Nov 27 '18 at 17:32

0

Here is one with stack overhead

[global strchar]
strchar:
            push    ebp
            mov     ebp, esp
            mov     dl, byte [ebp + 12]
            mov     ecx, dword [ebp + 8]
            xor     eax, eax
.loop:      mov     al, [ecx]
            or      al, al
            jz      .exit
            cmp     al, dl
            jz      .found
            add     ecx, 1
            jmp     .loop
.found:     mov     eax, ecx
.exit:
            leave
            ret

Here is one without stack overhead

[global strchar]
strchar:
            mov     dl, byte [esp + 8]
            mov     ecx, dword [esp + 4]
            xor     eax, eax
.loop:      mov     al, [ecx]
            or      al, al
            jz      .exit
            cmp     al, dl
            jz      .found
            add     ecx, 1
            jmp     .loop
.found:     mov     eax, ecx
.exit:
            ret

These are using the 'cdecl' calling convention. For 'stdcall' change the last 'ret' to 'ret 8'.

answered Nov 27 '18 at 17:32

W. Chang

484
3
8

Remove the first two `mov`s and change all `EAX/ECX` to `RAX/RCX` in the second version, it becomes a working 64-bit function. – W. Chang Nov 27 '18 at 17:36
Thank you for this – user10416143 Nov 27 '18 at 18:37
Never use `or` when you could use `test`. Especially here, you're lengthening the loop-carried false dependency through EAX on CPUs that don't rename AL separately from EAX. (AMD, and Intel IvyBridge and later.) [Test whether a register is zero with CMP reg,0 vs OR reg,reg?](https://stackoverflow.com/a/33724806). You could avoid the false dependency with `movzx eax, byte [ecx]`. Of course, a more efficient loop structure would only have 2 conditional branches and zero unconditional branches. e.g. with the check-for-zero at the bottom and the .found check in the mid. – Peter Cordes Nov 27 '18 at 19:49
Of course a 1 byte per clock `strchr` is garbage if SSE2 is available, but you could still make the byte loop much more efficient with [Why are loops always compiled into "do...while" style (tail jump)?](https://stackoverflow.com/q/47783926) – Peter Cordes Nov 27 '18 at 19:53

Assembly x86-32 and some c functions

2 Answers2