2

I have an initialized string "Hello, World!" from which I would like to extract the first character (i.e. 'H') and comapre it a character that is passed into a register at run time.

I have tried comparing the first character of "Hello, World!" with 'H' through the following code:

global start

section .data
msg: db "Hello, World!", 10, 0

section .text
start:
   mov rdx, msg
   mov rdi, [rdx]
   mov rsi, 'H'
   cmp rdi, rsi
   je equal

   mov rax, 0x2000001
   mov rdi, [rdx]
   syscall

equal:
   mov rax, 0x2000001
   mov rdi, 58
   syscall

However, this code terminates without jumping to the equal label. Moreover, the exit status of my program is 72, which is the ASCII code for H. This made me try to pass 72 into rsi instead of H, but that too resulted in a program the terminates without jumping to the equal label.

How can I properly compare the first character in "Hello, World!" with a character that is passed to a register?

mooncow
  • 353
  • 2
  • 9
  • 1
    You should use byte register parts like `al` and `ah`, not the full-width registers. The cmp is comparing the entire register values. Plus, `rdi` has more than one character. Stepping through a debugger like gdb can help you solve these kind of issues on your own and increase your understanding. – Rafael Jan 24 '19 at 08:35
  • @Rafael Could you provide a code example of how I should use byte register parts? I am very new to assembly and have never encountered `al` and `ah`. – mooncow Jan 24 '19 at 08:40
  • Those three MOV / CMP instructions should be something like `mov al, [rdx], mov ah, 'H', cmp al, ah` – Rafael Jan 24 '19 at 08:53
  • or simply use `movzx rdi, BYTE [rdx]` – Margaret Bloom Jan 24 '19 at 08:56
  • @Rafael Thanks. Your solutions works perfectly. If you don't mind, could you explain why using these registers works, but my implementation doesn't? – mooncow Jan 24 '19 at 09:04
  • Use `default rel` and use `cmp byte [msg], 'H'`. Or if you want the pointer in RDI so you can increment it in a loop, use `lea rdi, [rel msg]`. You normally never want to use `mov rdi, msg` with a 64-bit immediate of the absolute address. [Mach-O 64-bit format does not support 32-bit absolute addresses. NASM Accessing Array](https://stackoverflow.com/q/47300844) – Peter Cordes Jan 24 '19 at 14:15

2 Answers2

4

You and @Rafael's answer are massively over-complicating your code.

You normally never want to use mov rdi, msg with a 64-bit immediate of the absolute address. (See Mach-O 64-bit format does not support 32-bit absolute addresses. NASM Accessing Array)

Use default rel and use cmp byte [msg], 'H'. Or if you want the pointer in RDI so you can increment it in a loop, use lea rdi, [rel msg].

The only thing that's different between your branches is the RDI value. You don't need to duplicate the RAX setup or the syscall, just get the right value in RDI and then have the branches rejoin each other. (Or do it branchlessly.)

@Rafael's answer is still loading 8 bytes from the string for some reason, like both loads in your question. Presumably this is sys_exit and it ignores the upper bytes, only setting process exit status from the low byte, but just for fun let's pretend we actually want all 8 bytes loaded for the syscall while only comparing the low byte.

default rel         ; use RIP-relative addressing modes by default for [label]
global start

section .rodata                       ;; read-only data usually belongs in .rodata
msg: db "Hello, World!", 10, 0

section .text
start:
   mov   rdi, [msg]    ; 8 byte load from a RIP-relative address
   mov   ecx, 'H'

   cmp   dil, cl       ; compare the low byte of RDI (dil) with the low byte of RCX (cl)
   jne   .notequal
   ;; fall through on equal
   mov   edi, 58
.notequal:             ; .labels are local labels in NASM

   ; mov rdi, [rdx]    ; still loaded from before; we didn't destroy it.
   mov eax, 0x2000001
   syscall

Avoid writing to AH/BH/CH/DH when possible. It either has a false dependency on the old value of RAX/RBX/RCX/RDX, or it can cause partial-register merging stalls if you later read the full register. @Rafael's answer doesn't do that, but the mov ah, 'H' is dependent on the load into AL on some CPUs. See Why doesn't GCC use partial registers? and How exactly do partial registers on Haswell/Skylake perform? Writing AL seems to have a false dependency on RAX, and AH is inconsistent - mov ah, 'H' has a false dependency on the old value of AH on Haswell/Skylake, even though AH is renamed separately from RAX. But AL isn't, so yes, this might well have a false dependency on the load, stopping it from running in parallel and delaying the cmp by a cycle.

Anyway, the TL:DR here is that you shouldn't mess around with writing AH/BH/CH/DH if you don't need to. Reading them is often ok, but can have worse latency. And note that cmp dil, ah isn't encodeable, because DIL is only accessible with a REX prefix and AH is only accessible without.

I picked RCX instead of RSI because CL doesn't need a REX prefix, but since we need to look at the low byte of RDI (dil) we need a REX prefix anyway on the cmp. I could have use mov cl, 'H' to save code-size, because there's probably no problem with a false dependency on the old value of RCX.


BTW, cmp dil, 'H' would work just as well as cmp dil, cl.

Or if we load the byte with zero-extension into the full RDI, we can use cmp edi, 'H' instead of the low-8 version of it. (Zero-extending loads are the normal / recommended way to deal with bytes and 16-bit integers on modern x86-64. Merging into the low byte of the old register value is usually worse for performance, which is the reason Why do x86-64 instructions on 32-bit registers zero the upper part of the full 64-bit register?.)

And instead of branching, we could CMOV. This is sometimes better, sometimes not, for code-size and performance.

Version 2, only actually loading 1 byte:

start:
   movzx   edi, byte [msg]    ; 1 byte load, zero extended to 4 (and implicitly to 8)

   mov     eax, 58            ; ASCII ':'
   cmp     edi, 'H'
   cmove   edi, eax           ; edi =  (edi == 'H') ? 58 : edi

   ; rdi = 58 or the first byte,
   ; unlike in the other version where it had 8 bytes of string data here
   mov eax, 0x2000001
   syscall

(This version looks a lot shorter, but most of the extra lines were whitespace, comments, and labels. Optimizing to cmp-immediate makes this 4 instructions instead of 5 before the mov eax / syscall, but other than that they're equal.)

Peter Cordes
  • 245,674
  • 35
  • 423
  • 606
1

I'll explain the changes side-by-side (hopefully that's easier to follow):

global start

section .data
msg: db "Hello, World!", 10, 0

section .text
start:
   mov rdx, msg
   mov al, [rdx] ; moves one byte from msg, H to al, the 8-bit lower part of ax
   mov ah, 'H'   ; move constant 'H' to the 8-bit upper part of ax
   cmp al, ah    ; compares H with H
   je equal      ; yes, they are equal, so go to address at equal

   mov rax, 0x2000001
   mov rdi, [rdx]
   syscall

equal:           ; here we are
   mov rax, 0x2000001
   mov rdi, 58
   syscall

If you're not understanding the use / mention of al, ah, ax, please see General-Purpose Registers.

Rafael
  • 6,646
  • 13
  • 29
  • 43