How to isolate byte and word array elements in a 64-bit register

Question

I can tell this is a super simple problem but I have yet to figure it out. Basically, I just want to be able to take one element an array and add and subtract some numbers from it using registers and then put the result into my result variable.

segment .data
  a      dw  4, 234, -212
  b      db  112, -78, 50
  result dq  0
segment .text       
  global main
main:
  mov   rax, [a]

I know the solution has something to do with offsets and indexing, but I don't get how I am supposed to be able to get just one array element into a register.

What can I do?

Use the proper sized register (or convert sizes) with the proper offset. E.g. `mov ax, [a+2]` would load `234` and `mov al, [b+1]` would load `-78`. — Jester, Oct 30 '19 at 00:52
For example, use `movzx rax, word ptr [a]` (MASM syntax) or a similar expression to get the value `4` from the array and fill the upper bits of RAX with zeroes. — zx485, Oct 30 '19 at 00:57

Peter Cordes · Answer 1 · 2019-10-31T13:17:41.477

If you want to treat your values as signed, you want movsx. Assuming NASM syntax:

default rel
; ... declarations and whatever    

    movsx   rax, word [a + 1*2]    ; a is an array of dw = words
    movsx   rcx, byte [b + 1*1]    ; b is an array of db = bytes

    add     rax, rcx
    mov     [result], rax         ; result is a qword

(MASM or GNU .intel_syntax would use word ptr instead of word, just add ptr to the size specifier for the memory operand.)

The 1 can be a register like [a + rsi*2] or [b + rsi] so you can easily loop over your arrays. Referencing the contents of a memory location. (x86 addressing modes)

I wrote 1*2 instead of just 2 to indicate that it's index 1 (the 2nd array element), scaled by the element size. The assembler will evaluate the constant expression and just use the same (RIP-relative) addressing mode it would for [a] but with a different offset.

If you need it to work in position-independent code (where you can't use a [disp32 + register] addressing mode with a 32-bit absolute address for the symbol), lea rdi, [a] (RIP-relative LEA) first and do [rsi + rsi*2].

If you wanted zero-extension, you'd use movzx

    movzx   eax, word [a + 1*2]    ; a is an array of dw = words
    movzx   ecx, byte [b + 1*1]    ; b is an array of db = bytes
    ; word and byte zero-extended into 64-bit registers:
    ; explicitly to 32-bit by MOVZX, and implicitly to 64-bit by writing a 32-bit reg

    ; add     eax, ecx              ; can't overflow 32 bits, still zero-extended to 64
    sub     rax, rcx              ; want the full width 64-bit signed result 
    mov     [result], rax         ; result is a qword

If you knew the upper bits of your full result would always be zero, just use EAX (32-bit operand-size) except at the end. The advantages of using 32bit registers/instructions in x86-64

This code corresponds to C like

static  uint16_t a[] = {...};
static  uint8_t b[] = {...};
static  int64_t result;

void foo(){
    int64_t rax = a[1] - (int64_t)b[1];
    result = rax;    // why not just return this like a normal person instead of storing?
}

Speaking of which, you can look at compiler output on the Godbolt compiler explorer and see these instructions and addressing modes.

Note that mov al, [b + 1] would load a byte and merge it into the low byte of RAX.

You normally don't want this; movzx is the normal way to load a byte in modern x86. Modern x86 CPUs decode x86 to RISC-like internal uops for register renaming + Out-of-Order execution. movzx avoids any false dependency on the old value of the full register. It's analogous to ARM ldrb, MIPS lbu, and so on.

Merging into the low byte or word of RAX is a weird CISC thing that x86 can do but RISCs can't.

You can safely read 8-bit and 16-bit registers (and you need to for a word store) but generally avoid writing partial registers unless you have a good reason, and you understand the possible performance implications (Why doesn't GCC use partial registers?). e.g. you've xor-zeroed the full destination ahead of cmp + setcc al.

How to isolate byte and word array elements in a 64-bit register

1 Answers1

Linked