2

I am struggling with a problem in assembly, where I have to take the first byte (FF) of the hex code and copy it over the entire value:

0x045893FF      input
0xFFFFFFFF      output

What I did is:

movl $0x04580393FF, %eax
shl $24, %eax     # to get only the last byte 0xFF000000

Now I want to copy this byte into the rest of the register.

Peter Cordes
  • 245,674
  • 35
  • 423
  • 606
juliensaad
  • 1,899
  • 2
  • 19
  • 26

3 Answers3

5

You could do it for instance like this:

mov %al, %ah    #0x0458FFFF
mov %ax, %bx    #0xFFFF
shl $16, %eax   #0xFFFF0000
mov %bx, %ax    #0xFFFFFFFF

Another way would be:

movzx %al, %eax
imul $0x1010101, %eax

The last one is possibly faster on modern architectures.

Gunther Piez
  • 28,058
  • 6
  • 62
  • 101
  • great answer, that was exactly what I was looking for! Thank you very much – juliensaad Mar 20 '12 at 00:43
  • Why not `movsx %al, %eax` - after all, `0xff`, treated as signed, will extend to `0xffffffff` directly. No need for the "replicate bytes by multiplication" trick. – FrankH. Mar 20 '12 at 17:30
  • Agreed though that of course this only works for the special case of the byte being `0xff`. – FrankH. Mar 20 '12 at 17:31
  • 1
    @FrankH. As assembly programmer, you are entitled to do code obfuscation at your will. So I take advantage of that and insert a multiplication for an innocent move operation where it comes unexpected and hurts the reader the most. – Gunther Piez Mar 20 '12 at 18:46
  • The multiply way is *much* faster on Intel P6-family (partial register stalls), and still likely better on modern x86. Certainly for throughput, especially given the gratuitous partial-register write in `mov %ax, %bx` instead of `mov %eax, %ebx`. (Writing AX at the end is needed to merge into EAX, or with BMI2 and fast SHLD you could replace the last 3 insns with `rorx $16, %eax, %edx` + `shld $16, %ebx, %eax`. But then you'd still have [an AH merge on Intel](https://stackoverflow.com/questions/45660139/how-exactly-do-partial-registers-on-haswell-skylake-perform-writing-al-seems-to) – Peter Cordes Feb 16 '21 at 11:58
  • See also [Why doesn't GCC use partial registers?](https://stackoverflow.com/q/41573502). (And note the multiply version could save 1 cycle of latency on Intel SnB-family by using a different temporary, allowing mov-elimination to work: `movzbl %al, %ecx` / `imul $0x1010101, %ecx, %eax`) – Peter Cordes Feb 16 '21 at 12:00
1

Another solution using one register (EAX) only:

            (reg=dcba)
mov ah, al  (reg=dcaa)
ror eax, 8  (reg=adca)
mov ah, al  (reg=adaa)
ror eax, 8  (reg=aada)
mov ah, al  (reg=aaaa)

This is a bit slower than the above solution, though.

Luis Paris
  • 21
  • 2
1

I am used to NASM assembly syntax, but this should be fairly simple.

; this is a comment btw
mov eax, 0x045893FF ; mov to, from

mov ah, al
mov bx, ax
shl eax, 16
mov ax, bx

; eax = 0xFFFFFFFF
Kendall Frey
  • 39,334
  • 18
  • 104
  • 142