Can someone help me translate this pseudocode into x86 assembly?
if (eax > ebx)
mov dl, 5;
else
mov dl, 6;
Can someone help me translate this pseudocode into x86 assembly?
if (eax > ebx)
mov dl, 5;
else
mov dl, 6;
the simple version:
CMP EAX,EBX
JG L1
MOV DL,6
JMP L2
L1:
MOV DL,5
L2:
the optimized version:
CMP EAX,EBX ; dl = (eax > ebx) ? 5 : 6
SETLE DL
ADD DL,5
Another way is to use conditional move. You didn't specify whether the comparison is signed or unsigned but in case it's signed:
cmp eax, ebx
cmovg dl, 5
cmovle dl, 6
See also Unpredictable Conditional Branches on 32-Bit Intel® Architecture
If you're doing an unsigned comparison, the compare result goes into the carry flag. "Below" is the same condition as carry-set (e.g. a-b
produces a carry-out, so a
was strictly below b
, not equal or above). Notice that jb
is the same opcode as jc
, just different mnemonics for different semantic meanings of the same flag condition.
Unlike most flags, there are special instructions that use the carry flag, like adc
and sbb
, letting us add the carry flag to another register.
cmp ebx, eax ; set CF if ebx <unsigned eax
mov dl, 6 ; optionally mov edx,5 to avoid false dependencies
sbb dl, 0 ; 6 - (eax > ebx)
;; dl = eax > ebx ? 5 : 6
The logic can be tricky to figure out: cmp
in the other order will reverse the sense of the comparison, and you can mov dl,5
/ adc dl,0
instead of sbb to do other unsigned conditions like <=.
In this case, ebx - eax
(done by CMP) sets CF if ebx < eax
i.e. CF = eax > ebx
. So we need to subtract CF from 6.
However, for an 8-bit register that you need to initialize, setcc + add may be a better choice: it's fewer uops on more CPUs. adc reg,0
is 2 uops on Intel P6-family. adc reg,imm8
is 2 uops on Intel SnB through Haswell as well, for non-zero imm8. But setcc is only 1 uop and efficient, but only available with an 8-bit destination. (For wider outputs you normally want to xor-zero a wider register before the flag-setting operation which costs you an extra uop.)
But in this version, the mov dl,5
can execute without waiting for the cmp
, so if the cmp inputs were on the critical path for latency, this only has an adc
after them. That's only 1 uop on AMD and Intel SnB-family (because we used an immediate operand of 0.) So the critical-path latency from EAX or EBX to the result in DL is only 2 cycles, cmp + adc. vs. 3 cycles for cmp/setcc/add.
See also https://agner.org/optimize/ for more uop-count / latency info.
Also beware partial-register effects: Why doesn't GCC use partial registers?
Another option would be to use MASM32, where such if/else constructs are legal...