0

I'm learning assembly right now and I'm completely lost on how the sign extension is used. I understand that if you have a value in AX and do MUL value it multiplies the value times AX and the result is in DX:AX, but how do I access and use the upper portion? It makes no sense to me whatsoever.

Peter Cordes
  • 245,674
  • 35
  • 423
  • 606
  • Calling it _"sign extension"_ is a bit of a misnomer, especially since `MUL` does unsigned multiplication. It's simply a 32-bit product split into two parts. – Michael Mar 18 '21 at 19:59
  • Im still confused on how I would use this though. say I do a multiplication and the result is in DX:AX and i need to use that result for some other math. I have no idea how I would use it. I know how to access both DX and AX and I know how to know if it is extended into DX. I just dont know how I can use the result. Every example I'm shown just does the multiplication but how do I actually use the result? right now I have 2 numbers and no way to really use the result – Sean Meyer Mar 18 '21 at 20:12
  • If you need the full 32 bits you can just combine them into one register, e.g. `shl edx, 16` `mov dx, ax` (there may be more efficient ways, but that should work). – Michael Mar 18 '21 at 20:15
  • that seems like a waste of time, makes me wonder what the people who made it this way were thinking, or if they were thinking. Thank you though, i didn't think of shifting it. – Sean Meyer Mar 18 '21 at 20:17
  • They were probably thinking of making it work in the abscence of 32-bit registers, since those weren't added until the 80386 while `MUL r/m16` was available before that. – Michael Mar 18 '21 at 20:24
  • If you want a simple 32x32 => 32-bit one-register multiply result, use `imul ecx, esi` (https://www.felixcloutier.com/x86/imul) or whatever [like compilers do](https://stackoverflow.com/q/38552116). (With your inputs sign-extended into 32-bit registers). You can't get a widening multiply into a single register (ignoring upper halves of the input). If you don't want your 32-bit result split between 16-bit registers, use 32-bit multiply instead of 16-bit. x86 has that because it evolved out of 16-bit-only 8086, but it also has more useful 32-bit instructions. – Peter Cordes Mar 18 '21 at 20:27
  • 1
    So what exactly is your real question? You have two numbers that are only 16-bit, and you want their 32-bit product in a single register? Or can you just as easily work with only 32-bit numbers? The answer to the question you asked (how to access the upper half?) is that it's in DX, but you apparently already know that. – Peter Cordes Mar 18 '21 at 20:32
  • Its in DX but I didn't know how to use that information in a useful way. If I have 1 number split across 2 registers its not exactly straight forward how I would use the full result. I didn't think of shifting it into 1 register though. My real question was how do I use 2 numbers as 1 number because my result was split up, but that was answered by someone else. – Sean Meyer Mar 18 '21 at 20:47
  • The way to use the result depends on what you want to do with it. If you want to print it, for example, push both halves onto the stack and call your print function. – prl Mar 18 '21 at 21:54
  • On an 8086 where you don't have 32-bit registers, you'd do 32-bit add with `add ax, si` / `adc dx, di` to do DX:AX += DI:SI. If you do have 32-bit registers (i.e. 32-bit code) and don't want a single number split up between DX and AX (which sounds like your real question), don't get yourself into that situation in the first place by not using `mul r/m16`. Use `imul r32, r/m32`. – Peter Cordes Mar 19 '21 at 11:08

2 Answers2

1

16bit registers preserve unsigned numbers in the range 0000h..FFFFh. The result of adding or subtracting such numbers may overflow this range by one bit, this bit is mediated by CarryFlag.
For instance FFFFh + FFFFh = CF + FFFEh.

The situation is different with multiplication: it may overflow by sixteen bits,
for instance FFFFh * FFFFh = FFFE0001h.
This is why another register (hardwired as DX) is used to hold the upper 16 bits of the result.

Even the old 16bit processor 8086 can add|subtract 32bit numbers using only legacy 16bit registers, but each operation requires two machine instructions, and CarryFlag must be taken into account. The lower halves of 32bit numbers are calculated with ADD|SUB and the upper halves with ADC|SBB.

Let's calculate for instance (a*b)+(c*d) where a,b,c,d are 16bit numbers 000Ah, 000Bh, 000Ch, 000Dh, respectively.

MOV AX,000Ah ;
MOV CX,000Bh ; Let CX=000Bh
MUL CX       ; Let DX:AX= a * b = 0000:006Eh
MOV SI,AX  ; Save temporary result DX:AX
MOV DI,DX  ;   into another pair DI:SI.
MOV AX,000Ch
MOV CX,000Dh
MUL CX     ; Let DX:AX = c * d = 0000:009Ch
; Add the temporary result DI:SI to DX:AX.
ADD AX,SI  ; Let AX = 009Ch + 006Eh = 010Ah. CF = 0.
ADC DX,DI  ; Let DX = 0000h + 0000h + CF = 0. CF = 0.

Result of (a*b)+(c*d), which is 0000:010Ah, is now in DX:AX, CF=0.

We may also treat 16bit value as a signed integer in the range -32768..+32767, i.e. 8000h..7FFFh. Intel CPU 8086 couldn't multiply signed numbers, the instruction IMUL was introduced with 80186 and later. In the absence of IMUL we have to convert each number to its absolute value, perform the unsigned multiplication and, if exactly one of multiplicant was negative, convert the unsigned product to its negative value.
Conversion between a negative value and its absolute (positive) value is provided by the instruction NEG. To negate 32bit number in two joined 16bit register CarryFlag must be taken into account again:

; Negate DX:AX
NEG DX
NEG AX
SBB DX,0

or, alternatively and less effective

; Negate DX:AX
NOT AX
NOT DX
ADD AX,1
ADC DX,0

However, all Intel/AMD processors made in the last three decades can compute signed multiplication directly with IMUL so you'd better to use it, as Peter Cordes suggests in his comment.

vitsoft
  • 2,586
  • 1
  • 12
  • 25
  • 1
    This is a good example of how you'd do things in 16-bit mode, when simply using one 32-bit register for each product isn't an option. So it shows how you had to do things back in the day, but you left out the part where you do `imul ecx, 000Ah` / `imul edi, 000Dh` / `add ecx, edi` the normal way for 32-bit code. – Peter Cordes Mar 19 '21 at 11:10
  • 1
    The one-operand form of [IMUL](https://ulukai.org/ecm/doc/insref.htm#insIMUL) actually is available on 8086. Only the two- and three-operand forms are not. – ecm Mar 19 '21 at 13:11
1

On an 8086 where you don't have 32-bit registers, you'd do 32-bit add with add ax, si / adc dx, di to do DX:AX += DI:SI, for example. This is how you work with integers larger than the widest register.

But normally you don't want your number split up between multiple registers in the first place, when a single register is wide enough. Since you're writing 32-bit code, don't get create this DX:AX split problem in the first place by not using mul r/m16.

Instead, simply use 32-bit operand-size for imul r, r/m or the 3-operand immediate form, like a compiler would. (zero-extend or sign-extend narrow inputs to the desired destination width).

Then you're not even forced to use EDX:EAX, and you don't destroy EDX for no reason when you only want a 32x32 => 32-bit multiply, not 32x32 => 64-bit. (Only the high half of a full-multiply depends on signedness, that's why there's only imul, not other forms of unsigned mul).

For example:

  movsx   edx, word ptr [esp+4]    ; sign-extend a 16-bit stack arg
  movzx   ecx, [esp+8]             ; zero-extend another 16-bit arg
  imul    edx, ecx                 ; edx = i16 * (int)u16
  imul    edx, [esp+12]            ; and multiply by a 32-bit stack arg, edx *= an int

  mov  eax, 123
  imul ecx,  eax, 456            ; ecx = eax*456, leaving EAX unmodified.

  add   edx, ecx

  lea   eax, [edx + edx*4 + 789]   ; eax = edx * 5 + 789 ; more math using the power of 32-bit addressing modes.

BTW, as @Michael pointed out in comments, you can do something like this to merge DX:AX into a single 32-bit int in EDX.

   shl edx, 16
   mov dx, ax

If you know the upper 16 bits of EAX were zero, you'd use or edx, eax instead of partial-register shenanigans. Why doesn't GCC use partial registers?)

Peter Cordes
  • 245,674
  • 35
  • 423
  • 606