0

I was reading tutorials regarding inline assembly within C, and they tried a simple variable assignment with

int a=10, b;
asm ("movl %1, %%eax; 
movl %%eax, %0;"
:"=r"(b)        /* output */
:"r"(a)         /* input */
:"%eax"         /* clobbered register */
);

which made sense to me (move input into eax then move eax to output). But when I removed the %movl %%eax, 0 line (which is supposed to move the proper value to the output), the variable b was still assigned the proper value from the inline assembly.

My main question is how does the output 'know' to read from this %eax register?

Peter Cordes
  • 245,674
  • 35
  • 423
  • 606
Tom
  • 55
  • 8
  • 3
    BTW, the asm statement in the question is actually fully safe and correct. At least I don't see any problems. That's rare in questions about inline asm so I thought it was worth mentioning! – Peter Cordes Feb 17 '20 at 06:02

1 Answers1

2

An inline-assembly statement is not a function call.

The "return in EAX" thing is for functions; it's part of the calling convention that lets compilers make code that can interact with other code even when they're compiled separately. A calling convention is defined as part of an ABI doc.

As well as defining how to return (e.g. small non-FP objects in EAX, floating point in XMM0 or ST0), they also define where callers put args, and which registers you can use without saving/restoring (call-clobbered) and which you can (call-preserved). See https://en.wikipedia.org/wiki/Calling_convention in general, and https://www.agner.org/optimize/calling_conventions.pdf for more about x86 calling conventions.

This inflexible rigid set of rules doesn't apply to inline asm because it doesn't have to; the compiler necessarily can see the asm statement as part of the surrounding C code. That would defeat the whole point of inline. Instead, in GNU C inline asm you write operands / constraints that describe the asm to the compiler, effectively creating a custom calling convention for each asm statement. (With parts of that convention left up to the compiler's choice for "=r" outputs. Use "=a" if you want to force it to pick AL/AX/EAX/RAX.)

If you want to write asm that returns in EAX without having to tell the compiler about it, write a stand-alone function. (e.g. in a .s file, or an asm("") statement as the body of an __attribute__((naked)) C function. Either way you have to write the ret yourself and get args via the calling convention, too.)

Falling off the end of a non-void function after running an asm statement that leaves a value in EAX may appear to work with optimization disabled, but it's totally unsafe and will break as soon as you enable optimization and the compiler inlines it.


My main question is how does the output 'know' to read from this %eax register?

It probably just happened to pick EAX for the "=r" output when you compiled with optimization disabled. EAX is always GCC's first choice for evaluating expressions. Look at the compiler-generated asm output (gcc -S -fverbose-asm) to see what asm it generated around your asm, and which register it substituted into your asm template. You probably have mov %eax, %eax ; mov %eax, %eax.

Using mov as the first or last instruction of an asm template almost always means you're doing it wrong and should have used better constraints to tell the compiler where to put or where to find your data.

e.g. asm("" : "=r"(b) : "0"(a)) will make the compiler put the input into the same register as it's expecting the output operand. So that copies a value. (And forces the compiler to materialize it in a register, and forget anything it knows about the current value, defeating constant-propagation and value range optimizations, as well as stopping the compiler from optimizing away that temporary entirely.)

Why does issuing empty asm commands swap variables? describes that happening by change, same as your case with the compiler picking the same reg for input and output "r" operands. And illustrates using asm comments *inside the asm template to print out what the compiler chose for any %0 or %1 operands you don't otherwise reference explicitly**.

See also segmentation fault(core dumped) error while using inline assembly for more about the basics of using input and output constraints.

Also related: What happens to registers when you manipulate them using asm code in C++? for another example and writeup of how compilers handle register in GNU C inline asm statements.

Peter Cordes
  • 245,674
  • 35
  • 423
  • 606
  • This makes a lot more sense and clear things up, appreciate the in-depth response! – Tom Feb 17 '20 at 07:18
  • @Tom: if that fully answers your question, you can mark it accepted with the checkmark under the up/down vote arrows. Glad I could help; your question was clear enough about exactly what your misunderstanding was that it was possible to write an answer :) – Peter Cordes Feb 17 '20 at 07:25
  • @Tom: just noticed I linked the wrong Q&A: meant to link [segmentation fault(core dumped) error while using inline assembly](//stackoverflow.com/q/60237447) instead of [Inline assembly multiplication "undefined reference" on inputs](//stackoverflow.com/q/60252065). Perhaps the buggy code in the latter question what misled you into using `$1` instead of `%1` in your next question. – Peter Cordes Feb 17 '20 at 07:33