Say I have a simple system in C:
#include <cstddef>
typedef struct Point {
Point *a;
Point *b;
int x;
int y;
} Point;
int main() {
Point p1 = {NULL, NULL, 3, 5};
return 0;
}
Godbolt compiles to:
main:
push rbp
mov rbp, rsp
mov QWORD PTR [rbp-32], 0
mov QWORD PTR [rbp-24], 0
mov DWORD PTR [rbp-16], 3
mov DWORD PTR [rbp-12], 5
mov eax, 0
pop rbp
ret
A tiny step further and we have:
int main() {
Point v = {NULL, NULL, 3, 5};
Point m = {NULL, NULL, 7, 9};
Point s = {&v, &s, 11, 12};
return 0;
}
Compiled to:
main:
push rbp ; save the base pointer to the stack.
mov rbp, rsp ; put the previous stack pointer into the base pointer.
mov QWORD PTR [rbp-32], 0
mov QWORD PTR [rbp-24], 0
mov DWORD PTR [rbp-16], 3
mov DWORD PTR [rbp-12], 5
mov QWORD PTR [rbp-64], 0
mov QWORD PTR [rbp-56], 0
mov DWORD PTR [rbp-48], 7
mov DWORD PTR [rbp-44], 9
mov QWORD PTR [rbp-96], 0
mov QWORD PTR [rbp-88], 0
mov QWORD PTR [rbp-80], 0
mov DWORD PTR [rbp-80], 11
mov DWORD PTR [rbp-76], 12
lea rax, [rbp-32]
mov QWORD PTR [rbp-96], rax
lea rax, [rbp-96]
mov QWORD PTR [rbp-88], rax
mov eax, 0
pop rbp
ret
I can't exactly tell what's going on yet, but this helps (a little). Could one explain what is happening in the last example? I don't quite understand what the base pointer is, I know what the stack pointer is. I am not sure what QWORD PTR [...]
does, but it's saying it's a quad-word size and a pointer/address. But why is it picking those specific offsets from rbp
? I don't understand why it chose that.
Then the second part is the lea rax, [rbp-32]
. It looks like it's handling the part where I did {&v, &s}
.
So my question is:
- What is the QWORD/DWORD PTR loading into? Is this loading into the heap, the stack, or something else?
- Why is it choosing to be an offset of
rbp
? - Do the order of operations always go from the smallest object (most primitive object) to the most complex object? Or can you think of a case where the assembly code would first construct the complex object and then construct the more primitive objects?
I am wondering because I'm trying to wrap my head around how to create a tree in assembly. In functional programming or in JavaScript, you have a(b(c(), d(), e(f(g(), h()), ...)))
. The deepest functions get evaluated first, then a
gets evaluated last, passed in the arguments. But I'm having a hard time visualizing how this would look in assembly.
More specifically, I am trying to create like a simple key/value store in assembly, to get a deeper understanding of how "objects" are created at this low level. It's easy in JavaScript:
db[key] = value
But this is because value
already exists somewhere in memory. The question I have is, should I be creating this directly in the key-value store up-front? Or do you always create it in a random free spot in memory (like the offsets from rbp
) and then later move them to the correct position (or point them to the right places)? I keep thinking I should be creating the tree leaf node directly on the branch, like I am pasting a leaf on the branch (visually). But the leaf already exists! Where does it exist before it is on the branch!? Can it ever exist on the branch before it is constructed elsewhere? I am getting confused.
So, start with a leaf.
Paste it on a branch.
/
\ | |
\|/
|
|
Where is the leaf being created in the first place? That's what I was trying to see with the assembly example.
Basically I'm wondering how it looks to directly create something on the heap, rather than the stack.