4

Recently im learning how to write a boot sector, here is the complete code that i am learning:

org 07c00h
    mov ax, cs
    mov ds, ax
    mov es, ax
    call DispStr
    jmp $

DispStr:
    mov ax, BootMessage
    mov bp, ax
    mov cx, 16
    mov ax, 01301h
    mov bx, 000ch
    mov dl, 0
    int 10h
    ret

BootMessage: db "Hello, OS!"
times 510-($-$$) db 0

dw 0xaa55

a very simple code if you know how to boot a system. the result is a line Hello OS! displayed on the screen, the only thing that i dont know is the first line: org 07c00h, the book tells me that the line of code let the compiler to locate the address to the 7c00h place, but the explanation is very ambiguous, and i really dont know whats the use of it here. what intheworld does the line org 07c00h do here? i tried to remove the line, and use nasm to create a bin file, then use the bochs to boot the bin file. nothing different from the previous one: "hello OS!" displayed on the screen too. can i say that the first line does nothing here? whats the use of org xxxx intheworld?

Prof. Falken
  • 22,327
  • 18
  • 94
  • 163
Searene
  • 19,507
  • 34
  • 112
  • 159
  • It means exactly what the book says. If you don't understand it, you should probably review the basics again. In particular, you need to understand how memory works. – Karl Knechtel Apr 24 '12 at 15:52
  • 3
    As the [nasm manual](http://www.nasm.us/doc/nasmdoc7.html#section-7.1.1) says: "The function of the ORG directive is to specify the origin address which NASM will assume the program begins at when it is loaded into memory.". I.e. you're telling the assembler something it can't figure out on its own: at what address the program will be loaded. – user786653 Apr 24 '12 at 16:32
  • @Karl: And you should understand first what helping and being kind is and how to answer people to enlighten them instead of just pissing off. – SasQ Jun 17 '12 at 21:47
  • 1
    So you could help him understand those fundamentals. If you had the time to write such uninformative comment, you had (the same) time for writing something more enlightening. It's not needed to write a book, it's enough to throw a link to some explanation of memory segmentation somewhere over the Net. Why to comment only to not help? – SasQ Jun 17 '12 at 22:49

2 Answers2

5

The assembler is translating each line of your sorce code to processor instruction and generates these instructions in sequence, one after another, into the output binary file. Doing that, he maintains an internal counter which counts the current address of any such instruction, starting from 0 and upwards.

If you're assembling a normal program, these instructions will end up in the code section at some object file with just blank slots for addresses, which have to be filled in with proper addresses by the linker afterwards, so it's not a problem.

But when you assemble a flat binary file without any sections, relocations and other formatting, just raw machine instructions, then there is no information for the assembler about where are your labels indicating to and what are the addresses of your data. So, for example, when you have an instruction mov si, someLabel, then the assembler can only calculate the offset of this label starting from 0 at the beginning of the binary file. That is, it assumes that your code would be located in the memory beginning from the offset 0 in your code segment.

If it's not true, and you want your machine instructions in memory to begin from some other address, eg. 7C00, then you need to tell the assembler that the starting address of your program is 7C00 by writing org 0x7C00 at the beginning of your source. This directive tells the assembler that it should start up its internal address counter from 7C00 instead of from 0. The result is that all addresses used in such a program will be shifted by 7C00. The assembler simply adds 7C00 to each of the address calculated for each label. The effect is as if the label was located in memory at the addres, say, 7C48 (7C00 + 48) instead of just 0048 (0000 + 48), no matter that it is offset only 48 bytes from the beginning of the binary image file (which, after loading at the offset 7C00 will give the proper address).

As to your other question: 7C00 is the physical address of the bootloader. You can represent this physical address as a logical address (segment:offset) in a different ways, because segments overlap (next segment starts 16 bytes (10 in hex) after the previous one). For example, you can use logical address 0000:7C00 which is the simplest configuration: you use segment 0 starting at the beginning of your RAM, and offset 7C00 from that 0. Or, you can use logical address 07C0:0000, which is 7C0th segment. Remember that segments start 16 bytes apart from each other? So you simply multiply this 7C0 by 10 (16 in decimal) and you get 7C00 -- see? It's a matter of shift one position to the right in your hexadecimal address! :-) Now you just add your offset, which is 0 this time, so it's still 7C00 physically. The byte 0 in segment 07C0 which starts at 7C00 in memory.

Of course you can also use more complicated addresses, like, for example, 0234:58C0, which means that the segment starts at 2340 and when you add 58C0 offset to it, you'll get 7C00 again :-) But doing that could be confusing. It all depends on what configuration you need. If you want to consider the 7C00 physical address as the start of your segment, just use segment 07C0 and your first instruction will be at offset 0, so you don't need to put org directive, or you can put org 0 then. But if you need to read/write some data below the 7C00 address (for example, peek the BIOS data or fiddle with interrupt vectors), then use segment 0 and offset 7C00 which means your first instruction (0th byte in your binary file) will be located at 7C00 physical address in memory; then you have to add org 0x7C00 directive from the reasons described above.

SasQ
  • 11,814
  • 5
  • 39
  • 42
  • You mentioned "Remember that segments start 16 bytes apart from each other?", is this always the case? Or can you let me know when it would differ? – supmethods Aug 31 '18 at 01:03
  • Yes, in real mode, it's always the case. On old computers it was done this way in hardware (address lines shifted by 4 bits in the memory management unit when calculating addresses). When Intel introduced protected mode in their CPUs (from 286 up), segment registers have different behaviour: they index in special tables in memory (LDT, GTD or IDT), and these tables tell the CPU where the particular segments start. However, in real mode, they're still 16 bytes apart (and overlapping), for backward compatibility. – SasQ Sep 02 '18 at 11:22
3

It is where you have an assembler and linker in one step. The org tells the assembler which tells the linker (in these cases often the same program) where in physical memory space to put the code that follows. When you use a C compiler or some other high level language compiler you often have separate compile and link steps (although the compiler often calls the linker for you behind the scenes). The source is compiled to a position independent object file with some of the instructions left unimplemented waiting on the link step. The linker takes objects and a linker script or information from the user describing the memory space and from there then encodes the instructions for that memory space.

User786653 set it quite well it tells the assembler something it cant figure out on its own the memory space/address where these instructions are going to live in case there is a need to make position dependent encodings in the instructions. Also it uses that information in the output binary if it is a binary that includes address information, for example elf, srec, ihex, etc.

old_timer
  • 62,459
  • 8
  • 79
  • 150
  • 1
    thx, but what does `org 7c00h` mean? the segment is 7c00h, or the offset is 7c00h? which tool can i use to detect the address? – Searene Apr 25 '12 at 07:53
  • 1
    all right, using the bochs debugger, i found that the 0x7c00h was added in the part of offset address, if without the first line, `org 07c00h`, the system will load the wrong address of string `BootMessage`, thx a lot. i learned a lot. – Searene Apr 25 '12 at 08:13