Basic OS boot question_问答_开发者_运维开发者技术经验分享

I have some kinda basic question to the boot process of a computer and the part where the bootloader calls the OS.

So I know that the BIOS copies the first 512 bytes from a bootable drive into memory and executes the code - so that's the boot block.

But how does that small assembler boot bootloader the commands from the OS?
Does the bootloader continue running and still acts as a "transmitter" between software and hardware? Or is the control completely given to the OS?
Why are all bootloaders written in assembler?
And why do you have to go back from C++ to C when writing an OS?

Best regar开发者_运维问答ds, lamas

1) The bootloader normally contains some simple instructions to load yet more data from the disk and to execute it.

2) No,

3) To minimise the space they take up.

4) You don't.

But how does that small assembler boot bootloader the commands from the OS?

Yes, the bootloader is small, but the BIOS is not and it implements .. has always implemented .. the DOS I/O "system calls". This I/O system originally ran the whole OS I/O system back in the DOS and early Windows days. Now it's just a console responsible for loading the real OS which then supplies all of its own drivers. It's kind of a device-driver-library and original-IBM-PC-emulator for the boot loader.

Does the bootloader continue running and still acts as a "transmitter" between software and hardware? Or is the control completely given to the OS?

The bootloader is toast once the OS is running. This is a good question though, because in the original PC concept the BIOS did I/O for the OS as well as the bootloader, and so a part of the system did survive loading the OS.

Why are all bootloaders written in assembler?

Several reasons: they need to be small, they have fixed-address layout constraints, they have to do int $x style BIOS calls, and given their size and the fact that some of it must be in assembly, there isn't much to gain by taking 128 or so bytes and saying "ok, this part you can write in C, try not to write more than 10 or so statements".

And why do you have to go back from C++ to C when writing an OS?

C++ is fine today; back when today's big kernels were started things were different.

But how does that small assembler boot bootloader the commands from the OS?

Since you cannot do much in 512 bytes of code (although, in fact, a bootloader is not strictly limited to 512 bytes), a bootloader would usually do not much more than load a larger block of code from a disk into RAM and then execute it.

Does the bootloader continue running and still acts as a "transmitter" between software and hardware? Or is the control completely given to the OS?

I think that once the bootloader code has done its work, and jumped to the additional code that it has loaded into memory, it can then be overridden, since it's no longer needed.

Why are all bootloaders written in assembler?

I suppose this is mainly for one reason: If you write a bootloader in a high-level language, the produced code would quite probably rely on some sort of runtime library, containing essential functions. Those are, however, often not needed for a bootloader, and would therefore bloat its code size.

And why do you have to go back from C++ to C when writing an OS?

You don't strictly have to. It's just that C code is closer to the machine than C++. With C++ you can't always guess what code will be produced, and whether that will be as efficient as you would like.

Edit: I've also heard the argument that some OS developers stick to the C language because there's less choice in different programming paradigms and styles than e.g. in C++. Therefore its easier to work for a team with the common code base, because everyone will write more "similar" code. (Since I haven't myself been involved in any open-source or OS development, I can't judge from experience whether this is a valid statement or not.)

To answer your last question: Kernels do not need to be written in C. But it makes them a lot easier when it comes to try to allocate memory.

C++ has a lot of different cases where you can get memory implicitly allocated for temporary variables etc, these are not always evident upon code inspection. This makes writing a kernel more difficult, as you have to avoid allocating memory in some contexts.

However, it doesn't completely preclude writing a kernel in C++. There are ways around this.

The bootstrap loader loads the operating system from disk.

That's why it's called a bootstrap loader - it lifts itself in the boot straps by loading the disk operating system from disk before there is any disk operating system to load anything from disk.

When the operating system is loaded, it takes over and the loader is discarded.

A bootstrap loader is usually written in assembler because it's a good balance between total control and readability. It could be written in plain machine code (and the first ones probably were) but that is harder to read. It could be written in a low level language like C, but it's hard to reach the low level routines in BIOS that you need to do the loading, and also it's hard to keep it within the size constraints as the compiled code tend to have a lot of overhead.

An operating system is usually written in a low level langauge like C (with things like bootstrap loaders and hardware drivers still in assembler). You could write it in C++ as it builds on C, but that would be mostly pointless as you wouldn't use much of what C++ added. Object orientation is simply not very useful when writing an operating system.

There is one important point missing in all the answer on why it is written in assembly: because:

a. MMU is still not setup,

b. there is no existing loader process running to load your compiled binaries - as the memory region is not homogeneous - some parts are hardwired for DMA/BIOs purposes etc, so lots of remapping work is needed.

So when you write it in assembly, you have to explicitly specify all your memory locations - where is the data located, where is the executable located etc, and ensure that these constraints are not in conflict.

If you write in C, gcc will compile it into some ABI standard format, and there is a pre-existing loader (it may be just a library) to load the binary into memory. Since usually the memory used in binary is more than that available in phyical memory, and certain parts of memory are reserved for hardware usages - platform/hardware specific - MMU therefore is important to be setup before loading these binaries.

ELF loading is just one aspect of the complexity:

http://wiki.osdev.org/ELF#Loading_ELF_Binaries

But there do exists distribution like uClinux which do away with the need for MMU to be setup:

http://www.uclinux.org/

Someone have to teach me more about that.....