diff --git a/log/2024-03-26.md b/log/2024-03-26.md index b33c825..2daaf68 100644 --- a/log/2024-03-26.md +++ b/log/2024-03-26.md @@ -1,6 +1,365 @@ # Writing an OS in rust # Introduction -I have tried multiple times to write an OS, but have always failed. Time to fail again :). I will be writing down my progress here, and maybe share it some day. +I have tried multiple times to write an OS, but have always failed. Time to fail again :). I will be writing down my progress here, and maybe share it some day. I have previously used [osdev](https://osdev.org/) to learn and now I also use [OS in Rust](https://os.phil-opp.com/). + +# Cross-compilation +The first step to making an OS is of course installing the correct compilers. + +## binutils +Binutils is needed for linking and assembling. To install binutils I began by downloading the latest version from [sourceware](https://sourceware.org/pub/binutils/snapshots/) into the repository root directory. +After that I ran these commands in the repository: +```bash +tar -xzf binutils-x.y.z.tar.xz # Extract the source code +mv binutils-x.y.z binutils # Move source into a binutils directory +cd binutils +mkdir build-binutils # Create build directory +../configure --target=i686-elf --prefix=../../opt --with-sysroot --disable-nls --disable-werror +make # Compile +make install # Install +``` + +## gcc +Gcc is needed for eventual C code and as a frontend for the linker. To install gcc I began by downloading the source code from [mirrorservice](https://mirrorservice.org/sites/sourceware.org/pub/gcc/snapshots/) into the repository directory. After I downloaded gcc I ran these commands: +```bash +tar -xzf gcc-x.y.z.tar.xz # Extract the source code +mv gcc-x.y.z gcc # Move source into a binutils directory +cd gcc +mkdir build-gcc # Create build directory +../configure --target=i686-elf --prefix=../../opt --disable-nls --enable-languages=c,c++ --without-headers +make all-gcc # Compile gcc +make all-target-libgcc # Compile libgcc +make install-gcc # Install gcc +make install-target-libgcc # Install libgcc +``` +In order to add gcc and binutils to `$PATH` I added this line to `shellHook` in `shell.nix`: +```bash +export PATH="$PATH:$PWD/opt/bin" +``` + +## rustup +[Rustup](https://rustup.rs/) is needed to install the correct target. To make anyone able to replicate my toolchain, I used [nix](https://nixos.org). Following [this](https://nixos.wiki/wiki/Rust) tutorial, I got rustup installed. I also added +```bash +rustup component add rust-analyzer +rustup component add rust-src +``` +to `shellHook` to add the necessary components for IDEs and for compiling to custom targets. + +In the end, `shell.nix` looked like this: +```nix +{ pkgs ? import {} }: + +pkgs.mkShell rec { + buildInputs = with pkgs; [ + clang + llvmPackages_17.bintools + rustup + qemu + grub2 + libisoburn + bison + flex + gmp + libmpc + mpfr + texinfo + isl + ]; + RUSTC_VERSION = "nightly"; # Required for some experimental cargo features + hardeningDisable = [ "all" ]; # Required to compile gcc + LIBCLANG_PATH = pkgs.lib.makeLibraryPath [ pkgs.llvmPackages_latest.libclang.lib ]; + shellHook = '' + rustup component add rust-analyzer + rustup component add rust-src # Needed for compiling to a custom target + export PATH=$PATH:''${CARGO_HOME:-~/.cargo}/bin + export PATH=$PATH:''${RUSTUP_HOME:-~/.rustup}/toolchains/nightly-x86_64-unknown-linux-gnu/bin/ + export PATH="$PATH:$PWD/opt/bin" + ''; + # Add glibc, clang, glib, and other headers to bindgen search path + BINDGEN_EXTRA_CLANG_ARGS = + # Includes normal include path + (builtins.map (a: ''-I"${a}/include"'') [ + # add dev libraries here (e.g. pkgs.libvmi.dev) + pkgs.glibc.dev + ]) + # Includes with special directory paths + ++ [ + ''-I"${pkgs.llvmPackages_latest.libclang.lib}/lib/clang/${pkgs.llvmPackages_latest.libclang.version}/include"'' + ''-I"${pkgs.glib.dev}/include/glib-2.0"'' + ''-I${pkgs.glib.out}/lib/glib-2.0/include/'' + ]; +} +``` + +# Bootstrap assembly +To boot the operating system you need a bootloader. For this project I will just be using grub. But to use grub, grub needs to know how to boot your OS. To tell this to grub you need to have a header in your binary file. In order to achieve this I made an assembly file named `boot.s` containing this: +```asm +/* Declare constants for the multiboot header. */ +.set ALIGN, 1<<0 /* align loaded modules on page boundaries */ +.set MEMINFO, 1<<1 /* provide memory map */ +.set FLAGS, ALIGN | MEMINFO /* this is the Multiboot 'flag' field */ +.set MAGIC, 0x1BADB002 /* 'magic number' lets bootloader find the header */ +.set CHECKSUM, -(MAGIC + FLAGS) /* checksum of above, to prove we are multiboot */ + +/* +Declare a multiboot header that marks the program as a kernel. These are magic +values that are documented in the multiboot standard. The bootloader will +search for this signature in the first 8 KiB of the kernel file, aligned at a +32-bit boundary. The signature is in its own section so the header can be +forced to be within the first 8 KiB of the kernel file. +*/ +.section .multiboot +.align 4 +.long MAGIC +.long FLAGS +.long CHECKSUM + +/* +The multiboot standard does not define the value of the stack pointer register +(esp) and it is up to the kernel to provide a stack. This allocates room for a +small stack by creating a symbol at the bottom of it, then allocating 16384 +bytes for it, and finally creating a symbol at the top. The stack grows +downwards on x86. The stack is in its own section so it can be marked nobits, +which means the kernel file is smaller because it does not contain an +uninitialized stack. The stack on x86 must be 16-byte aligned according to the +System V ABI standard and de-facto extensions. The compiler will assume the +stack is properly aligned and failure to align the stack will result in +undefined behavior. +*/ +.section .bss +.align 16 +stack_bottom: +.skip 16384 # 16 KiB +stack_top: + +/* +The linker script specifies _start as the entry point to the kernel and the +bootloader will jump to this position once the kernel has been loaded. It +doesn't make sense to return from this function as the bootloader is gone. +*/ +.section .text +.global _start +.type _start, @function +_start: + /* + The bootloader has loaded us into 32-bit protected mode on a x86 + machine. Interrupts are disabled. Paging is disabled. The processor + state is as defined in the multiboot standard. The kernel has full + control of the CPU. The kernel can only make use of hardware features + and any code it provides as part of itself. There's no printf + function, unless the kernel provides its own header and a + printf implementation. There are no security restrictions, no + safeguards, no debugging mechanisms, only what the kernel provides + itself. It has absolute and complete power over the + machine. + */ + + /* + To set up a stack, we set the esp register to point to the top of the + stack (as it grows downwards on x86 systems). This is necessarily done + in assembly as languages such as C cannot function without a stack. + */ + mov $stack_top, %esp + + /* + This is a good place to initialize crucial processor state before the + high-level kernel is entered. It's best to minimize the early + environment where crucial features are offline. Note that the + processor is not fully initialized yet: Features such as floating + point instructions and instruction set extensions are not initialized + yet. The GDT should be loaded here. Paging should be enabled here. + C++ features such as global constructors and exceptions will require + runtime support to work as well. + */ + + /* + Enter the high-level kernel. The ABI requires the stack is 16-byte + aligned at the time of the call instruction (which afterwards pushes + the return pointer of size 4 bytes). The stack was originally 16-byte + aligned above and we've pushed a multiple of 16 bytes to the + stack since (pushed 0 bytes so far), so the alignment has thus been + preserved and the call is well defined. + */ + call kernel_main + + /* + If the system has nothing more to do, put the computer into an + infinite loop. To do that: + 1) Disable interrupts with cli (clear interrupt enable in eflags). + They are already disabled by the bootloader, so this is not needed. + Mind that you might later enable interrupts and return from + kernel_main (which is sort of nonsensical to do). + 2) Wait for the next interrupt to arrive with hlt (halt instruction). + Since they are disabled, this will lock up the computer. + 3) Jump to the hlt instruction if it ever wakes up due to a + non-maskable interrupt occurring or due to system management mode. + */ + cli +1: hlt + jmp 1b + +/* +Set the size of the _start symbol to the current location '.' minus its start. +This is useful when debugging or when you implement call tracing. +*/ +.size _start, . - _start +``` +This file creates a multiboot header and calls `kernel_main`. + +To assemble this I used the GNU assembler we compiled earlier: + +`i686-elf-as -o boot.o boot.s` + +# Linking +## What is it? +In order to combine `boot.s` and the kernel we need to do something called "linking". This will combine all the object files into a raw binary. An object file is a like a half done executable. It will contain references in ascii to for example `printf`. This can't be ran directly, because it doesn't know where `printf` is. When you link the object file, it combines your object file with `stdio`, which contains `printf`. + +## For an operating system +I have to combine `boot.s` and the kernel in a specific way to preserve the multiboot header. To do this I can create a *linker script* that tells the linker how to link the object files. To do this I created a file called `linker.ld` in the repository root, which contains this: +```linker +/* The bootloader will look at this image and start execution at the symbol + designated as the entry point. */ +ENTRY(_start) + +/* Tell where the various sections of the object files will be put in the final + kernel image. */ +SECTIONS +{ + /* Start offset */ + . = 2M; + + /* First put the multiboot header, as it is required to be put very early + in the image or the bootloader won't recognize the file format. + Next we'll put the .text section. */ + .text BLOCK(4K) : ALIGN(4K) + { + *(.multiboot) + *(.text) + } + + /* Read-only data. */ + .rodata BLOCK(4K) : ALIGN(4K) + { + *(.rodata) + } + + /* Read-write data (initialized) */ + .data BLOCK(4K) : ALIGN(4K) + { + *(.data) + } + + /* Read-write data (uninitialized) and stack */ + .bss BLOCK(4K) : ALIGN(4K) + { + *(COMMON) + *(.bss) + } + + /* The compiler may produce other sections, by default it will put them in + a segment with the same name. Simply add stuff here as needed. */ +} + +``` +I specified that I want to use this script in `i686-bare-metal.json`. + +# Kernel +The last thing that `boot.s` does is that it calls `kernel_main`. This is defined in a rust file. To setup the rust project I used `cargo new kernel`. This creates an (not entirely) empty rust project called `kernel`. + +## Target +By default rust will compile to x86_64-unknown-linux-gnu, but to make it compile to x86 without an OS and with our own linker script I needed to define a custom target. In `kernel/i686-bare-metal.json` I put this +```json +{ + "llvm-target": "i686-unknown-none", + "data-layout": "e-m:e-p:32:32-p270:32:32-p271:32:32-p272:64:64-i128:128-f64:32:64-f80:32-n8:16:32-S128", + "arch": "x86", + "target-endian": "little", + "target-pointer-width": "32", + "target-c-int-width": "32", + "os": "none", + "executables": true, + "linker-flavor": "gcc", + "linker": "i686-elf-gcc", + "panic-strategy": "abort", + "disable-redzone": true, + "features": "-sse,+soft-float", + "dynamic-linking": false, + "relocation-model": "pic", + "code-model": "kernel", + "exe-suffix": ".elf", + "has-rpath": false, + "no-default-libraries": true, + "position-independent-executables": false, + "pre-link-args": { + "gcc": ["-T", "../linker.ld", "-ffreestanding", "-nostdlib", "-lgcc", "../build/boot.o"] + } +} +``` +This tells the compiler to compile to 32-bit x86: +```json +"llvm-target": "i686-unknown-none", +"arch": "x86", +"target-pointer-width": "32", +"target-c-int-width": "32", +``` +To use gcc for linking: +```json +"linker-flavor": "gcc", # Tells the compiler that we're using gcc +"linker": "i686-elf-gcc", # Tells the compiler what gcc executable we want to use +``` +And this tells rust to link this with `boot.s`, without a standard library, freestanding and using my build script `../../linker.ld`. + +To use this custom target I needed to create a cargo configuration. To do this I wrote this in `kernel/.cargo/config.toml` +```toml +[build] +target = "i686-unknown-bare.json" # Use the custom target + +[unstable] +build-std-features = ["compiler-builtins-mem"] # To use a custom target, I need to manually compile the standard library. +build-std = ["core", "compiler_builtins"] +``` + +## Code +Just to test the kernel I put this in `kernel/src/main.rs` +```rust +#![no_std] // Don't compile with std +#![no_main] // Don't compile with a main function + +#[no_mangle] +fn kernel_main() { + let vga_buffer = 0xb8000 as *mut u8; // Make a pointer to address 0xB8000 + + let string: &[u8] = b":3"; // Define a string to print + for (i, &byte) in string.iter().enumerate() { // For every character in the string + unsafe { + *vga_buffer.offset(i as isize * 2) = byte; // Print the character + *vga_buffer.offset(i as isize * 2 + 1) = 0xf; // Set the color to white + } + } + + + loop {} // End of program +} + +#[panic_handler] +fn panic(_info: &core::panic::PanicInfo) -> ! { + loop {} +} +``` + +# Building the kernel +To build the kernel I just run `cargo build` in the `kernel` directory. This will produce a binary called `kernel/target/i686-bare-metal/debug/kernel.elf`. I now have a multiboot enabled kernel :) + +# File structure +``` +geos/ +| opt/ +| | gcc and binutils binaries and lib. +| kernel/ +| | rust project +| gcc/ +| | gcc source code +| binutils +| | binutils source code +``` -# Assembler