Posts with category - Code Snippets

Quick And Dirty Memory Inspection

Ever wish you could inspect N number of bytes of memory starting from a certain address? Back in the day when I as a kid and was programming in BASIC on an Apple II, we had two BASIC instructions called PEEK and POKE, which seemed like witchcraft at the time; you used PEEK and POKE to load a magic number and stuff happened! Looking back on it now, they were just instructions to inspect and set values at addresses in memory.

Nowadays, using C, we can write out memory using printf which has a conversion character for hexadecimal, and usually people dump the more compact hex values. However, occasionally we want a bitwise dump of each byte and there’s no printf conversion for that. Therefore I find myself rewriting this over and over every time I need it, so I coded up a quick one and stuck it here for my own future reference. It’s really simple but you’re welcome to use it if you like, I hereby declare it public domain code.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
#include <stdio.h>
#include <stdint.h>

void print_byte(const uint8_t x)
{
    size_t i;
    uint8_t mask = 0x01;

    for(i=0;i<8;i++) {
        printf("%u", (x & (mask << (7-i)))!=0);
        mask = 0x01;
    }
}

void dump_memory(const void * mem, const size_t length)
{
    uint8_t * byte_ptr;
    size_t i;

    for(i=0;i<length;i++) {
        byte_ptr = (uint8_t *)mem;
        byte_ptr+=i;
        printf("%p:\t", (void *)byte_ptr);
        print_byte(*byte_ptr);
        printf("\n");
    }        
}

int main()
{
    char sentence[] = "Hello World";

    dump_memory((void *)sentence, 12);

    return 0;
}

Before you go about peeking at addresses in memory, you need to make sure they’re mapped; otherwise you’d get a fault.

No Comments

Booting An Operating System On x86

As I am writing my tiny toy operating system, I’ll document some of the things I discover as I am doing it. I’ll gloss over things that have answers better documented in places like the OSDev Wiki, but will go into little details that was non-obvious (to me) that I had to find out through trial and error, poring over documents and asking on irc.

So, where does one start to write an OS? We can choose to start at the very beginning: booting. Example code here is from the OS I am working on called Treehouse.

What happens when an x86 system boots? It powers on, loads up the bootup firmware (UEFI or BIOS), and from there it will load up a bootloader that will load your kernel. A bootloader can be a simple thing that just loads up your kernel which it finds on a specific place on disk, or it can be pretty sophisticated, performing hardware initialization, reading different filesystems, or presenting a user menu that lets you select a kernel to boot.

Intel machines boot in real mode, the legacy mode with no memory protection and a 20-bit address space that gets you a whopping 1MB of RAM. Why? “Historical reasons”, as real mode is rarely ever has any modern uses. We’ll need protected mode, which all current operating systems need for things like being able to use your entire address space, virtual memory, paging, etc. Now, we can switch to protected mode ourselves in our kernel, but it’s a way easier to just let a bootloader like Grub do it for you, and we can assume protected mode when the system gets handed over to our kernel. It’s a cheat but it makes life simpler, and dammit Jim, we’re writing a kernel not a bootloader.

The boot.S file is the first Treehouse code that the bootloader will call. Most of it is from the OSDev Wiki Bare Bones Tutorial, and it’s well documented over there but here I’ll attempt to explain the bits that it doesn’t explain which puzzled me. It’s written in AT&T syntax assembly and it’s pretty short so I’ll just plonk it down here:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
# Boot.S taken mostly from the OSDEV wiki.

# Constants for the multiboot header
.set MULTIBOOT_ALIGN, 1<<0
.set MULTIBOOT_MEMINFO, 1<<1
.set MULTIBOOT_MAGIC, 0x1BADB002
.set MULTIBOOT_FLAGS, MULTIBOOT_ALIGN | MULTIBOOT_MEMINFO
.set MULTIBOOT_CHECKSUM, -(MULTIBOOT_MAGIC + MULTIBOOT_FLAGS)

# The multiboot header
.section .multiboot
.global _multiboot

.align 4
.long MULTIBOOT_MAGIC
.long MULTIBOOT_FLAGS
.long MULTIBOOT_CHECKSUM

.section .bootstrap_stack, "aw", @nobits
stack_bottom:
.skip 16384 # 16 KiB
stack_top:

.section .text
.global _start
.type _start, @function
_start:
movl $stack_top, %esp
pushl %ebx

call kernel_main

cli
hlt

.failsafeloop:
jmp .failsafeloop

.size _start, . - _start

Lines 4-8 are setting of symbols to expressions for the multiboot header, and the .set assembler directive works a little like #define in C, except instead of a search-and-replace it is actually assigning a value to the symbol, so it expects an expression there. Next, we plonk down the multiboot header in its own section, with the intent of putting that section in the very beginning of the binary in our link file. Sections are just named parts of the source which can be called anything you like, but depending on what you’re building you’re going to need a few “canonical” sections like .text (if you’re designing an executable), .data and .bss.

The next section is the bootstrap stack, which isn’t a “canonical” section in binaries, but the assembler lets you create any kind of section that you like for your own nefarious ends, and you can describe the properties of that section with attributes. A bootstrap stack is the stack you’re going to use while the kernel is bootstrapping itself. Code usually needs a stack for us to be able to do things like use stack variables and make function calls, so we’re going to have to reserve some space for it. For the attributes we use “aw” and @nobits. The “aw” means “allocatable and writable”. Writable (implies readable too) because you’d want to write to it (obviously) and allocatable means its loaded into memory at runtime. The @nobits is an attribute to indicate it’s not a section to be stored on disk but exists only at runtime and to be initialised to zeroes when starting up. The syntax is funky if you’ve never seen it before, but you won’t need to dabble with assembler syntax very much if you’re writing most of your code in C, like I’m doing here.

Next up is the .text section, which starts with the _start function, the literal starting point of execution for our kernel. As you can see, all it does is set up the address of stack_top and put it into the stack pointer register, push the EBX register onto the stack, and call the main function (kernel_main) which we’re going implement in C.

Now we don’t ever expect to return from kernel_main, but in the case it does, we have several failsafes; first is the cli and hlt instructions, which disables interrupts and halts the CPU, but in case that fails too I have an infinite loop there in the form of .failsafeloop.

Finally the last line sets the size of the _start symbol to current location minus the start, which is something from the OSDEV tutorial which will be useful further down when I do call tracing and debugging, so we’ll talk about it later when it comes up.

Next up, we’ll talk about the linker script!

No Comments