Building an OS - 3 - FAT12

Hello, welcome to Nanobyte. Today we will learn about filesystems, what they are, what are they good for, how they work, we will learn the inner workings of FAT, and we will make our bootloader load the kernel from the disk.

Filesystems

What is a file system, and why would you ever need something like that?

Let's imagine going to a big library, and we want to read “The lord of the rings” by J. R. R. Tolkien. We go to the nice lady at the front desk, and ask her about the book. The lady will look it up in a catalog, and she will tell us to go to the 2nd floor where the Fantasy Fiction section is, and look into the shelf 4A. We walk up the stairs, and find dozens of shelves full of books. We go to the 4A shelf, and notice that the books are ordered alphabetically by the author's name. We finally found the book we were looking for.

A file system is just like the library, a system of organizing pieces of data on some form of storage. If all the books in the library were put on random shelves, finding anything would be impossible, the same holds true for data on a disk: if you didn't have a filesystem to help you organize all this data, you wouldn't be able to find anything.

The term “file system” is typically used in the context of organizing data in nested structures of “files” and “folders”, but there can be other ways of organizing data on a disk. For example, relational database systems organize data in entities, attributes and relationships between these attributes.

If you find this concept interesting, you can experiment with writing an operating system that uses some other form of organizing data that doesn't use files and folders, it certainly sounds tempting to me. But for this course, we will stick to the traditional “file systems” that we're used to.

There is a huge number of file systems that exist out there, and they differ in how they work, what features they offer and their limits. Some of the most commonly used file systems are:

NTFS - the standard file system used on Windows machines. It is quite an advanced file system, has many features including permission controls, journaling, encryption, compression and many others
FAT* family - older file systems that were the default for the 9x versions of Windows (95, 98, Me). One of the simplest file systems that is still used in modern applications, writing drivers for it is very simple. FAT32 and exFAT are almost always used for external storage devices like SD cards and USB flash drives, where portability between operating systems is important. Also, they are also preferred in embedded applications, where computing power is severely limited.
APFS [apple fs] - used by Apple in MacOS. Also a very advanced file system, has many features, and it was pretty recently introduced to improve the older HFS+.
HFS+ - the older file system used on MacOS, an extended version of HFS
ext* family - used on Linux systems

FAT file system

Because it's one of the simplest file systems and it is supported even by toasters, I decided to use FAT, in particular FAT12 as it is the flavor used on floppy disks. Moving to FAT16 or FAT32 is not very difficult, they work basically the same except for some data structures which have been changed.

So let's dive straight into it, and see how it works.

A typical FAT disk is organized into 4 regions:

The “reserved sectors” region
The File Allocation Table region
The root directory region
The data region

1. The "reserved sectors" region

Begins with the boot sector, with which we are already pretty familiar. Contains some really important parameters, like the size of a sector, the size and location of each other region, as well as some metadata about the disk, like the volume ID, and serial number. Our bootloader is also stored in this region.

FAT32 uses an additional sector for storing these headers called the “filesystem information sector”, but this is not present in FAT12 and FAT16.

There can be other things stored in this region, depending on the file system implementation that created it; for example, in older versions of Windows, additional sectors were added to this region used to store the bootloader.

2. File allocation table region

Contains 2 copies of the file allocation table. This is a simple lookup table which will give us the location of the next block of data. We will get back to this in a minute.

3. Root directory region

The root directory is basically the “table of contents” of the disk: it contains a table of each file or folder located in the root of the disk, which includes things like the file name, the location of the file on the disk, the size, the attributes, and some other metadata.

4. Data region

The data region contains the actual contents of the files, as well as the other directories.

Reading procedure

To understand how files are read, let's go through an actual example. I created a disk image, and we will go step by step to read a file in the root directory named “test.txt”.

First, we need to figure out where the root directory region begins, this region contains the “table of contents” where our file should be. We know that the root directory is the third region, the first 2 being the reserved region and the FAT region, so if we can calculate the sizes of both of these regions, we will know where the root directory begins.The boot sector contains a header called “reserved sectors”, this gives us exactly the size of the reserved region, measured in sectors. In our case, this is just 1 sector.We don't have a header for the FAT region size, but we have the “file allocation table count” and the number of “sectors per fat”, multiplying these 2 numbers will give us the size of the FAT region. In our case, we have 2 file allocation tables, each being 9 sectors long, so the total size of this region is 18.If we add these up, we have the sector at which the root directory begins, that is 1 + 18 = 19. We should also calculate the size of the root directory, so we know how many sectors to read, and when to stop searching. For that, we need to use the “directory entry count” header; looking at the specification, we know that a directory entry is 32 bytes, so the root directory is 224 multiplied by 32, which is 7168 bytes. If we divide this by the “bytes per sector”, we get 14, which is the number of sectors that we need to read. If we had 225 entries, we would have gotten 7200 bytes, or 14.06 sectors. In this case, we need to round up to the next integer number, so we need to read 15 sectors. A trick we can use to achieve that result is to add the number of “bytes per sector” - 1 to the total byte count, before performing the division.
Now that we know where the root directory is, we read it into memory from disk.
Next, we need to go through the directory entries, and figure out where our file is. This is what a directory entry looks like; the most important fields for us are the file name and the “first cluster” number. You can see here that the filename is only 11 characters long… this is how things worked in the MS-DOS and Windows 3.1 era, you couldn't have file names any longer than that. Long file names first appeared in Windows 95, and these were done in a hackish way, by having a special value in the “attribute” field, and then using the other fields to store parts of the long file name. We won't be discussing these in this episode, since the 11 characters are enough for us to load the kernel, but we will in a future episode about FAT.
So this part is pretty straight forward, we compare the file name that we have with the file name field, and if it matches, that means we have found the directory entry. In our case, the 3rd entry matches the “test.txt” file. From this entry, we only need the “first cluster” number, which in our case is 3. You may have noticed that we have a “first cluster high” and a “first cluster low” fields, these are used together to get a 32-bit cluster number in FAT32, but in our case since we are working with FAT12, we only need the low 16 bits.So what are these clusters exactly? They are simply the blocks used by the FAT file system; just like disks are stored in blocks called “sectors”, FAT uses blocks called “sectors”, and the “Sectors per Cluster” header in the boot sector gives us the size of a cluster.
Since we already know where the first cluster of the file is located, we can already read the first block of the file into memory. The cluster number gives us the location in the “data region”, and it starts from 2… so to convert it to a sector number, we first take the total size of all the first 3 regions, and then add the (cluster number - 2) multiplied by the number of sectors per cluster. In our case, we already know the sizes of the first 3 regions which are 1, 18, and 14. Then we subtract 2 from the cluster number, 3, and multiply by the number of sectors per cluster, which is 2. Adding it all together gives us 35, which is the sector number where the data begins.
Now we can read a cluster, which is 2 sectors, starting at sector 35, this will give us the first block of the file contents.
After doing that, we need to figure out what the next cluster is. This is where the file allocation table comes into play; this is a simple lookup table, where the index of an entry corresponds to a cluster number, and the entries indicate the next cluster. The size of each entry depends on the FAT type; for FAT12, each entry is 12 bits wide, that is 1 byte and a half. Similarly, for FAT16 each entry is 16-bits wide, and for FAT32 each entry is 32 bits wide. In our example, the first cluster was 3, which would correspond to the 4th entry in the table; the 4th entry contains the value “4”, so this is our next cluster.Same as before, we calculate the sector number for cluster 4, read the data, and move on to the next cluster. This procedure will repeat until the cluster number has a value above or equal to FF8, these are special values which mark the end of the chain. Once we've reached the end of the chain, we've also reached the end of the file.

Now what if the file we want to find is not in the root directory, but in a folder? In that case, we need to split the path into components, and starting with the first component, we follow the same steps as above. Folders and files are read in the same way, but the contents of a “folder” have the same structure as the “root directory”. Once we finish reading the directory, we search for the next component in the path, and we keep doing that until we reach the file.

Write in C

Let's try to implement this algorithm in C. We will rewrite it later in assembly, as it will be much easier once we understand all the steps.

Continuing from where we left off in part 2, I will start by creating a “tools” folder, and another “fat” folder inside. I will also create a “fat.c” file, which will contain our implementation.

For now, we don't need to integrate this code into our operating system, so I will write it as a standalone program. This is a good practice when working on your operating system, whenever possible to write and test your code on an existing platform, like Windows or Linux, where you have all the debugging tools available, and integrate it into the operating system later.

We can start by writing the main function, where our program will take 2 command line arguments, the FAT disk image and then the file name of the file we want to read. Here we will check if the user introduced the correct number of parameters, and print the syntax otherwise.

Before going any further, let's integrate this in our Makefile. The rules I added are really simple, we just call the GCC compiler like we normally would, and the -g flag adds symbols for debugging. I also added an “all” target which will trigger all the components to be built.

Now back to the program, the next thing we need to do is to read the boot sector, and put that in a data structure. I created a struct called “BootSector”, and then I simply copied all the headers from the bootloader, and converted the syntax. We don't care about the bootloader code, so I simply ignored that part, because it doesn't contain any useful information.

Next, I created a function called “readBootSector” which reads the boot sector from the disk, and stores its data in a global variable. To keep things simple, I am simply returning a boolean value indicating the success. And of course, C doesn't have boolean types, so I added a typedef for that, and some defines for the “true” and “false” constants.

To implement this method, we simply need to call fread, and give it a pointer to a “BootSector” global variable to store the data.

Something we need to keep in mind is that modern compilers may add padding bytes to data structures; the reason is that aligning those to 4 or 8 bytes will improve the performance of certain operations. In our case, this would be a bad thing, because the BootSector structure wouldn't match to what's actually on the disk, so we need to tell the compiler to not do that. In GCC, this can be done by adding attribute(packed) .

Next, I called this function from main, and handled any potential errors.

Then I implemented another function, “readSectors”; I wanted this function to be similar to the disk routine we wrote in assembly in Part 2, so when we have to translate this code into Assembly it will be much easier. So this function takes as parameters the FILE handle, the sector number or LBA, the count, and a pointer to where we want to store the data. In the implementation, we first seek to the right position in the file, which is the sector number multiplied by the sector size, and then we read “count” sectors from that location.

Using this function, I implemented another function, “readFAT” which will read the file allocation table into memory. As we discussed earlier, the “Fat” region begins right after the “reserved” region, and the “reserved” region's size can be found in the “reserved sectors” field from the boot sector. After allocating enough memory, we just need to call the readSectors function to read the FAT. After that, I added the call to the “main” function and handled any errors.

We also need to read the root directory into memory, so let's first create a “DirectoryEntry” struct which contains all the fields from the FAT specification. Let's also add a global variable which will contain the root directory, and is an array of directory entries.

In the “readRootDirectory” function, we start by calculating the beginning position. As we discussed earlier, this is the sum of the sizes of the previous 2 regions, the reserved sectors + the file allocation tables. Now we need to determine how many sectors to read, which would be the total size of the root directory in bytes, divided by the size of a sector; also let's not forget to round up. After that, we can allocate the memory for the root directory; notice that I allocated using the “sector” count instead of the “size”, this is because the “readSectors” function can only read full sectors, so we need to make sure we allocate enough memory not to overflow any buffers. Finally, we read the sectors, and add this to the “main” function as well.

The next step is to find the file in the root directory, this is what the “findFile” function (that I initially called “readFile” and then changed my mind) will do, and it will return a pointer to the corresponding “DirectoryEntry”. The implementation is pretty simple, we just iterate over all the entries, and compare the “name” parameter to the “name” directory entry field. Finally, we can add this to main as well.

At this point, I wanted to test the code I have written so far, so I created a “test.txt” file that I then added to the disk image, by modifying the makefile. When building, I got some errors that I forgot to include some headers, which I quickly fixed. When running the program, we need to provide the “test.txt” file name in the format used by FAT, which is 11 characters, all caps, and padded with spaces. And it looks like it found the file, since we didn't get any error messages. If I run it with a bad file name, it will tell me that it couldn't find the file.

Next, I implemented the readFile method. We pass the file as a DirectoryEntry, that we got from the “findFile” function. First, I modified the function that reads the root directory, so that it saves the sector number where the root directory ends and the data region begins. This way, we don't have to do the calculation again, we can just use the saved value.

I will create a “currentCluster” variable which will keep track of the current cluster, and it's initial value will be the “first cluster” from the directory entry. Then, we begin the main loop where we read each cluster, and get the next cluster from the chain. The formula for converting from a cluster to a sector is… the root directory end, which is the same as the size of all previous regions combined, plus the current cluster - 2, multiplied by the number of sectors per cluster. Now that we know the sector to read, we read 1 cluster using the “readSectors” function, and then advance the position in the output buffer. Next, we need to determine what the next cluster is, which is a simple lookup in the file allocation table. What's a bit strange is the fact that each entry of the table is 12 bits wide, so we need to calculate the index by multiplying the current cluster number by 3 and dividing by 2. This gives us the byte index into the table, and then we need to select the correct bits. If the remainder of the division by 2 is 0 (or currentCluster is an even number), then we need to take the bottom 12 bits, so we apply a bitmask to remove the top 4 bits. If the remainder of the division is 1, or currentCluster is an odd number, we need to take the upper 12 bits, so we shift the value to the right by 4 bits. This will give us the next cluster in the chain, that we put in the “currentCluster” variable. Finally, we add the loop exit condition, we need to keep reading until the currentCluster is greater or equal to FF8, which marks the end of the chain.

Now that the readFile method is done, we can call it from main. When allocating the memory, make sure to allocate at least an extra sector, so we don't overwrite anything or get a segmentation fault. After checking for errors, I will print the contents of the file. To do this, I loop over each character, and if it's printable I print it, otherwise I print the hex number.

Let's give this a try, and see what happens. Looks like I have to add an include for “isprint”, ctype.h. And look at that, it works perfectly :) I can even print the contents of “kernel.bin”, and it matches what the hex viewer shows.

This is awesome, we have implemented our very own FAT driver in C. This will be really handy in the future, but right now we are working on the bootloader, so we need to convert it to Assembly so it can fit inside the boot sector. Easy, right?

I started by making some small improvements to the code we wrote in the previous episode; first, I wanted to make sure that the code segment is 0, as some bios-es might actually run our bootloader using the code segment 07C0, which could be unexpected. We can't manipulate the code segment register directly, we have to perform a far jump to achieve that. I used a trick to perform the jump, by pushing the segment and offset to the stack ,and then performing a far return, but you can do this using the jump instruction as well.

The second change I made was to read the drive parameters using the BIOS routine int13h ah=08, which should give us the number of heads and sectors. This was probably not necessary, since we already have this information stored in the formatted disk.

I continued by calculating the root directory's location and size, and then reading it into memory. The calculations are pretty straight forward, we just need to use the arithmetic instructions, and shuffle stuff around registers, until we get to the correct values. We already wrote the function which reads stuff into memory, we just need to set its parameters and call it, the sector to start reading from in AX, the number of sectors to read in CL, the drive number in DL, and the memory address to write to in es:bx.

The next step is to search for the “kernel.bin” file through the directory entries. I will use BX to count how many entries we've already checked, and the DI register will point to the current directory entry. Because the 'file name' field is the first field in the structure, this means that DI will also point directly to the “file name” field.

I added the “search_kernel” label to mark the beginning of the loop. Next, I created a string that contains the “kernel.bin” file name in the format expected by FAT, and I stored it in the SI register, the length which is 11 in CX. After saving the value of DI, I called the “repe cmpsb” instruction. The CMPSB instruction, which is shorthand for “Compare string bytes”, can be used to compare 2 bytes in memory, one stored in DS:SI and the second stored in ES:DI. Additionally, the instruction also increments or decrements both, SI and DI, depending on whether direction flag is cleared or set respectively. The REPE instruction is a shorthand for “repeat while equal”, and it will repeat the compare instruction as long as the values are equal, up to CX times (all this time CX is also being decremented). As you can see, this construct makes it much easier to compare the 2 strings, without having to write an additional loop. At the end of the REPE instruction, we can restore DI to its previous value, and if the strings are equal, we can jump out of the main loop, to the “found_kernel label”. Otherwise, we move to the next directory entry by adding 32 to DI, which is the size of a directory entry. We also increment the directory entry count we already checked, which is stored in BX, and then check if we've checked all of them or not. If there are more entries to check, we jump to the beginning of the loop. Otherwise, we display an error message that we haven't found the “kernel.bin”, after searching through the entire root directory.

When we find the kernel file, we want first to save the “first cluster” value. After exiting the loop, DI should still point to the “directory entry” structure. If you look at a FAT specification, you will notice that the offset of the “lower first cluster” field is “26”, so we need to grab the value from the [DI + 26] to get the first cluster.

The next step is to read the File allocation table, this is a pretty straight forward process, we just need to set the proper parameters and call the disk read method.

After reading the FAT, we can start reading the file, and process the cluster chain.

Since we want to read the file, we need to decide where to put it. What we need to do right now is look for a lower memory map, and pick a location which will maximize the amount of memory we can use. The reason why this map exists is because some memory regions aren't free, they are occupied by important stuff, and we don't want to overwrite any of those. Also, we can't use more than 1MB yet, since we are in 16-bit real mode. Looking at this map, it looks like the area between our bootloader (7E00) and the extended bios data area at (80000) is the largest contiguous section of memory we can find, at roughly 480KB.

Since we are using some memory at the end of the bootloader to store the file allocation table, we should leave some room, so I picked the address 0x20000. This should let the bootloader use whatever it needs, and it will still leave us with 380kb. It's not ideal, but that will probably be enough for us to do whatever we need to do until we can switch to protected 32-bit mode, and use all the memory we want.

I added 2 constants for the segment and offset, so if we change our minds, we can easily change them. Note that I used EQU here, what this means is that no memory will be allocated for the constant, it will be replaced with the value at build time, this is equivalent to a #DEFINE in C.

Going back to the loading process, I added the “load kernel loop” label to mark the beginning of this second loop. To read the data, we need to convert from the cluster number to a sector; Here, because I was lazy, I did something awful, that I will need to fix in the future… I hardcoded the offset to 31. For our 1.44Megs floppy disk, this will work, but it will not work on another type of disk, so I will definitely want to fix this in the future. Once we know the sector we need to read, we can call the disk_read function, and then increment the value by the “bytes per sector”. Here I did another mistake that I will want to fix in the future; This add will overflow if the kernel.bin file is larger than 64KB, in which case the read file will be corrupted, since we will be overwriting the first part of it. To fix this issue, I will need to detect this case, and increment the segment as well.

Getting the location of the next cluster means calculating the index in the weird 12-bit allocation table, we do here the same calculations we did in the C implementation. After all of that, we check if the cluster number we got is above or equal to FF8, and if it is, that means we have successfully read the entire file.

If not, that means that there are more clusters to read, in which case we jump back to the beginning of the loop.

Once we finished reading, we will pass the boot device in DL, just like we received it, we will set up the data registers, and then do a far jump into the beginning of the kernel.

Let's see what happens if we try to build… and I have a few compile errors to fix.

To test our code, we need to make a few changed to the kernel code. First, the 'org' directive needs to be changed to '0', since our code is loaded at offset '0'. I also rearranged the code a bit, so it looks cleaner. We no longer need to setup the data segments and the stack, for now we are using the values setup by the bootloader. I also removed the padding and the AA55 suffix that the bootloader required.

Now let's test, and see what happens. And it seems to work correctly!, Yeey.

Whew, this was a lot of work, but we finally made it. Before finishing, let me show you something interesting; I opened the bootloader in a hex editor, and look at this: there are only 3 padding bytes with the value “0”. What this means is that we used almost all the 512 bytes that we could use for the boot sector. It's great that we managed to fit everything, but if we want to add anything else, like fixing that “hardcoded” calculation, we actually need to remove something else, or refactor the code so it uses less space.

Let me demonstrate… I tried to add a few bytes, and now when I try to compile, I'm getting an error that TIMES is negative. This means that my code no longer fits in the 512 bytes boot sector, so I need to shorten it.

This should put in perspective how little 512 bytes actually is.

Outro:

With this, we end part 3 in which we learned a lot of interesting things about FAT12. Thank you for your attention, you can find links to everything we talked about in the description, as well as a link to the github repo which contains all the code. If you enjoyed the video don't forget to like, share and subscribe.

Table of Contents