Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
transcripts:building-an-os-1-hello-world [2023/09/01 21:53] – [Tools] Tiberiu Chibicitranscripts:building-an-os-1-hello-world [2023/09/09 16:57] (current) – [Building an OS - 1 - Hello world] Tiberiu Chibici
Line 1: Line 1:
 ====== Building an OS - 1 - Hello world ====== ====== Building an OS - 1 - Hello world ======
  
-Note: This is verbatim transcript of the [[https://youtu.be/9t-SPC7Tczc| Building an OS - 1 - Hello world ]] video. If you want to follow the new and improved text tutorial, [[:os-tutorial|check here]].+>Note: This is an almost verbatim transcript of the [[https://youtu.be/9t-SPC7Tczc| Building an OS - 1 - Hello world ]] video (some minor changes have been made where it would make things more clear). If you want to follow the new and improved text tutorial, [[:building-an-os|check here]]. 
  
 ===== Introduction ===== ===== Introduction =====
Line 11: Line 12:
 First, let's see what tools we will need. For editing text, any text editor will do; I will be using the //[[https://micro-editor.github.io/|micro editor]]//, which I really like because it uses common keyboard shortcuts. However, you can use anything you like. First, let's see what tools we will need. For editing text, any text editor will do; I will be using the //[[https://micro-editor.github.io/|micro editor]]//, which I really like because it uses common keyboard shortcuts. However, you can use anything you like.
  
-We will use //make// to build our project. //NASM// will be our assembler, which we will use to assemble our assembly code.+We will use //[[https://www.gnu.org/software/make/manual/make.html|make]]// to build our project. //[[https://www.nasm.us/|nasm]]// will be our assembler, which we will use to assemble our assembly code.
  
-We will also need some virtualization software. I will use qemu, but you can use any virtualization software you like such as VirtualBox, VMware or anything else.+We will also need some virtualization software. I will use [[https://www.qemu.org/|qemu]], but you can use any virtualization software you like such as [[https://www.virtualbox.org/|VirtualBox]][[https://www.vmware.com/products/workstation-player.html|VMware]] etc.
  
-I will be using Ubuntu in this tutorial; if you want to follow on Windows, your best option would be to use the Windows Subsystem for Linux. You can find a tutorial on how to set that up in the video description. The second option would be cygwin which has most of the tools that we need. On MacOS you shouldn't have any trouble finding these tools using homebrew.+I will be using Ubuntu in this tutorial; if you want to follow on Windows, your best option would be to use the [[:Windows Subsystem for Linux|Windows Subsystem for Linux]]. You can find a tutorial on how to set that up in the video description. The second option would be [[:Cygwin|Cygwin]] which has most of the tools that we need. On MacOS you shouldn't have any trouble finding these tools using [[https://brew.sh/|Homebrew]].
  
 ===== Getting started ===== ===== Getting started =====
  
-Let's create a //source// folder in our project, and inside we'll make a file called //main.asm//. The first part of our operating system will be written in a programming language called //assembly//. Later, we will be using C, but right now we don't really have a choice and we really have to use assembly. +Let's create a //''src''// folder in our project, and inside we'll make a file called //''main.asm''//. The first part of our operating system will be written in a programming language called //assembly//. Later, we will be using C, but right now we don't really have a choice and we really have to use assembly.
  
 So what exactly is this //assembly// thing? Shortly, the //assembly language// is the human readable interpretation of machine code. When you compile a program in a programming language such as C or C++, it gets converted into machine code which is the language that the processor understands. Assembly makes it easier for us, humans, to read and write machine code. Higher level programming languages, such as C or C++, need to be translated by the compiler into machine code. This involves many steps, like building an abstract syntax tree, and a huge amount of optimization. Assembly is much simpler, because the instructions simply need to be converted into their machine code representation by a tool called an //assembler//. So what exactly is this //assembly// thing? Shortly, the //assembly language// is the human readable interpretation of machine code. When you compile a program in a programming language such as C or C++, it gets converted into machine code which is the language that the processor understands. Assembly makes it easier for us, humans, to read and write machine code. Higher level programming languages, such as C or C++, need to be translated by the compiler into machine code. This involves many steps, like building an abstract syntax tree, and a huge amount of optimization. Assembly is much simpler, because the instructions simply need to be converted into their machine code representation by a tool called an //assembler//.
  
-An assembly instruction is made up of a //mnemonic//, or a keyword, and the parameters which are called //operands//, typically zero one or two operands. An important thing I'd like to mention about the assembly language is that there are differences between different processors. The instructions supported on an x86 processor, that you find in laptops and desktops, are different than instructions supported on an ARM CPU, which is found in smartphones or tablets. Even different processors with the same architecture might have differences; for example SSE is a feature introduced in the Pentium 3 processor line that didn't exist in the Pentium 2 line. However, newer processors easily keep backwards compatibility so that all +An assembly instruction is made up of a //mnemonic//, or a keyword, and a number of parameters which are called //operands//, typically zeroone or two operands. 
-the programs can still run without any modification. Backwards compatibility for the x86 architecture goes so far back that it can still run software on a modern computer that was designed for the 8086 CPU, the first CPU ever made, using the x86 architecture which was at least 40 years ago.+ 
 +<code asm> 
 +add ax, 7 
 +mov bx, ax 
 +inc ax 
 + 
 +mnemonic operand1, operand2, ... 
 +</code> 
 + 
 +An important thing I'd like to mention about the assembly language is that there are differences between different processors. The instructions supported on an x86 processor, that you find in laptops and desktops, are different than instructions supported on an ARM CPU, which is found in smartphones or tablets. Even different processors with the same architecture might have differences; for example SSE is a feature introduced in the Pentium 3 processor line that didn't exist in the Pentium 2 line. However, newer processors easily keep backwards compatibility so that all the programs can still run without any modification. Backwards compatibility for the x86 architecture goes so far back that it can still run software on a modern computer that was designed for the 8086 CPU, the first CPU ever made, using the x86 architecture which was at least 40 years ago.
  
 So, to be clear, we will be writing our operating system for the x86 architecture. This means that we'll be using the x86 assembly language. So, to be clear, we will be writing our operating system for the x86 architecture. This means that we'll be using the x86 assembly language.
Line 30: Line 40:
 ===== The boot process ===== ===== The boot process =====
  
 +So, what happens when you turn on your computer? First the BIOS kicks in performing all sorts of tests, showing a fancy logo and then, the part that's the most important to us... It starts the operating system. How does it do that?
  
-so what happens when you pour on your +There are actually two ways in which the BIOS can load an operating system. In the first method which is now called //**legacy booting** //the BIOS loads the first block of data (the first sectorfrom each boot device into memoryuntil it finds a certain signature (0xAA55). Once it finds that signatureit jumps to the first instruction in the loaded blockand this is where our operating system starts. The second method is called **//EFI//**, which works a bit differently. In this modethe BIOS looks for a certain EFI partition on each devicewhich contains special EFI programs. For the momentwe won't be covering EFI, we will only look at legacy mode.
-computer first the bio sticks in +
-performing all sorts of tests showing a +
-fancy logo and then the part that's the +
-most important to us it stars the +
-operating system so how does it do that +
-there are actually two ways in which the +
-BIOS can load an operating system in the +
-How the BIOS finds an OS +
-first method which is now called legacy +
-booting the BIOS loads the first block +
-of data or the first sector from each +
-boot device into memory until it finds a +
-certain signature once it finds that +
-signature it jumps to the first +
-instruction in the loaded block and this +
-is where our operating system starts the +
-second method is called efi which works +
-a bit differently in this mode the bios +
-looks for a certain efi partition on +
-each device which contains on special +
-efi programs for the moment we won't be +
-covering efi will only look at legacy +
-mode now that we know how the bios loads +
-our operating system here's what we need +
-to do we will write some code assemble +
-it and then we will put it in the first +
-sector of a floppy disk we also need to +
-somehow add that signature that the bios +
-requires after that we can test our +
-operating system so let's begin coding +
-we know that the bios always puts our +
-operating system at address 7000 so the +
-first thing we need to do is give our +
-sender this information this is done +
-using the org directive which tells the +
-assembler to calculate all memory offset +
-starting at 7000 changing this line to +
-another number won't make the bios load +
-a different address it will only tell +
-the assembler that the variables and +
-labels from our code should be +
-calculated with the offset 7000 before +
-we continue I need to explain the +
-difference between a directive and an +
-instruction a directive is a way of +
-giving the assembler a clue about how to +
-interpret our code when instruction is +
-translated into a machine code +
-instruction a directive won't get +
-translated it is only giving a clue to +
-the assembler +
-next we tell our assembler to emit +
-16-bit code as I mentioned before any +
-x86 CPU must be backwards compatible +
-with the original 8086 CPU so if an +
-operating system that was designed for +
-the 8086 is run on a modern CPU it still +
-needs to think that it's running on an +
-8086 because of this the CPU always +
-starts in 16-bit mode bits is also a +
-directive which tells the assembler to +
-emit 16-bit code writing bits 32 won'+
-make the processor running 32-bit mode +
-it is only directive which tells the +
-assembler to emit 32-bit code now I'll +
-define the main labels to mark where our +
-code begins for now we just want to note +
-that the BIOS loads our operating system +
-correctly so I'll only write a halt +
-instruction which holds the processor in +
-certain cases the CPU can start +
-executing again so I'll just create +
-another hold label and then jump to it +
-this way if the CPU ever starts again it +
-will be stuck in an infinite loop it's +
-not a good idea to allow the processor +
-to continue executing beyond the end of +
-our program our program is almost done +
-all that's left to do is add that +
-signature that the BIOS requires the +
-BIOS expects that the last two bytes of +
-the first sector are a a 5 5 we will be +
-putting our program on a standard 1.44 +
-megabytes floppy disk where one sector +
-has 512 bytes the BIOS requires that the +
-last 2 bytes of the first sector are a a +
-5 5 we can ask Nasim to emit bytes +
-directly by using the DB directive which +
-stands for declare constant byte the +
-times directive can be used to repeat +
-instructions or data here we use it to +
-pad our program so that it fills up to +
-510 bytes after which we declare the two +
-byte signature in NASM the dollar sign +
-can be used to obtain the assembly +
-position of the beginning of the current +
-line and the double dollar sign gives us +
-the position of the beginning of the +
-current section in our case dollar - +
-dollar +
-other is about the length of the program +
-so far measured in bytes finally we +
-declared a signature DW is a directive +
-similar to DB but it declares the two +
-byte constant which is generally +
-referred to as a word +
-with this we have successfully written +
-our first operating system so far it +
-doesn't really do anything but stop the +
-processor let's test it if it works I +
-created a bill directory to keep things +
-organized for building the project I +
-created a make file I added a rule to +
-build the main dot ASM code using NASM +
-an output in a binary format +
-I added a rule to build +
-Kimmage where I simply took the binary +
-file previously built and padded it with +
-zeros until it has 1.44 megabytes +
-finally we can test our little operating +
-system you can use any virtualization +
-software you want such as VirtualBox VM +
-where I use camel because it's really +
-easy to setup and it can be used from a +
-command line as you can see the system +
-boots from floppy and then it does +
-nothing exactly as we expected so far +
-our operating system does nothing and +
-does it perfectly now that we know it +
-works let's go back to the code and +
-print a hello world message to screen +
-before I start explaining how you can do +
-that I need to explain some basic +
-concepts about the x86 architectures +
-x86 CPU Registers +
-all processors have a number of +
-registers which are really small pieces +
-of memory that can be written and read +
-very fast and are built into the CPU +
-here's a diagram of holder registers on +
-an x86 CPU there are several types of +
-registers the general-purpose registers +
-can be used for almost any purpose the +
-index registers are usually used for +
-keeping indices and pointers they can +
-also be used for other purposes the +
-program counter is a special register +
-which keeps track of which memory +
-location the current instruction begins +
-the segment registers are used to keep +
-track of the currently active memory +
-segments which I will explain in just a +
-moment there is also a Flags register +
-which contains some special flags which +
-are set by various instructions there +
-are a few more special purpose registers +
-but will only introduce them when we +
-need them now we'll talk a bit about RAM +
-memory the 8086 CPU had a 20-bit address +
-bus this meant that you could access up +
-to 2 to the power of 20 which means +
-about one megabyte of memory at the time +
-typical computers had around 64 to 128 +
-kilobytes of memory so the engineers at +
-Intel thought this limit was huge for +
-various reasons they decided to use a +
-segment and offset addressing scheme for +
-Memory segmentation +
-memory in this scheme you address memory +
-by using two 16-bit values the segment +
-and the offset each segment contains 64 +
-kilobytes of memory where each byte can +
-be accessed using the offset value +
-segments overlap every 16 bytes this +
-means that you can convert a segment +
-offset address to an absolute address by +
-shifting the segment four bits to the +
-left were multiplying it by 16 and then +
-adding the offset this also means that +
-there are multiple ways of addressing +
-the same location in memory for example +
-the absolute address 7000 which is where +
-the BIOS flows our operating system can +
-be written as any combination that you +
-can see on the screen there are some +
-special registers which are used to +
-specify the actively used segments CS +
-contain +
-the code segment which is the segment +
-the processor executes code from the IP +
-or program counter register only gives +
-us the offset the D s and D s registers +
-are data segments newer processors +
-introduced additional data segments FS +
-and GS SS contains the current stack +
-register in order to access the outside +
-one of these active statements we need +
-to load that Simon into one of these +
-registers the code segment can only be +
-modified by performing a jump now how do +
-Referencing a memory location +
-you reference a memory location from +
-assembly you use this syntax a segment +
-register followed by a colon followed by +
-an expression which gives the offset put +
-between in brackets the segment register +
-can be omitted in which case the DSL +
-register will be used the processor is +
-capable of doing some arithmetic for us +
-as long as we use this expression the +
-base and index operands can be any +
-general-purpose processor registers in +
-16-bit mode there are a few limitations +
-however only B P and B X can be used as +
-base registers and only Si and di can be +
-used as index registers these +
-limitations exist because of how the +
-8086 CPU was originally designed where +
-they had to put such limitations to keep +
-the complexity down another example of +
-one such limitation is that we can'+
-write constants to the segment registers +
-directly we have to use an intermediary +
-register with the introduction of the +
-386 processor just a few years later +
-32-bit mode was introduced which pretty +
-much rendered 16-bit mode obsolete a lot +
-of newer cpu features were simply not +
-added to the 16-bit mode because it is +
-absolute and it only exists for +
-backwards compatibility it is still +
-useful to learn because most of the +
-things that apply to a 16-bit mode apply +
-to a 32-bit or 62 bit mode and it is +
-much simpler its main use today is in +
-the startup sequence most operating +
-systems switch to 32 or 64-bit mode +
-immediately after starting up we will do +
-the same thing in a future video but we +
-can't just yet for now we are limited to +
-the first sector of a floppy disk that +
-is 512 bytes which is very little space +
-once we are able to load a +
-from the floppy disk we can do a lot +
-more all operating systems have to do +
-the same thing in order to boot but +
-until we get there let's get back to +
-referencing our memory locations so I +
-already talked about the base and index +
-operands the scale and displacement +
-operands are numerical constants the +
-scale can only be used in 32 and 64-bit +
-modes and it can only have a value of +
-one to four or eight the displacement +
-can be any signed integer constant all +
-the operands in a memory reference +
-expression are optional so you only have +
-to use whatever you need so here's an +
-example first I defined a label which +
-points to a word having the value 100 +
-the first instruction puts the offset of +
-the label into the ax register the +
-second instruction puts the memory +
-contents per our label point set since +
-we didn't specify a segment register D s +
-is going to be used we haven't used the +
-base index or scale but only a constant +
-which is the offset for label points to +
-in assembly labels are simply constant +
-which points to a specific offset here'+
-a more complicated example where we want +
-to read this third element in an array +
-in this example we put the offset of the +
-array into BX and the index of the third +
-element in Si since we use zero-based +
-indexing the third element is array of +
-two and each element in the array is a +
-word which is a two bytes wide so we put +
-in si the value 4 you can see here that +
-we use the multiplication symbol the +
-assembler is capable of calculating the +
-result of constant expressions and +
-putting the result in the resulting +
-machine code however you can try to move +
-BX ax times 2 ax is not known at compile +
-time so it is not a constant for that +
-you have to be used the multiply +
-instruction referencing memory is the +
-only place where you can put registers +
-in an expression finally we put into ax +
-the third element in the array by +
-referencing the memory location at BX +
-plus si BX is our base register and si +
-is our +
-in this register now back to our +
-operating system the code segment +
-register has been set up for us by the +
-BIOS and it points to segment 0 there +
-are some biases out there which actually +
-jump to our code using a difference I +
-meant an offset such as segment 7c 0 +
-offset 0 but the standard behavior is to +
-use segment 0 offset 7000 we don't know +
-if the data segment an extra segment are +
-properly initialized so this is what we +
-have to do next since we can't write a +
-constant directly to a segment register +
-we have to use an intermediary register +
-we will use a X the move instruction +
-copies a reader from the source on the +
-left side to the destination on the +
-right side we also set up the stack +
-segment to 0 and a stack pointer to the +
-beginning of our program so what exactly +
-is this stack basically the stack is a +
-piece of memory that we can accessed in +
-a first in first out manner using the +
-push and pop instructions the stack also +
-has a special purpose when using +
-functions when you call a function the +
-return address is added to the stack +
-when you return from a function the +
-processor will read the return address +
-from the stack and then jump to it +
-another thing to note about the stack is +
-that it grows downwards SP points to the +
-The stack +
-top of the stack when you push something +
-SP is decremented by the number of bytes +
-pushed and then the data is written to +
-memory this is why we set up the stack +
-to point to the start of our operating +
-system because it grows downwards if we +
-set it up to the end of our program it +
-would overwrite our program we don'+
-want that so we just put it somewhere +
-where it won't overwrite anything the +
-beginning of our operating system is a +
-pretty safe spot now we'll start coding +
-a Buddhist function which prints a +
-string to the screen always document +
-your assembly functions so our function +
-will receive a pointer to a string in +
-DSS i and it will print characters until +
-it encounters a null character because I +
-decided to write the function above main +
-I have to add a jump instruction above +
-so main is still the entry point to our +
-program first I push the registers that +
-I'm going to modify to the stack +
-after which we enter the main loop the +
-load SB instruction loads apart from the +
-address DSS I into the AL register and +
-then increments si next I wrote the loop +
-exit condition the or instruction +
-performs a bitwise or and store the +
-result in the left hand side operand in +
-this case al orange a value to itself +
-will modify the value at all but what it +
-will modify is the Flex register if +
-there is multi zero the zero flag will +
-be set the next instruction is the +
-conditional jump which jumps to the down +
-label if the zero flag is set so +
-essentially if the next character is +
-null we jump outside the loop there'+
-something I forgot when I recorded the +
-video or jump instruction to the loop +
-label so that the code will loop after +
-exiting the loop we pop the registers we +
-previously pushed in reverse order and +
-then we'll return from this function so +
-far our function takes a string iterates +
-every character until it encounters the +
-null character and then exits what'+
-left to do is to print the character to +
-the screen the way we can do that is +
-using the BIOS as the name suggests the +
-BIOS or the basic input/output system +
-does more than just start a system it +
-also provides some very basic functions +
-which allow us to do some very basic +
-stuff such as writing text to the screen +
-so how exactly do we call the BIOS to +
-print the character for us the answer is +
-that we use interrupts so what are +
-interrupt an interrupt is basically a +
-signal which makes the processor stop +
-whatever it is doing to handle that +
-event there are three possible ways of +
-triggering an interrupt the first way it +
-is through an exception an exception is +
-generated by the CPU if a critical error +
-is encountered and it cannot continue +
-executing for example dividing by zero +
-will trigger an interrupt operating +
-systems can use these interrupts to stop +
-the misbehaving process or to attempt to +
-restore it to working order hardware can +
-also trigger interrupts for example when +
-a key is pressed on the keyboard or when +
-the disk controller finished performing +
-enough synchronous read the third way in +
-which interrupts can be triggered is +
-through the int instruct +
-shun interrupts are numbered from 0 to +
-255 so the instruction requires a +
-parameter indicating the interrupt +
-number to trigger the BIOS install some +
-interrupt handlers for us so that we can +
-use its functionality +
-Examples of BIOS interrupts +
-typically the BIOS reserves an interrupt +
-number for a category of functions and +
-the value in the aah register is used to +
-choose between the available functions +
-in that category to print text to the +
-screen we will need to call interrupts +
-10 hexadecimal which contains the video +
-services category by setting eh to 0 a +
-hexadecimal will call the right text in +
-teletype mode function here's a detailed +
-description of this function so what we +
-need to do in order to call this +
-function is to set the aah registered to +
-0e hexadecimal al to the ASCII character +
-that we want to print and B H to the +
-page number the build parameter is only +
-used in graphics mode so we can ignore +
-it because you're currently running in +
-text mode when I recorded the video I +
-forgot to set the page number to zero +
-after that we call interrupts one zero +
-hexadecimal finally let's add a string +
-containing the text hello world followed +
-by a new line +
-to add a new line you need to print both +
-the line feed and the carriage return +
-characters I created an awesome macro so +
-that I don't have to remember the hex +
-codes for these characters every time to +
-declare string we use the DB directive +
-which conveniently allows us to write as +
-many characters as we want all that left +
-to do is to set the SSI to address of +
-the string and then call the Podesta +
-let's now test our program +
-you +
-because I forgot to put that jump +
-instruction I only go to the age after +
-fixing the issue the message helloworld +
-is displayed great so we have +
-successfully written a tiny apartment +
-system which can print text to the +
-screen this was a lot of work and we +
-learned a lot of new stuff about how +
-computers work we'll continue the next +
-time when we will improve our assembly +
-skills and learn some new stuff by +
-extending our operating system to print +
-numbers to the screen after that we get +
-into the complex task of loading stuff +
-from the disk thank you for watching and +
-see you the next time bye bye +
-[Music]+
  
-REFORMATTED BY CHATGPT: +Now that we know how the BIOS loads our operating system, here's what we need to dowe will write some code, assemble it, and then we will put it in the first sector of a floppy disk. We also need to somehow add that signature that the BIOS requires, after which we can test our operating system. So, let's begin coding...
-Introduction+
  
-Hello and welcome! In this tutorial, I'll show you what it takes to build an operating system from scratch. So let's jump straight into it. First, let's talk about the tools we will need.+===== Writing the assembly code =====
  
-For editing textany text editor will do. I will be using the Micro editor, which I really like because it uses common keyboard shortcuts. However, you can use any text editor you prefer.+We know that the BIOS always puts our operating system at address 0x7C00so the first thing we need to do is give our assembler this informationThis is done using the //''org''// directive, which tells the assembler to calculate all memory offsets starting at 0x7C00.
  
-We will use Make to build our project. NASM will be our assembler, which we will use to assemble our assembly code. Additionally, we will need some virtualization software. I will be using QEMU, but you can use any virtualization software you prefer, such as VirtualBox, VMware, or others.+<code asm> 
 +org 0x7C00 
 +</code>
  
-I will be using Ubuntu in this tutorial. If you want to follow along on Windows, your best option would be to use the Windows Subsystem for Linux. You can find tutorial on how to set that up in the video description. Another option for Windows would be Cygwin, which has most of the tools we need. On macOS, you shouldn't have any trouble finding these tools using Homebrew.+Note: changing this line to another number won't make the bios load different address! It will only tell the assembler that the variables and labels from our code should be calculated with the offset 0x7C00.
  
-Assembly+Before we continue, I need to explain the difference between a **//directive// **and an //**instruction**.// A **//directive//** is a way of giving the assembler a clue about how to interpret our code. While an instruction is translated into a machine code instruction, a directive won't get translated; it is only giving a clue to the assembler.
  
-Let's create a source folder in our project and inside, we'll make a file called main.asmThe first part of our operating system will be written in programming language called assemblyLaterwe will be using C, but right now, we don't have choice and we have to use assembly.+Next, we need to tell our assembler to emit 16-bit codeAs I mentioned before, any x86 CPU must be backwards compatible with the original 8086 CPUSo, if an operating system that was designed for the 8086 is run on modern CPU, it still needs to think that it's running on an 8086Because of thisthe CPU always starts in 16-bit mode. //''bits''// is also directive which tells the assembler to emit 16-bit code.
  
-So, what exactly is assembly? Assembly language is the human-readable interpretation of machine code. When you compile a program in a higher-level programming language like C or C++, it gets converted into machine code, which is the language that the processor understands. Assembly makes it easier for us humans to read and write machine code.+<code asm> 
 +bits 16 
 +</code>
  
-Assembly instructions are made up of a mnemonic or a keyword and the number of parameters, called operands. It's important to note that there are differences between assembly languages for different processors. For example, x86 processors in laptops and desktops have different instructions compared to ARM CPUs in smartphones or tablets. However, newer processors maintain backwards compatibility to ensure that all programs can still run without modification.+Note: Writing ''bits 32'' won't make the processor run in 32-bit mode! It is only a directive which tells the assembler to emit 32-bit code.
  
-The x86 Boot process+Now, I'll define the ''main'' label to mark where our code begins. For now, we just want to know that the BIOS loads our operating system correctly, so I'll only write a ''hlt'' (halt) instruction which halts (stops) the processor. In certain cases, the CPU can start executing again, so I'll just create another ''.halt'' label, and then jump to it. This way, if the CPU ever starts again, it will be stuck in an infinite loop. It's not a good idea to allow the processor to continue executing beyond the end of our program.
  
-Now, let's understand how the BIOS loads an operating system. The BIOS can load an operating system using two methodslegacy booting and EFI. In legacy booting, the BIOS loads the first sector from each boot device into memory until it finds a certain signature. Once it finds the signature, it jumps to the first instruction in the loaded block, which is where our operating system starts. In EFI mode, the BIOS looks for a certain EFI partition on each device, which contains special EFI programs.+<code asm> 
 +main: 
 +    hlt
  
-Knowing how the BIOS loads our operating system, we need to write some code, assemble it, and put it in the first sector of a floppy diskWe also need to add the signature that the BIOS requires. Once we've done that, we can test our operating system.+.halt 
 +    jmp .halt 
 +</code>
  
-To get started, we'll inform our assembler about the memory offset by using the "org" directive. In our case, the BIOS always puts our operating system at address 7000We also need to emit 16-bit code since x86 CPUs are backwards compatible with the original 8086 CPU.+Our program is almost done. All that's left to do is add that signature that the BIOS requires. The BIOS expects that the last two bytes of the first sector are 0xAA55. We will be putting our program on a standard 1.44 MB floppy disk image, where one sector has 512 bytes. We can ask //nasm// to emit bytes directly by using the **''db''** directive, which stands for "declare constant byte". The ''times ''directive can be used to repeat instructions or dataHere, we use it to pad our program so that it fills up to 510 bytesafter which we declare the two byte signatureIn nasm, the **''$''** symbol can be used to obtain the assembly position of the beginning of the current line, and the **''$$''** sign gives us the position of the beginning of the current section. In our case, **''$-$$''** will give us the length of the program so far, measured in bytes. Finally we declare the signature. **''dw''** is a directive similar to **''db''**, but it declares a two byte constant, which is generally referred to as a "word".
  
-Next, we define the main labels to mark where our code begins. For now, we'll simply use a "halt" instruction to hold the processor. We want to create an infinite loop, so we'll create a "loop" label and jump to it.+<code asm> 
 +times 510-($-$$) db 0 
 +dw 0AA55h 
 +</code>
  
-To meet the BIOS requirements, we need to add the signature at the end of the first sector. We can use the "DB" directive to emit bytes directly and the "times" directive to pad our program to fill up to 510 bytes. Finallywe declare the two-byte signature using the "DW" directive.+With this, we have successfully written our first operating system! So farit doesn't really do anything but stop the processorLet's test it if it works!
  
-With this, we have successfully written our first operating system. So far, it doesn't really do anything but stop the processor. Let's test it.+===== Writing the makefile =====
  
-I created a "builddirectory to keep things organized for building the projectcreated make file and added a rule to build the "main.asmcode using NASM and output it in a binary format.+I created a **''build''** directory to keep things organized. For building the projectwill create **''Makefile''**. I added a rule to build the **''main.asm''** code using nasm, and output in a binary format. I also added a rule to build the disk image, where I simply take the binary file previously built, and pad it with zeros (with the ''truncate'' command) until it has 1.44 megabytes.
  
-I also added a rule to build "Kimmage" where I simply took the binary file previously built and padded it with zeros until it has 1.44 megabytes.+<code make Makefile> 
 +ASM=nasm
  
-Finally, we can test our little operating system. You can use any virtualization software you want, such as VirtualBox or VMWare. I use QEMU because it's really easy to set up and it can be used from the command line.+SRC_DIR=src 
 +BUILD_DIR=build
  
-As you can see, the system boots from the floppy and then it does nothing, exactly as we expectedSo far, our operating system does nothing, and it does it perfectly.+$(BUILD_DIR)/main_floppy.img: $(BUILD_DIR)/main.bin 
 +    cp $(BUILD_DIR)/main.bin $(BUILD_DIR)/main_floppy.img 
 +    truncate -s 1440k $(BUILD_DIR)/main_floppy.img 
 +     
 +$(BUILD_DIR)/main.bin: $(SRC_DIR)/main.asm 
 +    $(ASM) $(SRC_DIR)/main.asm -f bin -o $(BUILD_DIR)/main.bin 
 +</code>
  
-Now that we know it works, let's go back to the code and print a "Hello, World!" message to the screen.+===== The first test =====
  
-Before I start explaining how you can do that, I need to explain some basic concepts about the x86 architecture.+Finally, we can test our little operating system. You can use any virtualization software you wantsuch as VirtualBox, VMWare etc. use ''qemu'' because it's really easy to setup, and it can be used from a command line.
  
-x86 CPU Registers:+<code bash> 
 +$ qemu-system-i386 -fda build/main_floppy.img 
 +</code>
  
-All processors have a number of registers, which are really small pieces of memory that can be written and read very fast and are built into the CPU. Here's a diagram of the general-purpose registers on an x86 CPU:+As you can see, the system boots from floppy, and then it does nothing, exactly as we expected! So far, our operating system does nothing, and does it perfectly!!!
  
-todo...+{{ :transcripts:pasted:20230904-191922.png?600 }}
  
-There are several types of registers. The general-purpose registers can be used for almost any purpose. The index registers are usually used for keeping indices and pointers, but they can also be used for other purposes. The program counter (IP) is a special register that keeps track of which memory location the current instruction begins.+===== Hello world =====
  
-The segment registers are used to keep track of the currently active memory segments, which I will explain in just momentThere is also a Flags registerwhich contains some special flags that are set by various instructions.+Now that we know it works, let's go back to the code and print "Hello world" message to the screenBefore I start explaining how you can do thatI need to explain some basic concepts about the x86 architecture.
  
-There are a few more special-purpose registers, but we will only introduce them when we need them.+==== CPU registers ====
  
-Now we'll talk bit about RAM memory.+All processors have number of registers, which are really small pieces of memory that can be written and read very fast, and are built into the CPUHere is a diagram of all the registers on an x86 CPU:
  
-Memory Segmentation:+{{ osdev:media:table_of_x86_registers.svg?1000 |Table of x86 registers. By Immae - Own work, CC BY-SA 3.0, https://commons.wikimedia.org/w/index.php?curid=32745525}}
  
-The 8086 CPU had a 20-bit address bus. This meant that you could access up to 2^20, which means about one megabyte of memory. At the time, typical computers had around 64 to 128 kilobytes of memory.+There are several types of registers:
  
-So, the engineers at Intel thought this limit was huge. For various reasonsthey decided to use a segment and offset addressing scheme for memory in this scheme. You address memory by using two 16-bit values: the segment and the offset.+  * the general-purpose registers can be used for almost any purpose (RAXRBX, RCX, RDX, R8-R15 including their smaller counter parts, EAX, AX, AL, AH etc) 
 +  * the index registers (RSIRDI) are usually used for keeping indices and pointers; they can also be used for other purposes 
 +  * the program counter (RIP) is a special register which keeps track of which memory location the current instruction begins at 
 +  * the segment registers (CS, DS, ES, FS, GS, SS) are used to keep track of the currently active memory segments (which I will explain in just a moment) 
 +  * there is also a flags register (RFLAGS) which contains some special flags set by various instructions 
 +  * there are a few more special purpose registers, but I will only introduce them when we need them
  
-Each segment contains 64 kilobytes of memory, where each byte can be accessed using the offset value. Segments overlap every 16 bytes. This means that you can convert a segment-offset address to an absolute address by shifting the segment four bits to the left (or multiplying it by 16) and then adding the offset.+==== Real memory model ====
  
-This also means that there are multiple ways of addressing the same location in memory. For example, the absolute address 7000which is where the BIOS loads our operating systemcan be written in any combination shown on the screen.+Now let's talk a bit about RAM. The 8086 CPU had a 20-bit address bus, which meant that you could access up to 2<sup>20</sup>, or about 1 MB of memory. At the timetypical computers had around 64 to 128 KBso the engineers at Intel thought this limit was huge. For various reasons, they decided to use a //segment and offset addressing scheme// for addressing memory.
  
-There are some special registers which are used to specify the actively used segments. CS contains the code segment, which is the segment the processor executes code from. The IP (or program counter) register only gives us the offset. The DS and ES registers are data segments. Newer processors introduced additional data segmentsFS and GS. SS contains the current stack register.+<code -> 
 +               0x1234:0x5678 
 +              segment:offset 
 +</code>
  
-In order to access the memory outside one of these active segmentswe need to load that segment into one of these registersThe code segment can only be modified by performing a jump.+In this schemeyou use two 16-bit values, the **//segment //**and the **//offset//**Each segment contains 64 KB of memory, where each byte can be accessed by using the offset value. Segments overlap every 16 bytes.
  
-Referencing Memory Location:+{{ :transcripts:pasted:20230904-195813.png?500 }}This means that you can convert segment:offset address to an absolute address by shifting the segment four bits to the left (or multiplying it by 16), and then adding the offset.
  
-To reference a memory location in assembly, you use this syntax: a segment register followed by a colon followed by an expression which gives the offset, put between square brackets. The segment register can be omitted, in which case the DS register will be used.+<code c> 
 +linear_address = segment << 4 + offset
 +// or 
 +linear_address = segment * 16 + offset; 
 +</code>
  
-The processor is capable of doing some arithmetic for us as long as we use this expression. The base and index operands can be any general-purpose processor registersIn 16-bit modethere are a few limitations, however. Only BP and BX can be used as base registers, and only SI and DI can be used as index registers. These limitations exist because of how the 8086 CPU was originally designed. They had to put such limitations to keep the complexity down.+This also means that there are multiple ways of addressing the same location in memoryFor examplethe absolute address 0x7C00 (where the BIOS loads our operating system) can be written as any combination that you can see on the screen:
  
-Another example of one such limitation is that we can't write constants to the segment registers directly. We have to use an intermediary register.+<code -> 
 +segment:offset     linear_address 
 + 0x0000:0x7C00         0x7C00 
 + 0x0001:0x7BF0         0x7C00 
 + 0x0010:0x7B00         0x7C00 
 + 0x00C0:0x7000         0x7C00 
 + 0x07C0:0x0000         0x7C00 
 +</code>
  
-With the introduction of the 386 processor just a few years later, 32-bit mode was introduced, which pretty much rendered 16-bit mode obsolete. A lot of newer CPU features were simply not added to 16-bit mode because it is absolute, and it only exists for backward compatibility. It is still useful to learn because most of the things that apply to 16-bit mode apply to 32-bit or 64-bit mode, and it is much simpler. Its main use today is in the startup sequence. Most operating systems switch to 32-bit or 64-bit mode immediately after starting up. We will do the same thing in a future video, but we can't just yet. For now, we are limited to the first sector of a floppy disk, which is 512 bytes, which is very little space.+There are some special registers which are used to specify the actively used segments:
  
-Once we are able to load a boot loader from the floppy diskwe can do a lot moreAll operating systems have to do the same thing in order to boot, but until we get there, let's get back to referencing our memory locations.+  * **''CS''** contain the code segmentwhich is the segment the processor executes code fromThe **''IP ''**register (the program counter) only gives us the offset! 
 +  * **''DS''** and **''ES''** are data segmentsNewer processors introduced additional data segments **''FS''** and **''GS''**  
 +  * **''SS''** contains the current stack register
  
-So, I already talked about the base and index operands. The scale and displacement operands are numerical constants. The scale can only be used in 32and 64-bit modes, and it can only have a value of 1, 2, 4or 8. The displacement can be any signed integer constant.+In order to access (read or write) any memory location, its segment needs to be loaded into one of these registers, by setting the corresponding register. The code segment can only be modified by performing a jump. 
 + 
 +Now, how do we reference a memory location from assembly? We use this syntax: 
 + 
 +<code asm> 
 +[segment : base + index * scale + displacement] 
 +</code> 
 + 
 +Where: 
 + 
 +  * segment: one of CS, DS, ES, FS, GS, SS. Default: DS (SS if BP is used as base) 
 +  * base 
 +    * 16-bit: BP or BX 
 +    * 32/64-bit: any general purpose register 
 +  * index: 
 +    * 16-bit: SI or DI 
 +    * 32/64-bit: any general purpose register 
 +  * scale (32/64-bit only): 1, 2, 4 or 8 
 +  * displacement: a signed constant number 
 + 
 +The processor is capable of doing some arithmetic for us, as long as we use this expression. 
 + 
 +In 16-bit mode, there are a few limitations because that's how the 8086 CPU was originally designed. This was probably done to keep the complexity and cost down. Another example of one such limitation is that we can't write constants to the segment registers directly, we have to use an intermediary register. With the introduction of the 386 processor just a few years later, 32-bit mode was introduced which pretty much rendered 16-bit mode obsolete. A lot of newer CPU features were simply not added to the 16-bit mode, because it is obsolete and only exists for backwards compatibility. However, it is still useful to learn, because most of the things that apply to a 16-bit mode also apply to 32-bit and 64 bit modes. The main use today of 16-bit mode is in the startup sequence; most operating systems switch to 32 or 64-bit mode immediately after starting up. We will do the same thing in a future video, but we can't just yet, as we are limited to the first sector of a floppy disk (512 bytes) which is very little space. Once we are able to load a from the disk, we can do a lot more. 
 + 
 +All operating systems have to do the same thing in order to boot, but until we get there, let's get back to referencing our memory locations. So, I already talked about the base and index operands. The scale and displacement operands are numerical constants; the scale can only be used in 32 and 64-bit modes, and it can only have a value of 1, 2, 4 or 8. The displacement can be any signed integer constant.
  
 All the operands in a memory reference expression are optional, so you only have to use whatever you need. All the operands in a memory reference expression are optional, so you only have to use whatever you need.
  
-Here's an example: First, I defined a label which points to a word having the value 100. The first instruction puts the offset of the label into the AX register. The second instruction puts the memory contents that our label points to into the AX register.+=== Examples === 
 + 
 +== Example 1== 
 + 
 +<code asm> 
 +var: dw 100 
 + 
 +    mov ax, var     ; copy offset to ax 
 +    mov ax, [var]   ; copy memory contents of ds:var to ax 
 +</code> 
 + 
 +First, I defined a label which points to a word having the value ''100.'' 
 + 
 +The first instruction ''mov ax, var'' puts the offset of the label into the ax register. 
 + 
 +The the second instruction ''mov ax, [var]'' copies the memory contents that our label points to. Since we didn't specify a segment register, DS is going to be used. We haven't used the base, index or scale, but only a constant, which is the offset denoted by the "var" label. In assembly, labels are simply constants which point to specific memory offsets. 
 + 
 +== Example 2: == 
 + 
 +<code asm> 
 +array: dw 100, 200, 300 
 + 
 +    ; read third element in array 
 +    mov bx, array 
 +    mov si, 2 * 2 
 +    mov ax, [bx + si] 
 +</code> 
 + 
 +Here's a more complicated example, where we want to read the third element in an array. We put the offset of the array into BX, and the index of the third element in SI. Since we use zero-based indexing, the third element is at ''array[2]''; each element in the array is a word, which is 2 bytes wide, so we put in SI the value 4. 
 + 
 +Note: You can see here that we use the multiplication symbol. The assembler is capable of calculating the result of constant expressions, and put the result in the resulting machine code. However, you can't write ''mov bx, ax * 2''. ''AX ''is not known at compile time, so it is not a constant. To perform this multiplication, you have to use the ''MUL'' (multiply) instruction. Referencing memory is the only place where you can put registers in an expression! 
 + 
 +Finally, we put into AX the third element in the array, by referencing the memory location at BX + SI. BX is our base register, and SI is our index register. 
 + 
 +==== Back to the OS - the initialization ==== 
 + 
 +Back to our operating system, the code segment register has been set up for us by the BIOS and it points to segment 0. There are some BIOSes out there which actually jump to our code using a different segment and offset such 0x07C0:0x0000, but the standard behavior is to use 0x0000:0x7C00. We don't know if DS and ES are properly initialized, so this is what we have to do next. Since we can't write a constant directly to a segment register, we have to use an intermediary register; we will use AX. The MOV (move) instruction copies data from the source on the left side to the destination on the right side. 
 + 
 +<code asm> 
 +main: 
 +    ; setup data segments 
 +    mov ax, 0           ; can't set ds/es directly 
 +    mov ds, ax 
 +    mov es, ax 
 +     
 +    ; setup stack 
 +    mov ss, ax 
 +    mov sp, 0x7C00      ; stack grows downwards from where we are loaded in memory 
 +</code> 
 + 
 +We also set up the stack segment (SS) to 0, and the stack pointer (SP) to the beginning of our program. So what exactly is this stack? 
 + 
 +The stack is a piece of memory that we can access in a "first in last out" manner, using the PUSH and POP instructions. The stack also has a special purpose when using functions. When you call a function, the return address is added to the stack, and when you return from a function, the processor will read the return address from the stack and then jump to it. 
 + 
 +Another thing to note about the stack is that it grows downwards! SP points to the top of the stack. When you push something, SP is decremented by the number of bytes pushed, and then the data is written to memory. This is why we set up the stack to point to the start of our operating system: because it grows downwards. If we set it up to the end of our program, it would overwrite our program. We don't want that, so we just put it somewhere where it won't overwrite anything. The beginning of our operating system is a pretty safe spot. 
 + 
 +Now we'll start coding a ''puts'' function which prints a string to the screen. 
 + 
 +Note: Always document your assembly functions! 
 + 
 +<code asm> 
 +start: 
 +    jmp main 
 + 
 +
 +; Prints a string to the screen 
 +; Params: 
 +;   - ds:si points to string 
 +
 +puts: 
 + 
 +    ; ....... 
 + 
 + 
 +main: 
 +</code> 
 + 
 +Our function will receive a pointer to a string in ''DS:SI'' and it will print characters until it encounters a null character. Because I decided to write the function above ''main'', I have to add a jump instruction above, so ''main ''is still the entry point to our program. 
 + 
 +First, we push the registers that we're going to modify to the stack, after which we enter the main loop. 
 + 
 +<code asm> 
 +puts: 
 +    ; save registers we will modify 
 +    push si 
 +    push ax 
 +    push bx 
 + 
 +.loop: 
 +    lodsb               ; loads next character in al 
 +</code> 
 + 
 +The ''lodsb'' (load string byte) instruction loads a byte from the address ''DS:SI'' into the AL register, and then increments ''SI.'' 
 + 
 +Next, I wrote the loop exit condition; the ''or'' instruction performs a bit-wise "or" and stores the result in the left operand, in this case ''AL''. OR-ing a value to itself won't modify the value at all, but it will modify is the ''FLAGS'' register. If the result is 0, the "zero" flag (ZF) will be set. 
 + 
 +<code asm> 
 +    or al, al           ; verify if next character is null? 
 +    jz .done            ; exit condition 
 + 
 +    ; todo ..... 
 + 
 +    jmp .loop 
 + 
 +.done: 
 +    pop bx 
 +    pop ax 
 +    pop si     
 +    ret 
 +</code> 
 + 
 +The next instruction, ''JZ'', is a conditional jump which will jump to the ''.done'' label if the zero flag is set. So, essentially, if the next character is ''null'', we jump outside the loop. 
 + 
 +After exiting the loop, we pop the registers we previously pushed in reverse order, and then we'll return from this function. So far, our function takes a string, iterates every character until it encounters the ''null'' character, and then exits. What's left to do is to print the character to the screen. The way we can do that is by using the BIOS. As the name suggests, the **//BIOS //**or the **//Basic Input/Output System//** does more than just start the computer. It  also provides some very basic functions, which allow us to do some very basic stuff, such as writing text to the screen. So, how exactly do we call the BIOS to print the character for us? The answer is that we use **//interrupts//**. 
 + 
 +===== Interrupts ===== 
 + 
 +An interrupt is a signal which makes the processor stop whatever it is doing to handle that event. There are 3 possible ways of triggering an interrupt: 
 + 
 +  - Through **//an exception//**; an exception is generated by the CPU if a critical error is encountered, and it cannot continue executing. For example, dividing by zero will trigger an interrupt. Operating systems can use these interrupts to stop the misbehaving process, or to attempt to restore it to working order. 
 +  - **//Hardware//** can also trigger interrupts. For example, when a key is pressed on the keyboard, or when the disk controller finished performing an asynchronous read. 
 +  - From code, through the **//INT instruction//**. Interrupts are numbered from 0 to 255, so the ''INT'' instruction requires a parameter indicating the interrupt number to trigger. 
 + 
 +The BIOS installed some interrupt handlers for us, so that we can use its functionality. Typically, the BIOS reserves an interrupt number for a category of functions, and the value in the ''AH ''register is used to choose between the available functions in that category. 
 + 
 +<code -> 
 +Examples of BIOS interrupts: 
 + 
 +INT 10h -- Video 
 +INT 11h -- Equipment check 
 +INT 12h -- Memory size 
 +INT 13h -- Disk I/O 
 +INT 14h -- Serial communication 
 +INT 15h -- Cassette 
 +INT 16h -- Keyboard 
 + 
 +............ 
 +</code> 
 + 
 +To print text to the screen, we will need to call [[https://en.wikipedia.org/wiki/INT_10H|interrupt 10h ]]which contains the video services category. By setting ''AH'' to ''0Ah'', we will call the "write text in teletype mode" function. Here's a detailed description of this function: 
 + 
 +<code -> 
 +VIDEO - TELETYPE OUTPUT 
 + 
 +AH = 0Eh 
 +AL = character to write 
 +BH = page number 
 +BL = foreground color (graphics modes only) 
 + 
 +Return: 
 +Nothing 
 + 
 +Desc: Display a character on the screen, advancing the cursor and scrolling the screen as necessary
  
-We didn't specify a segment registerso DS is going to be used. We haven't used the baseindexor scale, but only a constant, which is the offset the label points toIn assembly, labels are simply constants that point to a specific offset. Here's a more complicated example where we want to read the third element in an array:+Notes: Characters 07h (BEL)08h (BS)0Ah (LF)and 0Dh (CR) are interpreted and do the expected things. 
 +IBM PC ROMs dated 1981/4/24 and 1981/10/19 require that BH be the same as the current active page
  
-In this example, we put the offset of the array into BX and the index of the third element in SI. Since we use zero-based indexing, the third element is at an offset of 2. Each element in the array is a word (two bytes), so we put the value 4 in SI. You can see here that we use the multiplication symbol. The assembler is capable of calculating the result of constant expressions and putting the result in the resulting machine code. However, you can't try to move BXAX times 2 because AX is not known at compile time, so it is not a constant. For that, you have to use the multiply instruction.+BUG: If the write causes the screen to scrollBP is destroyed by BIOSes for which AH=06h destroys BP 
  
-Referencing memory is the only place where you can put registers in an expressionFinally, we put the third element in the array into AX by referencing the memory location at BX+SIBX is our base register, and SI is our index register.+Source: http://www.ctyme.com/intr/rb-0106.htm 
 +</code>
  
-Now back to our operating system. The code segment register has been set up for us by the BIOS, and it points to segment 0. There are some BIOSes out there which actually jump to our code using a different segment and offset, such as segment 7C0, offset 0, but the standard behavior is to use segment 0, offset 7000. We don't know if the data segment and extra segment are properly initialized, so this is what we have to do next.+What we need to do in order to call this function is to set:
  
-Since we can't write a constant directly to a segment register, we have to use an intermediary register. We will use AX. The move instruction copies a value from the source on the left side to the destination on the right side. We also set up the stack segment to and the stack pointer to the beginning of our program.+  * AH to 0Eh 
 +  * AL to the ASCII character that we want to print 
 +  * BH to the page number (which is 0
 +  * BL (the foreground color) is only used in graphics mode, so we can ignore it because we're currently running in text mode.
  
-Nowlet's start coding a function that prints a string to the screen. Always document your assembly functions. Soour function will receive a pointer to a string in DS:SI, and it will print characters until it encounters a null character.+<code asm> 
 +    mov ah0x0E        ; call bios interrupt 
 +    ; al is already set by lodsb 
 +    mov bh0           ; set page number to 
 +    int 0x10 
 +</code>
  
-Because I decided to write the function above mainI have to add a jump instruction above so that main is still the entry point to our programFirst, push the registers that I'm going to modify to the stack, after which we enter the main loop.+Finally let's add a string containing "Hello world"followed by a new line. To add a new line, you need to print both the line feed and the carriage return characters. I created an awesome macro so that I don't have to remember the hex codes for these characters every time. To declare the string we use the DB directive.
  
-The load SB instruction loads the byte at the address DS:SI into the AL register and then increments SI. Next, I wrote the loop exit condition. The OR instruction performs a bitwise OR and stores the result in the left-hand side operand. In this case, AL ORing a value to itself will modify the value at ALbut what it will modify is the flags register. If the result is zerothe zero flag will be set.+<code asm> 
 +%define ENDL 0x0D, 0x0A 
 +msg_hellodb 'Hello world!'ENDL
 +</code>
  
-The next instruction is a conditional jump, which jumps to the label "down" if the zero flag is set. Essentially, if the next character is null, we jump outside the loop. There's something I forgot when I recorded the videoan OR jump instruction to the "loop" labelso that the code will loop.+All that's left to do is to set DS:SI to the address of the stringand then call ''puts''.
  
-After exiting the loopwe pop the registers we previously pushed in reverse order, and then we return from this function. So far, our function takes a string, iterates through every character until it encounters the null character, and then exits.+<code asm> 
 +    ; print hello world message 
 +    mov simsg_hello 
 +    call puts 
 +</code>
  
-What'left to do is to print the character to the screen. We can do that by using the BIOS. As the name suggests, the BIOS (Basic Input/Output System) does more than just start a system; it also provides some very basic functions that allow us to do some basic stuff, such as writing text to the screen.+Let'now test our program:
  
-So, how exactly do we call the BIOS to print the character for us? The answer is that we use interruptsAn interrupt is basically a signal that makes the processor stop whatever it is doing to handle that event. There are three possible ways of triggering an interrupt:+<code bash> 
 +$ make 
 +$ qemu-system-i386 -fda build/main_floppy.img 
 +</code>
  
-Through an exception: An exception is generated by the CPU if a critical error is encountered and it cannot continue executing. For example, dividing by zero will trigger an interrupt. Operating systems can use these interrupts to stop misbehaving processes or attempt to restore them to working order. +And the result:
-Hardware interruptsHardware can also trigger interrupts. For example, when a key is pressed on the keyboard or when the disk controller finishes performing an asynchronous read. +
-Using the INT instruction: Interrupts are numbered from 0 to 255, so the INT instruction requires a parameter indicating the interrupt number to trigger.+
  
-The BIOS installs some interrupt handlers for us so that we can use its functionality. Examples of BIOS interrupts typically reserve an interrupt number for a category of functions, and the value in the AH register is used to choose between the available functions in that category.+{{ :transcripts:pasted:20230909-161027.png?600 }}
  
-To print text to the screen, we will need to call interrupt 10h (hexadecimal), which contains the video services category. By setting AH to 0Eh (hexadecimal), we will call the "Write Text in Teletype Mode" function. Here's a detailed description of this function.+===== Conclusion =====
  
-To call this function, we need to set the AH register to 0Eh (hexadecimal)AL to the ASCII character that we want to print, and BH to the page numberThe BH parameter is only used in graphics modeso we can ignore it because we're currently running in text mode. When I recorded the video, I forgot to set the page number to zero.+Great! So, we have successfully written a tiny operating system which can print text to the screen! This was a lot of workand we learned a lot of new stuff about how computers work. We'll continue the next time when we will improve our assembly skills and learn some new stuff, by extending our operating system to print numbers to the screenAfter that, we will get into the complex task of loading stuff from the disk.
  
-After that, we call interrupt 10h (hexadecimal). Finally, let's add a string containing the text "Hello, World!" followed by a new line.+Thank you for watching and see you the next time! Bye bye!
  
-To add a new line, you need to print both the line feed and carriage return characters. I created an ASCII macro so that I don't have to remember the hex codes for these characters every time. To declare a string, we use the DB directive, which conveniently allows us to write as many characters as we want. 
  
-All that's left to do is to set DS:SI to the address of the string and then call our print function. Now, let's test our program!