Both sides previous revisionPrevious revisionNext revision | Previous revision |
transcripts:building-an-os-1-hello-world [2023/09/04 20:47] – Tiberiu Chibici | transcripts:building-an-os-1-hello-world [2023/09/09 16:57] (current) – [Building an OS - 1 - Hello world] Tiberiu Chibici |
---|
====== Building an OS - 1 - Hello world ====== | ====== Building an OS - 1 - Hello world ====== |
| |
>Note: This is a verbatim transcript of the [[https://youtu.be/9t-SPC7Tczc| Building an OS - 1 - Hello world ]] video. If you want to follow the new and improved text tutorial, [[:building-an-os|check here]]. | >Note: This is an almost verbatim transcript of the [[https://youtu.be/9t-SPC7Tczc| Building an OS - 1 - Hello world ]] video (some minor changes have been made where it would make things more clear). If you want to follow the new and improved text tutorial, [[:building-an-os|check here]]. |
| |
===== Introduction ===== | ===== Introduction ===== |
Finally, we put into AX the third element in the array, by referencing the memory location at BX + SI. BX is our base register, and SI is our index register. | Finally, we put into AX the third element in the array, by referencing the memory location at BX + SI. BX is our base register, and SI is our index register. |
| |
==== Back to the operating system - the initialization ==== | ==== Back to the OS - the initialization ==== |
| |
Back to our operating system, the code segment register has been set up for us by the BIOS and it points to segment 0. There are some BIOSes out there which actually jump to our code using a different segment and offset such 0x07C0:0x0000, but the standard behavior is to use 0x0000:0x7C00. We don't know if DS and ES are properly initialized, so this is what we have to do next. Since we can't write a constant directly to a segment register, we have to use an intermediary register; we will use AX. The MOV (move) instruction copies data from the source on the left side to the destination on the right side. | Back to our operating system, the code segment register has been set up for us by the BIOS and it points to segment 0. There are some BIOSes out there which actually jump to our code using a different segment and offset such 0x07C0:0x0000, but the standard behavior is to use 0x0000:0x7C00. We don't know if DS and ES are properly initialized, so this is what we have to do next. Since we can't write a constant directly to a segment register, we have to use an intermediary register; we will use AX. The MOV (move) instruction copies data from the source on the left side to the destination on the right side. |
We also set up the stack segment (SS) to 0, and the stack pointer (SP) to the beginning of our program. So what exactly is this stack? | We also set up the stack segment (SS) to 0, and the stack pointer (SP) to the beginning of our program. So what exactly is this stack? |
| |
The stack is a piece of memory that we can accesse in a "first in last out" manner, using the PUSH and POP instructions. The stack also has a special purpose when using functions. When you call a function, the return address is added to the stack, and when you return from a function, the processor will read the return address from the stack and then jump to it. | The stack is a piece of memory that we can access in a "first in last out" manner, using the PUSH and POP instructions. The stack also has a special purpose when using functions. When you call a function, the return address is added to the stack, and when you return from a function, the processor will read the return address from the stack and then jump to it. |
| |
Another thing to note about the stack is that it grows downwards! SP points to the top of the stack. When you push something, SP is decremented by the number of bytes pushed, and then the data is written to memory. This is why we set up the stack to point to the start of our operating system: because it grows downwards. If we set it up to the end of our program, it would overwrite our program. We don't want that, so we just put it somewhere where it won't overwrite anything. The beginning of our operating system is a pretty safe spot. | Another thing to note about the stack is that it grows downwards! SP points to the top of the stack. When you push something, SP is decremented by the number of bytes pushed, and then the data is written to memory. This is why we set up the stack to point to the start of our operating system: because it grows downwards. If we set it up to the end of our program, it would overwrite our program. We don't want that, so we just put it somewhere where it won't overwrite anything. The beginning of our operating system is a pretty safe spot. |
| |
now we'll start coding a Buddhist function which prints a string to the screen always document your assembly functions so our function will receive a pointer to a string in DSS i and it will print characters until it encounters a null character because I decided to write the function above main I have to add a jump instruction above so main is still the entry point to our program first I push the registers that I'm going to modify to the stack after which we enter the main loop the load SB instruction loads apart from the address DSS I into the AL register and then increments si next I wrote the loop exit condition the or instruction performs a bitwise or and store the result in the left hand side operand in this case al orange a value to itself will modify the value at all but what it will modify is the Flex register if there is multi zero the zero flag will be set the next instruction is the conditional jump which jumps to the down label if the zero flag is set so essentially if the next character is null we jump outside the loop there's something I forgot when I recorded the video or jump instruction to the loop label so that the code will loop after exiting the loop we pop the registers we previously pushed in reverse order and then we'll return from this function so far our function takes a string iterates every character until it encounters the null character and then exits what's left to do is to print the character to the screen the way we can do that is using the BIOS as the name suggests the BIOS or the basic input/output system does more than just start a system it also provides some very basic functions which allow us to do some very basic stuff such as writing text to the screen so how exactly do we call the BIOS to print the character for us the answer is that we use interrupts so what are interrupt an interrupt is basically a signal which makes the processor stop whatever it is doing to handle that event there are three possible ways of triggering an interrupt the first way it is through an exception an exception is generated by the CPU if a critical error is encountered and it cannot continue executing for example dividing by zero will trigger an interrupt operating systems can use these interrupts to stop the misbehaving process or to attempt to restore it to working order hardware can also trigger interrupts for example when a key is pressed on the keyboard or when the disk controller finished performing enough synchronous read the third way in which interrupts can be triggered is through the int instruct shun interrupts are numbered from 0 to 255 so the instruction requires a parameter indicating the interrupt number to trigger the BIOS install some interrupt handlers for us so that we can use its functionality Examples of BIOS interrupts typically the BIOS reserves an interrupt number for a category of functions and the value in the aah register is used to choose between the available functions in that category to print text to the screen we will need to call interrupts 10 hexadecimal which contains the video services category by setting eh to 0 a hexadecimal will call the right text in teletype mode function here's a detailed description of this function so what we need to do in order to call this function is to set the aah registered to 0e hexadecimal al to the ASCII character that we want to print and B H to the page number the build parameter is only used in graphics mode so we can ignore it because you're currently running in text mode when I recorded the video I forgot to set the page number to zero after that we call interrupts one zero hexadecimal finally let's add a string containing the text hello world followed by a new line to add a new line you need to print both the line feed and the carriage return characters I created an awesome macro so that I don't have to remember the hex codes for these characters every time to declare string we use the DB directive which conveniently allows us to write as many characters as we want all that left to do is to set the SSI to address of the string and then call the Podesta let's now test our program you because I forgot to put that jump instruction I only go to the age after fixing the issue the message helloworld is displayed great so we have successfully written a tiny apartment system which can print text to the screen this was a lot of work and we learned a lot of new stuff about how computers work we'll continue the next time when we will improve our assembly skills and learn some new stuff by extending our operating system to print numbers to the screen after that we get into the complex task of loading stuff from the disk thank you for watching and see you the next time bye bye | Now we'll start coding a ''puts'' function which prints a string to the screen. |
| |
| Note: Always document your assembly functions! |
| |
| <code asm> |
| start: |
| jmp main |
| |
| ; |
| ; Prints a string to the screen |
| ; Params: |
| ; - ds:si points to string |
| ; |
| puts: |
| |
| ; ....... |
| |
| |
| main: |
| </code> |
| |
| Our function will receive a pointer to a string in ''DS:SI'' and it will print characters until it encounters a null character. Because I decided to write the function above ''main'', I have to add a jump instruction above, so ''main ''is still the entry point to our program. |
| |
| First, we push the registers that we're going to modify to the stack, after which we enter the main loop. |
| |
| <code asm> |
| puts: |
| ; save registers we will modify |
| push si |
| push ax |
| push bx |
| |
| .loop: |
| lodsb ; loads next character in al |
| </code> |
| |
| The ''lodsb'' (load string byte) instruction loads a byte from the address ''DS:SI'' into the AL register, and then increments ''SI.'' |
| |
| Next, I wrote the loop exit condition; the ''or'' instruction performs a bit-wise "or" and stores the result in the left operand, in this case ''AL''. OR-ing a value to itself won't modify the value at all, but it will modify is the ''FLAGS'' register. If the result is 0, the "zero" flag (ZF) will be set. |
| |
| <code asm> |
| or al, al ; verify if next character is null? |
| jz .done ; exit condition |
| |
| ; todo ..... |
| |
| jmp .loop |
| |
| .done: |
| pop bx |
| pop ax |
| pop si |
| ret |
| </code> |
| |
| The next instruction, ''JZ'', is a conditional jump which will jump to the ''.done'' label if the zero flag is set. So, essentially, if the next character is ''null'', we jump outside the loop. |
| |
| After exiting the loop, we pop the registers we previously pushed in reverse order, and then we'll return from this function. So far, our function takes a string, iterates every character until it encounters the ''null'' character, and then exits. What's left to do is to print the character to the screen. The way we can do that is by using the BIOS. As the name suggests, the **//BIOS //**or the **//Basic Input/Output System//** does more than just start the computer. It also provides some very basic functions, which allow us to do some very basic stuff, such as writing text to the screen. So, how exactly do we call the BIOS to print the character for us? The answer is that we use **//interrupts//**. |
| |
| ===== Interrupts ===== |
| |
| An interrupt is a signal which makes the processor stop whatever it is doing to handle that event. There are 3 possible ways of triggering an interrupt: |
| |
| - Through **//an exception//**; an exception is generated by the CPU if a critical error is encountered, and it cannot continue executing. For example, dividing by zero will trigger an interrupt. Operating systems can use these interrupts to stop the misbehaving process, or to attempt to restore it to working order. |
| - **//Hardware//** can also trigger interrupts. For example, when a key is pressed on the keyboard, or when the disk controller finished performing an asynchronous read. |
| - From code, through the **//INT instruction//**. Interrupts are numbered from 0 to 255, so the ''INT'' instruction requires a parameter indicating the interrupt number to trigger. |
| |
| The BIOS installed some interrupt handlers for us, so that we can use its functionality. Typically, the BIOS reserves an interrupt number for a category of functions, and the value in the ''AH ''register is used to choose between the available functions in that category. |
| |
| <code -> |
| Examples of BIOS interrupts: |
| |
| INT 10h -- Video |
| INT 11h -- Equipment check |
| INT 12h -- Memory size |
| INT 13h -- Disk I/O |
| INT 14h -- Serial communication |
| INT 15h -- Cassette |
| INT 16h -- Keyboard |
| |
| ............ |
| </code> |
| |
| To print text to the screen, we will need to call [[https://en.wikipedia.org/wiki/INT_10H|interrupt 10h ]]which contains the video services category. By setting ''AH'' to ''0Ah'', we will call the "write text in teletype mode" function. Here's a detailed description of this function: |
| |
| <code -> |
| VIDEO - TELETYPE OUTPUT |
| |
| AH = 0Eh |
| AL = character to write |
| BH = page number |
| BL = foreground color (graphics modes only) |
| |
| Return: |
| Nothing |
| |
| Desc: Display a character on the screen, advancing the cursor and scrolling the screen as necessary |
| |
| Notes: Characters 07h (BEL), 08h (BS), 0Ah (LF), and 0Dh (CR) are interpreted and do the expected things. |
| IBM PC ROMs dated 1981/4/24 and 1981/10/19 require that BH be the same as the current active page |
| |
| BUG: If the write causes the screen to scroll, BP is destroyed by BIOSes for which AH=06h destroys BP |
| |
| Source: http://www.ctyme.com/intr/rb-0106.htm |
| </code> |
| |
| What we need to do in order to call this function is to set: |
| |
| * AH to 0Eh |
| * AL to the ASCII character that we want to print |
| * BH to the page number (which is 0) |
| * BL (the foreground color) is only used in graphics mode, so we can ignore it because we're currently running in text mode. |
| |
| <code asm> |
| mov ah, 0x0E ; call bios interrupt |
| ; al is already set by lodsb |
| mov bh, 0 ; set page number to 0 |
| int 0x10 |
| </code> |
| |
| Finally let's add a string containing "Hello world", followed by a new line. To add a new line, you need to print both the line feed and the carriage return characters. I created an awesome macro so that I don't have to remember the hex codes for these characters every time. To declare the string we use the DB directive. |
| |
| <code asm> |
| %define ENDL 0x0D, 0x0A |
| msg_hello: db 'Hello world!', ENDL, 0 |
| </code> |
| |
| All that's left to do is to set DS:SI to the address of the string, and then call ''puts''. |
| |
| <code asm> |
| ; print hello world message |
| mov si, msg_hello |
| call puts |
| </code> |
| |
| Let's now test our program: |
| |
| <code bash> |
| $ make |
| $ qemu-system-i386 -fda build/main_floppy.img |
| </code> |
| |
| And the result: |
| |
| {{ :transcripts:pasted:20230909-161027.png?600 }} |
| |
| ===== Conclusion ===== |
| |
| Great! So, we have successfully written a tiny operating system which can print text to the screen! This was a lot of work, and we learned a lot of new stuff about how computers work. We'll continue the next time when we will improve our assembly skills and learn some new stuff, by extending our operating system to print numbers to the screen. After that, we will get into the complex task of loading stuff from the disk. |
| |
| Thank you for watching and see you the next time! Bye bye! |
| |
| |