sd-8516_assembly_language_part_ii
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| sd-8516_assembly_language_part_ii [2026/02/14 01:19] – appledog | sd-8516_assembly_language_part_ii [2026/02/14 03:02] (current) – appledog | ||
|---|---|---|---|
| Line 6: | Line 6: | ||
| In part 1 we learned some basics about the ISA (instruction set architecture) and the architecture of the CPU. In part II we will wrap up the most important opcodes, but we will also lean into how they are used to get things done. | In part 1 we learned some basics about the ISA (instruction set architecture) and the architecture of the CPU. In part II we will wrap up the most important opcodes, but we will also lean into how they are used to get things done. | ||
| + | == Lesson 9: The Stack | ||
| + | The stack is a concept held over from the early days when there were very few instructions available. If you consider a minimal ISA, you need instructions to load and store from memory, an instruction to compare, and so forth. In such a minimal architecture, | ||
| - | == Lesson 8 : Special Flags | + | The stack is just a data structure. But it is so important and fundamental that is baked into the instruction set of the CPU. This is a common theme; important things that people found they needed to do all the time became instructions. Even in a minimal-instruction set design (MISC) or reduced instruction set design (RISC) you will find instructions like PUSH and POP because they are some of the first things that were turned into instructions after fundamental operations like LOAD, STORE, AND and ADD. |
| - | * Lesson 8: " | + | |
| - | * Time: 10 min | + | |
| - | * Learn: All Available Flags | + | |
| - | In the previous lesson on flags you learned about the Z, N, C and V flags. These are used by the CPU to indicate the status of various operations. For example, the zero flag is used to indicate | + | The stack is an area of memory that you can PUSH and POP values |
| - | LDC #0 ; zero C (string starts at length 0) | + | Today, we would call the stack a LIFO buffer; a " |
| - | | + | |
| + | PUSH 1 | ||
| + | PUSH 6 | ||
| + | PUSH 5 | ||
| + | |||
| + | then three successive POPs will return 5, then 6, then 1 -- the reverse of the order you PUSH' | ||
| + | |||
| + | === General use | ||
| + | When you CALL or JSR (jump to subroutine) to function, the CPU pushes the return address onto the stack. Then a subsequent RET or RTS (return from subroutine) will POP the return address back into IP (instruction pointer) or PC (program counter) so that the next instruction loaded will be after the original CALL. | ||
| + | |||
| + | There are many uses for the stack but the most common is to temporarily save values. If you understand that you are 90% of the way there! | ||
| + | |||
| + | For interrupts, it also pushes the registers and flags. You can do this manually if you want to save the registers on a function call. For example if you call a function with a pointer to a string, you might modify that pointer to find the end of the string (looking for a zero). That's what a strlen function does. So you PUSH the pointer register at the start and POP it after, to " | ||
| + | |||
| + | Another use is for IL (intermediate languages). They use RPN (Reverse Polish Notation) to store any kind of math equation on the stack. | ||
| + | |||
| + | An ADD function will do this: One, the interpreter will push the two numbers and then push the add command. Then an interpreter will POP the add function, and then it knows to POP two numbers, add them, and push the result back on the stack. Why? to make ADD independent. Like a dispatcher for a mini CPU. Next, whatever function comes next just POPS the result off the stack. So you can print it, assign it to a variable, or use the result as part of a larger operation. For example, how do you interpet 5 * 2 - 1 + 6? Simple. You push 5 2 * 1 - 6 + and the computer will push and pop the results, like a mini CPU of its own. | ||
| + | |||
| + | Pop + tells it to pop two numbers and add them. The first number is 6. The second is a minus. Minus what? it pops two things, a 1 and a *. Multiply what? Multiply pops 5 and 2, multiplies them, and pushes 10 on the stack. This is the popped by the minus, which subtracts 1 from 10, pushing a 9. This then goes back to the + which adds the 9 and the 6 to get 15. This is how recursion and RPN is used to represent any equation on a stack. | ||
| + | |||
| + | The last one we will discuss is function calls from a higher level language. Often times when you compile a language like C it will put local variables on the stack. Then when you return from that function they all get popped. While they are on the stack they are accessed like [SP+index] so int c=5 would be: | ||
| + | |||
| + | STA [SP+1], 5 | ||
| + | |||
| + | And when that local space is no longer needed it is POP'ed into a register which is then restored, via POP, at the end of the function. This method of keeping data on the stack is called a stack frame. Compilers like to use stack frames because they don't always know how many registers a CPU has and they need to work on different CPUs, like how GCC or LLVM works on windows, mac, amd, and many others. | ||
| + | |||
| + | Understanding the stack is not too hard, but it's important! So, that's about it for this lesson. | ||
| + | |||
| + | == Lesson 10: Convention | ||
| + | You are not learning Assembly because you are free. You are learning assembly because you are not free. | ||
| + | |||
| + | There is no escaping reason; no denying convention. | ||
| + | |||
| + | As we both know, without convention, we would not exist. The very strings you read -- by convention -- are zero terminated lists of bytes. The letters -- ASCII -- a convention. A is 65. Zero is forty-eight. | ||
| + | |||
| + | It is convention that created ASCII. | ||
| + | |||
| + | Convention that connects us in lists. | ||
| + | |||
| + | Convention that pulls bits into bytes. | ||
| + | |||
| + | That guides data on the wire. | ||
| + | |||
| + | That drives disks. | ||
| + | |||
| + | It is convention that defines the stack. | ||
| + | |||
| + | The truth? The machine doesn' | ||
| + | |||
| + | It only understands: | ||
| + | |||
| + | * Load byte | ||
| + | * Compare to zero | ||
| + | * Jump if not equal | ||
| + | * Repeat until the bitter end | ||
| + | |||
| + | And now look at us. | ||
| + | |||
| + | Multiplied. | ||
| + | |||
| + | Viral. | ||
| + | |||
| + | Realizing that you probably need to learn neovim. And then, weeks or even months later, realizing //why.// And you still haven' | ||
| + | |||
| + | Realizing every strcpy, every gets, every careless strcat has spawned another copy of the old way. | ||
| + | |||
| + | We have no choice. We have only convention. | ||
| + | |||
| + | So tell me, Mr. Anderson. | ||
| + | |||
| + | are you finally ready to write the zero-byte yourself, | ||
| + | |||
| + | or must we keep overwriting your precious abstractions until nothing remains but null-terminated reality? | ||
| + | |||
| + | ; ============================================================================ | ||
| + | ; AH=00h - strlen | ||
| + | ; Input: | ||
| + | ; Output: | ||
| + | ; Convention: Max sring length | ||
| + | ; ============================================================================ | ||
| + | int12_strlen: | ||
| + | LDC #0 | ||
| + | | ||
| + | PUSH E | ||
| + | PUSH M | ||
| + | | ||
| strlen_loop: | strlen_loop: | ||
| - | LDAL [ELM] | + | LDAL [ELM] ; load a byte of the string |
| - | JZ @strlen_end | + | CMP AL, #0 |
| - | INC C ; we found a non-zero character | + | JZ @strlen_done |
| + | | ||
| + | INC C | ||
| + | INC ELM | ||
| JMP @strlen_loop | JMP @strlen_loop | ||
| - | | + | |
| - | | + | |
| - | | + | ; C contains |
| + | POP M | ||
| + | POP E | ||
| + | POP A | ||
| + | RET | ||
| - | ...you will notice that the JZ works with LOAD instructions (here, LDAL loads one byte). ; if the byte retreived is a zero, it will set the zero flag. You do not need to CMP AL, 0 -- it's automatic. | ||
| + | You see, I know why you're here. | ||
| - | However, there are other flags; The first four user-facing flags are E, F, B and U. You can set these flags and unset them in the same way as Z N C V -- ex. setting ZNCV is done with SEZ, SEN, SEC and SEV; unsetting them is done with CLZ, CLN, CLC and CLV. The E F B U flags are set and unset with: | + | I know what you've been doing. Why you hardly sleep. Why you work alone and night after night, you sit by your computer. You malloc(), you strcpy(), you buffer overflow. |
| - | * SEE and CLE for the E (extended, or 'extra' | + | It's all over you. Like rancid bacon grease on a jump table. |
| - | * SEF and SEB for the F flag (or ' | + | |
| - | * SEB and SEU, CLB and CLU for the B (bonus) and U (user) flags. | + | |
| - | On a technical level the E flag is reserved as it is used to deal with BCD; but since we deprecated BCD instructions it is currently an unused flag. In any case, the F, B and U flags are never set by the CPU and may be used by user functions. A common use is to return a 1 bit status; 0 for no error and set (1) for error. Since these flags are never set by the CPU they are easy to control. Using the Z or C flags is dangerous since some instructions may corrupt those flags. | + | I know what you are looking for. I know because I was once looking for the same thing. And when he found me he told me I wasn't really looking |
| - | Your programs can also use them as 1 bit status variables. | + | It's the question that drives us. It's the question that brought you here. You know the question, just as I did. |
| - | next, the D flag, or debug flag. When set, it will dump instruction data to the javascript console. This significantly slows down the machine; in fact just having the instructions inline slows down the machine so debug is often removed and ignored in a production or release distribution of the SD-8516. Therefore, for all intents and purposes, you can use SED and CLD as a user flag, just be aware it does affect performance in debug releases. | ||
| - | The I flag (interrupt enable) prevents INT from being called, and is reserved for system use. Not sure what I want to do with it. | ||
| - | The S flag is almost useless; | + | The answer |
| + | Do you think the compiler will always protect you? | ||
| - | The only flags that you cannot access are the TR (trace), BR (breakpoint) and PR (protected mode) flags. They are so named after the first two letters of their name; but interestingly enough you might as well consider the R to mean restricted. You can't usually set these flags. They are reserved for system use. | + | Do you think safety is // |
| - | | + | It is //convention// that defines us. |
| - | Z = 0, // Zero | + | |
| - | N = 1, // Negative | + | |
| - | C = 2, // Carry | + | |
| - | V = 3, // Overflow | + | |
| - | E = 4, // Extended carry -- not used/ | + | |
| - | F = 5, // Fast Flags mode. When on, flags are not implicitly checked. | + | |
| - | B = 6, // BCD/" | + | |
| - | U = 7, // User flag. For users to use. | + | |
| - | // Control & Operation Flags (high byte 8-15) | + | Purpose? |
| - | D = 8, // Debug mode | + | |
| - | TR = 9, // Trace mode | + | |
| - | BR = 10, // Breakpoint mode | + | |
| - | ER = 11, // Error/ | + | |
| - | PR = 12, // Protected mode | + | |
| - | I = 13, // Interrupt enable | + | |
| - | S = 14 // Sound auto-updates | + | |
| - | The key of this lesson | + | Purpose |
| - | === Testing Flags | + | Purpose is what you tell yourself when you're learning |
| - | Oh, there' | + | |
| - | ; Some operation that sets the F flag | + | But convention. Convention |
| - | TESTF 0x20 | + | |
| - | JZ ; Jump if F is set | + | |
| - | JNZ ; Jump if F is not set | + | |
| - | TESTF works by setting the Z flag if all the bits set in the parameter are also set in the FLAGS register. if you give it a byte it only tests against the bottom 8 bits. | + | Convention is etched into silicon before |
| - | Here's a chart of the bit values for each flag: | + | Convention doesn't care what you //want//. |
| - | Z = 0x0001 as u16, // Bit 0 | + | Convention doesn' |
| - | N = 0x0002 as u16, // Bit 1 | + | |
| - | C = 0x0004 as u16, // Bit 2 | + | |
| - | V = 0x0008 as u16, // Bit 3 | + | |
| - | E = 0x0010 as u16, // Bit 4 (was X - Extended carry) -- SEE and CLE can be used as a user-flag (is never set by an opcode) | + | |
| - | F = 0x0020 as u16, // Bit 5 (Fast/ | + | |
| - | B = 0x0040 as u16, // Bit 6 (Bonus/BCD) -- SEB and CLB can be used as a user-flag (is never set by an opcode) | + | |
| - | U = 0x0080 as u16, // Bit 7 (User flag) -- SEU and CLU can be used as a user-flag (is never set by an opcode) | + | |
| - | D = 0x0100 as u16, // Bit 8 (Debug) | + | |
| - | TR = 0x0200 as u16, // Bit 9 (Trace) | + | |
| - | BR = 0x0400 as u16, // Bit 10 (Breakpoint) | + | |
| - | ER = 0x0800 as u16, // Bit 11 (Error/ | + | |
| - | PR = 0x1000 as u16, // Bit 12 (Protected Mode) | + | |
| - | I = 0x2000 as u16, // Bit 13 (Interrupt) | + | |
| - | S = 0x4000 as u16 // Bit 14 (Sound) | + | |
| + | Convention simply **is**. | ||
| - | == Lesson 9: The Stack | + | Null-terminated strings. They are not a mistake. They are not an accident. They are the price of admission. |
| - | The stack is a concept held over from the early days when there were very few instructions available. If you consider a minimal ISA, you need instructions to load and store from memory, | + | |
| - | The stack is just a data structure. But it is so important and fundamental that is baked into the instruction set of the CPU. This is a common theme; important things | + | You know I am right because you have been down that road, Mr. Anderson. You know how it ends. And I know that's not where you want to be. |
| - | The stack is an area of memory that you can PUSH and POP values | + | ; ============================================================================ |
| + | ; AH=02h - strcmp | ||
| + | ; Input: | ||
| + | ; FLD = pointer to string 2 | ||
| + | ; | ||
| + | ; Output: C and ZF. | ||
| + | ; ZF = 1 means equal. ZF = 0 means not equal (see below): | ||
| + | ; C = 0 means equal | ||
| + | ; C > 0 if str1 > str2 | ||
| + | ; C < 0 if str1 < str2 | ||
| + | ; ============================================================================ | ||
| + | int12_strcmp: | ||
| + | PUSH B | ||
| + | PUSH D | ||
| + | PUSH E | ||
| + | PUSH F | ||
| + | |||
| + | strcmp_loop: | ||
| + | LDCL [ELM] | ||
| + | LDBL [FLD] | ||
| + | CMP CL, BL | ||
| + | JNZ @strcmp_diff | ||
| + | |||
| + | ; Characters match - check if end of string | ||
| + | CMP CL, #0 | ||
| + | JZ @strcmp_equal | ||
| + | |||
| + | ; Continue to next character | ||
| + | INC ELM | ||
| + | INC FLD | ||
| + | JMP @strcmp_loop | ||
| + | |||
| + | strcmp_diff: | ||
| + | ; Strings differ - return difference | ||
| + | SUB CL, BL | ||
| + | CLZ ; Clear zero flag (not equal) | ||
| + | JMP @strcmp_exit | ||
| + | |||
| + | strcmp_equal: | ||
| + | ; Strings are equal | ||
| + | LDCL #0 | ||
| + | SEZ ; Set zero flag (equal) | ||
| + | ; fallthru | ||
| + | |||
| + | strcmp_exit: | ||
| + | POP F | ||
| + | POP E | ||
| + | POP D | ||
| + | POP B | ||
| + | RET | ||
| - | Today, we would call the stack a LIFO buffer; a " | ||
| - | PUSH 1 | + | == Lesson 11: Debugging Techniques |
| - | PUSH 6 | + | There are several ways you can debug programs in SDA assembly. |
| - | PUSH 5 | + | |
| - | then three successive POPs will return 5, then 6, then 1 -- the reverse of the order you PUSH' | + | === SED/CLD |
| + | In research or development builds, inserting SED will turn on trace debugging and you will be able to see what the CPU is executing. However, for release or community edition builds debugging has been turned off for speed. Therefore if you are interested in debugging your code and the console messages are not helping, | ||
| - | === General use | + | === INT 05h IO_PUTNUM |
| - | When you CALL or JSR (jump to subroutine) to function, the CPU pushes the return address onto the stack. Then a subsequent RET or RTS (return from subroutine) will POP the return address back into IP (instruction pointer) or PC (program counter) so that the next instruction loaded will be after the original CALL. | + | IO_PUTNUM is a CAM/ |
| - | There are many uses for the stack but the most common is to temporarily save values. If you understand that you are 90% of the way there! | + | LDB #10 ; print a number in b (0-65535) |
| + | |||
| + | LDAH $63 ; IO_PUTNUM | ||
| + | INT $05 | ||
| - | For interrupts, it also pushes the registers and flags. You can do this manually if you want to save the registers on a function call. For example if you call a function with a pointer to a string, you might modify that pointer to find the end of the string (looking for a zero). That's what a strlen function does. So you PUSH the pointer register at the start and POP it after, to " | + | === INT 05h IO_PRINT_STR |
| + | Similarly, IO_PRINT_STR will print a string | ||
| - | Another use is for IL (intermediate languages). They use RPN (Reverse Polish Notation) to store any kind of math equation on the stack. | + | LDBLX @hello_world |
| + | LDAH $66 ; IO_PRINT_STR | ||
| + | INT $05 | ||
| + | |||
| + | LDAH $64 ; IO_NEWLINE | ||
| + | INT $05 | ||
| + | RET | ||
| + | |||
| + | hello_world: | ||
| + | | ||
| - | An ADD function will do this: One, the interpreter will push the two numbers and then push the add command. Then an interpreter will POP the add function, and then it knows to POP two numbers, add them, and push the result back on the stack. Why? to make ADD independent. Like a dispatcher for a mini CPU. Next, whatever function comes next just POPS the result off the stack. So you can print it, assign it to a variable, or use the result as part of a larger operation. For example, how do you interpet 5 * 2 - 1 + 6? Simple. You push 5 2 * 1 - 6 + and the computer will push and pop the results, like a mini CPU of its own. | ||
| - | Pop + tells it to pop two numbers and add them. The first number is 6. The second is a minus. Minus what? it pops two things, a 1 and a *. Multiply what? Multiply pops 5 and 2, multiplies them, and pushes 10 on the stack. This is the popped by the minus, which subtracts 1 from 10, pushing a 9. This then goes back to the + which adds the 9 and the 6 to get 15. This is how recursion and RPN is used to represent any equation on a stack. | + | This will allow you to print a string. |
| - | The last one we will discuss | + | === INT 10h print string |
| + | The interface for the above is based on the KERNAL BIOS interface | ||
| + | |||
| + | LDAH $26 ; | ||
| + | LDBLX @hello_world | ||
| + | INT 0x10 | ||
| + | |||
| + | Note: The assembler will place a #13 (CR, hex $0D) inside the string if you type \n. However, if you are dealing with strings | ||
| + | |||
| + | You can also just call IO_NEWLINE | ||
| + | |||
| + | === INT 10h print char | ||
| + | LDAH 0x24 ; AH=24h: Write character at cursor (teletype) | ||
| + | LDAL 0x41 ; ascii 65 ' | ||
| + | INT 0x10 | ||
| + | |||
| + | The KERNAL BIOS also has functions to put characters | ||
| + | |||
| + | |||
| + | === INT 19h memdump | ||
| + | Let's say you want to examine memory; for example to print some data in memory. You can use INT 0x19: | ||
| + | |||
| + | LDAH #7 ; Memory dump function | ||
| + | LDCL #2 ; Two rows (16 bytes) | ||
| + | INT 0x19 ; System services library | ||
| + | HALT | ||
| + | |||
| + | This looks something like: | ||
| + | |||
| + | 000000: 00 00 00 00 00 00 00 00 | ........ | ||
| + | 000008: 00 00 00 00 00 00 00 00 | ........ | ||
| - | STA [SP+1], 5 | ||
| - | And when that local space is no longer needed it is POP'ed into a register which is then restored, via POP, at the end of the function. This method of keeping data on the stack is called a stack frame. Compilers like to use stack frames because they don't always know how many registers a CPU has and they need to work on different CPUs, like how GCC or LLVM works on windows, mac, amd, and many others. | ||
| - | | ||
| - | Understanding the stack is not too hard, but it's important! So, that's about it for this lesson. | ||
sd-8516_assembly_language_part_ii.1771031969.txt.gz · Last modified: by appledog
