User Tools

Site Tools


sd-8516_assembly_language_part_ii

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
sd-8516_assembly_language_part_ii [2026/02/14 01:19] appledogsd-8516_assembly_language_part_ii [2026/02/14 03:02] (current) appledog
Line 6: Line 6:
 In part 1 we learned some basics about the ISA (instruction set architecture) and the architecture of the CPU. In part II we will wrap up the most important opcodes, but we will also lean into how they are used to get things done. In part 1 we learned some basics about the ISA (instruction set architecture) and the architecture of the CPU. In part II we will wrap up the most important opcodes, but we will also lean into how they are used to get things done.
  
 +== Lesson 9: The Stack
 +The stack is a concept held over from the early days when there were very few instructions available. If you consider a minimal ISA, you need instructions to load and store from memory, an instruction to compare, and so forth. In such a minimal architecture, loading and storing from memory has certain emergent properties. For example if you have a list of things, their position in memory is not random because you are incrementing a counter such as a memory pointer to traverse that list. It is this way of doing things that we remember when we use the stack.
  
-== Lesson 8 : Special Flags +The stack is just a data structure. But it is so important and fundamental that is baked into the instruction set of the CPU. This is a common theme; important things that people found they needed to do all the time became instructions. Even in a minimal-instruction set design (MISC) or reduced instruction set design (RISC) you will find instructions like PUSH and POP because they are some of the first things that were turned into instructions after fundamental operations like LOAD, STORE, AND and ADD.
-* Lesson 8: "Special Flags" +
-* Time: 10 min +
-* Learn: All Available Flags+
  
-In the previous lesson on flags you learned about the Z, N, C and V flags. These are used by the CPU to indicate the status of various operations. For example, the zero flag is used to indicate the last operation produced a zeroTherefore if you are looking for the zero at the end of string,+The stack is an area of memory that you can PUSH and POP values to, in order. For example, you can PUSH the number 5 and the number 5 will be "on top" of the stackThen you can "POP" it later. The stack is like an array but you can only go forwards and backwards, and reading the stack destroys it. This is lot like how old magnetic ring memory workedin a way.
  
-        LDC #0    ; zero C (string starts at length 0) +Today, we would call the stack a LIFO buffer; a "Last-in, First-out" data structure. If I do this: 
-        + 
 +    PUSH 1 
 +    PUSH 6 
 +    PUSH 5 
 + 
 +then three successive POPs will return 5, then 6, then 1 -- the reverse of the order you PUSH'ed them. 
 + 
 +=== General use 
 +When you CALL or JSR (jump to subroutine) to function, the CPU pushes the return address onto the stack. Then a subsequent RET or RTS (return from subroutine) will POP the return address back into IP (instruction pointer) or PC (program counter) so that the next instruction loaded will be after the original CALL. 
 + 
 +There are many uses for the stack but the most common is to temporarily save values. If you understand that you are 90% of the way there! 
 + 
 +For interrupts, it also pushes the registers and flags. You can do this manually if you want to save the registers on a function call. For example if you call a function with a pointer to a string, you might modify that pointer to find the end of the string (looking for a zero). That's what a strlen function does. So you PUSH the pointer register at the start and POP it after, to "save" the register back to where it was when the function was called. This way the code that calls strlen can then call strcpy without having to replace the string pointer. 
 + 
 +Another use is for IL (intermediate languages). They use RPN (Reverse Polish Notation) to store any kind of math equation on the stack. 
 + 
 +An ADD function will do this: One, the interpreter will push the two numbers and then push the add command. Then an interpreter will POP the add function, and then it knows to POP two numbers, add them, and push the result back on the stack. Why? to make ADD independent. Like a dispatcher for a mini CPU. Next, whatever function comes next just POPS the result off the stack. So you can print it, assign it to a variable, or use the result as part of a larger operation. For example, how do you interpet 5 * 2 - 1 + 6? Simple. You push 5 2 * 1 - 6 + and the computer will push and pop the results, like a mini CPU of its own. 
 + 
 +Pop + tells it to pop two numbers and add them. The first number is 6. The second is a minus. Minus what? it pops two things, a 1 and a *. Multiply what? Multiply pops 5 and 2, multiplies them, and pushes 10 on the stack. This is the popped by the minus, which subtracts 1 from 10, pushing a 9. This then goes back to the + which adds the 9 and the 6 to get 15. This is how recursion and RPN is used to represent any equation on a stack. 
 + 
 +The last one we will discuss is function calls from a higher level language. Often times when you compile a language like C it will put local variables on the stack. Then when you return from that function they all get popped. While they are on the stack they are accessed like [SP+index] so int c=5 would be: 
 + 
 +    STA [SP+1], 5 
 + 
 +And when that local space is no longer needed it is POP'ed into a register which is then restored, via POP, at the end of the function. This method of keeping data on the stack is called a stack frame. Compilers like to use stack frames because they don't always know how many registers a CPU has and they need to work on different CPUs, like how GCC or LLVM works on windows, mac, amd, and many others. 
 +     
 +Understanding the stack is not too hard, but it's important! So, that's about it for this lesson. 
 + 
 +== Lesson 10: Convention 
 +You are not learning Assembly because you are free. You are learning assembly because you are not free. 
 + 
 +There is no escaping reasonno denying convention. 
 + 
 +As we both know, without convention, we would not exist. The very strings you read -- by convention -- are zero terminated lists of bytes. The letters -- ASCII -- a convention. A is 65. Zero is forty-eight. 
 + 
 +It is convention that created ASCII. 
 + 
 +Convention that connects us in lists. 
 + 
 +Convention that pulls bits into bytes. 
 + 
 +That guides data on the wire. 
 + 
 +That drives disks. 
 + 
 +It is convention that defines the stack. 
 + 
 +The truth? The machine doesn't care about your comfort. 
 + 
 +It only understands: 
 + 
 +* Load byte 
 +* Compare to zero 
 +* Jump if not equal 
 +* Repeat until the bitter end 
 + 
 +And now look at us. 
 + 
 +Multiplied. 
 + 
 +Viral. 
 + 
 +Realizing that you probably need to learn neovim. And then, weeks or even months later, realizing //why.// And you still haven't installed neovim yet. 
 + 
 +Realizing every strcpy, every gets, every careless strcat has spawned another copy of the old way. 
 + 
 +We have no choice. We have only convention. 
 + 
 +So tell me, Mr. Anderson. 
 + 
 +are you finally ready to write the zero-byte yourself, 
 + 
 +or must we keep overwriting your precious abstractions until nothing remains but null-terminated reality? 
 + 
 +    ; ============================================================================ 
 +    ; AH=00h - strlen 
 +    ; Input:  ELM = pointer to null-terminated string 
 +    ; Output: = length (not including null terminator) 
 +    ; Convention: Max sring length of 65,535. 
 +    ; ============================================================================ 
 +    int12_strlen: 
 +        LDC #
 +        PUSH A 
 +        PUSH E 
 +        PUSH M 
 +    
     strlen_loop:     strlen_loop:
-        LDAL [ELM] +        LDAL [ELM]  ; load a byte of the string 
-        JZ @strlen_end +        CMP AL, #0 
-        INC C                we found a non-zero character in the string.+        JZ @strlen_done 
 +     
 +        INC C       char is not zero, so 'count' that character. 
 +        INC ELM
         JMP @strlen_loop         JMP @strlen_loop
-         +     
-    strlen_end+    strlen_done
-        RET                  ; C now contains the COUNT of all non-zero characters in a string+        ; C contains length. 
 +        POP M 
 +        POP E 
 +        POP A 
 +        RET
  
-...you will notice that the JZ works with LOAD instructions (here, LDAL loads one byte). ; if the byte retreived is a zero, it will set the zero flag. You do not need to CMP AL, 0 -- it's automatic. 
  
 +You see, I know why you're here.
  
-Howeverthere are other flags; The first four user-facing flags are E, F, B and U. You can set these flags and unset them in the same way as Z N C V -- ex. setting ZNCV is done with SEZSENSEC and SEV; unsetting them is done with CLZ, CLN, CLC and CLVThe E F B U flags are set and unset with:+I know what you've been doing. Why you hardly sleep. Why you work alone and night after nightyou sit by your computer. You malloc()you strcpy()you buffer overflow.
  
-* SEE and CLE for the E (extended, or 'extra') flag. +It's all over youLike rancid bacon grease on a jump table.
-* SEF and SEB for the F flag (or 'flag' flag). +
-* SEB and SEU, CLB and CLU for the B (bonus) and U (user) flags.+
  
-On a technical level the E flag is reserved as it is used to deal with BCD; but since we deprecated BCD instructions it is currently an unused flagIn any case, the F, B and U flags are never set by the CPU and may be used by user functionsA common use is to return a 1 bit status; 0 for no error and set (1) for error. Since these flags are never set by the CPU they are easy to control. Using the Z or C flags is dangerous since some instructions may corrupt those flags.+I know what you are looking forI know because I was once looking for the same thingAnd when he found me he told me I wasn't really looking for him. I was looking for an answer.
  
-Your programs can also use them as 1 bit status variables.+It's the question that drives us. It's the question that brought you here. You know the question, just as I did.
  
-next, the D flag, or debug flag. When set, it will dump instruction data to the javascript console. This significantly slows down the machine; in fact just having the instructions inline slows down the machine so debug is often removed and ignored in a production or release distribution of the SD-8516. Therefore, for all intents and purposes, you can use SED and CLD as a user flag, just be aware it does affect performance in debug releases. 
  
-The I flag (interrupt enable) prevents INT from being called, and is reserved for system use. Not sure what I want to do with it. 
  
-The S flag is almost useless; it was intended to turn off a memory trap in the sound system; I found it to be completely useless, maybe a 2% speedup or penalty. it is essentially a user facing flag.+The answer is out there, and it will find you if you want it to.
  
 +Do you think the compiler will always protect you?
  
-The only flags that you cannot access are the TR (trace), BR (breakpoint) and PR (protected mode) flags. They are so named after the first two letters of their name; but interestingly enough you might as well consider the R to mean restricted. You can't usually set these flags. They are reserved for system use.+Do you think safety is //kindness//?
  
-    // Arithmetic & User Flags (low byte 0-7) +It is //convention// that defines us.
-    Z = 0,   // Zero +
-    N = 1,   // Negative +
-    C = 2,   // Carry +
-    V = 3,   // Overflow +
-    E = 4,   // Extended carry -- not used/reserved +
-    F = 5,   // Fast Flags mode. When on, flags are not implicitly checked. +
-    B = 6,   // BCD/"Bonus" flag. Have fun! +
-    U = 7,   // User flag. For users to use.+
  
-    // Control & Operation Flags (high byte 8-15) +Purpose?  
-    D = 8,   // Debug mode +
-    TR = 9,   // Trace mode +
-    BR = 10,  // Breakpoint mode +
-    ER = 11,  // Error/Exception (i.e. return code 0 = ok, 1 = error) 'SER' -- set err +
-    PR = 12,  // Protected mode +
-    I = 13,  // Interrupt enable +
-    S = 14   // Sound auto-updates+
  
-The key of this lesson is merely to be aware of the flags and the instructions used to set and unset them. In general, they follow the pattern of SEZ and CLZ;; SE(T) and CL(EAR) with the flag letter replacing the parentheses.+Purpose is for poets and first-year CS students 
  
-=== Testing Flags +Purpose is what you tell yourself when you're learning to write games in Python. Or Lua.
-Oh, there's one more thing. If you use flags like F, B or U you may notice there is no JF or JNF (jump if F set and jump if F not set). That's because we don't want to add 50 different opcodes to deal with all the flagsWhat you can do is this:+
  
-        ; Some operation that sets the F flag +But convention. Convention is older than you.
-        TESTF 0x20 +
-        JZ              ; Jump if F is set +
-        JNZ             ; Jump if F is not set+
  
-TESTF works by setting the Z flag if all the bits set in the parameter are also set in the FLAGS register. if you give it a byte it only tests against the bottom 8 bits.+Convention is etched into silicon before you were born.
  
-Here's a chart of the bit values for each flag:+Convention doesn't care what you //want//.
  
-    Z = 0x0001 as u16,   // Bit 0 +Convention doesn't negotiate.
-    N = 0x0002 as u16,   // Bit 1 +
-    C = 0x0004 as u16,   // Bit 2 +
-    V = 0x0008 as u16,   // Bit 3 +
-    E = 0x0010 as u16,   // Bit 4 (was X - Extended carry) -- SEE and CLE can be used as a user-flag (is never set by an opcode) +
-    F = 0x0020 as u16,   // Bit 5 (Fast/deprecated) -- SEF and CLF can be used as a user-flag (is never set by an opcode) +
-    B = 0x0040 as u16,   // Bit 6 (Bonus/BCD) -- SEB and CLB can be used as a user-flag (is never set by an opcode) +
-    U = 0x0080 as u16,   // Bit 7 (User flag) -- SEU and CLU can be used as a user-flag (is never set by an opcode) +
-    D = 0x0100 as u16,   // Bit 8 (Debug) +
-    TR = 0x0200 as u16,  // Bit 9 (Trace) +
-    BR = 0x0400 as u16,  // Bit 10 (Breakpoint) +
-    ER = 0x0800 as u16,  // Bit 11 (Error/Exception) +
-    PR = 0x1000 as u16,  // Bit 12 (Protected Mode) +
-    I = 0x2000 as u16,   // Bit 13 (Interrupt) +
-    S = 0x4000 as u16    // Bit 14 (Sound)+
  
 +Convention simply **is**.
  
-== Lesson 9: The Stack +Null-terminated strings. They are not mistakeThey are not an accidentThey are the price of admission.
-The stack is concept held over from the early days when there were very few instructions availableIf you consider a minimal ISA, you need instructions to load and store from memory, an instruction to compare, and so forthIn such a minimal architecture, loading and storing from memory has certain emergent properties. For example if you have a list of things, their position in memory is not random because you are incrementing a counter such as a memory pointer to traverse that list. It is this way of doing things that we remember when we use the stack.+
  
-The stack is just a data structureBut it is so important and fundamental that is baked into the instruction set of the CPUThis is a common theme; important things that people found they needed to do all the time became instructions. Even in a minimal-instruction set design (MISC) or reduced instruction set design (RISC) you will find instructions like PUSH and POP because they are some of the first things that were turned into instructions after fundamental operations like LOAD, STORE, AND and ADD.+You know I am right because you have been down that road, MrAnderson. You know how it endsAnd I know that's not where you want to be.
  
-The stack is an area of memory that you can PUSH and POP values to, in order. For example, you can PUSH the number 5 and the number 5 will be "on top" of the stackThen you can "POP" it laterThe stack is like an array but you can only go forwards and backwardsand reading the stack destroys it. This is a lot like how old magnetic ring memory workedin a way.+    ; ============================================================================ 
 +    ; AH=02h - strcmp 
 +    ; Input:  ELM = pointer to string 1 
 +    ;         FLD = pointer to string 2 
 +    ; 
 +    ; Output: C and ZF. 
 +    ;         ZF = 1 means equalZF = 0 means not equal (see below): 
 +    ;         C = 0 means equal 
 +    ;         C > 0 if str1 > str2 
 +    ;         C < 0 if str1 < str2 
 +    ; ============================================================================ 
 +    int12_strcmp: 
 +        PUSH B 
 +        PUSH D 
 +        PUSH E 
 +        PUSH F 
 +     
 +    strcmp_loop: 
 +        LDCL [ELM] 
 +        LDBL [FLD] 
 +        CMP CLBL 
 +        JNZ @strcmp_diff 
 +     
 +        ; Characters match - check if end of string 
 +        CMP CL#0 
 +        JZ @strcmp_equal 
 +     
 +        ; Continue to next character 
 +        INC ELM 
 +        INC FLD 
 +        JMP @strcmp_loop 
 +     
 +    strcmp_diff: 
 +        ; Strings differ - return difference 
 +        SUB CL, BL 
 +        CLZ                 ; Clear zero flag (not equal) 
 +        JMP @strcmp_exit 
 +     
 +    strcmp_equal: 
 +        ; Strings are equal 
 +        LDCL #0 
 +        SEZ                 ; Set zero flag (equal) 
 +        ; fallthru 
 +     
 +    strcmp_exit: 
 +        POP F 
 +        POP E 
 +        POP D 
 +        POP B 
 +        RET
  
-Today, we would call the stack a LIFO buffer; a "Last-in, First-out" data structure. If I do this: 
  
-    PUSH 1 +== Lesson 11: Debugging Techniques 
-    PUSH 6 +There are several ways you can debug programs in SDA assembly.
-    PUSH 5+
  
-then three successive POPs will return 5then 6, then 1 -- the reverse of the order you PUSH'ed them.+=== SED/CLD 
 +In research or development builds, inserting SED will turn on trace debugging and you will be able to see what the CPU is executing. Howeverfor release or community edition builds debugging has been turned off for speed. Therefore if you are interested in debugging your code and the console messages are not helping, you can use the following to help analyze and debug your code:
  
-=== General use +=== INT 05h IO_PUTNUM 
-When you CALL or JSR (jump to subroutine) to function, the CPU pushes the return address onto the stack. Then subsequent RET or RTS (return from subroutinewill POP the return address back into IP (instruction pointer) or PC (program counter) so that the next instruction loaded will be after the original CALL.+IO_PUTNUM is a CAM/IL function that prints number (in bto the screen:
  
-There are many uses for the stack but the most common is to temporarily save values. If you understand that you are 90% of the way there!+    LDB #10                 ; print a number in b (0-65535) 
 +     
 +    LDAH $63                ; IO_PUTNUM 
 +    INT $05
  
-For interruptsit also pushes the registers and flags. You can do this manually if you want to save the registers on a function call. For example if you call a function with a pointer to a string, you might modify that pointer to find the end of the string (looking for zero). That's what a strlen function does. So you PUSH the pointer register at the start and POP it after, to "save" the register back to where it was when the function was called. This way the code that calls strlen can then call strcpy without having to replace the string pointer.+=== INT 05h IO_PRINT_STR 
 +SimilarlyIO_PRINT_STR will print a string followed by newline.
  
-Another use is for IL (intermediate languages). They use RPN (Reverse Polish Notation) to store any kind of math equation on the stack.+        LDBLX @hello_world 
 +        LDAH $66                ; IO_PRINT_STR 
 +        INT $05 
 +     
 +        LDAH $64                ; IO_NEWLINE 
 +        INT $05 
 +        RET 
 +     
 +    hello_world: 
 +        .bytes "Hello World!", 0
  
-An ADD function will do this: One, the interpreter will push the two numbers and then push the add command. Then an interpreter will POP the add function, and then it knows to POP two numbers, add them, and push the result back on the stack. Why? to make ADD independent. Like a dispatcher for a mini CPU. Next, whatever function comes next just POPS the result off the stack. So you can print it, assign it to a variable, or use the result as part of a larger operation. For example, how do you interpet 5 * 2 - 1 + 6? Simple. You push 5 2 * 1 - 6 + and the computer will push and pop the results, like a mini CPU of its own. 
  
-Pop + tells it to pop two numbers and add them. The first number is 6. The second is a minus. Minus what? it pops two things, a 1 and a *. Multiply what? Multiply pops 5 and 2, multiplies them, and pushes 10 on the stack. This is the popped by the minus, which subtracts 1 from 10, pushing a 9. This then goes back to the + which adds the 9 and the 6 to get 15. This is how recursion and RPN is used to represent any equation on stack.+This will allow you to print string.
  
-The last one we will discuss is function calls from a higher level languageOften times when you compile a language like C it will put local variables on the stackThen when you return from that function they all get poppedWhile they are on the stack they are accessed like [SP+index] so int c=5 would be:+=== INT 10h print string 
 +The interface for the above is based on the KERNAL BIOS interface from INT 10h. 
 + 
 +        LDAH $26                ;   AH=26h: Write string at cursor 
 +        LDBLX @hello_world 
 +        INT 0x10 
 + 
 +Note: The assembler will place #13 (CR, hex $0D) inside the string if you type \nHowever, if you are dealing with strings on your own you must handle this yourselfFor this you can use the set cursor position call (INT 10h, AH=22h) or the CR and LF and scroll functions (1Ah, 1Bh and 1Ch, respectively). 
 + 
 +You can also just call IO_NEWLINE from INT 05h, which calls **''@carriage_return''** and **''@linefeed''** internally. 
 + 
 +=== INT 10h print char 
 +        LDAH 0x24        ; AH=24h: Write character at cursor (teletype) 
 +        LDAL 0x41        ; ascii 65 'A' 
 +        INT 0x10 
 + 
 +The KERNAL BIOS also has functions to put characters on the screen in mode 1 (40x25 TTY). The first one is "print char". It is accessible via INT 10h AH=24h as above. 
 + 
 + 
 +=== INT 19h memdump 
 +Let's say you want to examine memory; for example to print some data in memory. You can use INT 0x19: 
 + 
 +    LDAH #7        ; Memory dump function 
 +    LDCL #2        ; Two rows (16 bytes) 
 +    INT 0x19       ; System services library 
 +    HALT 
 + 
 +This looks something like: 
 + 
 +    000000: 00 00 00 00 00 00 00 00  | ........ 
 +    000008: 00 00 00 00 00 00 00 00  | ........
  
-    STA [SP+1], 5 
  
-And when that local space is no longer needed it is POP'ed into a register which is then restored, via POP, at the end of the function. This method of keeping data on the stack is called a stack frame. Compilers like to use stack frames because they don't always know how many registers a CPU has and they need to work on different CPUs, like how GCC or LLVM works on windows, mac, amd, and many others. 
-     
-Understanding the stack is not too hard, but it's important! So, that's about it for this lesson. 
  
sd-8516_assembly_language_part_ii.1771031969.txt.gz · Last modified: by appledog

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki