31 January 2025
So today I want to revisit the la vs li vs lui confusion that I am experiencing about how I’m initializing the stack pointer. Having thought that the matter was settled, I deleted the other ‘control’ instructions that I was using for comparison. Luckily for me, I jotted it all down in my previous notes, so it was there when I needed it. Here is the reconstructed version:
la sp, END_OF_RAM # initialize the stack pointer
la sp, 0x20005000 # *** debug *** for comparison
lui sp, %hi(END_OF_RAM) # initialize the stack pointer
la t0, END_OF_RAM # *** debug *** for comparison
la t0, 0x20005000 # *** debug *** for comparison
And here is what it gets crunched down to after assembly:
la sp, END_OF_RAM # initialize the stack pointer
0: 20005137 lui sp,0x20005
la sp, 0x20005000 # *** debug *** for comparison
4: 20005137 lui sp,0x20005
lui sp, %hi(END_OF_RAM) # initialize the stack pointer
8: 20005137 lui sp,0x20005
la t0, END_OF_RAM # *** debug *** for comparison
c: 200052b7 lui t0,0x20005
la t0, 0x20005000 # *** debug *** for comparison
10: 200052b7 lui t0,0x20005
So I had noticed that the first three instructions all resulted in the same instruction, 0x20005137, while the last two gave me a puzzling 0x200052b7. And in the clear light of day I see that is not the catastrophe I originally thought it was, because they are not initializing the stack pointer, but the t0 register, for a comparison.
Just to be doubly sure, I debug the code a line at a time and watch the registers in real time. Again, I spy the stack pointer holding the mysterious value 0x20002800 before any of the code is executed in this debug session.
The first instruction does what I want, which, in this particular instance was what I told it to do. The second instruction also seems to work properly, but since there was no net observable change of state, I might want to introduce intentional intermediate values just to make sure that the correct value is really being re-written to the stack pointer. The third instruction follows the pattern, leaving sp standing at 0x20005000.
The fourth instruction sets t0 to decimal value 536891392, which translates to 0x20005000, so that’s good. The fifth and final instruction does the same.
So the only mystery remaining is why I forgot that I had intentionally introduced a wrinkle into the testing methodology without following up on it correctly. It helps to write things down.
All this work to understand how to succinctly initialize the stack pointer when I’m not even using it yet. That, I think, is about to change.
Now I can get on to the important business of creating not one but two fake reset buttons for the development board of a chip that has no external reset signal available.
So the plan is to use an external interrupt line to trigger a routine that initiates a self-reset of the chip. So I will start at the very end and write a short routine that resets the chip and then call it when the user types a particular key on the console. And to do that, I need to modify the present code to wait for a character to arrive from the USART, echo it and then check if the reset key has been pressed and jump to the reset routine if it has. It would also be nice if the LED continued to blink, if not all by itself then perhaps every time a character was received.
Right now the USART initialization code spits out a single ‘!’ character to show that the thing is working.
I have just discovered that the GNU assembler treats ‘//’ as the single-line comment pattern, in addition to ‘#’. The old-school /* comment */ format is also supported. I only saw this because MRS2, which evolved from Eclipse, will append ‘// ‘ (including a space) on selected lines of code then you press the command+/ keys, similar to control+/ on other OSes.
So I will create a function that toggles the LED, so that I don’t have to keep track of two peripheral pointers at once.
The first function is always the most interesting. I vaguely remember how to do this in RISC-V assembly. First you decrement the stack pointer by how many bytes of memory your function will need, then preserve any registers that you need to preserve for the caller, if any. Then you do whatever is appropriate for the function to do and then back out the way you came in: ‘pop’ any preserved values off the stack and then return the stack pointer to its original value.
Now this makes me start thinking about whether my use of t0/t1 in the initialization code was the best choice. Technically, since no functions were called and no interrupts were suspected, it didn’t matter which registers I used. But now that we’re entering the grown-up world of proper functions and accountability, perhaps I should switch over to the ‘saved’ registers, s0-s11. They are also known as:
x8 s0/fp saved register 0, frame pointer
x9 s1 saved register 1
x18 s2 saved register 2
x19 s3 saved register 3
x20 s4 saved register 4
x21 s5 saved register 5
x22 s6 saved register 6
x23 s7 saved register 7
x24 s8 saved register 8
x25 s9 saved register 9
x26 s10 saved register 10
x27 s11 saved register 11
Note that s2-s11 are not present at all on the RV32EC devices, such as the CH32V003.
Again, this is only if I want to stay reasonably aligned with the published ABI, which I am under no obligation whatsoever to observe. I have zero intention at this point of ‘cooperating’ with any other software on this project, so I have the dizzying freedom to do as I think best. And what I think is best does tend to shift a bit over time.
For reference, here is the RISC-V calling conventions as codified from the source:
https://riscv.org/wp-content/uploads/2024/12/riscv-calling.pdf
There is one caveat to that statement, however. Both the QingKeV2 and QingKeV4 processors have a feature called Hardware Prologue and Epilogue (HPE). This described in the V2 PM in Secction 3.4, p. 14 and V4 PM p. 17. This mechanism is triggered by either an interrupt or an exception and saves and restores either 10 (V2) or 16 (V4) of the ‘caller saved’ registers to and from the stack.
The V2 decrements to stack pointer by 48 before the push and adds it back afterward. The V4 saves the list of registers to an internal stack in a single cycle, then restores them when appropriate.
V2 saved registers:
x1 ra
x5 t0
x6 t1
x7 t2
x10 a0
x11 a1
x12 a2
x13 a3
x14 a4
x15 a5
V4 saved registers:
x1 ra
x5 t0
x6 t1
x7 t2
x10 a0
x11 a1
x12 a2
x13 a3
x14 a4
x15 a5
x16 a6
x17 a7
x28 t3
x29 t4
x30 t5
x31 t6
Updating to the latest Processor Manual (PM) for the QingKeV2, V1.2, I am reminded of questions I have yet to solve, such as what is the EABI mode on/off option controlled from the Interrupt System Control Register (INTSYSCR), at CSR address 0x804?
There is a typo in the PM V4 in Table 1-2 RISC-V Registers, where registers x18-27 are referred to as a2-11, when they should be called s2-11. Oopsie.
I’m not 100% sure how I can leverage this hardware capability to my advantage yet, but it’s nice to know it’s there. How much should it affect my choice of working registers?
So I think I’m going to switch over to using s0/s1 for the initialization code and thenceforth into the future. So that seems to work OK, and I can now write a function to toggle the LED and call it from a loop and see what breaks next.
Here’s what the program looks like now with an endless loop that just calls the led_toggle function forever:
loop: # an endless loop
call led_toggle # toggle the LED
j loop # do it again
led_toggle: # toggle LED1 on A0
# on entry: none
# on exit: none
addi sp, sp, -16 # allocate space on the stack
sw s0, 12(sp) # preserve s0
sw s1, 8(sp) # preserve s1
la s0, GPIOA_BASE
lh s1, GPIO_OUTDR(s0) # read present value of GPIO_OUTDR
xori s1, s1, (1 << 0) # toggle bit 0
sh s1, GPIO_OUTDR(s0) # write inverted value back
lw s0, 12(sp) # restore s0
lw s1, 8(sp) # restore s1
addi sp, sp, 16 # restore stack pointer
ret # return from function
It’s nice to leave a comment in the function as to what is expected on entry and exit. In this example, no arguments are passed into the function and no return value is expected.
Alas, TextEdit does not allow me to indent a block of text. I can do it in the MRS2 code editor, copy it to the clipboard and then un-indent it in MRS2, then paste it into TextEdit.
Note that I am allocating four words of space on the stack in preparation for saving the registers, when I only need 2 words. This is because the ABI says to allocate space on the stack in 16 byte (128 bit) blocks. Now the ABI is not the boss of me, but sometimes I find that I need one more register and having allocated a bigger-than-needed block comes in handy. It doesn’t cost anything in execution time.
Now I need a function to see if the USART has received any characters.
usart_rxne: # return USART1 RXNE receive register not empty status
# on entry: none
# on exit: a0[0] = USART1_RXNE
addi sp, sp, -16 # allocate space on the stack
sw s0, 12(sp) # preserve s0
la s0, USART1_BASE
lh a0, USART_STATR(s0) # read status register
andi a0, a0, USART_RXNE # isolate RXNE receive register not empty status bit
snez a0, a0 # set not equal to zero
lw s0, 12(sp) # restore s0
addi sp, sp, 16 # restore stack pointer
ret # return from function
So since I am going to use the s0 register as the peripheral pointer to USART1, I preserve its current value on the stack. Setting s0 to USART1_BASE, I can read in the status register STATR and mask out all the bits except USART_RXNE, leaving only the status bit. Now this particular status bit happens to be in bit position 5, leaving either a zero (receive register is empty) or a 0x20 (receive register not empty). Using the pseudoinstruction snez, I effectively set bit 0 to a 1 or a zero, depending on the value in the register. This makes it easier for the calling function to interpret the results, i.e., true or false, than expecting it to know which bits means what in every kind of peripheral status register.
Now I can write another function to actually receive and return a single character from the USART, using the usart_rxne function to tell if there’s anything there yet or not. Now in truth it wold also be possible to structure this another way. I prefer to do it this way, as I might want to have a separate routine to tell if a key has been pressed, e.g., kb_hit().
Here is the code to reset the system:
system_reset: # reset the system
# on entry: none
# on exit: does not return
la s0, PFIC_BASE
li s1, 0xFA050000 # key 1
sw s1, PFIC_CFGR(s0)
li s1, 0xBCAF0000 # key 2
sw s1, PFIC_CFGR(s0)
li s1, 0xBEEF0000 | PFIC_RESETSYS # key 3 + system reset request bit
sw s1, PFIC_CFGR(s0)
1: j 1b # loop here until reset occurs
I don’t really know if the loop at the end does any good or not, but there’s no specification on how long it takes for the system reset to take hold. We don’t want any more code executing after this point.
I added some code to get a character from the USART and echo it back to the console, then check to see if it was a 0x00 character (control+space) that was received. If it was, it just jumps to the system_reset function. The code there just feeds the three key values to the PFIC_CFGR register, with the last key having the SYSRST bit set (bit 7). The SVD calls it PFIC_RESETSYS.
This seems to work. The chip seems to reset and print a ‘!’ character, then just echoes back anything else typed in, until you type control+space, then you see another ‘!’ appear. Also the LED is toggled after every keystroke, and gets reset to ‘on’ after a reset.
I added a usart_puts function to print a nul-terminated string via USART1. You can declare a constant text string to print like this:
announce_string: .asciz "G8-asm\r\n"
And the code to print it is this:
la a0, announce_string # announce
call usart_puts
So now all I have to do is to wire an external interrupt to trigger the system_reset code.
The fisrt target is the ‘Download’ button on the board. It is connected between Vcc and PC17. So I need to setup PC17 in the GPIO initialization section as an input with a pull-down resistor. I also have to enable the peripheral clock for GPIOC.
Enabling the GPIOC peripheral clock was easy. I just added the RCC_IOPCEN value to the list:
# enable peripheral clocks
li s1, RCC_USART1EN | RCC_IOPCEN | RCC_IOPBEN | RCC_IOPAEN
sw s1, RCC_APB2PCENR(s0)
Configuring PC17 is going to be a little different than all the other GPIO pins I’ve initialized so far. First of all, you might have noticed that its number (17) is out of the ‘normal’ range of 0-15 that most of the GPIO in this series as well as the STM32 devices share. So its configuration is handled through the USB PD periperhal. PC17 is also used as the USB-PD ‘PDM’ signal.
Further reading of the RM (always rewarding, even if not right away) reveals that there are a couple of ‘expansion’ registers for the higher-order GPIO bits, like our little friend PC17.
Ah, I have mis-read the documentation. It’s not the USB PD peripheral that’s connected to PC17, it’s the normal, regular, standard USB-DM line of the USB full speed device. As that is presently connected to the i8 wireless dongle, pushing the button is not going to be immediately detectable by my simple software techniques.
I don’t think it will be possible to use the ‘Download’ button as an alternative reset button.
What I do think is that I ought to think about it some more and get back to it tomorrow.
In other news, the i8 battery is still running.