Notes on RISC-V Assembly Language Programming - Part 18

12 February 2025

I can fit the scalar values for each glyph into a one-dimensional array. Then I need an array of pointers to variable-length arrays of coordinates. Others have been able to do all this with a single array, but I see a lot of wasted space in there.

I’m trying to decide ahead of time if I need to reproduce the left column and right column values in the representation array, or if I can just get away with character widths. Or do I even need to keep track of the character widths? I could just treat these as monospaced characters and just pick a number.

Here are the leftest and rightest columns from the plain set:

Max left = (-2, 9)
Min left = (-8, 1241)
Max right = (9, 1273)
Min right = (2, 9)

And here are the same statistics from the simplex set:

Max left = (-4, 509)
Min left = (-15, 613)
Max right = (15, 613)
Min right = (4, 509)

After I’m ‘done’ with these scalable fonts, there’s one more bit-mapped font trick I want to try. I can take my existing 5×8 font and double or triple it in size, giving a blocky character. That might better represent the types of letters and numbers seen on temporary highway signs, as those still tend to be composed of 5×7 (or so) LED matrices. But when am I ever ‘done’ with anything?

So I am going to assume that I need all this data for now, and incorporate it into some data structures and try to port them over to the project and see if I can plot some nice looking characters onto the little OLED screen.

The first array encodes the ASCII value of the character as the index, so that doesn’t need an actual slot in the data file. In reality, since the first 32 ASCII characters are technically unprintable, our array index [0] points to ASCII value 32, the space, which, ironically, while a ‘printable’ character, does not print anything. This offset is just something that has to be remembered.

Each entry in the array will be a typedef’d structure containing the requisite information:

Information         Plain       Simplex
------------------  --------    ---------
Number of vertices  (0, 38)     (0, 56)
Left hand column    (-8, -2)    (-15, -4)
Right hand column   (2, 9)      (4, 15)
Coordinate index

Note that these sampled data values only represent the two subsets, roman plain and roman simplex. Using any of the other styles will have different values. Just for completeness, here are the statistics for the entire occidental glyph set:

Statistic           Value   Character
------------------- ------  ---------
Max vertices        143     3323
Max left            0       197
Min left            -41     907
Max right           41      907
Min right           0       197
Max character width 82      907
Max x               41      907
Min x               -41     907
Max y               41      907
Min y               -48     2411
Max dx              40      796
Min dx              -29     2825
Max dy              78      2405
Min dy              -80     2411
------------------- ------
Total vertices      47,465

Just looking at the total number vertices, and remembering that each vertex will require a minimum of two bytes for storage, we see that this little device with its 62K of flash memory will not be big enough to render every one of these characters, without adding an external memory device of some sort. So for now, I’ll content myself with the plain and simplex roman variations.

The vertex encoding get tantalizingly close to a single byte per coordinate pair. However, I want to also encode the ‘pen up’ information, which I use to distinguish ‘move to’ and ‘draw to’ commands. If I felt like running histograms on these data sets, I might be able to see a further pattern or trend that would allow me to use a look-up table for these values. But I am going to leave that as an exercise for you, my Dear Reader. I have to draw the line, somewhere.

So it looks like our benefactor, Dr. Hershey, was on to something when he originally encoded his coordinates as pairs of single digits. I’m not going to use his precise technique, although it will still end up as 16 bits of data per vertex. I’m just folding in the out-of-band ‘pen up’ condition to each coordinate pair.

Reviewing the summary, it looks like our friend character 906 is bringing home all the gold medals. It’s the ‘very large circle’ glyph, and I’m going to disqualify it for being an outlier. This is the one that broke my Python script and simplistic transmission encoding. It’s a lovely pentacontagon, or fifty-sided polygon, and therefore the smoothest of the approximated circles in the repertory.

Statistic           Value   Character
------------------- ------  ---------
Max vertices        143     3323
Max left            0       197
Min left            -27     2411
Max right           24      2381
Min right           0       197
Max character width 46      992
Max x               22      906
Min x               -24     2411
Max y               39      2403
Min y               -48     2411
Max dx              40      796
Min dx              -29     2825
Max dy              78      2405
Min dy              -80     2411
------------------- ------
Total vertices      47,415

So for the vector array, each vector will be a typedef’d struct holding the x and y coordinates as signed integers, as well as a boolean ‘pen up’ flag to distinguish ‘move to’ from ‘draw to’. Since the x axis shows a slightly smaller range of values, I’ll squeeze the ‘pen up’ flag into the x side, perhaps like this:

typedef struct { // vertex data
    int         x:7;        // x coordinate
    PEN_UP_t    pen_up:1;   // 'pen up' flag
    int         y:8;        // y coordinate
} VERTEX_t;

So I’ll need to add some more to my little Python script to generate the data for these two arrays, then emit it in a close approximation of my C coding style.

It took a bit of fiddling and also some back-and-forth to get the data structures ‘just right’, but I was able to port over both the plain and simplex roman character sets and have them plot out on the OLED screen. One thing that tripped me up was the vertex count. The original definition file described a ‘vertex count’ that also included the left and right column data as an additional vertex. Also, it counted, as it should, the ‘PEN_UP’ codes. These two little deviations that I introduced into the True Form sure made things look weird on the little screen for a while. But I eventually realized the error of my ways and corrected the code. Now it runs through either the plain set or the simplex set with the greatest of ease. Drawing a single character at a time happens so quickly, it seems almost instantaneous. I’ll have to try printing out a whole screen of text and see if I can tell how long it’s taking.

Next I’ll need to see about scaling these ‘scalable’ fonts to fit my imagined sizes for the different formats I’d like to support. I also need to look at the big-blocky font I suggested previously.