CH32V003 driving WS2812B LEDs with SPI – Part 4

1 March 2025

I am both displeased and a little disappointed to find my experiment still running today, with over 40,000 successful loops and zero errors.

In order to help convince myself that this is not just a statistically unlikely run of “good luck”, I will eliminate the delays between color changes that I had inserted into the code to make the color changes more obvious to us slow(er)-brained humans.

Once upon a time, a long, long time ago, I worked on a project that needed to work perfectly every time, and be able to check that it had worked perfectly every time. We were running millions of encryption and decryption cycles on blocks of data and were seeing some very rare cases of mistakes creeping in. We were able to catch the mistakes, but were ever so curious as to what was causing them. Sound familiar? So I set up a test to run over the weekend and let the test machine run full speed ahead. Returning on Monday, we found it had caught five errors in just over 50,000,000 transactions. Unacceptable!

We contacted the manufacturer of one of the critical components of the device and were told that, because of the way we had configured the circuit, the component could encounter a “race condition” and double-clock itself, if a particular signal arrived within a one nanosecond window that varied plus or minus five nanoseconds over temperature. That’s a tiny window! But we were hitting it on a reproducible scale.

The solution in that case was to clock the device synchronously by providing our own clock signal to the chip instead of depending on its internal clock. That way all the transactions would have been rigidly in step and not exploring the spaces of all possible timing combinations. Unfortunately, we had already committed to a PCB design and were on the verge of production when the whole outfit went south. Printed circuit board design and production were things to be Taken Seriously back in the day.

And by “went south” I mean the owner cleaned out the bank account and disappeared, literally leaving us at the office saying, “Stay here and I’ll go get your paychecks from the bank myself.”

But away from past mistakes and back to present mistakes. Over five million loops with no errors seems to indicate to me that the code, when properly enhardwared, works as designed. Now I need to run it up on the lift and swap the original circuit back in and see if we can continue to reproduce the error states we were previously seeing.

So at first it looked like the impossible was happening: everything now worked and yet I had changed nothing. But patience won by asking me to take a break and come back in a few minutes. When I did, I saw, just as I was sitting down, a run of errors being logged on the console.

So the next variation on the testing got underway. I wanted to try disconnecting the SPI output from the PA2 line and drive a different WS2812B externally to the board. I set about finding another suitable LED module and building another little test cable for it. Again, when I sat back down at the desk, another run of errors was simultaneously occurring. What are the odds?, one might ask.

A judicious tap-tap-tapping on the little breadboard circuit rewarded me with my answer: 100% guaranteed to fail when vigorously agitated. An intermittent connection is somehow to blame for all this mess.

Here is my list of possible candidates for where the issue lies, in decreasing order of probability (in my mind):

1.  Janky test cables made from exceedingly economical jumper wires
2.  Interconnects in the no-name solderless breadboard hosting the circuit
3.  My soldering of the header pins to the PCB
4.  Manufacturing error or tolerances in the board itself

Before I completely disassemble this prototype and build it up again in a more resilient form factor, I will go ahead and try the new LED module. Same problem, as errors continue to be encountered, even with the PA2-PC6 bridge disconnected. Disconnecting the new LED module completely, while at the same time not re-connecting the onboard LED still encounters errors. I really thought it would have no measurable effect, and I seem to be right this one time.

Usually in these situations, the first thing I look for is some sort of power interruption or brown-out condition on the power supply. The reason I don’t think this is the culprit is because the chip does not seem to be resetting itself when these errors occur, as both the loop and error counts seem to persist across these error states. Additionally, I feel that both the CH32V003 and the WS2812B are correctly and adequately decoupled using their respective manufacturers’ suggested values of capacitors.

Now one thing I have not yet done is to activate the chip’s inbuilt power monitoring circuitry. Perhaps that could tell me if there are sufficient variations in chip’s internal power distribution occurring that could cause individual peripherals to misbehave without triggering a complete system reset.

Reading about the power control peripheral, I see that it can monitor the system voltage and potentially trigger an interrupt if certain parameters are exceeded. Using the SDK, I see the first reasonable thing to do is to ‘de-initialize’ the peripheral using the PWR_DeInit() function, contained in the /Peripheral/inc/ch32v00x_pwr.h and /Peripheral/src/ch32v00x.c files. Here’s what the function does:

RCC_APB1PeriphResetCmd(RCC_APB1Periph_PWR, ENABLE);
RCC_APB1PeriphResetCmd(RCC_APB1Periph_PWR, DISABLE);

But… wait a minute. Isn’t that the “brick myself so hard” sequence I found earlier? Let’s find out!

And the answer is… yes. Yes, it does. Recovery consists of the following steps.

In the MRS2 IDE, go to the Flash -> Download Configuration menu option.
In the “Download Parameters” panel, go to the bottom and check (enable) these options:

1.  Turn off WCH-Link Power Output
2.  Clear CodeFlash by Power-Off
3.  Disable MCU Code-Protect

Now take that PWR_DeInit() function call out of your program! Your program should download properly and run again.

Now having configured and enabled the “programmable voltage detector” circuit (and don’t forget to enable the PWR peripheral’s clock, like I did!), I see that the chip thinks its supply voltage is just fine. I set it to the highest voltage, ~4.4 VDC, and actually measured 4.75 VDC at the board. The chip is rated to run at full speed all the way down to 2.7 V, or 2.8 V if’n you’re wanting ADC function, so it’s able to detect any voltage anomalies in this manner. Of course, the next step is to make the voltage monitoring an asynchronous process and have it trigger an interrupt, but we both know it’s my wiring.

I’ll wire up a more robust test fixture on the morrow.