Recap
I'm building a smart solar monitoring system that uses three panels with a clean reference to eliminate weather effects and directly measure dust-induced losses in real time. One panel stays pristine as a baseline, and comparing the two under identical sky conditions gives a performance ratio that reveals soiling immediately. The goal is to use environmental sensors and edge AI to predict exactly when cleaning is needed, before efficiency drops enough to impact revenue. This beats fixed-schedule cleaning or waiting for output to degrade.Also this is complementary to my Master's thesis of a minute Shape memory Alloy based solar panel cleaning robot
Previous posts:
- SolarSense - Part 1 - Introduction, The POC Built and The Plan
- SolarSense - Part 2 - Can Arduino CAN?
- SolarSense - Part 3 - PCB Schematics Walkthrough
Why CAN?
Because I CAN.. Alright i will stop with the Pun. Why have CAN to say if a panel is soiled or clean? The real challenge comes during the AI training phase. Since I need to experiment with different timeframes of statistical compression (avg, max etc). I need to collect massive amounts of raw sensor data from the three panels over weeks or months to train an edge AI model that can predict soiling. That data volume cannot flow over regular wireless. I am talking about 10 sensor types updating at different rates, with measurements from multiple sample points. LoRa bandwidth would be a bottleneck. WiFi demands power and gets unreliable outdoors. Both add latency and packet loss that corrupt the data stream. Coming to think of it, it is like Stock trading, you need to identify the right timewindow to calculate the indicators of performance.
My plan is to eventually deploy the measurement station at a remote location on a rooftop or pole - away from the gateway for safety and unobstructed sky access. The CAN bus with shielded twisted pair can run 50+ meters reliably without the noise and dropout issues of wireless. That cable will carry all the raw telemetry at full fidelity back to the gateway for logging and later analysis.
CAN also gives me a standardized, well understood protocol. Building custom serial handling for this much data would be error prone and hard to debug. CAN peripheral support is built into the STM32 hardware. The protocol itself enforces packet integrity with checksums and acknowledgments. If a frame gets corrupted, i know about it instead of silently logging garbage.
Stress testing my implementation early on a controlled bus lets me validate reliability before field deployment. If the CAN implementation fails during a rainstorm in Kerala, i will lose months of data. Better to find problems in the lab.
Once the model is trained and deployed remotely, the picture changes completely. At that point, the edge device runs the AI inference on all the raw sensor data locally and only sends back the results - a simple alert saying clean or soiled, confidence level, maybe a status update every hour. That traffic is tiny and LoRa is perfect for it. A few kilobytes per day over LoRa is completely practical. So the architecture splits nicely: CAN for the high bandwidth data collection phase, LoRa for the low bandwidth deployment phase.
The Protocol Design
The protocol uses classic CAN - standard 11-bit IDs and 8-byte payloads at 500 kbps. I avoided CAN FD because the older Nucleo boards do not support it well, and 500 kbps is more than enough for my use case.
I organized the data into three traffic classes by update frequency:
Fast (1 Hz): Panel voltages and currents - these change every second as clouds pass. Four frames, one for each panel and one for UV.
Medium (0.2 Hz): Thermocouple and battery data. These are stable unless the environment shifts significantly.
Slow (0.1 Hz): Air quality, pressure, humidity, temperature, and rain detection. These sensors are sluggish and do not need to be read faster anyway.
The diagnostics channel runs at 1 Hz in both directions - time sync from the gateway to the panel, and a heartbeat from the panel back with sequence numbers and error flags. And of-course these frequency are quite aggressive, and I hope to reduce them down soon within a day or two of data collection. Aha, That gives me an idea. i should be able to set the frequency over CAN Bus. Rather than reprogram it. Why did i not think of that?
The Data Structures
Looking at the code, each sensor type has its own struct. For example, a panel measurement looks like this:
typedef struct {
uint16_t v_mv; // voltage in millivolts
int32_t i_ua; // current in microamps
int32_t p_uw; // power in microwatts (calculated at gateway)
} SS_Panel_t;
All the fixed point scaling is documented in one place so there is no guessing. Voltage is in mV, current in microamps, temperature in degrees times 10 or times 100 depending on the sensor. This keeps the integers and avoids floating point over the bus.
The frame layout is compact. A panel reading takes 6 bytes: 2 for voltage, 4 for current. The gateway computes power on arrival, so we do not waste payload space transmitting it. Total bus load at worst case is only 0.15 percent of 500 kbps capacity - we could handle backfill replays of thousands of missed frames per second if the SD card goes offline temporarily.
The Sender - F103 Nucleo
The Nucleo-F103RB is the first Nucleo board I purchased some 10 years ago when I started my STM32 Journey. but it has a CAN peripheral it may be the older BX standard, lower frame size but that is all i need. The sender firmware is split into two modes.
Loopback mode runs a stress test. All 10 frame types are sent as fast as the CAN peripheral can push them - roughly one frame per millisecond - and the firmware waits for each echo before sending the next. If the byte stream matches exactly, the LED stays on. If anything corrupts, it goes dark. Running this for roughly an hour gives me confidence the pack and unpack functions are symmetric. Truth be said, i stress tested because I wanted to see the limits out of pure curiosity.
Normal mode transmits at the actual needed frequency. One second pulse ticks at 1 Hz and sends all four fast frames. Every five seconds the medium frames go out. Every ten seconds come the slow frames. This matches the data collectors which do not need constant reads anyway.
The pack functions are straightforward - memcpy the struct fields into the eight byte buffer in little-endian order. Here is what the pack function actually does under the hood:
void can_pack_panel(SS_Panel_t *p, uint8_t *buf)
{
buf[0] = (p->v_mv) & 0xFF; // voltage low byte
buf[1] = (p->v_mv >> 8) & 0xFF; // voltage high byte
buf[2] = (p->i_ua) & 0xFF; // current byte 0
buf[3] = (p->i_ua >> 8) & 0xFF; // current byte 1
buf[4] = (p->i_ua >> 16) & 0xFF; // current byte 2
buf[5] = (p->i_ua >> 24) & 0xFF; // current byte 3
}
and
SS_Panel_t p = { .v_mv = 5500, .i_ua = 175000 };
can_pack_panel(&p, tx_buf);
tx_frame(SS_ID_PANEL1, 6);
This generates 6 bytes on the wire: voltage takes 2 bytes, current takes 4 bytes, all in little-endian byte order. So a voltage of 5500 mV becomes 0x5C 0x15 on the wire (5500 = 0x157C, but reversed for little-endian).

The Receiver - Arduino Q with STM32U585
As i covered in Part , the Arduino Q was jealous of my relation with STM32 and so he had Zephyr and arduino-router locking me out of the CAN. The solution was to take control of the STM32U585 directly using STM32HAL. Once i disabled the arduino-router service, i could flash bare metal firmware. Cut out the middle man I say.
The receiver listens passively on the bus. When a frame arrives, it checks the ID and dispatches to an unpack function. The unpack function mirrors the pack logic - it extracts the bytes in little-endian order and reconstructs the struct:
void can_unpack_panel(uint8_t *buf, SS_Panel_t *p)
{
// voltage from bytes 0-1
p->v_mv = buf[0] | (buf[1] << 8);
// current from bytes 2-5
p->i_ua = buf[2] | (buf[3] << 8) | (buf[4] << 16) | (buf[5] << 24);
// compute power
p->p_uw = (p->v_mv * p->i_ua) / 1000;
}
When 6 bytes arrive from the CAN bus, the unpack function reassembles them back into the struct. So the bytes 0x5C 0x15 become 5500 mV again, and 0xF8 0xAB 0x02 0x00 becomes 175000 microamps. Then it computes power on the fly by multiplying voltage times current and dividing by 1000.
The dispatch loop in the receiver routes each frame type to its unpack function:
switch (id) {
case SS_ID_PANEL1: {
SS_Panel_t p;
can_unpack_panel(rxData, &p);
printf("[%06lu] RX 0x101 P1_REF V=%umV I=%ldµA P=%ldµW\r\n",
tick, p.v_mv, p.i_ua, p.p_uw);
break;
}
case SS_ID_PANEL2: {
SS_Panel_t p;
can_unpack_panel(rxData, &p);
printf("[%06lu] RX 0x102 P2_SOIL V=%umV I=%ldµA P=%ldµW\r\n",
tick, p.v_mv, p.i_ua, p.p_uw);
break;
}
case SS_ID_THERMO: {
SS_Thermo_t t;
can_unpack_thermo(rxData, &t);
printf("[%06lu] RX 0x201 TC surface=%.1fC cold=%.1fC\r\n",
tick, t.surface_c10 / 10.0f, t.cold_c10 / 10.0f);
break;
}
}
Each frame ID has a unique handler. When a panel frame arrives, it unpacks to a panel struct and prints the voltage, current, and computed power. When a thermocouple frame arrives, it unpacks to a thermo struct and prints temperatures. This is where the computation happens - the sender does not transmit power at all.
Transceiver Hardware
The F103 Nucleo uses the Waveshare RS485/CAN Shield which sits on top and breaks out the CAN TX and RX lines. This shield has an onboard 120 ohm termination resistor already built in.
The Arduino Q runs the MAX33041EVAL shield which also has CAN transceiver circuitry. When i first connected the two boards, nothing worked. Frames were not getting through at all.
I added a common ground wire between the two boards since CAN is a differential bus and needs a shared reference. That did not fix the communication, but i kept it anyway.
The real issue was the termination resistors. I pulled out the multimeter and checked. The Waveshare board measured 120 ohms. The MAX33041EVAL measured 60 ohms. Google Search AI said "Having different termination values on the same bus creates an impedance mismatch - the signal reflections get mangled and frames corrupt or disappear entirely."
The MAX33041EVAL has jumpers that can change its termination. I adjusted them to disable the 60 ohm termination and enable 120 ohm instead, matching the Waveshare side. Once both sides had 120 ohm, communication became rock solid immediately.
Here are the multimeter readings that showed the problem and the fix:
On the Waveshare CAN Shield

On MAX33401EVAL Board with default Jumpers

On MAX33401EVAL Board with changed Jumpers
Jumper Details as per: https://www.analog.com/media/en/technical-documentation/data-sheets/max33041eshld.pdf
The star mark refers to the default position of jumpers when shipped. So I had to just change the position of the JU7 and JU8


Putting It Together
To test the whole system

- Mount the Waveshare RS485/CAN Shield on the F103 Nucleo
- Mount the MAX33041EVAL shield on the Arduino Q
- Configure the MAX33041EVAL jumpers to disable termination (check the datasheet for which jumpers)
- Wire CAN_H and CAN_L from the Waveshare board to the MAX33041EVAL board via shielded twisted pair
- Compile and flash both boards
- Open two serial terminals one directly, and the other via SSH terminal of ArduinoQ. Still using the same python script as in my Part 2
The F103 prints what it is sending. The Arduino Q prints what it is receiving. If both match line by line, the protocol is working. If frames are getting dropped or corrupted, check that the jumpers on the MAX33041EVAL are set correctly.
Here is a video captured using another App i am experimenting OBS studio. It did allow me to capture just the windows i am interested in without showing others.
What Comes Next
The protocol and firmware are solid. The next step is to integrate the actual sensor readings - right now i am transmitting dummy test data. The STM32L476 on the data collection PCB will read from the INA219B current sensors, the MAX31855 thermocouple interface, the BME280, SEN66, and HX94C humidity sensor, then pack them into CAN frames.
Once that is working, i need to add SD card logging on the collection board as a backup, and implement the time sync heartbeat so the two boards stay in sync. Then the real fun begins - taking all that sensor data and training an edge AI model to predict when the panels need cleaning. The infrastructure is getting solid. The CAN bus is proving to be the right choice - reliable, well understood, and not overkill for the data rates we need.
Final Notes
Building a custom protocol is not something to take lightly. But for a closed system like this where i control both ends, it is worth the effort. The payoff is a bus that handles exactly what i need and nothing more. The stress test gave me the confidence that the implementation works. Next post will be about actually running the sensor board and seeing real data flow across the CAN bus.
There is also a problem with the Arduino Q's STM32 sort of not accepting the CAN frames after few flash attempts, But the problem went out when I restarted the ArduinoQ. The problem is sporadic but I haven't been find the root cause of the problem yet.