I have been building and shipping close to 50 boards for one these boards for my clients (a DMX based Animatronics controller board). I usually ship lots of ten boards to him and he uses them as needed and sends a payment for the boards as they are used. From time to time a board fails and he holds on to them and returns them when he gets a few bad boards. 90% of the time, the boards are fine (bad configuration, i.e. erroneous settings, bad address, etc.). I can usually 'fix' the boards by forcing the configuration to factory settings and then reloading his settings file. Recently I received three failed boards, which turned out to be actual failures. One of the boards had a blown MOSFET that is used to act as a programmable fuse for the servo power rails (programmable current level and trip delay). This failure has occurred before, caused by a dead short at the servo connector (the trip delay was set to long and the MOSFET overheated before the delay expired and the MOSFET is turned off).
It was the other failures (both boards seemed to have identical issues) that puzzled me greatly. In this case, the device was mostly functional, the DMX reciever was able to receive commands (serial port on the processor/RS485 transceiver), but the servos (12 mux'ed drivers) and the LEDs (RGB) were not commanding correctly. After resetting the configuration settings and reflashing the firmware, the problems still existed. I started probing around and found multiple strange occurrences. First, the timer/PWM signals were non-responsive (fixed signal on one channel and none on the other). The select lines that I used to multiplex the PWM signals into the individual servo channels were not sequencing. Also, the GPIO pin that I used to strobe the LEDs was not being driven correctly. All of this pointed to a problem with the microprocessor (ATMEGA328PB). It is not too often that I have come across a microprocessor that mostly works, so my first thought was to double check the soldering on the processor pins (TQFP-32). Everything looked fine.
After thinking and delaying (not really happy about having to remove/desolder chips), I started to evaluate the costs and labor versus just scrapping the boards. I decided that I would attempt to rescue the board and a started the removal process on the first board. What a total pain! Working on the processor, on a fully populated board (I usually solder all of the IC onto the board first, passives next and then connectors and such last) was going to be a lot more difficult than the assembly process. First I need to remove the 2x3 header used to program the device. My default approach to multi-pin IC removal is to first clip all of the pins, then I remove the chip and cleanup all of the pads (removing any of the clipped leads). With limited access, it was almost impossible to clip the leads. I then used solder wick to remove as much solder as possible and then 'pop' each pin off of the pads by using an exacto knife as a pry tool, while heating the solder joint.
I was able to remove the processor, only damaging two of the pads (oddly, they were not uses, upper left of second photo). Note: these images were shot through the eyepiece on my microscope that I use for soldering.
I then reloaded the firmware on the new processor and re-tested the assembly. The device worked as expected, fully restoring functionality. When reviewing my notes/contacts with this client, I found information that might explain the failures. I guess the client was attempting to control a much larger, linear actuator that required 12 volts (automotive wiper motor) and I as I remember there were multiple failures during his process (external driver between the DMX controller, servo output and the external H-Bridge to drive the actuator). I can only image that there issues sequencing the power (5V on my board and 12V on every thing else) that might have taken out part of (not all) of the functionality of my board/processor.
Just thought that I would share this debug/repair story.
Top Comments