High Throughput SPI traffic part 3a - SPI with DMA

27 May 2016

We know SPI as a 4*-wire protocol. But it doesn't have to be.

I'm checking high speed SPI data transfer with buffers, DMA and parallel data lines.

In this blog, I use DMA to hand over data to SPI and let it do its own thing.

SPI Performance will NOT Increase

In the previous blog, we already achieved maximum SPI performance. We're sending all data via SPI in one shot.

In that example we already achieved the maximum yield possible for our Baudrat The actual traffic speed can't go up by adding DMA.

Why would we use DMA then?

DMA to Increase the Troughput and Optimise the Data Channel

So the SPI Baud speed doesn't change, does it? Indeed.

The effective 'speed per bit' is at its maximum (for our current physical implementation with one SPI data line)

What changes though, is that we've freed up our microcontroller core during the SPI call.

In the previous Buffered example, the controller ticks were used to move buffer memory to the SPI module.

During that operation, we don't have the exclusive attention of the controller core.

With DMA, the SPI is running all by itself, and gets the data spoon fed by the DMA module.

void mibspiDmaConfig(mibspiBASE_t *mibspi,uint32 channel, uint32 txchannel)
{
  uint32 bufid  = 0;
  uint32 icount = 0;


  /* setting transmit and receive channels */
  mibspi->DMACTRL[channel] |= ((txchannel) << 16);


  /* enabling transmit and receive dma */
  mibspi->DMACTRL[channel] |=  0x8000C000; // todo: disable receive


  /* setting Initial Count of DMA transfers and the buffer utilized for DMA transfer */
  mibspi->DMACTRL[channel] |=  (icount << 8) |(bufid<<24);

}

void dmaConfigCtrlPacket(uint32 sadd,uint32 dadd,uint32 dsize)
{
   g_dmaCTRLPKT.SADD      = sadd;  /* source address             */
   g_dmaCTRLPKT.DADD      = dadd;  /* destination  address       */
   g_dmaCTRLPKT.CHCTRL    = 0;                 /* channel control            */
   g_dmaCTRLPKT.FRCNT = 1;                 /* frame count                */
   g_dmaCTRLPKT.ELCNT     = dsize;             /* element count              */
   g_dmaCTRLPKT.ELDOFFSET = 4;                 /* element destination offset */
   g_dmaCTRLPKT.ELSOFFSET = 0;          /* element destination offset */
   g_dmaCTRLPKT.FRDOFFSET = 0;          /* frame destination offset   */
   g_dmaCTRLPKT.FRSOFFSET = 0;                 /* frame destination offset   */
   g_dmaCTRLPKT.PORTASGN  = 4;                 /* port b                     */
   g_dmaCTRLPKT.RDSIZE    = 0;  /* read size                  */
   g_dmaCTRLPKT.WRSIZE    = ACCESS_16_BIT;  /* write size                 */
   g_dmaCTRLPKT.TTYPE     = FRAME_TRANSFER ;   /* transfer type              */
   g_dmaCTRLPKT.ADDMODERD = ADDR_INC1;         /* address mode read          */
   g_dmaCTRLPKT.ADDMODEWR = ADDR_OFFSET;       /* address mode write         */
   g_dmaCTRLPKT.AUTOINIT  = AUTOINIT_ON;       /* autoinit                   */
}

void _writeData64DMA() {
    gioSetBit(_portDataCommand, _pinDataCommand, 1);
    mibspiTransfer(mibspiREG3, 2 );
    while(!(mibspiIsTransferComplete(mibspiREG3, 2))) {
    }
}

If we provide more than one buffer, we can set up a round robin system.

We can fill pixels in one buffer while the SPI is outputting the previous one.

And that allows us to gain time. We don't have to spend time to prepare all data up front to have the buffered speed.

We can prepare just enough, and hand over to SPI and DMA.

Meanwhile, we prepare the next chunk.

The Example Program

As you can see in the photo at the beginning of the blog, and in the LA output, it has a bug.

The image i corrupted. I don't send the full data.

That doesn't matter for this blog, because it shows the mechanism. But it's not something to be proud of.

Hercules grey beards, please chime in if you know my error.

Part 3b shows the working version.

The Series
0 - Buffers and Parallel Data Lines
1a - Buffers and DMA
1b - SPI without Buffers
2 - SPI with Buffers
3a - SPI with DMA
3b - SPI with DMA works
4a - SPI Master with DMA and Parallel Data Lines
Hercules microcontrollers, DMA and Memory Cache

Attachments:

RM46_BOOSTXL-EDUMKII-LCD_BITMAP 20160527.zip
3058.lcd_hx8353e.zip
RM46_BOOSTXL-EDUMKII-LCD_BITMAP_20181009.zip

Top Comments

Jan Cumps over 7 years ago

I've attached the latest version, with SPI DMA fully working. It uses DMA to draw a bitmap half a line at a time, instead of pixel per pixel.
I've also attached the CCS project for the LCD Driver. It's available on GitHub but the attached project is easier to start from.

You don't have permission to edit metadata of this video.

Edit media

Dimensions x
Subject (required) Brief Description Tags (separated by comma) Video visibility in search results
Parent content

Poster
Upload Preview
- Cancel
- Vote Up +1 Vote Down
- Sign in to reply
- More
- Cancel
Jan Cumps over 7 years ago

2 years after the facts I finally have the DMA functionality working. Half a line is now sent to the SPI module without CPU involvement.
The SPI module of the Hercules fetches the image data itself (direct memory access) and transmits it to the LCD.

I still need to do some fine-tuning, but glad it works.
- Cancel
- Vote Up +1 Vote Down
- Sign in to reply
- More
- Cancel
fmilburn over 7 years ago

Thanks for outlining this and showing some code. I have used DMA a few times but should look into using it more.
- Cancel
- Vote Up 0 Vote Down
- Sign in to reply
- More
- Cancel