element14 Community
element14 Community
    Register Log In
  • Site
  • Search
  • Log In Register
  • About Us
  • Community Hub
    Community Hub
    • What's New on element14
    • Feedback and Support
    • Benefits of Membership
    • Personal Blogs
    • Members Area
    • Achievement Levels
  • Learn
    Learn
    • Ask an Expert
    • eBooks
    • element14 presents
    • Learning Center
    • Tech Spotlight
    • STEM Academy
    • Webinars, Training and Events
    • Learning Groups
  • Technologies
    Technologies
    • 3D Printing
    • FPGA
    • Industrial Automation
    • Internet of Things
    • Power & Energy
    • Sensors
    • Technology Groups
  • Challenges & Projects
    Challenges & Projects
    • Design Challenges
    • element14 presents Projects
    • Project14
    • Arduino Projects
    • Raspberry Pi Projects
    • Project Groups
  • Products
    Products
    • Arduino
    • Avnet Boards Community
    • Dev Tools
    • Manufacturers
    • Multicomp Pro
    • Product Groups
    • Raspberry Pi
    • RoadTests & Reviews
  • Store
    Store
    • Visit Your Store
    • Choose another store...
      • Europe
      •  Austria (German)
      •  Belgium (Dutch, French)
      •  Bulgaria (Bulgarian)
      •  Czech Republic (Czech)
      •  Denmark (Danish)
      •  Estonia (Estonian)
      •  Finland (Finnish)
      •  France (French)
      •  Germany (German)
      •  Hungary (Hungarian)
      •  Ireland
      •  Israel
      •  Italy (Italian)
      •  Latvia (Latvian)
      •  
      •  Lithuania (Lithuanian)
      •  Netherlands (Dutch)
      •  Norway (Norwegian)
      •  Poland (Polish)
      •  Portugal (Portuguese)
      •  Romania (Romanian)
      •  Russia (Russian)
      •  Slovakia (Slovak)
      •  Slovenia (Slovenian)
      •  Spain (Spanish)
      •  Sweden (Swedish)
      •  Switzerland(German, French)
      •  Turkey (Turkish)
      •  United Kingdom
      • Asia Pacific
      •  Australia
      •  China
      •  Hong Kong
      •  India
      •  Korea (Korean)
      •  Malaysia
      •  New Zealand
      •  Philippines
      •  Singapore
      •  Taiwan
      •  Thailand (Thai)
      • Americas
      •  Brazil (Portuguese)
      •  Canada
      •  Mexico (Spanish)
      •  United States
      Can't find the country/region you're looking for? Visit our export site or find a local distributor.
  • Translate
  • Profile
  • Settings
Transportation & Automotive
  • Technologies
  • More
Transportation & Automotive
Blog Hercules microcontrollers, DMA and Memory Cache
  • Blog
  • Forum
  • Documents
  • Quiz
  • Polls
  • Members
  • Mentions
  • Sub-Groups
  • Tags
  • More
  • Cancel
  • New
Join Transportation & Automotive to participate - click to join for free!
  • Share
  • More
  • Cancel
Group Actions
  • Group RSS
  • More
  • Cancel
Engagement
  • Author Author: Jan Cumps
  • Date Created: 2 Jan 2016 9:17 PM Date Created
  • Views 4933 views
  • Likes 6 likes
  • Comments 20 comments
  • hercules_launchpad
  • texas_instrments
  • cache
  • arm_cortex
  • sci
  • automotive
  • memory_map
  • dma
Related
Recommended

Hercules microcontrollers, DMA and Memory Cache

Jan Cumps
Jan Cumps
2 Jan 2016

DMA and memory cache don't always play nicely together.

I had an issue when trying to use serial communication and DMA on a TI Hercules controller. The DMA data wasn't appearing in my read buffers.

TI's application specialists helped me to resolve my issues. It was related to ARM memory cache settings.

I've written a step-by-step guide on the hackster.io forum on that subject.

 

You don't have permission to edit metadata of this video.
Edit media
x
image
Upload Preview
image

 

Due to the subject, this hasn't turned into an easy-to-read novel. I hope it may help fellow developers struggling with similar issues.

 

When you're playing with DMA, you think everything is working ok, and you don't see any data written to your variables, this post may be worth looking into.
It's based on an SCI example, but applicable in many DMA situations.

 

Default Cache configuaration is write-back:

image

For DMA relevant buffers, you can specify a chunk of RAM with write-through configuation.

image

There's more to do than just setting aside a portion of the memory. Check the video for a quick overview, and the hackster.io blog post for a step-by-step explanation with a working project attached.

 

The Series
0 - Buffers and Parallel Data Lines
1a - Buffers and DMA
1b - SPI without Buffers
2 - SPI with Buffers
3a - SPI with DMA
3b - SPI with DMA works
4a - SPI Master with DMA and Parallel Data Lines
Hercules microcontrollers, DMA and Memory Cache
  • Sign in to reply

Top Comments

  • clem57
    clem57 over 9 years ago +2
    Thank you on describing cache coherence with DMA. This is a common occurrence since the microcontroller own's cache resource on the chip and not the external DMA circuits which can only see RAM. Clem
  • clem57
    clem57 over 9 years ago in reply to Jan Cumps +2
    i found him! Clem
  • DAB
    DAB over 6 years ago in reply to Jan Cumps +2
    It still sounds like it steals CPU cycles unless there is a full isolation with separate address and data lines into the memory. I have seen designs with that approach so that the I/O processor can access…
Parents
  • DAB
    DAB over 9 years ago

    Great post Jan.

     

    DMA can be very useful, but I want to make sure that everyone understands that DMA works by stealing CPU cycles.

    Each DMA action will slow down your normal algorithm processing.  Usually by a small amount, but if you use a lot of DMA, it can add up to be a significant affect on your timing budgets.

     

    DAB

    • Cancel
    • Vote Up +1 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • Jan Cumps
    Jan Cumps over 9 years ago in reply to DAB

    It seems that in the Hercules family, the memory DMA transfer doesn't use any CPU cycles:

     

    Application Report: SCI With DMA

    Page 5:
    Without the DMA the CPU would have to move data to the SCI after every buffer sized bytes. With the DMA the CPU is free for the entire message transmission.

    I couldn't find a firm source outside of that document

    • Cancel
    • Vote Up +1 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • Jan Cumps
    Jan Cumps over 9 years ago in reply to Jan Cumps

    By the way, I've used a little endian processor. It seems to turn around the MS and LS part of a word. I'll retry the same program with a big endian TMS570...

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • Jan Cumps
    Jan Cumps over 6 years ago in reply to DAB

    DAB  wrote:

     

    It would take an interesting hardware design to make it happen without affecting overall CPU performance.

    The easiest way would be to copy the data into a separate memory buffer, but the copying would still use CPU cycles.

     

    If you find out if they indeed can do DMA without CPU cycle stealing, let me know.  They should have a very interesting architecture solution to make it happen.

     

    DAB

    The DMA gives the peripheral controller direct access to a chunk of memory. I tell the DMA logic and the peripheral controller where that memory is. The Peripheral can then crunch on that memory while the main CPU does other jobs.

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • DAB
    DAB over 6 years ago in reply to Jan Cumps

    It still sounds like it steals CPU cycles unless there is a full isolation with separate address and data lines into the memory.

    I have seen designs with that approach so that the I/O processor can access the memory exclusively while the CPU is locked out.

    They use a dual buffer approach so that each processor has exclusive access while the other processor works on the other.

     

    DAB

    • Cancel
    • Vote Up +2 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • Jan Cumps
    Jan Cumps over 6 years ago in reply to DAB

    DAB  wrote:

     

    It still sounds like it steals CPU cycles unless there is a full isolation with separate address and data lines into the memory.

    I have seen designs with that approach so that the I/O processor can access the memory exclusively while the CPU is locked out.

    They use a dual buffer approach so that each processor has exclusive access while the other processor works on the other.

     

    DAB

     

    From http://www.ti.com/lit/an/spna213/spna213.pdf :

     

    ABSTRACT
    This application report summarizes the necessary steps to setup the direct memory access controller (DMA) to transfer data between the SCI and the data

    RAM of the microcontroller, freeing the CPU during the entire message transmission.

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • DAB
    DAB over 6 years ago in reply to Jan Cumps

    Hi Jan,

     

    The key words are freeing the CPU during message transmission.

    Yes the DMA controller must still stop the CPU when it transfers the 32bit word into memory.

    The DMA controller takes care of each byte transfer and builds the 32bit word without interfering with the CPU during each byte.

     

    So the best you can say about his method is that it does 32bit transfers where traditional DMA would steal a cycle very byte.

    From that perspective this method is 75% more efficient over traditional DMA byte transfer.

     

    It is still cycle stealing.

     

    DAB

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
Comment
  • DAB
    DAB over 6 years ago in reply to Jan Cumps

    Hi Jan,

     

    The key words are freeing the CPU during message transmission.

    Yes the DMA controller must still stop the CPU when it transfers the 32bit word into memory.

    The DMA controller takes care of each byte transfer and builds the 32bit word without interfering with the CPU during each byte.

     

    So the best you can say about his method is that it does 32bit transfers where traditional DMA would steal a cycle very byte.

    From that perspective this method is 75% more efficient over traditional DMA byte transfer.

     

    It is still cycle stealing.

     

    DAB

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
Children
  • Jan Cumps
    Jan Cumps over 6 years ago in reply to DAB

    I've rechecked, DAB ,

    In my example, I send 64 chunks of 16 bits:

    image

     

    I have an example and a few Hercules controllers that support DMA and buffered SPI. I could set up an example and do some measurements ...

    That could be a winter project ...

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • DAB
    DAB over 6 years ago in reply to Jan Cumps

    Hi Jan,

     

    While I can see how the controller can buffer the data, I still think at some point it needs to steal CPU cycles to move the data into memory.

    I agree that it is probably less intrusive than the traditional DMA on each byte, but there is still going to be a defined cycle cost in the end.

     

    No rush on testing. As I have said before, I used to analyze computer architectures and have found many instances where DMA devices had more affect on the CPU than advertised.

     

    DAB

    • Cancel
    • Vote Up +1 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • Jan Cumps
    Jan Cumps over 6 years ago in reply to DAB

    DAB  wrote:

     

    Hi Jan,

    ...

     

    No rush on testing. As I have said before, I used to analyze computer architectures and have found many instances where DMA devices had more affect on the CPU than advertised.

     

    DAB

    My curiosity is triggered. It's also a chance to get better at working with DMA on these controllers - I'm particularly bad at it.

     

    I've been able to port a TI example to the controller that I have, but it does not work. Not a single bit comes out of the SPI controller.

    I've created a ticket over at ti: https://e2e.ti.com/support/microcontrollers/hercules/f/312/t/736834

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • Jan Cumps
    Jan Cumps over 6 years ago in reply to Jan Cumps

    Jan Cumps  wrote:

     

    DAB   wrote:

     

    Hi Jan,

    ...

     

    No rush on testing. As I have said before, I used to analyze computer architectures and have found many instances where DMA devices had more affect on the CPU than advertised.

     

    DAB

    My curiosity is triggered. It's also a chance to get better at working with DMA on these controllers - I'm particularly bad at it.

     

    I've been able to port a TI example to the controller that I have, but it does not work. Not a single bit comes out of the SPI controller.

    I've created a ticket over at ti: https://e2e.ti.com/support/microcontrollers/hercules/f/312/t/736834

    The ticket at the TI E2E community is solved and I have that speedy mechanism working now:

     

    • reads 64 16-bit values from RAM in chunks of 64 bits in DMA (4 values in a single read),
    • then writes these in chunks of 16 bits to SPI buffer (maximum write size. DMA controller does the folding from 64 to 16)
    • 4 parallel SPI output lines to send 4 bits per clock tick - this is called parallel pin mode (or Quad mode in some documents for this 4-pin setup). The receiver has to support this. The Hercules supports up to 4 parallel SPI pins in and out.

     

    Here's a capture.Every CS block clocks away 64 16-bit values and checksum/parity info.

    Only SIMO 0 and 2 are captured by the logic analyser in the image below. I did not connect 1 and 3.

    image

     

    image

    • Cancel
    • Vote Up +1 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • DAB
    DAB over 6 years ago in reply to Jan Cumps

    Ok, so you get four 16-bit values in one DMA pull from memory, which is more efficient than pulling them one at a time.

     

    It will make the SPI transfer mostly transparent, unless you are moving a lot of data over your SPI busses.

     

    DAB

    • Cancel
    • Vote Up +1 Vote Down
    • Sign in to reply
    • More
    • Cancel
element14 Community

element14 is the first online community specifically for engineers. Connect with your peers and get expert answers to your questions.

  • Members
  • Learn
  • Technologies
  • Challenges & Projects
  • Products
  • Store
  • About Us
  • Feedback & Support
  • FAQs
  • Terms of Use
  • Privacy Policy
  • Legal and Copyright Notices
  • Sitemap
  • Cookies

An Avnet Company © 2025 Premier Farnell Limited. All Rights Reserved.

Premier Farnell Ltd, registered in England and Wales (no 00876412), registered office: Farnell House, Forge Lane, Leeds LS12 2NE.

ICP 备案号 10220084.

Follow element14

  • X
  • Facebook
  • linkedin
  • YouTube