element14 Community
element14 Community
    Register Log In
  • Site
  • Search
  • Log In Register
  • Members
    Members
    • Achievement Levels
    • Benefits of Membership
    • Feedback and Support
    • Members Area
    • Personal Blogs
    • What's New on element14
  • Learn
    Learn
    • eBooks
    • Learning Center
    • Learning Groups
    • STEM Academy
    • Webinars, Training and Events
  • Technologies
    Technologies
    • 3D Printing
    • Experts & Guidance
    • FPGA
    • Industrial Automation
    • Internet of Things
    • Power & Energy
    • Sensors
    • Technology Groups
  • Challenges & Projects
    Challenges & Projects
    • Arduino Projects
    • Design Challenges
    • element14 presents
    • Project14
    • Project Groups
    • Raspberry Pi Projects
  • Products
    Products
    • Arduino
    • Avnet Boards Community
    • Dev Tools
    • Manufacturers
    • Product Groups
    • Raspberry Pi
    • RoadTests & Reviews
  • Store
    Store
    • Visit Your Store
    • Or choose another store...
      • Europe
      •  Austria (German)
      •  Belgium (Dutch, French)
      •  Bulgaria (Bulgarian)
      •  Czech Republic (Czech)
      •  Denmark (Danish)
      •  Estonia (Estonian)
      •  Finland (Finnish)
      •  France (French)
      •  Germany (German)
      •  Hungary (Hungarian)
      •  Ireland
      •  Israel
      •  Italy (Italian)
      •  Latvia (Latvian)
      •  
      •  Lithuania (Lithuanian)
      •  Netherlands (Dutch)
      •  Norway (Norwegian)
      •  Poland (Polish)
      •  Portugal (Portuguese)
      •  Romania (Romanian)
      •  Russia (Russian)
      •  Slovakia (Slovak)
      •  Slovenia (Slovenian)
      •  Spain (Spanish)
      •  Sweden (Swedish)
      •  Switzerland(German, French)
      •  Turkey (Turkish)
      •  United Kingdom
      • Asia Pacific
      •  Australia
      •  China
      •  Hong Kong
      •  India
      •  Korea (Korean)
      •  Malaysia
      •  New Zealand
      •  Philippines
      •  Singapore
      •  Taiwan
      •  Thailand (Thai)
      • Americas
      •  Brazil (Portuguese)
      •  Canada
      •  Mexico (Spanish)
      •  United States
      Can't find the country/region you're looking for? Visit our export site or find a local distributor.
  • Translate
  • Profile
Clustered MCUs
  • Challenges & Projects
  • Project14
  • Clustered MCUs
  • More
  • Cancel
Clustered MCUs
Blog Project14 | Clustered MCUs:  Functional Safety with Lockstep CPUs
  • Blog
  • Forum
  • Documents
  • Events
  • Polls
  • Files
  • Members
  • Mentions
  • Sub-Groups
  • Tags
  • More
  • Cancel
  • New
Clustered MCUs requires membership for participation - click to join
Blog Post Actions
  • Subscribe by email
  • More
  • Cancel
  • Share
  • Subscribe by email
  • More
  • Cancel
Group Actions
  • Group RSS
  • More
  • Cancel
Engagement
  • Author Author: Jan Cumps
  • Date Created: 25 Aug 2018 1:49 PM Date Created
  • Views 1926 views
  • Likes 14 likes
  • Comments 4 comments
  • clusteredmcuch
  • texas_instruments
  • project14
  • functional_safety
  • hercules
  • ti_rt
Related
Recommended

Project14 | Clustered MCUs:  Functional Safety with Lockstep CPUs

Jan Cumps
Jan Cumps
25 Aug 2018
image

Clustered MCUs

Enter Your Electronics & Design Project for Your Chance to Win a Grand Prize for the craziest project or a $100 Shopping Cart!

Back to The Project14 homepage image

Project14 Home
Monthly Themes
Monthly Theme Poll

 

The Project14 theme for September '18 is Clustered MCUs. The theme page lists a number of multiple CPU configurations. What's not mentioned is lockstep CPUs, a design where multiple physical CPUs are used to check integrity of the processing.

image

Lockstep is a technique to validate the process integrity in hardware, with minimal performance and energy consumption impact. It will detect inconsistencies generated by external and internal events. There's no practical software solution that can offer the same level of verification.

I'm using a TI Hercules Safety Microcontroller to show how this works.

 

 

Lockstep

 

In a lockstep configurations, two identical cores are placed on the same silicon.

One is the main processor. That's the one we use the results from. The other one is used to verify the main one's correctness.

Both of them execute the same instructions, but not at the same time. The second processor (the checker) executes everything with a few clock cycles delay.

Upon each clock tick, the result of the previous activity is compared between processors. If no external events have made the controller glitch, the results have to be the same.

If not, an alert is raised.

 

Because external events (a peak in power, magnetic or electric field interference, what have you) can impact both cores, there are some physical measures to minimise them (avoid common mode errors).

  • Running 1.5 - 2 clock cycle apart takes care that the same event doesn't hit both cores while they are doing exactly the same instruction.
  • Flipping one upside-down in relation to the other takes care that the same impuls doesn't hit the same spot if it travels from bottom to top or vise versa..
  • Rotating one 90° in relation to the other takes care that an external event doesn't go through the code in the same direction when it's an impact from the side.

 

image

source: Texas Instrument paper: Introduction to HerculesTm ARMRegistered CortexTM-R4F MCUs

 

From the whitepaper HerculesTm Microcontrollers: Real-time MCUs for safety-critical products:

 

The lockstep CPU scheme implements a checker CPU, which is hardwired to be fed the same input as the functional CPU.
Two blocks of the same logic, fed the same input, should in theory produce the same output.
A core compare module monitors the outputs of the two CPU cores on a cycle-by-cycle basis and signals any errors to the system.
This near instant fault detection in the Hercules MCU’s comes with little penalty in power consumption and no impact to CPU performance.

Also, in comparison to other elements on the MCU, the size overhead of the lockstep mechanism is minimal.

 

Whenever logic is duplicated, there is always a concern of common mode failure.
To combat common mode failure, TI has implemented multiple best practices on the lockstep CPU subsystem.
Temporal diversity of the two CPU cores is implemented, such that the CPU cores operate 1.5 or 2 cycles out of phase in order to mitigate common mode failure in clocking.
A voltage guard ring is implemented around the CPU cores.
Physical design diversity is implemented by flipping and rotating the checker CPU with respect to the functional CPU.

 

LockStep Test Project Source and Details

 

The project is described in detail - with full source attached - in element14 blog Hercules Safety MCU Demo with Educational BoosterPack.

It exercises the lockstep cores in two ways.

  • It performs a full self-test at the start of the program.
  • It injects a deliberate error in one controller to validate the detection and alerting system.

 

Self-test example

 

The self test (in Hercules lingo LBIST: CPU Logic Built in Self Test) runs a set of tests on both CPUs at the startup of the device.

These tests are comparable to the tests that are run on each Hercules MCU at production time.

image

source: Texas Instrument paper: Introduction to HerculesTm ARMRegistered CortexTM-R4F MCUs

 

 

Error Injection example

 

This test deliberately forces a core compare error situation (testing error scenarios is an important part of safety validation).

It's not easy to create a core error in one of the CPUs. Lucky for us, the manufacturer has provided on-silicon functionality to do that.

When you set one of the MCU's registers to a particular value, the core compare error is triggered. The detection trap should activate and we can test our handling logic.

 

image

 

As written above, the source code for this project is available on blog Hercules Safety MCU Demo with Educational BoosterPack.

The full Code Composer Studio project can also be downloaded from there.

 

Action Video

 

The video shows the functional safety features in action. The lockstep CPU integrity is at the beginning of the demo.

 

You don't have permission to edit metadata of this video.
Edit media
x
image
Upload Preview
image

 

This is not the typical Project14 content, but I hope this little explanation and demo explains the concept of this niche Clustered MCU architecture.

  • Sign in to reply

Top Comments

  • balearicdynamics
    balearicdynamics over 5 years ago +3
    Jan a very good project and explanation. As you say I am just one of these that don't now nothing about this technique. Until now Enrico
  • Jan Cumps
    Jan Cumps over 5 years ago in reply to 14rhb +2
    14rhb wrote: ... Am I correct in understanding this detects hardware faults or issues where the hardware glitches rather than poorly designed software (which I assume would run on both cores and so they…
  • 14rhb
    14rhb over 5 years ago in reply to Jan Cumps +2
    Hi Jan, That is really useful; software faults can still be risk-mitigated using internal/external watchdog timers (albeit with a knock on effect to the actual functionality being undertaken at that time…
  • balearicdynamics
    balearicdynamics over 5 years ago

    Jan a very good project and explanation. As you say I am just one of these that don't now nothing about this technique. Until now image

     

    Enrico

    • Cancel
    • Vote Up +3 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • 14rhb
    14rhb over 5 years ago in reply to Jan Cumps

    Hi Jan,

     

    That is really useful; software faults can still be risk-mitigated using internal/external watchdog timers (albeit with a knock on effect to the actual functionality being undertaken at that time, but at least it cannot hang permanently).

     

    The Lockstep technique must double the processor power consumption, although I guess it is still less overall as the peripheral/external port power consumption will remain the same.

     

    Another great technique to know about, thank you.

     

    Rod

    • Cancel
    • Vote Up +2 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • Jan Cumps
    Jan Cumps over 5 years ago in reply to 14rhb

    14rhb  wrote:

     

    ... Am I correct in understanding this detects hardware faults or issues where the hardware glitches rather than poorly designed software (which I assume would run on both cores and so they would both fail simultaneously, or at least a few clock cycles apart )?

     

    Rod

    Yes, correct. Anything that can generate a non-code related issue in a controller core.

    From the manufacturer: "Hard, transient, and AC fault types can be detected".

    This doesn't detect firmware issues.

     

    The MCU has more hardware checks (memory, peripherals, ...), several of them also with redundant hardware to enable on-silicon traps. The blog referred to in the above post contains some additional checks that I've tested out.

    • Cancel
    • Vote Up +2 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • 14rhb
    14rhb over 5 years ago

    Jan,

     

    This is really interesting stuff. Am I correct in understanding this detects hardware faults or issues where the hardware glitches rather than poorly designed software (which I assume would run on both cores and so they would both fail simultaneously, or at least a few clock cycles apart )?

     

    Rod

    • Cancel
    • Vote Up +1 Vote Down
    • Sign in to reply
    • More
    • Cancel
element14 Community

element14 is the first online community specifically for engineers. Connect with your peers and get expert answers to your questions.

  • Members
  • Learn
  • Technologies
  • Challenges & Projects
  • Products
  • Store
  • About Us
  • Feedback & Support
  • FAQs
  • Terms of Use
  • Privacy Policy
  • Legal and Copyright Notices
  • Sitemap
  • Cookies

An Avnet Company © 2023 Premier Farnell Limited. All Rights Reserved.

Premier Farnell Ltd, registered in England and Wales (no 00876412), registered office: Farnell House, Forge Lane, Leeds LS12 2NE.

ICP 备案号 10220084.

Follow element14

  • X
  • Facebook
  • linkedin
  • YouTube