element14 Community
element14 Community
    Register Log In
  • Site
  • Search
  • Log In Register
  • Members
    Members
    • Achievement Levels
    • Benefits of Membership
    • Feedback and Support
    • Members Area
    • Personal Blogs
    • What's New on element14
  • Learn
    Learn
    • eBooks
    • Learning Center
    • Learning Groups
    • STEM Academy
    • Webinars, Training and Events
  • Technologies
    Technologies
    • 3D Printing
    • Experts & Guidance
    • FPGA
    • Industrial Automation
    • Internet of Things
    • Power & Energy
    • Sensors
    • Technology Groups
  • Challenges & Projects
    Challenges & Projects
    • Arduino Projects
    • Design Challenges
    • element14 presents
    • Project14
    • Project Groups
    • Raspberry Pi Projects
  • Products
    Products
    • Arduino
    • Avnet Boards Community
    • Dev Tools
    • Manufacturers
    • Product Groups
    • Raspberry Pi
    • RoadTests & Reviews
  • Store
    Store
    • Visit Your Store
    • Or choose another store...
      • Europe
      •  Austria (German)
      •  Belgium (Dutch, French)
      •  Bulgaria (Bulgarian)
      •  Czech Republic (Czech)
      •  Denmark (Danish)
      •  Estonia (Estonian)
      •  Finland (Finnish)
      •  France (French)
      •  Germany (German)
      •  Hungary (Hungarian)
      •  Ireland
      •  Israel
      •  Italy (Italian)
      •  Latvia (Latvian)
      •  
      •  Lithuania (Lithuanian)
      •  Netherlands (Dutch)
      •  Norway (Norwegian)
      •  Poland (Polish)
      •  Portugal (Portuguese)
      •  Romania (Romanian)
      •  Russia (Russian)
      •  Slovakia (Slovak)
      •  Slovenia (Slovenian)
      •  Spain (Spanish)
      •  Sweden (Swedish)
      •  Switzerland(German, French)
      •  Turkey (Turkish)
      •  United Kingdom
      • Asia Pacific
      •  Australia
      •  China
      •  Hong Kong
      •  India
      •  Korea (Korean)
      •  Malaysia
      •  New Zealand
      •  Philippines
      •  Singapore
      •  Taiwan
      •  Thailand (Thai)
      • Americas
      •  Brazil (Portuguese)
      •  Canada
      •  Mexico (Spanish)
      •  United States
      Can't find the country/region you're looking for? Visit our export site or find a local distributor.
  • Translate
  • Profile
RoadTests & Reviews
  • Products
  • More
RoadTests & Reviews
Polls DSP  Vs.  CORTEX-M4 with dsp & fpu
  • Blog
  • RoadTest Forum
  • Documents
  • RoadTests
  • Reviews
  • Polls
  • Files
  • Members
  • Mentions
  • Sub-Groups
  • Tags
  • More
  • Cancel
  • New
RoadTests & Reviews requires membership for participation - click to join
Actions
  • Share
  • More
  • Cancel
Engagement
  • Author Author: coolbox
  • Date Created: 22 Oct 2014 5:00 AM Date Created
  • Last Updated Last Updated: 11 Oct 2021 3:01 PM
  • Views 3113 views
  • Likes 0 likes
  • Comments 13 comments
Related
Recommended

DSP  Vs.  CORTEX-M4 with dsp & fpu

Although the boundary between DSP & MCU is vanished still I want to here from you that what is your first choice when playing with digital signal.Is it a complete DSP like TI320F28xx or high performance cortex M4 core MCU.

  • cortex-m4
  • fpu
  • dsp
  • Share
  • History
  • More
  • Cancel
  • Sign in to reply

Top Comments

  • dougw
    dougw over 9 years ago in reply to vsluiter +3
    Hi Victor, The Cortex-M7 has a 6-stage superscalar pipeline which can execute two instructions simultaneously. The Cortex-M4 can execute just one instruction at one time. The Cortex-M7 is built on a 28…
  • dougw
    dougw over 9 years ago +2
    TI DSPs have a lot of historical support and many examples to help get started. Cortex M4 and OMAP CPUs are great to minimize components in a system. Also check out the new Cortex M7 - much better DSP…
  • Kilohercas
    Kilohercas over 9 years ago +2
    DSP at same clock frequency will destroy arm cortex m4 mcu. ARM architecture is very flexible for complex task. DSP on the other hand, have hardware specially designed for making lot of calculation in…
  • DAB
    DAB over 6 years ago

    As a Systems Engineer, I always like combining functional control with DSP capability.

     

    I used this approach back in 1982 when I combined an Intel 8085 with a 9811 math coprocessor chip so that I could maintain a FLIR camera using a Doppler radar on board a Helicopter to support search and rescue operations.  I used the 8085 to control all of the sensors while keeping the 9811 chip fully occupied doing floating point trigonometric calculations to derive the position angles for the servos.

     

    Lots of fun and the pilots loved the result.

     

    DAB

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • michaelkellett
    michaelkellett over 6 years ago

    I haven't voted because I don't really think the basic question has a simple answer but it's certainly interesting to talk about.

     

    I do lot's of DSP on Cortex Mx processors, I've done designs for clients that are based on DSP on M0+, M3 and M4, I 'll do M7 too when the dust has settled on the STM32H7 parts..

     

    In the past I've used proper DSP chips (mainly AD) and I still support a customer project based on 320F2808.

     

    As ever you have to use the part for the job.

     

    Most of my current DSP work is done on FPGA, which gives you a step up on both performance and flexibility, but with a terrible hit on development cost and time.

     

    But I agree with LInas, DSP chips are optimised for DSP stuff and clock for clock massively outperform general purpose processors at DSP work. On the other hand, gneral purpose DSPs (I'm thinking Cortex Mx here) ahve got pretty good and can oftne eat up the work of small DSPs, my 320F2808 design would go on a  CortexM4 if I did it today.

     

    M

    • Cancel
    • Vote Up +1 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • shabaz
    shabaz over 7 years ago in reply to dougw

    Hi Doug,

     

    I agree. A DSP is a fully independent processor (standalone or multiple cores with or without other processor cores). NEON just takes advantage of the fact that it is possible to (relatively) rapidly fill wide registers from contiguous memory locations, and then execute some things in parallel using these registers. NEON doesn't have a separate program counter or control unit, it shares it with the main CPU. Whereas a DSP can run autonomously reading in data, processing it and writing it. NEON is 'fed' with wide data as part of the normal execution of the main CPU - some notes on it here: BBB, NEON and making Tintin bigger

    Basically it is an accelerator, like an FPU for example. (I'm just a beginner with NEON, I may have made mistakes in my understanding).

     

    The only point I think I was trying to make at the time was regarding this:

    compiler had option to use NEON core, and also, code did run faster with this option enabled, so my guess is that NEON did make some impact on code execution speed

    However one can't turn on a flag and expect C code to actually use NEON, because no C operation maps to NEON instruction(s), nor does gcc today get a 'hint' that some part of the code could be translated into NEON instructions. The only known way of invoking NEON functionality in C is (a) to explicitly type NEON 'intrinsic' functions into the code, or (b) include code or link to libraries that have code written to take advantage of NEON (and if that code is written in C then it will be using NEON intrinsics). Parts or all of the code could also be written in assembler (e.g. inline assembler) and the NEON instruction set would be used.

    Since ARM is popular nowadays, there are many libraries that have been written to use NEON (i.e. they explicitly have #ifdef's and #includes with NEON specific code when compiling for ARM+NEON), so one doesn't need to be aware of it and it could be used. But only if you are using such libraries (usually easy to tell, there would be some information on the project page, or 'grep' the source code).

     

    With a DSP you can code in normal C it will execute on the DSP, since it has a control unit and program counter like any usual processor), but they have more sophisticated compilers nowadays (I'm no expert) which may get some hints from certain C code structure. Also to intensively control and make benefit of the DSP and use the performance and features of it fully, the usual intrinsics or inline assembler is still possible too. The only DSP I've used is the 56k architecture, which is quite ancient..

    • Cancel
    • Vote Up +2 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • dougw
    dougw over 7 years ago in reply to shabaz

    I gather the NEON instruction capability in M4 and M7 is not competing directly with a full DSP - it is to allow the M series to handle "DSP" tasks a little better than previous ARM instruction sets. A modern DSP should still significantly out-perform a Cortex M7 at DSP tasks.

    • Cancel
    • Vote Up +1 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • shabaz
    shabaz over 9 years ago in reply to Kilohercas

    Hi!

    You do need the flag, but GCC won't do anything with it unless your code (or any library code you're linking in) uses the special functions (intrinsics), i.e. the code needs to be specifically designed to use NEON.

    So, perhaps your linked-in code was designed for NEON.

    If you can explicitly code to get your data into (say) 8-bit values and can parallel-ize, then using the special functions, basic operators should accelerate you 5-10 times from what I understand (and I also saw this with limited experiments at the time). Perhaps it still won't compare with a DSP if the use-case is fractals generation (I have no idea).

    Software libraries (video/image libraries I suppose) are available designed to take advantage of NEON if it exists.

    • Cancel
    • Vote Up +1 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • Kilohercas
    Kilohercas over 9 years ago in reply to shabaz

    compiler had option to use NEON core, and also, code did run faster with this option enabled, so my guess is that NEON did make some impact on code execution speed

    • Cancel
    • Vote Up +1 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • shabaz
    shabaz over 9 years ago in reply to Kilohercas

    Hi Linas,

     

    Unless your code had special functions to support NEON (or used a library of code which uses such functions), then your code didn't use NEON capability. GCC doesn't make use of NEON automatically (even though SHARC's compiler may), so code needs to explicitly use NEON (there are functions that GCC understands) for SIMD acceleration to occur.

    I investigated NEON a while back for the BBB (I'm no expert on NEON) and I had to explicitly call special functions that GCC understood and replaced with NEON instructions.

    • Cancel
    • Vote Up +1 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • Kilohercas
    Kilohercas over 9 years ago in reply to vsluiter

    multiple problems for each processor.

    This test is regarding complex math inside VERY long loop. STM32F407 is low clocked not efficient processor to do math, so it is very slow.
    Beaglebone does have NEON core, that is small dsp for parallel computation. I don't know did my code explore this core, well, compiler was set in correct way, so neon should be used, since  SIMD can be used for multiplications for multiple numbers

    ADSP-21489 is very strong at for loops, with good compiler performance, thats why even at 1/2 clock, it give much higher fps to bigger screen.

    This test simply means that if you want to do math, use DSP, not cortex M/A/R core

    • Cancel
    • Vote Up +1 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • vsluiter
    vsluiter over 9 years ago in reply to Kilohercas

    ... But then what is your comparison? Is the Beaglebone the weakest link, or the STM32? In the first case, you're comparing a DSP to an Application processor.

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
  • Kilohercas
    Kilohercas over 9 years ago

    DSP at same clock frequency will destroy arm cortex m4 mcu.


    ARM architecture is very flexible for complex task. DSP on the other hand, have hardware specially designed for making lot of calculation in very short time.

    My ADSP-21489 running at 450MHz is MUSCH MUCH MUCH more faster than beaglebone (Arm cortex m8 with NEON Core (Small DSP)) running at 1,2GHz
    take a look ( note screen fro Cortex is 320x240 while for dsp much larger 48x272, and still, making much more FPS. (also i have video how fast is softcore processor is nios 2f)

    You don't have permission to edit metadata of this video.
    Edit media
    x
    image
    Upload Preview
    image
    You don't have permission to edit metadata of this video.
    Edit media
    x
    image
    Upload Preview
    image

    note that for STM32F407 and Beaglebone, you can clearly see calculation speed by line that is updating screen, for ADSP it is invisable, since it is pushing lot of fps to much larger screen around 2x larger, meaning 2x more calculations

    • Cancel
    • Vote Up +2 Vote Down
    • Sign in to reply
    • More
    • Cancel
>
element14 Community

element14 is the first online community specifically for engineers. Connect with your peers and get expert answers to your questions.

  • Members
  • Learn
  • Technologies
  • Challenges & Projects
  • Products
  • Store
  • About Us
  • Feedback & Support
  • FAQs
  • Terms of Use
  • Privacy Policy
  • Legal and Copyright Notices
  • Sitemap
  • Cookies

An Avnet Company © 2023 Premier Farnell Limited. All Rights Reserved.

Premier Farnell Ltd, registered in England and Wales (no 00876412), registered office: Farnell House, Forge Lane, Leeds LS12 2NE.

ICP 备案号 10220084.

Follow element14

  • X
  • Facebook
  • linkedin
  • YouTube