element14 Community
element14 Community
    Register Log In
  • Site
  • Search
  • Log In Register
  • Community Hub
    Community Hub
    • What's New on element14
    • Feedback and Support
    • Benefits of Membership
    • Personal Blogs
    • Members Area
    • Achievement Levels
  • Learn
    Learn
    • Ask an Expert
    • eBooks
    • element14 presents
    • Learning Center
    • Tech Spotlight
    • STEM Academy
    • Webinars, Training and Events
    • Learning Groups
  • Technologies
    Technologies
    • 3D Printing
    • FPGA
    • Industrial Automation
    • Internet of Things
    • Power & Energy
    • Sensors
    • Technology Groups
  • Challenges & Projects
    Challenges & Projects
    • Design Challenges
    • element14 presents Projects
    • Project14
    • Arduino Projects
    • Raspberry Pi Projects
    • Project Groups
  • Products
    Products
    • Arduino
    • Avnet & Tria Boards Community
    • Dev Tools
    • Manufacturers
    • Multicomp Pro
    • Product Groups
    • Raspberry Pi
    • RoadTests & Reviews
  • About Us
  • Store
    Store
    • Visit Your Store
    • Choose another store...
      • Europe
      •  Austria (German)
      •  Belgium (Dutch, French)
      •  Bulgaria (Bulgarian)
      •  Czech Republic (Czech)
      •  Denmark (Danish)
      •  Estonia (Estonian)
      •  Finland (Finnish)
      •  France (French)
      •  Germany (German)
      •  Hungary (Hungarian)
      •  Ireland
      •  Israel
      •  Italy (Italian)
      •  Latvia (Latvian)
      •  
      •  Lithuania (Lithuanian)
      •  Netherlands (Dutch)
      •  Norway (Norwegian)
      •  Poland (Polish)
      •  Portugal (Portuguese)
      •  Romania (Romanian)
      •  Russia (Russian)
      •  Slovakia (Slovak)
      •  Slovenia (Slovenian)
      •  Spain (Spanish)
      •  Sweden (Swedish)
      •  Switzerland(German, French)
      •  Turkey (Turkish)
      •  United Kingdom
      • Asia Pacific
      •  Australia
      •  China
      •  Hong Kong
      •  India
      • Japan
      •  Korea (Korean)
      •  Malaysia
      •  New Zealand
      •  Philippines
      •  Singapore
      •  Taiwan
      •  Thailand (Thai)
      • Vietnam
      • Americas
      •  Brazil (Portuguese)
      •  Canada
      •  Mexico (Spanish)
      •  United States
      Can't find the country/region you're looking for? Visit our export site or find a local distributor.
  • Translate
  • Profile
  • Settings
Raspberry Pi
  • Products
  • More
Raspberry Pi
Raspberry Pi Forum What does 24GFLOPs GPU speed mean?
  • Blog
  • Forum
  • Documents
  • Quiz
  • Events
  • Polls
  • Files
  • Members
  • Mentions
  • Sub-Groups
  • Tags
  • More
  • Cancel
  • New
Join Raspberry Pi to participate - click to join for free!
Featured Articles
Announcing Pi
Technical Specifications
Raspberry Pi FAQs
Win a Pi
Raspberry Pi Wishlist
Actions
  • Share
  • More
  • Cancel
Forum Thread Details
  • Replies 8 replies
  • Subscribers 676 subscribers
  • Views 1338 views
  • Users 0 members are here
Related

What does 24GFLOPs GPU speed mean?

Former Member
Former Member over 12 years ago

The 3rd bullet at: 

http://www.element14.com/community/docs/DOC-52938/l/raspberry-pi-model-a--now-available-in-eu

 

says GPU is capable of 24GFLOPs. 

Is this single-precision or double-precision?

Is it sustained or burst-mode only?

Does it rely on operands being in cache memory?

What clock frequency is the GPU?

How many parallel functional units?

What benchmark is used to measure the speed?

 

  • Sign in to reply
  • Cancel
  • shabaz
    shabaz over 12 years ago

    I found this definition. As an example, apparently the Nvidia GeForce 8400GS (2007 era entry-level desktop computing GPU) had approx 40GFLOP capability.

    Many GPUs have hundreds of processors on the chip apparently. (I'm no expert in this area).

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • Cancel
  • jason721
    jason721 over 12 years ago

    Per Wikipedia a GFLOP is a Giga(Billion) Floating-point Operations Per Second. That can be calculated by

    image

    http://en.wikipedia.org/wiki/FLOPS

     

    All info provided by Boardcom about the BCM2935 which uses the VideoCore 4 GPU.

    http://www.broadcom.com/products/BCM2835

     

    I the link below states that the default GPU is set at 250MHz

    http://elinux.org/RPiconfig

     

    That is about all I was able to dig up. I will move this thread over to the Raspberry Pi Group to see if anyone there can provide furnther input.

     

    Thanks,

    Jason G.

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • Cancel
  • Former Member
    Former Member over 12 years ago in reply to jason721

    Hi Jason,

       I really did intend this as a question for Element14 to answer via Feedback and Support,

    asking what does Element14 mean when they claim 24GFLOPS performance,

    rather than a question for discussion between RPi users as to what we think

    Element14 is claiming.

       I don't see anything at the Broadcom page you mentioned that claims 24GFLOPS.

       User shabaz cites a definition requiring 64-bit precision, which I agree is standard,

    but I don't think I've seen any claims that the GPU can even do 64-bit floating point.

       I presume the claim is that a single RPi does 24GFLOPS, so I'm not sure why you

    selected Wikipedia's long formula that includes sockets/node and nodes/chassis.

       There is a claim here:

    http://www.element14.com/community/groups/raspberry-pi/blog/2012/02/29/raspberry-pi-gpu-performance-dominates-the-iphone-4s-and-tegra-2-based-smartphones

    that the RPi VideoCore IV is "dual core", but I think that's a mistake.

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • Cancel
  • johnbeetem
    johnbeetem over 12 years ago in reply to Former Member

    It's been a long time since I was active in supercomputing, but here's my understanding.  AFAIK, there's no requirement that FLOPS have to be double-precision.  There's no reason for a GPU to do double precision, so I can't imagine why Broadcom or anyone else would do so.

     

    Generally, GFLOPS is a marketing number so the specs are as optimistic as possible.  It's generally a "peak" number, and according to one wag it's "a guarantee from the manufacturer that you won't go faster than this".  So they generally just multiply FLOPS times the clock rate, and say "sure, you could go this fast if you could get the operands to the arithmetic hardware".  That's usually very hard to do in general, so it's only with highly structured computations like inner products that you can actually get close to the peak.  If the problem is embarassingly parallel enough and you do some hand-tuning, you can get pretty close.

     

    Another spec game: the main processing element of a floating-point engine is usually multiply-accumulate, since it's easy to do both at the same time.  These are counted as two FLOPs, so the GFLOPS rating immediately doubles.

     

    As far as benchmarks are concerned, I note that chez RasPi a certain moderator often claims that VideoCore 4 is higher performance than its competitors, but I've never seen him quote an independent study.  If anyone has seen any, let us know.

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • Cancel
  • Former Member
    Former Member over 12 years ago

    The core has a 3D hardware acccelerator which consists of a bundle of procesors each capable of running SIMD (Single intruction mulitple data) 32-bit floating point calculations. If you multiply the processors, the data words and the clock you get the 24Gflops/sec.

     

    Post edit: forgot that this is alongside the GPU itself which has a vector floating point processor which does 32, 64 or 80 bit precision floating point operations.

    Which again is along side two general purpose (integer) cores. All of that is working with special H264 hardware accelrated video encode/decode HW and a plethora of other special multi-media cores.

    • Cancel
    • Vote Up +2 Vote Down
    • Sign in to reply
    • Cancel
  • Former Member
    Former Member over 12 years ago in reply to Former Member

    Gert,

      Thanks for the reply.

    We've been told previously that the GPU has a vector unit, but that

    it is integer only, so it wouldn't be possible to develop floating-point

    libraries, such as BLAS.  James wrote:

     

      "BLAS has lots of floating point, and the vector core is integer only unfortunately.

      I'll take a look at the interfaces posted though - should save a bit of specing time."

    http://www.raspberrypi.org/phpBB3/viewtopic.php?p=14633

     

    So is there a vector floating-point unit in addition to a vector

    integer unit?  And if so, would it be theoretically possible to

    develop libraries like BLAS that could demonstrate something

    close to 24GFLOPS?

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • Cancel
  • Former Member
    Former Member over 12 years ago

    I was part of the team that build the  3D engine so I know it has vector floating point.

    There are a number of 'issues with that:

    1/ You are unlikely to get access to the details how to program it. It is very special and very valuable IP.

    2/ The core is designed for graphic (pixel) calculatons so it has only very few of the IEEE exceptions build in.

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • Cancel
  • Former Member
    Former Member over 12 years ago in reply to Former Member

    I think it is well understood that the GPU IP is proprietary, but many people

    have asked whether it was possible for Broadcom to develop an

    interface that exposes the 24GFLOPS of performance, but the answer

    has always been that it could perhaps be done for integers but not for floats.

    So this is an interesting relevation if in fact a floating-point array library

    could in theory be developed, even without IEEE exceptions.

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • Cancel
element14 Community

element14 is the first online community specifically for engineers. Connect with your peers and get expert answers to your questions.

  • Members
  • Learn
  • Technologies
  • Challenges & Projects
  • Products
  • Store
  • About Us
  • Feedback & Support
  • FAQs
  • Terms of Use
  • Privacy Policy
  • Legal and Copyright Notices
  • Sitemap
  • Cookies

An Avnet Company © 2025 Premier Farnell Limited. All Rights Reserved.

Premier Farnell Ltd, registered in England and Wales (no 00876412), registered office: Farnell House, Forge Lane, Leeds LS12 2NE.

ICP 备案号 10220084.

Follow element14

  • X
  • Facebook
  • linkedin
  • YouTube