element14 Community
element14 Community
    Register Log In
  • Site
  • Search
  • Log In Register
  • Community Hub
    Community Hub
    • What's New on element14
    • Feedback and Support
    • Benefits of Membership
    • Personal Blogs
    • Members Area
    • Achievement Levels
  • Learn
    Learn
    • Ask an Expert
    • eBooks
    • element14 presents
    • Learning Center
    • Tech Spotlight
    • STEM Academy
    • Webinars, Training and Events
    • Learning Groups
  • Technologies
    Technologies
    • 3D Printing
    • FPGA
    • Industrial Automation
    • Internet of Things
    • Power & Energy
    • Sensors
    • Technology Groups
  • Challenges & Projects
    Challenges & Projects
    • Design Challenges
    • element14 presents Projects
    • Project14
    • Arduino Projects
    • Raspberry Pi Projects
    • Project Groups
  • Products
    Products
    • Arduino
    • Avnet & Tria Boards Community
    • Dev Tools
    • Manufacturers
    • Multicomp Pro
    • Product Groups
    • Raspberry Pi
    • RoadTests & Reviews
  • About Us
  • Store
    Store
    • Visit Your Store
    • Choose another store...
      • Europe
      •  Austria (German)
      •  Belgium (Dutch, French)
      •  Bulgaria (Bulgarian)
      •  Czech Republic (Czech)
      •  Denmark (Danish)
      •  Estonia (Estonian)
      •  Finland (Finnish)
      •  France (French)
      •  Germany (German)
      •  Hungary (Hungarian)
      •  Ireland
      •  Israel
      •  Italy (Italian)
      •  Latvia (Latvian)
      •  
      •  Lithuania (Lithuanian)
      •  Netherlands (Dutch)
      •  Norway (Norwegian)
      •  Poland (Polish)
      •  Portugal (Portuguese)
      •  Romania (Romanian)
      •  Russia (Russian)
      •  Slovakia (Slovak)
      •  Slovenia (Slovenian)
      •  Spain (Spanish)
      •  Sweden (Swedish)
      •  Switzerland(German, French)
      •  Turkey (Turkish)
      •  United Kingdom
      • Asia Pacific
      •  Australia
      •  China
      •  Hong Kong
      •  India
      • Japan
      •  Korea (Korean)
      •  Malaysia
      •  New Zealand
      •  Philippines
      •  Singapore
      •  Taiwan
      •  Thailand (Thai)
      • Vietnam
      • Americas
      •  Brazil (Portuguese)
      •  Canada
      •  Mexico (Spanish)
      •  United States
      Can't find the country/region you're looking for? Visit our export site or find a local distributor.
  • Translate
  • Profile
  • Settings
Raspberry Pi
  • Products
  • More
Raspberry Pi
Raspberry Pi Forum Has everone exPired in here?
  • Blog
  • Forum
  • Documents
  • Quiz
  • Events
  • Polls
  • Files
  • Members
  • Mentions
  • Sub-Groups
  • Tags
  • More
  • Cancel
  • New
Join Raspberry Pi to participate - click to join for free!
Featured Articles
Announcing Pi
Technical Specifications
Raspberry Pi FAQs
Win a Pi
Raspberry Pi Wishlist
Actions
  • Share
  • More
  • Cancel
Forum Thread Details
  • Replies 23 replies
  • Subscribers 679 subscribers
  • Views 2031 views
  • Users 0 members are here
  • raspberry_pi
Related

Has everone exPired in here?

Former Member
Former Member over 13 years ago

Well all the schools are back. Raspberry Pi shipments seem to be going pretty swimmingly now. I really expected this place to be hot enough with discussions to bake a Pi. (Sorry all) image

 

Seriously, what is everybody up to I am still working hard on improving my soldering skills and pushing the Pi to its limits whenever I can.

 

Dark nights drawing in now so peering up to the night sky takes over from solder school soon.

 

Have fun all.

 

Ray

  • Sign in to reply
  • Cancel
Parents
  • GreenYamo
    GreenYamo over 13 years ago

    I've been on holiday with a lack of internet connectivity - a sad thing indeed !

     

    I'm thinking of making one of my Pi's a permanent media player (using RaspBMC) and for the other, I need to build my Slice of Pi from Ciseco and then I will be doing a bit of Arduino like interfacing.

     

    I know this is a pi forum, but i'll also be getting my Nanode (Networked Arduino) up and running again now the colder times are here and working in my computer room isn't like working in a small furnace :-)

     

    Steve

    • Cancel
    • Vote Up +1 Vote Down
    • Sign in to reply
    • Cancel
  • Former Member
    Former Member over 13 years ago in reply to GreenYamo

    Good to see you back with Internet connection again then Steve. image Funny you should mention furnaces. I have just be reading on RPi.org site about the abilty now to monitor the internal temp of the SOC on the Pi and "official" overclocking. So we can now use our Pi's to monitor their own internal furnace temps. It all looks quite useful actually. image

     

    Ray

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • Cancel
  • Former Member
    Former Member over 13 years ago in reply to Former Member

    The overclocking story is a hoot.  They expect us to believe that

    a 43% overclock (1000/700=1.43) results in a 75% benchmark

    improvement for LU decomposition (135.12/77.374=1.75).

     

    I suspect there's something they aren't telling us, like maybe they

    used a newer version of gcc for the faster timings.

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • Cancel
  • Former Member
    Former Member over 13 years ago in reply to Former Member

    I suspect there's something they aren't telling us, like maybe they

    used a newer version of gcc for the faster timings.

    While I agree that they could be massaging the figures with newer gcc or hardfloat vs softfloat, looking at the Arm clock increase in isolation will never tell the whole story. I suspect the DRAM clock increasing from 400 to 500 at the same time will lead to a significant improvement as well, simply by reducing a bottleneck for anything that's bound by ram like these benchmarks.

    Also as we know the Arm is really just a bolt-on to the GPU and the GPU is likely arbitrating access to everything, the core clock increase could have interesting effects. Without knowledge of the internal architecture it's not really possible for us to gauge how much performance increase a given clock increase could make.

     

    If you look at the commit logs for their kernel source, not the firmware, it's quite interesting to see that there's actually quite a few performance related changes recently, they've switched off tracing options for a 20% increase and to some degree fixed the 20% cpu that running USB was taking etc. So whatever actual difference you see is probably down to a combination of factors.

     

    You'd need to look at each change in isolation to see where the increases come from. Looking at IO to get a feeling of how much the Arm clock increase really helps may be a better way than pure CPU/RAM benchmarks.

     

     

    For anyone interested, it's probably also possible to change the cpu governor from 'ondemand' to 'performance' along with some other setting to pick the max clock speed and have the increase enabled permanently. Ondemand is great for increasing battery life on a power hungry laptop, but does add latency to the speed transitions.

    There are various tuneable parameters under /sys/devices/system/cpu/ that can be tweaked to alter some of this though.

    • Cancel
    • Vote Up +1 Vote Down
    • Sign in to reply
    • Cancel
  • Former Member
    Former Member over 13 years ago in reply to Former Member

    Selsinork,

      You wrote;

     

    "I suspect the DRAM clock increasing from 400 to 500 at the same time will lead to a significant improvement as well, simply by reducing a bottleneck for anything that's bound by ram like these benchmarks."

     

    Yes, increasing the DRAM clock by 25% will result in improved times, but the

    improvement is not additive.  If you increase the CPU by 43% and the DRAM

    by 25% (and do nothing else), then your expected improvement would be

    between 43% and 25%, not 43+25.  If the benchmark spends equal amounts

    of time waiting on the cpu and the DRAM, then it would improve by the

    average of 43% and 25%, which is 34%.

     

    The benefit from the 100% GPU speedup is hard to estimate, depending on

    what percent of the time the benchmark spends waiting on the GPU, which

    includes the L2 cache.

     

    But it's easy enough to measure the benefit from turbo mode.  Just run the

    same benchmark with and without turbo mode.  But they haven't done that,

    or if they have, they haven't shown the results.  Instead they've implied that

    turbo mode is responsible for the 75% LU benchmark speedup, which I don't

    think is plausible.

     

    If gcc turns out to be a significant factor, then that will also benefit any other

    competing ARM boards.

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • Cancel
  • Former Member
    Former Member over 13 years ago in reply to Former Member

    Yes, increasing the DRAM clock by 25% will result in improved times, but the

    improvement is not additive.  If you increase the CPU by 43% and the DRAM

    by 25% (and do nothing else), then your expected improvement would be

    between 43% and 25%, not 43+25.  If the benchmark spends equal amounts

    of time waiting on the cpu and the DRAM, then it would improve by the

    average of 43% and 25%, which is 34%.

    Sorry, but it's really not as simple as you make out. Increasing a clock somewhere can have very counter intuitive effects, including causing things to slow down. Trying to use averages is just not going to work.

     

    To illustrate the point, take a step back in time to a simpler processor without L1 or L2 cache, pipelining, branch prediction etc and use some obviously contrived numbers:

    The CPU has a memory access cycle of 10nS, but RAM has an access speed of 90Mhz (11.1nS). The CPU has to insert a wait state as a memory access can't be completed in a single CPU cycle, total memory cycle is now 20nS. A very slight increase in RAM speed to 100Mhz, or 11%, leads to a what's effectively a 100% improvement.

    The inverse happens if you now increase the CPU speed and drop it's memory access cycle to 9nS, you suffer a huge performance loss.

     

    Now that simplistic illustration doesn't really apply to the Pi, the architecture is too complex. It does however show that some tiny improvement in the right place can have drastic effects that are not simple to predict and that simply increasing a clock can actually be the wrong thing to do.

     

    In more modern systems it's more likely that you need to trade off odd ratios of clocks between CPU, L2, RAM and things like pipeline length, burst speed of RAM, cache associativity level, cache line size etc. You may find that RAM is optimised to be able to fill an 8 byte CPU L2 cache line very quickly but then requires some recovery time (which could hurt if your cache line is 16 bytes).

    That leads to things like your supposed compiler improvement coming into play - if the compiler better optimises the code so that it doesn't thrash data into and out of cache then the code runs faster overall. 

    Even better is if the code and data can be squeezed into a footprint that fits inside L2 cache as in that case the GPU clock increase is what will help and as you point out the GPU clock gets a rather large bump. So think what happens when you double the cache speed and also make the benchmark fit inside cache...

     

    gcc optimisations won't necessarily help other competing boards. Ever compiled the linux kernel ?  Notice that there's options under x86 to compile for various different generations of both Intel and AMD cpus. The trick being to compile something that's optimised for your cpu's specific quirks gets most advantage, run the same binary on something slightly different and performance may well suck..

     

    Anyway, as shown in my obviously contrived example, it's certainly possible to get a big increase by going from a (hopefully corner case) pathologically badly optimised setup to the one that sits in some sweet spot where everything aligns correctly and that it might take a relatively minor change to achieve that.

     

    Synthetic benchmarks like those tend to be useless as a real world indication of actual performance, they're a marketing tool. As such I don't expect to see anything other than their implied 75% increase, it makes for good hype after all image

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • Cancel
  • Former Member
    Former Member over 13 years ago in reply to Former Member

    > Sorry, but it's really not as simple as you make out.

     

    agreed, but I think it's the best first approximation, and the

    second-order effects are unquantifiable.

     

    It's sort of like driving down a street with traffic lights.

    Sometimes increasing your speed doesn't help at all,

    because you just get to the next red light faster. 

    And sometimes it helps a lot, because you might get

    through a green light that you otherwise would have missed.

    But it's impossible to quantify these effects, and over a long

    drive they tend to average out.

     

    On a simple architecture like the Arm6, where there isn't a lot

    of out-of-order activity going on, the "relay-race" model is

    pretty good, where you average the potential speedups of the

    various participants to get an overall potential improvement.

    • Cancel
    • Vote Up +1 Vote Down
    • Sign in to reply
    • Cancel
  • Former Member
    Former Member over 13 years ago in reply to Former Member

    It strikes me that this question is amenable to experiment, should anyone want real numbers :-). With the new 'raspi-config' tool, changing your overclock setting is a matter of a few seconds, plus maybe 45 seconds for the reboot.

     

    I haven't measured that benchmark because it doesn't matter to me, but I did post my measurements of the USB speedup here:

    http://www.raspberrypi.org/phpBB3/viewtopic.php?f=28&t=12097&start=500#p179692

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • Cancel
  • Former Member
    Former Member over 13 years ago in reply to Former Member

    Accoridn to the RPi website...

     

    "Introducing turbo mode: up to 50% more performance for free"

     

    and

     

    "Comparing the new image with 1GHz turbo enabled, against the previous image at 700MHz, nbench reports 52% faster on integer, 64% faster on floating point and 55% faster on memory."

     

    and

     

    "We have enabled Gordon’s “FIQ Fix” in the USB driver, which reduces the USB interrupt rate, improving general performance by about 10%."

     

    All taken together, that seems a decent increase in performance, for free. And they do use nbench, which is a standard perfromance tester so easy to compare if people can be bothered.

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • Cancel
  • Former Member
    Former Member over 13 years ago in reply to Former Member

    agreed, but I think it's the best first approximation, and the

    second-order effects are unquantifiable.

    Sure, unquantifiable for us, much less so if you happen to work for the chip designer and have access to the internals of the architecture.

     

    Only other thing I'd add is that in computing things tend to come in big steps when there's a favourable alignment of whatever factors. Outside of those alignment points there's often little to be gained.

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • Cancel
  • Former Member
    Former Member over 13 years ago in reply to Former Member
    All taken together, that seems a decent increase in performance, for free. And they do use nbench, which is a standard perfromance tester so easy to compare if people can be bothered.

    I think that's the way you have to look at it, as the combination of several different things all happening at once.  Lots of people will simply see the overclocking and assume that's the only factor.

     

    Unfortunately still no accelerated X driver, so all the nbench improvements in the world make no difference when screen redraw 'feels' slow...

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • Cancel
Reply
  • Former Member
    Former Member over 13 years ago in reply to Former Member
    All taken together, that seems a decent increase in performance, for free. And they do use nbench, which is a standard perfromance tester so easy to compare if people can be bothered.

    I think that's the way you have to look at it, as the combination of several different things all happening at once.  Lots of people will simply see the overclocking and assume that's the only factor.

     

    Unfortunately still no accelerated X driver, so all the nbench improvements in the world make no difference when screen redraw 'feels' slow...

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • Cancel
Children
No Data
element14 Community

element14 is the first online community specifically for engineers. Connect with your peers and get expert answers to your questions.

  • Members
  • Learn
  • Technologies
  • Challenges & Projects
  • Products
  • Store
  • About Us
  • Feedback & Support
  • FAQs
  • Terms of Use
  • Privacy Policy
  • Legal and Copyright Notices
  • Sitemap
  • Cookies

An Avnet Company © 2025 Premier Farnell Limited. All Rights Reserved.

Premier Farnell Ltd, registered in England and Wales (no 00876412), registered office: Farnell House, Forge Lane, Leeds LS12 2NE.

ICP 备案号 10220084.

Follow element14

  • X
  • Facebook
  • linkedin
  • YouTube