RoadTest: RoadTest the Raspberry Pi 4 Model B (2GB)
Author: bratoff
Creation date:
Evaluation Type: Development Boards & Tools
Did you receive all parts the manufacturer stated would be included in the package?: True
What other parts do you consider comparable to this product?: I do not know of another current SBC with comparable performance.
What were the biggest problems encountered?: Lack of setup information.
Detailed Review:
My plan for this road test was to run a series of benchmarks for various use cases, particularly those which have been a performance challenge for earlier generations of the Pi. As a point of reference, I ran the same set of tests against an identically-configured Raspberry Pi 3B. I also did some comparison testing using common activities that most of us have done on the Pi, to get a subjective feel for the difference in user experience.
It is reasonable to ask why I chose a P 3B as my base rather than the newer Pi 3B+. There are several reasons for this. First of all, while I own a couple of 3B’s, I do not own a 3B+. But more importantly, there are very few differences between these two half-generations that I expected to be of significance to the conclusions of my testing. The 3B+ uses essentially the same BCM2837 SoC as the 3B, albeit at a slightly higher clock rate and in a metal package to help with the additional heat dissipation. While the 3B+ does have upgraded components for dual-band WiFi and gigabit Ethernet, the lack of additional I/O lanes in the SoC forced the designers to route these interfaces through the 3B+’s USB 2.0, preventing their true performance potential from being realized. By comparison, the 4B’s BCM2711 SoC represents a true 2-generation jump in core technology (Cortex A53 vs Cortex A72), as well as a 2-generation jump in GPU (VideoCore IV vs VideoCore VI). In addition, the newer SoC’s expanded I/O capability allows all the Pi 4’s peripherals to have their own lanes, enabling the prospect of achieving maximum benefit from the upgraded I/O components.
Each of the test Pi cards was fitted with a brand-new 32gb SanDisk Ultra micro SD card, loaded with the latest Raspbian Buster “full” distribution. Each model was powered by its own 2-amp power supply. The same 1080 HD monitor and USB wireless mouse/keyboard combo were used on both cards. (On the Pi 4, the HDMI0 interface was used and the mouse/keyboard dongle was in one of the USB 2.0 ports.) Both Raspbian installs were given the latest updates, and the CPU and temperature monitors were added to the taskbar. The splash screen was disabled so that I could observe the boot-up and shutdown logs on the HDMI display. As I was aware of the potential heat issues with these models, I affixed one of the commonly available 15x15x5mm thermal adhesive backed heat sinks to each processor chip. No enclosures or active cooling were used. Network connectivity for both gigabit wired and dual band WiFi was provided by a Netgear R7000 located approximately 5 feet from the Pi under test. The wired Ethernet cable remained connected except during the WiFi-specific tests.
The packaging of the Raspberry Pi is famously Spartan, and the Pi 4 continues this tradition, perhaps even more so. The lightweight box contains only the Pi 4 card, a fold-out “Safety and User Guide” with all the usual warnings and caveats in multiple languages, and a little reminder card with no words but six cryptic images. As nearly as I can figure out, the images are:
Not only was there no anti-static packaging, but there was also nothing alerting the buyer to the fact that the power and HDMI connectors differ from prior models. While many prospective buyers might have already become aware of this through product advertising or the purchasing process, of greater concern is the lack of any information in the box regarding minimum power requirements or which of the two HDMI ports should be used during initial start-up and installation.
That last point – the lack of information on the two HDMI ports – proved to be an issue during the initial installation of Raspbian from a fresh image. The first thing Raspbian Buster does at power-on is to check for and attempt to update the boot loader on the Pi 4. This means that there is no output on either HDMI port for just long enough to make the user think the Pi has failed. It is also not indicated anywhere that if a boot loader update does occur, the Pi must be power-cycled in order to boot up Raspbian. During my initial testing, all of this was determined by trial and error and the Pi 4 did eventually boot up to the initial configuration tool with the monitor connected to HDMI0.
The remainder of the installation went very smoothly, with no further surprises. The new configuration dialog works well and I soon had a fully working Pi 4. It should be noted that even during this initial setup, the new card’s improved speed and responsiveness were already in evidence. The overall experience felt much more like a small desktop than a single board computer.
It was at this time that the question of thermal issues first arose. After just a few minutes of routine operation, not just the processor but the entire card and all of the connectors get quite warm. I surmise from this that the ground layer of the circuit board is somehow part of the on-board passive cooling. It was at this point that I decided to apply a small heat sink to each Pi’s processor chip and to monitor temperature during all of my tests.
The next step was to install all the test software on both machines.
glxgears – this is a simple graphics benchmark than can be installed using apt.
glmark2 – a much more elaborate graphics benchmark, retrieved from github and built on the pi. In addition to it being a good comparison test for the VideoCore graphics coprocessors, the lengthy build process gave me the opportunity to benchmark another common activity.
sysbench – a popular linux benchmark suite, installable using apt.
hardinfo – this is mainly a system information utility, but it contains some useful CPU and FPU benchmarks. It’s installed using apt.
Phoronix Test Suite – this might be the Mac Daddy of all testing and benchmarking tools. It pulls together several hundred possible tests and benchmarks under a common UI. I chose a modest selection of 15 benchmarks, based on a combination of the tool’s recommendation feature and their availability on the Raspbian platform. PTS comes as a .deb file that must be downloaded from the Phoronix home page and installed using dpkg.
The first test run was glxgears. It runs from a shell prompt and creates an additional window in which it displays an animation of three meshed, rotating gears. Meanwhile the original text window displays the current frame rate. This is supposed to be a pass/fail test – either the frame rate matches the refresh rate of the monitor (60 Hz in this case) or it doesn’t. For each Pi, I ran the test long enough to get 10 frame rate reports and then averaged the results.
I initially encountered an interesting anomaly in running this simple test: the Pi 4B produced the expected 60 frames per second, but the Pi 3B was displaying an absurd 255 fps. After a bit of investigating I discovered that while OpenGL was enabled by default on the 4B, I needed to go into raspi-config and turn it on (it’s under Advanced Settings) on the 3B. After making that change and rebooting, the 3B also passed at 60 fps.
To take things a small step further, I decided to re-run the test with the graphic window maximized, with the text window overlaying and partially obscuring it. This works the rest of the system a bit harder by bringing a few more X operations into play. It was here that I found my first performance difference, although it was much less than expected. The 3B ran full-screen at 32.13 fps, while the 4B managed 36.65, for a smaller than expected but statistically significant 14% improvement. Some of the other tests produced a much larger performance difference, as you’ll see below.
Next I ran the glmark2 tests. As previously mentioned, this benchmark needs to be downloaded in source form and its makefile run to produce the benchmark executable. The fairly lengthy build compiles 72 separate source modules and constructs two object libraries before linking the final executable. Based on another user’s comments, I built the glmark2-es2 variant, as that was the one which he had successfully built and run on Raspbian. The timings for the build can be found below under System Testing.
The glmark-es2 benchmark generates a number of different 3D images and applies a variety of rotation, lighting, texture and anti-aliasing effects to each of them. This gives the VideoCore GPU a thorough workout. Each test is repeated multiple times, and the program then combines the results to produce an overall score, where higher is better. It is here that the 2-generation difference in GPUs became apparent, with the 3B’s VideoCore IV scoring a 25 and the 4B’s VideoCore VI thoroughly crushing it with a score of 134 – a difference of over 400 percent! The 3D motion effects on the 4B were completely fluid, with no sign of stuttering or artifacts, while on the 3B you could clearly see the jerkiness caused by the much lower frame rate.
The first CPU test I ran was the sysbench cpu test, which I ran first in single-thread mode and then with 4 threads, in order to see the difference when all four cores in each chip are in play. The increased performance of the Pi 4’s A72 cores was evident in both tests. It completed the single-core test in an average of 92.78 seconds, compared to the Pi 3’s average of 144.56 seconds. This is an improvement of nearly 36%, and therefore cannot be explained solely by the 25% difference in system clock speed. In the 4-thread test, both results scaled more or less consistently with the additional active cores, giving average results of 23.28 seconds for the Pi 4 versus 37.8 seconds for the Pi 3, a difference of approximately 38%.
The hardinfo utility contains five CPU tests and two FPU tests which each stress the compute cores in different ways. The degree of improvement varied significantly from one CPU test to the next, but in all but one case continued to demonstrate the superior core efficiency of the Pi 4. The results are summarized in this table:
Test name | Better is | Pi 3B Score | Pi 4B Score | % Difference |
---|---|---|---|---|
Blowfish | Lower | 10.65 | 7.19 | 32% |
Cryptohash | Higher | 128.90 | 334.37 | 159% |
Fibonacci | Lower | 4.01 | 2.43 | 39% |
N-Queens | Lower | 9.39 | 11.27 | 20% |
Zlib | Higher | 0.12 | 0.22 | 83% |
The results from the FPU (floating point computation) tests were even more impressive:
Test Name | Better is | Pi 3B Score | Pi 4B Score | % Difference |
FFT | Lower | 12.01 | 6.11 | 49% |
Ray Tracing | Lower | 7.01 | 3.09 | 56% |
Three CPU tests from the Phoronix suite were also run. While the difference in performance was similar, it was in these tests that I first encountered temperature warnings and potential thermal limiting on both boards, in the 2-part “john-the-ripper” test. The first part of this benchmark repeatedly executes a brute-force encryption-cracking algorithm which is very compute-intensive. Since Phoronix also manages the repetitions and averaging of results, all passes are run back-to-back with no time for the processor to cool off. In this part of the testing, the 4B reached a CPU temperature of 82 degrees Celsius and exhibited signs of thermal throttling. The 3B got to 78 degrees and also showed signs of throttling.
The second part of “john-the-ripper” uses another compute-intensive activity, the MD5 hashing algorithm. This pushed both boards to their thermal limits, with the 3B reaching a whopping 83 degrees and the 4B not far behind at 82 degrees Celsius. We must conclude from this experience that active cooling is needed when either of these boards is to be used in a compute-intensive application.
The third CPU benchmark runs an MP3 encoding algorithm repeatedly. There were no signs of thermal distress from either board while running this test. CPU utilization did not exceed 25% in this test, while it maxed out at 100% when running the two john-the-ripper tests. This suggests a difference in the ability to use multiple cores between the three tests.
Here are the results from the Phoronix CPU tests:
Test Name | Better is | Pi 3B Score | Pi 4B Score | % Difference |
John-the-ripper blowfish | Higher | 638 | 857 | 34% |
John-the-ripper MD5 | Higher | 12921 | 21926 | 70% |
Encode-mp3 | Lower | 80.87 | 41.55 | 49% |
Putting it all together, here is our CPU/FPU performance chart:
I have indicated the tests where a higher score is better. For all others, lower is better. In order to make the chart more readable, I have also normalized some of the results so that all test numbers are within the same order of magnitude. This is indicated by the multiplier or divider after the test name.
Five tests were chosen from the “System” category in the Phoronix suite. As opposed to the straight CPU tests, these benchmarks are intended to exercise both the CPU and system memory, giving additional perspective on how a system will perform on real-world applications. The big difference in the Pi 4’s performance on these tests suggests that not just the CPU but also memory speed and bus efficiency have been greatly improved.
Of the five tests chosen, “pybench” was originally designed to benchmark one python implementation against another, by performing operations commonly found in the compilation process. The other four tests each perform a different graphics transformation and are based on the popular “gimp” graphics utility. These graphics tests use large amounts of memory and computation time, stressing CPU, RAM and the memory interface.
Here are the results:
Test Name | Better is | Pi 3B Score | Pi 4B Score | % Difference |
Pybench | Lower | 16904 | 5259 | 69% |
Unsharp mask | Lower | 23.34 | 4.12 | 82% |
Resize | Lower | 8.74 | 4.79 | 45% |
Rotate | Lower | 8.61 | 5.26 | 39% |
Auto levels | Lower | 12.21 | 4.36 | 64% |
The Pi 4B has an unfair advantage in pure network testing because of its onboard gigabit wired internet and dual-band WiFi, compared to the Pi 3B’s 100 megabit wired and 2.4 ghz only WiFi. What we can learn from these tests, however, is how close each board comes to its theoretical maximum network capability. As it turns out they both do quite well, with the Pi 4’s only advantage being attributable to its faster interfaces.
I chose three tests that stress common network activities. The first test initiates 20 TCP connections and sends a continuous stream of packets over each concurrently for a period of 360 seconds. The next test is similar, but uses only a single connection. The difference between the two should provide some indication of lost throughput due to the overhead of multiple connections. The third test does not transfer data packets, but simply attempts to establish and terminate as many TCP connections as it can in the allotted test duration. This test is concurrently executed in four threads. All three of these tests were repeated using both the wired and the WiFi interface. As with the other tests, multiple runs were performed and the results averaged to produce the numbers here.
The test server used in these benchmarks was a high-end Windows 10 PC with a gigabit wired interface, located on the same wired network segment as the Pi under test. Thus, on the WiFi tests only the Pi end was in fact wireless.
Here are the numbers:
Test Name | Better is | Pi 3B Score | Pi 4B Score |
TCP 20 threads wired | Higher | 94.90 | 942 |
TCP 20 threads WiFi | Higher | 38.87 | 65.97 |
TCP 1 thread wired | Higher | 94.90 | 946 |
TCP 1 thread WiFi | Higher | 39.97 | 62.77 |
Connection rate wired | Higher | 4563 | 1707 |
Connection rate WiFi | Higher | 1093 | 1587 |
You will notice one surprising anomaly in the data: The Pi 3B wired connection seems to be capable of a much higher connection rate than the Pi 4B. I do not have an explanation for this. In all the other tests, we see each board pretty much living up to its theoretical maximum capability.
My final testing category is what I am calling use case testing. I have chosen a collection of basic activities that most Pi users will do numerous times while developing their application or in day to day use of their Pi. The tests are:
Build a large application – as mentioned earlier, the glmark2 benchmark must be downloaded in source form from github and built on the target computer. This is a lengthy build that involves compiling 72 separate source modules, linking them into two libraries, and then linking the final executable. Anyone developing on the Pi will be familiar with this process. It provides a good workout for the processor, memory and filesystem.
Load chromium browser – one of the irritations on the Pi (to me anyway) is how long chrome takes to load. It’s a very large executable and we all use it all the time, so there’s a real value to any improvement here. In order to eliminate the effects of file system caching, each timed load was done from a fresh reboot. Chromium was loaded via the default taskbar icon. As with the other tests, multiple passes were averaged to provide the final result.
Copy a large file from NAS to desktop – I have a WD MyCloud which houses my extensive movie collection, mostly in ISO format. For this test, I chose a 2 GB file containing a concert video and measured the time to copy if from my NAS to the Pi desktop using the File Manager. This test was repeated via both the wired and the WiFi connections.
Boot to Raspbian desktop – as previously mentioned, I disabled the splash screen so that I could observe the boot-up log. I measured the time from application of power until the desktop wallpaper first appeared.
Reboot to steady state – you are probably aware that even after the desktop appears, Raspbian is still running several startup tasks for the first few seconds. This may be detected by the fact that upon boot-up, you will see CPU utilization momentarily spike and then drop into the single digits. My reboot test measures the time from clicking the reboot button on the shutdown dialog until the system reboots and the CPU utilization (as measured by the task bar widget) spikes and then drops below 10%.
Copy a large file tree to USB – after running the build, the glmark2 project folder tree contains a few hundred files totaling just over 130 megabytes. For this test, I timed the copying of the entire project tree to the root directory of an empty USB 3.0 memory stick. While this seemed at first like a good way to measure the benefit of the Pi 4B’s USB 3.0 ports, the marginal difference I observed in actual testing was probably due to limitations in the USB stick itself. The test was performed at the command line using the “cp –r” command, using a shell script that captured the time before and after the copy.
Copy a large file tree to another folder – This is essentially the same test as above, except that the target directory is the home folder of the pi user, which is of course located on the SD card. One of the reported improvements in the Pi 4B is the change to a DIO interface to the system SD card. In theory this should double the on-card filesystem bandwidth, and my testing supports that conclusion.
ISO playback using VLC player – this is really a subjective rather than a quantitative test. I played the local desktop copy of the ISO concert video for a few minutes, watching for stutters, stalls, dropped frames, etc.
There was definitely a difference between the Pi 3B and the Pi 4B. For starters, the 3B just didn’t “feel” as smooth during playback, suggesting a lower frame rate. When playing back in a window, there were only occasional stutters, usually at a sudden camera or lighting change. When I switched to full screen mode, the stutter at scene change became more consistent and the frame rate appeared to drop even farther. The Pi 4B on the other hand was completely smooth in windowed playback and appeared to be rendering the full 30 frames per second of the test video without any difficulty. Upon switching to full screen, the 4B did stutter occasionally on scene changes, but not as consistently. The perceived frame rate remained consistently higher than the 3B. The overall experience was definitely much more watchable on the 4B.
Here are the numbers for the quantifiable use case tests:
Test Name | Better is | Pi 3B Score | Pi 4B Score | % Difference |
Build glmark2 | Lower | 220 | 107 | 51% |
Load chromium | Lower | 11.40 | 7.55 | 34% |
Power on to desktop | Lower | 21.01 | 20.89 | 1% |
Reboot to idle | Lower | 29 | 28.07 | 3% |
Copy to USB | Lower | 107.75 | 102.5 | 5% |
Copy to folder | Lower | 17.4 | 7 | 60% |
Copy from NAS wired | Lower | 469 | 131.67 | 72% |
Copy from NAS WiFi | Lower | 707.67 | 301.67 | 57% |
Based on my testing and overall experience working with the Raspberry Pi 4B, I believe we finally have the major upgrade that the Raspberry Pi team has been hinting at for some time. I observed dramatic improvements in performance in nearly every aspect that I was able to measure. The increases in raw compute performance coupled with increased graphics and memory performance should make possible applications that would have previously been considered unsuitable for the Pi. While users have been building media servers using the Pi for quite some time, the upgraded networking and the availability of USB 3.0 for attached storage makes such an application far more realistic and achievable. Although some non-demanding games such as Minecraft have long been available on the Pi, the jump in graphics performance alongside these other improvements begs the question of when we might see full-fledged 3D gaming and animation on the Pi.
Even for those working on more prosaic applications, the Pi 4B provides a smoother and faster experience. Running a large build is no longer relegated to lunch break or overnight. Edits, file copies and other routine tasks now operate at a pace that feels more like a desktop machine than a single board computer. While I have always considered the “Pi Top” concept to be somewhat silly given the abundance of cheap laptops, I now believe that a Pi 4 embedded in a Pi Top would give a number of laptops and chromebooks a run for their money at a competitive cost.
Does it sound like I’m impressed? I look forward to putting my Pi 4B to work. The biggest question for me is where to use it first!
I want to thank the Element 14 team for giving me the opportunity to test drive the Raspberry Pi 4. It has been a most rewarding and enlightening experience.
Top Comments
Bruce,
I really like the way you introduced the roadtest and your benchmarking testing.
Interesting comment about the lack of anti-static bag. I have seen over the years the Pi with and without the anti…