Try out the Raspberry Pi Model 3 B Plus! - Review

Table of contents

RoadTest: Try out the Raspberry Pi Model 3 B Plus!

Author: Nitin_Bhaskar

Creation date:

Evaluation Type: Development Boards & Tools

Did you receive all parts the manufacturer stated would be included in the package?: True

What other parts do you consider comparable to this product?: Raspberry Pi, Raspberry Pi 2B, Dragonboard 410c

What were the biggest problems encountered?: Native compilation performance is bad. SSH disconnects over long run.

Detailed Review:

Raspberry pi over years have proved to be an ideal platform for educational and quick prototyping purpose. I was lucky enough to be selected as one for the roadtester for new Raspberry pi 3B+. My review basically is targeted towards the improvement over previous versions, benchmarking and ease of developing/running ML/AI frameworks such as tensorflow and ARM NN SDK.

 

The Rpi 3B+ arrived in box as shown below.

imageimage

 

The box contained a safety instruction with quick start manual and Rpi 3B+(ofcourse!).

image

Roadtest prerequisite:

I used the following for this roadtest:

  • Laptop running Ubuntu
  • 16 GB micro SD card with Raspbian stretch with desktop(version April 2018) for Raspberry pi
  • Ethernet cable, 5v-2.5A phone charger/adapter with micro USB cable.
  • Android phone with 802.11ac capabilities

 

Below is the image of Raspberry pi with micro SD card used.

image

 

 

 

 

{tabbedtable} Tab LabelTab Content
CPU

This Raspberry pi has SoC from Broadcom BCM2837B0, Cortex-A53 (ARMv8) 64-bit SoC clocked at 1.4GHz. The processor is encapsulated in a new package with a heat spreader for better thermal control (see below image).

image

 

Below is the console output from "/proc/cpuinfo",

processor      : 0
model name      : ARMv7 Processor rev 4 (v7l)
BogoMIPS        : 38.40
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32
CPU implementer : 0x41
CPU architecture: 7
CPU variant    : 0x0
CPU part        : 0xd03
CPU revision    : 4

processor      : 1
model name      : ARMv7 Processor rev 4 (v7l)
BogoMIPS        : 38.40
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32
CPU implementer : 0x41
CPU architecture: 7
CPU variant    : 0x0
CPU part        : 0xd03
CPU revision    : 4

processor      : 2
model name      : ARMv7 Processor rev 4 (v7l)
BogoMIPS        : 38.40
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32
CPU implementer : 0x41
CPU architecture: 7
CPU variant    : 0x0
CPU part        : 0xd03
CPU revision    : 4

processor      : 3
model name      : ARMv7 Processor rev 4 (v7l)
BogoMIPS        : 38.40
Features        : half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt vfpd32 lpae evtstrm crc32
CPU implementer : 0x41
CPU architecture: 7
CPU variant    : 0x0
CPU part        : 0xd03
CPU revision    : 4

Hardware        : BCM2835
Revision        : a020d3
Serial          : 00000000379948e6

As it can be seen hardware reported is BCM2835 instead of BCM2837 as it shown in previous versions of Rpi.

Few info on CPU frequency scaling,

pi@raspberrypi:~ $ sudo cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq
600000
pi@raspberrypi:~ $ sudo cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq
1400000
pi@raspberrypi:~ $ sudo cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq
600000
pi@raspberrypi:~ $ sudo cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies
600000 1400000
pi@raspberrypi:~ $ sudo cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq
600000
pi@raspberrypi:~ $ sudo cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
ondemand

Very clearly, only 600MHz and 1400MHz is supported by scaling governor and switching scheme is "ondemand". During "sysbench" profiling, below is variation of CPU temperature and CPU throttling seen at the end,

 

temp=54.2'C
1400000
temp=53.7'C
1400000
temp=53.7'C
1400000
temp=54.8'C
1400000
temp=54.2'C
1400000
temp=53.7'C
1400000
temp=53.7'C
600000
temp=53.7'C
600000
^C
pi@raspberrypi:~ $

Note that this temperature logged is die temperature and it reached max of 54.8 degree Celsius. CPU frequency changed according to the demand and at the end when test was over, it came back to 600MHz.

 

I ran few benchmark tests on Rpi 3B+ and below is my finding,

Sysbench:

pi@raspberrypi:~ $ sysbench --num-threads=4 --test=cpu --cpu-max-prime=20000 --validate run
sysbench 0.4.12:  multi-threaded system evaluation benchmark
 
Running the test with following options:
Number of threads: 4
Additional request validation enabled.
 
 
Doing CPU performance benchmark
 
Threads started!
Done.

 
Maximum prime number checked in CPU test: 20000
 
 
Test execution summary:
    total time:                          186.6278s
    total number of events:              10000
    total time taken by event execution: 746.3490
    per-request statistics:
         min:                                 73.83ms
         avg:                                 74.63ms
         max:                                 97.98ms
         approx.  95 percentile:              76.14ms
 
Threads fairness:
    events (avg/stddev):           2500.0000/33.25
    execution time (avg/stddev):   186.5872/0.03

 

Below is result from Rpi 2B+,

Test execution summary:
    total time:                          431.4357s
    total number of events:              10000
    total time taken by event execution: 1725.2872
    per-request statistics:
         min:                                 76.31ms
         avg:                                172.53ms
         max:                                357.22ms
         approx.  95 percentile:             188.58ms
 
Threads fairness:
    events (avg/stddev):           2500.0000/12.51
    execution time (avg/stddev):   431.3218/0.04

 

Comparison table:

Rpi 2B+Rpi 3B+
Execution time431.3218s186.5872s

 

Clearly it can be see that the execution time taken by Rpi 3B+ is less than half of that taken by Rpi 2B+.

RAM

The RAM used in this version of Raspberry pi is same as one used in previous version of Rpi 3B+ and Rpi 2B+ which is B8132B4PB-8D-F RAM chip (1GB) from Elpida.

image

Below is some benchmark number,

Sysbench:

pi@raspberrypi:~ $ sysbench --test=memory --memory-block-size=1K --memory-total-size=1G --num-threads=1 run
sysbench 0.4.12:  multi-threaded system evaluation benchmark
 
Running the test with following options:
Number of threads: 1
 
Doing memory operations speed test
Memory block size: 1K
 
Memory transfer size: 1024M
 
Memory operations type: write
Memory scope type: global
Threads started!
Done.

 
Operations performed: 1048576 (276646.74 ops/sec)
 
1024.00 MB transferred (270.16 MB/sec)

 
 
Test execution summary:
    total time:                          3.7903s
    total number of events:              1048576
    total time taken by event execution: 3.0114
    per-request statistics:
         min:                                  0.00ms
         avg:                                  0.00ms
         max:                                  3.63ms
         approx.  95 percentile:               0.00ms
 
Threads fairness:
    events (avg/stddev):           1048576.0000/0.00
    execution time (avg/stddev):   3.0114/0.00

 

Below output is from Rpi 2B+,

Test execution summary:
    total time:                          8.7923s
    total number of events:              1048576
    total time taken by event execution: 6.7600
    per-request statistics:
         min:                                  0.00ms
         avg:                                  0.01ms
         max:                                 20.28ms
         approx.  95 percentile:               0.00ms
 
Threads fairness:
    events (avg/stddev):           1048576.0000/0.00
    execution time (avg/stddev):   6.7600/0.00

 

Comparison table:

Rpi 2B+Rpi 3B+
Execution time6.76s3.0114s

 

Memtester:

pi@raspberrypi:~ $ time memtester 128M 1
memtester version 4.3.0 (32-bit)
Copyright (C) 2001-2012 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).
 
pagesize is 4096
pagesizemask is 0xfffff000
want 128MB (134217728 bytes)
got  128MB (134217728 bytes), trying mlock ...locked.
Loop 1/1:
  Stuck Address       : ok
  Random Value        : ok
  Compare XOR         : ok
  Compare SUB         : ok
  Compare MUL         : ok
  Compare DIV         : ok
  Compare OR          : ok
  Compare AND         : ok
  Sequential Increment: ok
  Solid Bits          : ok
  Block Sequential    : ok
  Checkerboard        : ok
  Bit Spread          : ok
  Bit Flip            : ok
  Walking Ones        : ok
  Walking Zeroes      : ok
  8-bit Writes        : ok
  16-bit Writes       : ok
 
Done.

 
real    6m8.503s
user    6m7.887s
sys     0m0.590s

 

Below output is from Rpi 2B+,

real    20m6.274s
user    13m22.910s
sys     0m1.940s

 

Comparison table:

 

Rpi 2B+Rpi 3B+
Time taken by memtester20m6.274s6m8.503s

 

 

 

In both tests, Rpi 2B+ has taken time more than double that of Rpi 3B+.

PMIC

There is an onboard power management IC from MaxLinear and the part number is MxL7704. It is five output PMIC optimized for powering low power microprocessors.

image

USB + Ethernet

This version of Raspberry Pi uses LAN7515 which is capable of providing USB hub and Gigabit Ethernet functionalities, hence, a common section for USB and Ethernet. Although LAN7515 supports gigabit Ethernet, due to USB speed limitation the throughput is limited to 300Mbps. LAN7515 is an upgrade over previous version of Rpi which used LAN9514.

 

image

My throughput test here involves following aspects,

  • USB transfer speed
  • Ethernet throughput over UDP and TCP using iperf
  • USB transfer along with Ethernet for TCP Tx and Rx using iperf

 

USB speed test:

Time taken to transfer 0.99GB file: 1m16s

 

Ethernet throughput:

TCP:

Rx - 203Mbps

Tx - 327Mbps

 

UDP:

Rx - 212Mbps

Tx - 265Mbps

 

USB + Ethernet:

TCP Rx - 166Mbps and time taken for USB transfer of 0.99GB file during simultaneous TCP Rx - 1m32s

 

TCP Tx -244Mbps and Time taken for USB transfer of 0.99GB file during simultaneous TCP Tx- 1m22s

WiFi

This revision of Raspberry Pi received a major upgrade on WiFi front. The new Rpi 3B+ has support for 802.11ac. The wireless solution used here is CYW43455 from Cypress which supports single stream IEEE 802.11.b/g/n/ac wireless LAN, as well as Bluetooth 4.2/BLE. This is the first Raspberry Pi that supports both 2.4GHz and 5GHz bands. CYW43455 supports WPA/WPA2 and WPS.

 

As far as linux driver is concerned, open source FMAC from Cypress is used. Below are the throughput numbers,

BandTCP RxTCP TxUDP RxUDP Tx
2.4GHz53.3Mbps55.2Mbps66.4Mbps72.2Mbps
5GHz92.4Mbps96.1Mbps115Mbps128Mbps

 

For BT/BLE, drivers used are,

hci_uart               36864  1
btbcm                  16384  1 hci_uart
serdev                 20480  1 hci_uart
bluetooth             368640  29 hci_uart,bnep,btbcm,rfcomm
ecdh_generic           28672  1 bluetooth

 

Below is output during BT pairing,

pi@raspberrypi:~ $ bluetoothctl
[NEW] Controller B8:27:EB:33:xx:xx raspberrypi [default]
[bluetooth]# agent KeyboardOnly
Agent registered
[bluetooth]# default-agent
Default agent request successful
[bluetooth]# power on
Changing power on succeeded
[bluetooth]# scan on
Discovery started
[CHG] Controller B8:27:EB:33:xx:xx Discovering: yes
[NEW] Device 04:D1:3A:8D:xx:xx Mi 5
[bluetooth]# pair 04:D1:3A:8D:xx:xx
Attempting to pair with 04:D1:3A:8D:xx:xx
[CHG] Device 04:D1:3A:8D:xx:xx Connected: yes
Request passkey
[agent] Enter passkey (number in 0-999999): 773283
[CHG] Device 04:D1:3A:8D:xx:xx Modalias: bluetooth:v001Dp1200d1436
[CHG] Device 04:D1:3A:8D:xx:xx UUIDs: 00001105-0000-1000-8000-00805f9b34fb
[CHG] Device 04:D1:3A:8D:xx:xx UUIDs: 0000110a-0000-1000-8000-00805f9b34fb
[CHG] Device 04:D1:3A:8D:xx:xx UUIDs: 0000110c-0000-1000-8000-00805f9b34fb
[CHG] Device 04:D1:3A:8D:xx:xx UUIDs: 0000110e-0000-1000-8000-00805f9b34fb
[CHG] Device 04:D1:3A:8D:xx:xx UUIDs: 00001112-0000-1000-8000-00805f9b34fb
[CHG] Device 04:D1:3A:8D:xx:xx UUIDs: 00001115-0000-1000-8000-00805f9b34fb
[CHG] Device 04:D1:3A:8D:xx:xx UUIDs: 00001116-0000-1000-8000-00805f9b34fb
[CHG] Device 04:D1:3A:8D:xx:xx UUIDs: 0000111f-0000-1000-8000-00805f9b34fb
[CHG] Device 04:D1:3A:8D:xx:xx UUIDs: 0000112d-0000-1000-8000-00805f9b34fb
[CHG] Device 04:D1:3A:8D:xx:xx UUIDs: 0000112f-0000-1000-8000-00805f9b34fb
[CHG] Device 04:D1:3A:8D:xx:xx UUIDs: 00001132-0000-1000-8000-00805f9b34fb
[CHG] Device 04:D1:3A:8D:xx:xx UUIDs: 00001200-0000-1000-8000-00805f9b34fb
[CHG] Device 04:D1:3A:8D:xx:xx UUIDs: 00001800-0000-1000-8000-00805f9b34fb
[CHG] Device 04:D1:3A:8D:xx:xx UUIDs: 00001801-0000-1000-8000-00805f9b34fb
[CHG] Device 04:D1:3A:8D:xx:xx ServicesResolved: yes
[CHG] Device 04:D1:3A:8D:xx:xx Paired: yes
Pairing successful

Running Tensorflow

Running Tensorflow on Rpi 3B+ is very simple. First download and install Tensorflow using pip,

 

sudo pip install tensorflow

 

Download sample imagenet example which classifies image.

wget https://raw.githubusercontent.com/tensorflow/models/master/tutorials/image/imagenet/classify_image.py

 

Run downloaded example. When run first time, it would download inception model. Below profiling is done on the second run.

pi@raspberrypi:~ $ time python classify_image.py
2018-06-06 14:52:55.854013: W tensorflow/core/framework/op_def_util.cc:332] Op BatchNormWithGlobalNormalization is deprecated. It will cease to work in GraphDef version 9. Use tf.nn.batch_normalization().
giant panda, panda, panda bear, coon bear, Ailuropoda melanoleuca (score = 0.89107)
indri, indris, Indri indri, Indri brevicaudatus (score = 0.00779)
lesser panda, red panda, panda, bear cat, cat bear, Ailurus fulgens (score = 0.00296)
custard apple (score = 0.00147)
earthstar (score = 0.00117)


real    0m50.943s
user    0m54.723s
sys     0m3.696s

Running ARM NN SDK

ARM has released its neural network SDK which is optimized for opencl and NEON. Since we have NEON support on Rpi 3B+, I will try running an ARM NN example optimized for NEON.

 

First let's create a folder,

mkdir project
cd project

 

Download alexnet model,

wget https://developer.arm.com//-/media/developer/technologies/Machine%20learning%20on%20Arm/Tutorials/Running%20AlexNet%20on%20Pi%20with%20Compute%20Library/compute_library_alexnet.zip

 

Unzip alexnet model,

unzip compute_library_alexnet.zip -d assets_alexnet

 

Install scons,

sudo apt-get install scons

 

Clone Compute Library SDK

git clone https://github.com/Arm-software/ComputeLibrary.git

 

Build using scons with option "neon=1" to enable NEON optimization

cd ComputeLibrary
scons Werror=1 debug=0 asserts=0 neon=1 opencl=0 examples=1 build=native –j4

 

Running the  alexnet example

pi@raspberrypi:~/project/ComputeLibrary $ export LD_LIBRARY_PATH=build/
pi@raspberrypi:~/project/ComputeLibrary $ PATH_ASSETS=../assets_alexnet
 
pi@raspberrypi:~/project/ComputeLibrary $ time ./build/examples/graph_alexnet 0 $PATH_ASSETS $PATH_ASSETS/go_kart.ppm $PATH_ASSETS/labels.txt
 
./build/examples/graph_alexnet
 
Usage: ./build/examples/graph_alexnet 0 ../assets_alexnet ../assets_alexnet/go_kart.ppm ../assets_alexnet/labels.txt [fast_math_hint]
 
No fast math info provided: disabling fast math
 
---------- Top 5 predictions ----------
 
0.9736 - [id = 573], n03444034 go-kart
0.0118 - [id = 518], n03127747 crash helmet
0.0108 - [id = 751], n04037443 racer, race car, racing car
0.0022 - [id = 817], n04285008 sports car, sport car
0.0006 - [id = 670], n03791053 motor scooter, scooter

---------- Top 5 predictions ----------
 
0.9736 - [id = 573], n03444034 go-kart
0.0118 - [id = 518], n03127747 crash helmet
0.0108 - [id = 751], n04037443 racer, race car, racing car
0.0022 - [id = 817], n04285008 sports car, sport car
0.0006 - [id = 670], n03791053 motor scooter, scooter

 
Test passed
 
real    0m5.827s
user    0m14.469s
sys     0m1.583s

ZRAM

The performance of Raspberry Pi can be improved slightly using ZRAM.

 

First download the script from https://wiki.debian.org/ZRam

Let's experiment on ARM NN example,

Without ZRAM,

pi@raspberrypi:~/project/ComputeLibrary $ export LD_LIBRARY_PATH=build/
pi@raspberrypi:~/project/ComputeLibrary $ PATH_ASSETS=../assets_alexnet
 
pi@raspberrypi:~/project/ComputeLibrary $ time ./build/examples/graph_alexnet 0 $PATH_ASSETS $PATH_ASSETS/go_kart.ppm $PATH_ASSETS/labels.txt
 
./build/examples/graph_alexnet
 
Usage: ./build/examples/graph_alexnet 0 ../assets_alexnet ../assets_alexnet/go_kart.ppm ../assets_alexnet/labels.txt [fast_math_hint]
 
No fast math info provided: disabling fast math
 
---------- Top 5 predictions ----------
 
0.9736 - [id = 573], n03444034 go-kart
0.0118 - [id = 518], n03127747 crash helmet
0.0108 - [id = 751], n04037443 racer, race car, racing car
0.0022 - [id = 817], n04285008 sports car, sport car
0.0006 - [id = 670], n03791053 motor scooter, scooter

---------- Top 5 predictions ----------
 
0.9736 - [id = 573], n03444034 go-kart
0.0118 - [id = 518], n03127747 crash helmet
0.0108 - [id = 751], n04037443 racer, race car, racing car
0.0022 - [id = 817], n04285008 sports car, sport car
0.0006 - [id = 670], n03791053 motor scooter, scooter

 
Test passed
 
real    0m5.827s
user    0m14.469s
sys     0m1.583s

 

Now start the zram compression

pi@raspberrypi:~/project/ComputeLibrary $ sudo /home/pi/zram.sh start
Setting up swapspace version 1, size = 173.9 MiB (182292480 bytes)
no label, UUID=358dc673-1764-4897-9ab7-27e4b19d1a62
Setting up swapspace version 1, size = 173.9 MiB (182292480 bytes)
no label, UUID=596434c2-483e-453b-933b-816d44c4587f
Setting up swapspace version 1, size = 173.9 MiB (182292480 bytes)
no label, UUID=e2cf31ed-d23b-45a3-af5d-fa10a0754220
Setting up swapspace version 1, size = 173.9 MiB (182292480 bytes)
no label, UUID=88836e77-647a-4470-94b2-c7357c8d84d9

 

Rerun the ARM NN SDK example

pi@raspberrypi:~/project/ComputeLibrary $ time ./build/examples/graph_alexnet 0 $PATH_ASSETS $PATH_ASSETS/go_kart.ppm $PATH_ASSETS/labels.txt
 
./build/examples/graph_alexnet
 
Usage: ./build/examples/graph_alexnet 0 ../assets_alexnet ../assets_alexnet/go_kart.ppm ../assets_alexnet/labels.txt [fast_math_hint]
 
No fast math info provided: disabling fast math
 
---------- Top 5 predictions ----------
 
0.9736 - [id = 573], n03444034 go-kart
0.0118 - [id = 518], n03127747 crash helmet
0.0108 - [id = 751], n04037443 racer, race car, racing car
0.0022 - [id = 817], n04285008 sports car, sport car
0.0006 - [id = 670], n03791053 motor scooter, scooter

---------- Top 5 predictions ----------
 
0.9736 - [id = 573], n03444034 go-kart
0.0118 - [id = 518], n03127747 crash helmet
0.0108 - [id = 751], n04037443 racer, race car, racing car
0.0022 - [id = 817], n04285008 sports car, sport car
0.0006 - [id = 670], n03791053 motor scooter, scooter

 
Test passed
 
real    0m5.836s
user    0m14.186s
sys     0m1.600s

 

Comparison table:

Without ZRAMZRAM enabled
User mode execution time14.469s14.186s

 

A small improvement is seen.

 

Overall I found Rpi 3B+ a neat piece of hardware with pretty stable OS support. I would have liked to see an upgrade of RAM, hopefully in next version of Rpi image.

Anonymous

Top Comments