P2P3 - Final Blog - Object Detection and Localisation on ultra96v2 using AI/ML

5 Sep 2023

Introduction:

The final project is based on AI/Machine Learning within the Core Technology Path. The aim of the project was to accelerate object detection and localisation on ultra96v2. I have some prior experience in FPGA design, but this is my first project with regards to AI/Machine Learning application targeting FPGA platforms. So I am using this as an opportunity to learn the new application area and at the same time share my learnings and experiences so that the wider community can benefit from the efforts.

Top-level Block diagram:

The above block diagram shows the simplified view of what I had in mind. I wanted to ingest data either as Real-time video from a USB camera, or stored video or as a sequence of images. The reason for having the three modes of input was to have some flexibility from a design and debug point of view. My approach was to start from the bottom (sequence of images) and move to the top (Real-time video). The resulting images will be processed by the object detection and localisation engine. Ideally the model has to localize the position of the object we are looking for in the image and identify the object in the image. Then we need to generate two bounding box, green bounding box indicating the actual/original location of the object and red bounding box that indicates the location of the object as predicted by the model.

Below is the actual test inputs and model I narrowed down for the project. Please note that the below model is based on tensor flow. The dataset plays an important role in deciding the properties of the model. For all the example design, there are already existing data set that are provided. This is a good starting point, but doesn't help us understand what happens with the dataset. For object classification problems (like cat or Dog detection) the dataset can be partitioned into two folders and we can train the model to infer if a given image is cat or dog and provide feedback accordingly. For object detection, the dataset needs to be labelled (for example, to identify a dog in a image amidst a lot of cats, we need to manually set the pixel levels for the identify the boundary where there is the dog). This complicates the design iteration as the labelling process is laborious and time consuming. I was looking for ways to overcome the manual labelling of data and came across the following approach (ML-DL-Algorithms/Object Detection and Localization using Tensorflow.ipynb at main · kaneelgit/ML-DL-Algorithms (github.com)), which is simple way to overlay an image (object to be detected) over another image (background image) at a fixed/random location. In this way we can generate multiple images. For this project, we will be using the famous "Where's Waldo" images for object detection based on the example design.

I didnot try to install tensorflow or other machine learning related libraries in my systems because of space restriction. I tried to use google collab (Colab is a hosted Jupyter Notebook service that requires no setup to use and provides free access to computing resources, including GPUs and TPUs). I had to modify/update the above design to run in google collab (as some of the libraries were outdated) to verify if the model can perform the detection accurately before trying to accelerate it on hardware. The code can be found in the below link

manihatn/e14_p2p3_final_project: Path to Programmable 3 Final project. Object Detection and Localisation on ultra96v2 using AI/ML (github.com)

As the design run across different epochs, we can see that the model converges (initially there is a big difference/offset between the original location of the object and the predicted location, but this gradually reduces)

When we near epoch 7 to 10, below is the models convergence

The test results based on the test images are also quite impressive. The difference between the original location (green bounding box) and the predicted location by model (red bounding box) is very narrow. Indicating that the model has a good fit based on the dataset provided.

Board setup

I first wanted to test that the Ultra96-v2 board was capable of running the demo projects. I observed frequent board restarts due to excess power consumption while running demos and I had to update the PMIC firmware on the board I received.
The build image had a PMIC application, which could read, verify if the current PMIC registers are different to the newest version and program registers accordingly. I used the application to update the PMIC firmware and was able to run the example demos without any issues. Ultra96-V2 Getting Started Guide v2.0 - element14 Community section 14 provides you detailed information on how to check your current version and update it to the latest version.
The other issue I had was with the DP to HDMI cable. I got some DP to HDMI adapters and could not get the display working and after lot of probing and reading documentation, I came to know that I needed to use an Active DP to HDMI cable to get the display working.
Changing the resolution was also an annoying problem, but I got around it.

Workflow of a Machine Learning Application

After having my model and board ready, I wanted to see what were the steps involved in converting the model to something that can be run on the DPU.

Model -> hdf5 -> training -> quantizing (calib and test) -> compiling -> .xmodel -> run on target platform

The first step was to run a simple example design and make sure that the above flow works. So I started with some Vitis-AI-Tutorials repository and the project of interest was the 09-mnist_pyt as it was a simple model and based on python.

user@user-Precision-7920-Tower:/media/user/095D522E4363D493/ultra96$ git clone https://github.com/Xilinx/Vitis-AI-Tutorials.git
Cloning into 'Vitis-AI-Tutorials'...
remote: Enumerating objects: 3626, done.
remote: Counting objects: 100% (735/735), done.
remote: Compressing objects: 100% (543/543), done.
remote: Total 3626 (delta 172), reused 696 (delta 155), pack-reused 2891
Receiving objects: 100% (3626/3626), 1.61 GiB | 10.61 MiB/s, done.
Resolving deltas: 100% (1169/1169), done.
user@user-Precision-7920-Tower:/media/user/095D522E4363D493/ultra96$ ls
Vitis-AI-Tutorials
user@user-Precision-7920-Tower:/media/user/095D522E4363D493/ultra96$ cd Vitis-AI-Tutorials/
user@user-Precision-7920-Tower:/media/user/095D522E4363D493/ultra96/Vitis-AI-Tutorials$ git branch
* master
user@user-Precision-7920-Tower:/media/user/095D522E4363D493/ultra96/Vitis-AI-Tutorials$ git checkout 1.4
Checking out files: 100% (1053/1053), done.
Branch '1.4' set up to track remote branch '1.4' from 'origin'.
Switched to a new branch '1.4'
user@user-Precision-7920-Tower:/media/user/095D522E4363D493/ultra96/Vitis-AI-Tutorials$ git branch
* 1.4
  master
user@user-Precision-7920-Tower:/media/user/095D522E4363D493/ultra96/Vitis-AI-Tutorials$ ls
Design_Tutorials  Feature_Tutorials  index.rst  Introduction  README.md
user@user-Precision-7920-Tower:/media/user/095D522E4363D493/ultra96/Vitis-AI-Tutorials$ cd Design_Tutorials/
user@user-Precision-7920-Tower:/media/user/095D522E4363D493/ultra96/Vitis-AI-Tutorials/Design_Tutorials$ ls
01-caffe_cats_vs_dogs            11-tf2_var_autoenc
02-MNIST_classification_tf       12-Alveo-U250-TF2-Classification
03-using_densenetx               13-vdpu-pre-post-pl-acc
04-Keras_GoogleNet_ResNet        14-caffe-ssd-pascal
05-Keras_FCN8_UNET_segmentation  15-caffe-segmentation-cityscapes
07-yolov4-tutorial               16-profiler_introduction
08-tf2_flow                      17-PyTorch-CityScapes-Pruning
09-mnist_pyt                     18-mpsocdpu-pre-post-pl-acc
10-RF_modulation_recognition
user@user-Precision-7920-Tower:/media/user/095D522E4363D493/ultra96/Vitis-AI-Tutorials/Design_Tutorials$ cd 09-mnist_pyt/
user@user-Precision-7920-Tower:/media/user/095D522E4363D493/ultra96/Vitis-AI-Tutorials/Design_Tutorials/09-mnist_pyt/files$ ls/
application  common.py  compile.sh  docker_run.sh  img  PROMPT.txt  quantize.py  run_all.sh  setup.sh  target.py  train.py
user@user-Precision-7920-Tower:/media/user/095D522E4363D493/ultra96/Vitis-AI-Tutorials/Design_Tutorials/09-mnist_pyt/files$ ./docker_run.sh xilinx/vitis-ai-gpu:latest
NOTICE:  BY INVOKING THIS SCRIPT AND USING THE SOFTWARE INSTALLED BY THE
SCRIPT, YOU AGREE ON BEHALF OF YOURSELF AND YOUR EMPLOYER (IF APPLICABLE)
TO BE BOUND TO THE LICENSE AGREEMENTS APPLICABLE TO THE SOFTWARE THAT YOU
INSTALL BY RUNNING THE SCRIPT.

Press any key to continue...
BY ELECTING TO CONTINUE, YOU WILL CAUSE THIS SCRIPT FILE TO AUTOMATICALLY
INSTALL A VARIETY OF SOFTWARE COPYRIGHTED
BY XILINX AND THIRD PARTIES THAT IS SUBJECT TO VARIOUS LICENSE AGREEMENTS 
THAT APPEAR UPON INSTALLATION, ACCEPTANCE AND/OR ACTIVATION OF THE
SOFTWARE AND/OR ARE CONTAINED OR DESCRIBED IN THE CORRESPONDING RELEASE
NOTES OR OTHER DOCUMENTATION OR HEADER OR SOURCE FILES. XILINX DOES NOT
GRANT TO LICENSEE ANY RIGHTS OR LICENSES TO SUCH THIRD-PARTY SOFTWARE.
LICENSEE AGREES TO CAREFULLY REVIEW AND ABIDE BY THE TERMS AND CONDITIONS
OF SUCH LICENSE AGREEMENTS TO THE EXTENT THAT THEY GOVERN SUCH SOFTWARE.

Press any key to continue...
BY ELECTING TO CONTINUE, YOU WILL CAUSE THE FOLLOWING SOFTWARE TO BE DOWNLOADED
AND INSTALLED ON YOUR SYSTEM. BY ELECTING TO CONTINUE, YOU UNDERSTAND THAT THE
INSTALLATION OF THE SOFTWARE LISTED BELOW MAY ALSO RESULT IN THE INSTALLATION
ON YOUR SYSTEM OF ADDITIONAL SOFTWARE NOT LISTED BELOW IN ORDER TO OPERATE
(SUCH SOFTWARE IS HEREAFTER REFERRED TO AS ‘DEPENDENCIES’)
XILINX DOES NOT GRANT TO LICENSEE ANY RIGHTS OR LICENSES TO SUCH DEPENDENCIES
LICENSEE AGREES TO CAREFULLY REVIEW AND ABIDE BY THE TERMS AND CONDITIONS
OF ANY LICENSE AGREEMENTS TO THE EXTENT THAT THEY GOVERN SUCH DEPENDENCIES

BY ELECTING TO CONTINUE, YOU WILL CAUSE THE FOLLOWING SOFTWARE PACKAGES
(AND THEIR RESPECTIVE DEPENDENCIES, IF APPLICABLE) TO BE DOWNLOADED FROM
UBUNTU'S MAIN REPO AND INSTALLED ON YOUR SYSTEM:
http://us.archive.ubuntu.com/ubuntu/dists/bionic/
Press any key to continue...http://us.archive.ubuntu.com/ubuntu/dists/bionic/
1.  sudo  
2.  git  
3.  zstd  
4.  tree  
5.  vim  
6.  wget  
7.  bzip2  
8.  ca-certificates  
9.  curl  
10. unzip  
11. python3-minimal  
12. python3-opencv  
13. python3-venv  
14. python3-pip  
15. python3-setuptools  
16. g++  
17. make  
18. cmake  
19. build-essential  
20. autoconf  
21. libgoogle-glog-dev  
22. libgflags-dev  
23. libunwind-dev  
24. libtool  
25. libgtk2.0-dev
26. libavcodec-dev
27. libavformat-dev
28. libavdevice-dev

BY ELECTING TO CONTINUE, YOU WILL CAUSE THE FOLLOWING SOFTWARE PACKAGES
(AND THEIR RESPECTIVE DEPENDENCIES, IF APPLICABLE) TO BE DOWNLOADED FROM
ANACONDA REPO AND INSTALLED ON YOUR SYSTEM:
https://anaconda.org”
Press any key to continue...1.  absl-py
2.  astor
3.  attrs
4.  backcall
5.  backports
6.  backports.weakref
7.  blas
8.  bleach
9.  boost
10.  bzip2
11.  ca-certificates
12.  cairo
13.  c-ares
14.  certifi
15.  cffi
16.  chardet
17.  cloudpickle
18.  conda
19.  conda-package-handling
20.  cryptography
21.  cycler
22.  cytoolz
23.  dask-core
24.  dbus
25.  decorator
26.  defusedxml
27.  dill
28.  dpuv1_compiler
29.  dpuv1-rt
30.  dpuv1-rt-ext
31.  dpuv1-rt-neptune
32.  entrypoints
33.  expat
34.  ffmpeg
35.  fontconfig
36.  freeglut
37.  freetype
38.  fribidi
39.  gast
40.  gettext
41.  gflags
42.  giflib
43.  glib
44.  glog
45.  gmp
46.  gnutls
47.  google-pasta
48.  graphite2
49.  graphviz
50.  grpcio
51.  gst-plugins-base
52.  gstreamer
53.  h5py
54.  harfbuzz
55.  hdf5
56.  icu
57.  idna
58.  imageio
59.  importlib_metadata
60.  importlib-metadata
61.  intel-openmp
62.  ipykernel
63.  ipython
64.  ipython_genutils
65.  ipywidgets
66.  jasper
67.  jedi
68.  jinja2
69.  joblib
70.  jpeg
71.  json-c
72.  jsoncpp
73.  jsonschema
74.  jupyter
75.  jupyter_client
76.  jupyter_console
77.  jupyter_core
78.  keras
79.  keras-applications
80.  keras-base
81.  keras-preprocessing
82.  kiwisolver
83.  krb5
84.  lame
85.  ld_impl_linux-64
86.  leveldb
87.  libblas
88.  libboost
89.  libcblas
90.  libedit
91.  libffi
92.  _libgcc_mutex
93.  libgcc-ng
94.  libgfortran-ng
95.  libglu
96.  libiconv
97.  liblapack
98.  liblapacke
99.  libopenblas
100.  libopencv
101.  libopus
102.  libpng
103.  libprotobuf
104.  libsodium
105.  libssh2
106.  libstdcxx-ng
107.  libtiff
108.  libtool
109.  libuuid
110.  libvpx
111.  libwebp
112.  libxcb
113.  libxml2
114.  lmdb
115.  lz4-c
116.  markdown
117.  markupsafe
118.  marshmallow
119.  matplotlib
120.  matplotlib-base
121.  mistune
122.  mkl
123.  mkl_fft
124.  mkl_random
125.  mkl-service
126.  mock
127.  more-itertools
128.  nbconvert
129.  nbformat
130.  ncurses
131.  nettle
132.  networkx
133.  notebook
134.  numpy
135.  numpy-base
136.  olefile
137.  openblas
138.  opencv
139.  openh264
140.  openssl
141.  opt_einsum
142.  packaging
143.  pandas
144.  pandoc
145.  pandocfilters
146.  pango
147.  parso
148.  pexpect
149.  pickleshare
150.  pillow
151.  pip
152.  pixman
153.  pluggy
154.  progressbar2
155.  prometheus_client
156.  prompt_toolkit
157.  prompt-toolkit
158.  protobuf
159.  ptyprocess
160.  py
161.  pybind11
162.  py-boost
163.  pycosat
Press any key to continue...163.  pycosat
164.  pycparser
165.  pydot
166.  pygments
167.  py-opencv
168.  pyopenssl
169.  pyparsing
170.  pyqt
171.  pyrsistent
172.  pysocks
173.  pytest
174.  pytest-runner
175.  python
176.  python-dateutil
177.  python-gflags
178.  python-graphviz
179.  python-leveldb
180.  python-utils
181.  pytz
182.  pywavelets
183.  pyyaml
184.  pyzmq
185.  qt
186.  qtconsole
187.  qtpy
188.  readline
189.  requests
190.  ruamel_yaml
191.  scikit-image
192.  scikit-learn
193.  scipy
194.  send2trash
195.  setuptools
196.  sip
197.  six
198.  snappy
199.  sqlite
200.  tensorboard
201.  tensorflow
202.  tensorflow-base
203.  tensorflow-estimator
204.  termcolor
205.  terminado
206.  testpath
207.  _tflow_select
208.  threadpoolctl
209.  tk
210.  toolz
211.  tornado
212.  tqdm
213.  traitlets
214.  urllib3
215.  wcwidth
216.  webencodings
217.  werkzeug
218.  wheel
219.  widgetsnbextension
220.  wrapt
221.  x264
222.  xcompiler
223.  xorg-libice
224.  xorg-libsm
225.  xorg-libx11
226.  xorg-libxext
227.  xorg-libxpm
228.  xorg-libxrender
229.  xorg-libxt
230.  xorg-renderproto
231.  xorg-xextproto
232.  xorg-xproto
233.  xz
234.  yaml
235.  yaml-cpp
236.  zeromq
237.  zipp
238.  zlib
239.  zstd

BY ELECTING TO CONTINUE, YOU ACKNOWLEDGE AND AGREE, FOR YOURSELF AND ON BEHALF
OF YOUR EMPLOYER (IF APPLICABLE), THAT XILINX IS NOT DISTRIBUTING TO YOU IN
THIS FILE ANY OF THE AFORMENTIONED SOFTWARE OR DEPENDENCIES, AND THAT YOU ARE
SOLELY RESPONSIBLE FOR THE INSTALLATION OF SUCH SOFTWARE AND DEPENDENCIES ON
YOUR SYSTEM AND FOR CAREFULLY REVIEWING AND ABIDING BY THE TERMS AND CONDITIONS
OF ANY LICENSE AGREEMENTS TO THE EXTENT THAT THEY GOVERN SUCH SOFTWARE AND DEPENDENCIES

Press any key to continue...

Do you agree to the terms and wish to proceed [y/n]? y
Unable to find image 'xilinx/vitis-ai-gpu:latest' locally
docker: Error response from daemon: pull access denied for xilinx/vitis-ai-gpu, repository does not exist or may require 'docker login': denied: requested access to the resource is denied.
See 'docker run --help'.

I couldn't succeed in getting it to work. The issue was that GPU version of Vitis was not provided because of GPU licensing requirements. So I had to try an alternate approach (which required a lot more memory space ~20GB).

After that, I tried to run the Vitis-AI repo v2.0 as it was mostly supported for the ultra96v2.

user@user-Precision-7920-Tower:/media/user/095D522E4363D493/ultra96/Vitis-AI-Tutorials$ cd .. && git clone -b v2.0 https://github.com/Xilinx/Vitis-AI
Cloning into 'Vitis-AI'...
remote: Enumerating objects: 90613, done.
remote: Counting objects: 100% (9647/9647), done.
remote: Compressing objects: 100% (3747/3747), done.
remote: Total 90613 (delta 5321), reused 9166 (delta 5248), pack-reused 80966
Receiving objects: 100% (90613/90613), 2.11 GiB | 9.70 MiB/s, done.
Resolving deltas: 100% (44980/44980), done.
Note: checking out 'd02dcb6041663dbc7ecbc0c6af9fafa087a789de'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b <new-branch-name>

Checking out files: 100% (35569/35569), done.
user@user-Precision-7920-Tower:/media/user/095D522E4363D493/ultra96$ cd Vitis-AI-Tutorials/
user@user-Precision-7920-Tower:/media/user/095D522E4363D493/ultra96/Vitis-AI-Tutorials$ ls
Design_Tutorials  Feature_Tutorials  index.rst  Introduction  README.md
user@user-Precision-7920-Tower:/media/user/095D522E4363D493/ultra96/Vitis-AI-Tutorials$ cd Design_Tutorials/
user@user-Precision-7920-Tower:/media/user/095D522E4363D493/ultra96/Vitis-AI-Tutorials/Design_Tutorials$ ls
01-caffe_cats_vs_dogs            07-yolov4-tutorial            12-Alveo-U250-TF2-Classification  17-PyTorch-CityScapes-Pruning
02-MNIST_classification_tf       08-tf2_flow                   13-vdpu-pre-post-pl-acc           18-mpsocdpu-pre-post-pl-acc
03-using_densenetx               09-mnist_pyt                  14-caffe-ssd-pascal
04-Keras_GoogleNet_ResNet        10-RF_modulation_recognition  15-caffe-segmentation-cityscapes
05-Keras_FCN8_UNET_segmentation  11-tf2_var_autoenc            16-profiler_introduction
user@user-Precision-7920-Tower:/media/user/095D522E4363D493/ultra96/Vitis-AI-Tutorials/Design_Tutorials$ cd 09-mnist_pyt/
user@user-Precision-7920-Tower:/media/user/095D522E4363D493/ultra96/Vitis-AI-Tutorials/Design_Tutorials/09-mnist_pyt$ cd files/
user@user-Precision-7920-Tower:/media/user/095D522E4363D493/ultra96/Vitis-AI-Tutorials/Design_Tutorials/09-mnist_pyt/files$ ./docker_run.sh xilinx/vitis-ai:latest
NOTICE:  BY INVOKING THIS SCRIPT AND USING THE SOFTWARE INSTALLED BY THE
SCRIPT, YOU AGREE ON BEHALF OF YOURSELF AND YOUR EMPLOYER (IF APPLICABLE)
TO BE BOUND TO THE LICENSE AGREEMENTS APPLICABLE TO THE SOFTWARE THAT YOU
INSTALL BY RUNNING THE SCRIPT.

Press any key to continue...
BY ELECTING TO CONTINUE, YOU WILL CAUSE THIS SCRIPT FILE TO AUTOMATICALLY
INSTALL A VARIETY OF SOFTWARE COPYRIGHTED
BY XILINX AND THIRD PARTIES THAT IS SUBJECT TO VARIOUS LICENSE AGREEMENTS 
THAT APPEAR UPON INSTALLATION, ACCEPTANCE AND/OR ACTIVATION OF THE
SOFTWARE AND/OR ARE CONTAINED OR DESCRIBED IN THE CORRESPONDING RELEASE
NOTES OR OTHER DOCUMENTATION OR HEADER OR SOURCE FILES. XILINX DOES NOT
GRANT TO LICENSEE ANY RIGHTS OR LICENSES TO SUCH THIRD-PARTY SOFTWARE.
LICENSEE AGREES TO CAREFULLY REVIEW AND ABIDE BY THE TERMS AND CONDITIONS
OF SUCH LICENSE AGREEMENTS TO THE EXTENT THAT THEY GOVERN SUCH SOFTWARE.

Press any key to continue...
BY ELECTING TO CONTINUE, YOU WILL CAUSE THE FOLLOWING SOFTWARE TO BE DOWNLOADED
AND INSTALLED ON YOUR SYSTEM. BY ELECTING TO CONTINUE, YOU UNDERSTAND THAT THE
INSTALLATION OF THE SOFTWARE LISTED BELOW MAY ALSO RESULT IN THE INSTALLATION
ON YOUR SYSTEM OF ADDITIONAL SOFTWARE NOT LISTED BELOW IN ORDER TO OPERATE
(SUCH SOFTWARE IS HEREAFTER REFERRED TO AS ‘DEPENDENCIES’)
XILINX DOES NOT GRANT TO LICENSEE ANY RIGHTS OR LICENSES TO SUCH DEPENDENCIES
LICENSEE AGREES TO CAREFULLY REVIEW AND ABIDE BY THE TERMS AND CONDITIONS
OF ANY LICENSE AGREEMENTS TO THE EXTENT THAT THEY GOVERN SUCH DEPENDENCIES

BY ELECTING TO CONTINUE, YOU WILL CAUSE THE FOLLOWING SOFTWARE PACKAGES
(AND THEIR RESPECTIVE DEPENDENCIES, IF APPLICABLE) TO BE DOWNLOADED FROM
UBUNTU'S MAIN REPO AND INSTALLED ON YOUR SYSTEM:
http://us.archive.ubuntu.com/ubuntu/dists/bionic/
Press any key to continue...http://us.archive.ubuntu.com/ubuntu/dists/bionic/
1.  sudo  
2.  git  
3.  zstd  
4.  tree  
5.  vim  
6.  wget  
7.  bzip2  
8.  ca-certificates  
9.  curl  
10. unzip  
11. python3-minimal  
12. python3-opencv  
13. python3-venv  
14. python3-pip  
15. python3-setuptools  
16. g++  
17. make  
18. cmake  
19. build-essential  
20. autoconf  
21. libgoogle-glog-dev  
22. libgflags-dev  
23. libunwind-dev  
24. libtool  
25. libgtk2.0-dev
26. libavcodec-dev
27. libavformat-dev
28. libavdevice-dev

BY ELECTING TO CONTINUE, YOU WILL CAUSE THE FOLLOWING SOFTWARE PACKAGES
(AND THEIR RESPECTIVE DEPENDENCIES, IF APPLICABLE) TO BE DOWNLOADED FROM
ANACONDA REPO AND INSTALLED ON YOUR SYSTEM:
https://anaconda.org”
Press any key to continue...1.  absl-py
2.  astor
3.  attrs
4.  backcall
5.  backports
6.  backports.weakref
7.  blas
8.  bleach
9.  boost
10.  bzip2
11.  ca-certificates
12.  cairo
13.  c-ares
14.  certifi
15.  cffi
16.  chardet
17.  cloudpickle
18.  conda
19.  conda-package-handling
20.  cryptography
21.  cycler
22.  cytoolz
23.  dask-core
24.  dbus
25.  decorator
26.  defusedxml
27.  dill
28.  dpuv1_compiler
29.  dpuv1-rt
30.  dpuv1-rt-ext
31.  dpuv1-rt-neptune
32.  entrypoints
33.  expat
34.  ffmpeg
35.  fontconfig
36.  freeglut
37.  freetype
38.  fribidi
39.  gast
40.  gettext
41.  gflags
42.  giflib
43.  glib
44.  glog
45.  gmp
46.  gnutls
47.  google-pasta
48.  graphite2
49.  graphviz
50.  grpcio
51.  gst-plugins-base
52.  gstreamer
53.  h5py
54.  harfbuzz
55.  hdf5
56.  icu
57.  idna
58.  imageio
59.  importlib_metadata
60.  importlib-metadata
61.  intel-openmp
62.  ipykernel
63.  ipython
64.  ipython_genutils
65.  ipywidgets
66.  jasper
67.  jedi
68.  jinja2
69.  joblib
70.  jpeg
71.  json-c
72.  jsoncpp
73.  jsonschema
74.  jupyter
75.  jupyter_client
76.  jupyter_console
77.  jupyter_core
78.  keras
79.  keras-applications
80.  keras-base
81.  keras-preprocessing
82.  kiwisolver
83.  krb5
84.  lame
85.  ld_impl_linux-64
86.  leveldb
87.  libblas
88.  libboost
89.  libcblas
90.  libedit
91.  libffi
92.  _libgcc_mutex
93.  libgcc-ng
94.  libgfortran-ng
95.  libglu
96.  libiconv
97.  liblapack
98.  liblapacke
99.  libopenblas
100.  libopencv
101.  libopus
102.  libpng
103.  libprotobuf
104.  libsodium
105.  libssh2
106.  libstdcxx-ng
107.  libtiff
108.  libtool
109.  libuuid
110.  libvpx
111.  libwebp
112.  libxcb
113.  libxml2
114.  lmdb
115.  lz4-c
116.  markdown
117.  markupsafe
118.  marshmallow
119.  matplotlib
120.  matplotlib-base
121.  mistune
122.  mkl
123.  mkl_fft
124.  mkl_random
125.  mkl-service
126.  mock
127.  more-itertools
128.  nbconvert
129.  nbformat
130.  ncurses
131.  nettle
132.  networkx
133.  notebook
134.  numpy
135.  numpy-base
136.  olefile
137.  openblas
138.  opencv
139.  openh264
140.  openssl
141.  opt_einsum
142.  packaging
143.  pandas
144.  pandoc
145.  pandocfilters
146.  pango
147.  parso
148.  pexpect
149.  pickleshare
150.  pillow
151.  pip
152.  pixman
153.  pluggy
154.  progressbar2
155.  prometheus_client
156.  prompt_toolkit
157.  prompt-toolkit
158.  protobuf
159.  ptyprocess
160.  py
161.  pybind11
162.  py-boost
163.  pycosat
Press any key to continue...163.  pycosat
164.  pycparser
165.  pydot
166.  pygments
167.  py-opencv
168.  pyopenssl
169.  pyparsing
170.  pyqt
171.  pyrsistent
172.  pysocks
173.  pytest
174.  pytest-runner
175.  python
176.  python-dateutil
177.  python-gflags
178.  python-graphviz
179.  python-leveldb
180.  python-utils
181.  pytz
182.  pywavelets
183.  pyyaml
184.  pyzmq
185.  qt
186.  qtconsole
187.  qtpy
188.  readline
189.  requests
190.  ruamel_yaml
191.  scikit-image
192.  scikit-learn
193.  scipy
194.  send2trash
195.  setuptools
196.  sip
197.  six
198.  snappy
199.  sqlite
200.  tensorboard
201.  tensorflow
202.  tensorflow-base
203.  tensorflow-estimator
204.  termcolor
205.  terminado
206.  testpath
207.  _tflow_select
208.  threadpoolctl
209.  tk
210.  toolz
211.  tornado
212.  tqdm
213.  traitlets
214.  urllib3
215.  wcwidth
216.  webencodings
217.  werkzeug
218.  wheel
219.  widgetsnbextension
220.  wrapt
221.  x264
222.  xcompiler
223.  xorg-libice
224.  xorg-libsm
225.  xorg-libx11
226.  xorg-libxext
227.  xorg-libxpm
228.  xorg-libxrender
229.  xorg-libxt
230.  xorg-renderproto
231.  xorg-xextproto
232.  xorg-xproto
233.  xz
234.  yaml
235.  yaml-cpp
236.  zeromq
237.  zipp
238.  zlib
239.  zstd

BY ELECTING TO CONTINUE, YOU ACKNOWLEDGE AND AGREE, FOR YOURSELF AND ON BEHALF
OF YOUR EMPLOYER (IF APPLICABLE), THAT XILINX IS NOT DISTRIBUTING TO YOU IN
THIS FILE ANY OF THE AFORMENTIONED SOFTWARE OR DEPENDENCIES, AND THAT YOU ARE
SOLELY RESPONSIBLE FOR THE INSTALLATION OF SUCH SOFTWARE AND DEPENDENCIES ON
YOUR SYSTEM AND FOR CAREFULLY REVIEWING AND ABIDING BY THE TERMS AND CONDITIONS
OF ANY LICENSE AGREEMENTS TO THE EXTENT THAT THEY GOVERN SUCH SOFTWARE AND DEPENDENCIES

Press any key to continue...

Do you agree to the terms and wish to proceed [y/n]? y 
Setting up user 's environment in the Docker container...
usermod: no changes
Running as vitis-ai-user with ID 0 and group 0

After some initial efforts, I succeed in getting the docker image to work.

==========================================
 
__      ___ _   _                   _____
\ \    / (_) | (_)            /\   |_   _|
 \ \  / / _| |_ _ ___ ______ /  \    | |
  \ \/ / | | __| / __|______/ /\ \   | |
   \  /  | | |_| \__ \     / ____ \ _| |_
    \/   |_|\__|_|___/    /_/    \_\_____|
 
==========================================

Docker Image Version: 2.5.0.1260   (CPU) 
Vitis AI Git Hash: 502703c 
Build Date: 2022-06-12

For TensorFlow 1.15 Workflows do:
     conda activate vitis-ai-tensorflow 
For PyTorch Workflows do:
     conda activate vitis-ai-pytorch 
For TensorFlow 2.8 Workflows do:
     conda activate vitis-ai-tensorflow2 
For WeGo Tensorflow 1.15 Workflows do:
     conda activate vitis-ai-wego-tf1 
For WeGo Tensorflow 2.8 Workflows do:
     conda activate vitis-ai-wego-tf2 
For WeGo Torch Workflows do:
     conda activate vitis-ai-wego-torch 
Vitis-AI /workspace > conda activate vitis-ai-pytorch
(vitis-ai-pytorch) Vitis-AI /workspace > ls
application  common.py  compile.sh  docker_run.sh  img  PROMPT.txt  quantize.py  run_all.sh  setup.sh  target.py  train.py
(vitis-ai-pytorch) Vitis-AI /workspace > source ./run_all.sh 

-----------------------------------------
PyTorch version :  1.10.1
3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:53) 
[GCC 9.4.0]
-----------------------------------------
 Command line options:
--build_dir    :  ./build
--batchsize    :  100
--learnrate    :  0.001
--epochs       :  3
-----------------------------------------
No CUDA devices available..selecting CPU
Downloading http://yann.lecun.com/exdb/mnist/https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 404: Not Found

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 404: Not Found

Traceback (most recent call last):
  File "train.py", line 131, in <module>
    run_main()
  File "train.py", line 124, in run_main
    train_test(args.build_dir, args.batchsize, args.learnrate, args.epochs)
  File "train.py", line 72, in train_test
    transform=train_transform)
  File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.7/site-packages/torchvision/datasets/mnist.py", line 87, in __init__
    self.download()
  File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.7/site-packages/torchvision/datasets/mnist.py", line 190, in download
    raise RuntimeError("Error downloading {}".format(filename))
RuntimeError: Error downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'

[VAIQ_NOTE]: Loading NNDCT kernels...

-----------------------------------------
PyTorch version :  1.10.1
3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:53) 
[GCC 9.4.0]
-----------------------------------------
 Command line options:
--build_dir    :  ./build
--quant_mode   :  calib
--batchsize    :  100
-----------------------------------------
No CUDA devices available..selecting CPU
Traceback (most recent call last):
  File "quantize.py", line 125, in <module>
    run_main()
  File "quantize.py", line 118, in run_main
    quantize(args.build_dir,args.quant_mode,args.batchsize)
  File "quantize.py", line 61, in quantize
    model.load_state_dict(torch.load(os.path.join(float_model,'f_model.pth')))
  File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.7/site-packages/torch/serialization.py", line 594, in load
    with _open_file_like(f, 'rb') as opened_file:
  File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.7/site-packages/torch/serialization.py", line 230, in _open_file_like
    return _open_file(name_or_buffer, mode)
  File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.7/site-packages/torch/serialization.py", line 211, in __init__
    super(_open_file, self).__init__(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: './build/float_model/f_model.pth'
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'

[VAIQ_NOTE]: Loading NNDCT kernels...

-----------------------------------------
PyTorch version :  1.10.1
3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:53) 
[GCC 9.4.0]
-----------------------------------------
 Command line options:
--build_dir    :  ./build
--quant_mode   :  test
--batchsize    :  100
-----------------------------------------
No CUDA devices available..selecting CPU
Traceback (most recent call last):
  File "quantize.py", line 125, in <module>
    run_main()
  File "quantize.py", line 118, in run_main
    quantize(args.build_dir,args.quant_mode,args.batchsize)
  File "quantize.py", line 61, in quantize
    model.load_state_dict(torch.load(os.path.join(float_model,'f_model.pth')))
  File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.7/site-packages/torch/serialization.py", line 594, in load
    with _open_file_like(f, 'rb') as opened_file:
  File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.7/site-packages/torch/serialization.py", line 230, in _open_file_like
    return _open_file(name_or_buffer, mode)
  File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.7/site-packages/torch/serialization.py", line 211, in __init__
    super(_open_file, self).__init__(open(name, mode))
FileNotFoundError: [Errno 2] No such file or directory: './build/float_model/f_model.pth'
-----------------------------------------
COMPILING MODEL FOR ZCU102..
-----------------------------------------
[UNILOG][FATAL][XCOM_FILE_NOT_EXISTS][The file is not exists] "/workspace/./build/quant_model/CNN_int.xmodel" doesn't exist.
*** Check failure stack trace: ***
This program has crashed!
Aborted (core dumped)
**************************************************
* VITIS_AI Compilation - Xilinx Inc.
**************************************************
-----------------------------------------
MODEL COMPILED
-----------------------------------------
-----------------------------------------
COMPILING MODEL FOR ZCU104..
-----------------------------------------
[UNILOG][FATAL][XCOM_FILE_NOT_EXISTS][The file is not exists] "/workspace/./build/quant_model/CNN_int.xmodel" doesn't exist.
*** Check failure stack trace: ***
This program has crashed!
Aborted (core dumped)
**************************************************
* VITIS_AI Compilation - Xilinx Inc.
**************************************************
-----------------------------------------
MODEL COMPILED
-----------------------------------------
-----------------------------------------
COMPILING MODEL FOR ALVEO U50..
-----------------------------------------
[UNILOG][FATAL][XCOM_FILE_NOT_EXISTS][The file is not exists] "/workspace/./build/quant_model/CNN_int.xmodel" doesn't exist.
*** Check failure stack trace: ***
This program has crashed!
Aborted (core dumped)
**************************************************
* VITIS_AI Compilation - Xilinx Inc.
**************************************************
-----------------------------------------
MODEL COMPILED
-----------------------------------------
-----------------------------------------
COMPILING MODEL FOR VCK190..
-----------------------------------------
[UNILOG][FATAL][XCOM_FILE_NOT_EXISTS][The file is not exists] "/workspace/./build/quant_model/CNN_int.xmodel" doesn't exist.
*** Check failure stack trace: ***
This program has crashed!
Aborted (core dumped)
**************************************************
* VITIS_AI Compilation - Xilinx Inc.
**************************************************
-----------------------------------------
MODEL COMPILED
-----------------------------------------

------------------------------------
3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:53) 
[GCC 9.4.0]
------------------------------------
Command line options:
 --build_dir    :  ./build
 --target       :  zcu102
 --num_images   :  10000
 --app_dir      :  application
------------------------------------

Copying application code from application ...
Copying compiled model from ./build/compiled_model/CNN_zcu102.xmodel ...
Traceback (most recent call last):
  File "target.py", line 121, in <module>
    main()
  File "target.py", line 117, in main
    make_target(args.build_dir, args.target, args.num_images, args.app_dir)
  File "target.py", line 83, in make_target
    shutil.copy(model_path, target_dir)
  File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.7/shutil.py", line 248, in copy
    copyfile(src, dst, follow_symlinks=follow_symlinks)
  File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.7/shutil.py", line 120, in copyfile
    with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: './build/compiled_model/CNN_zcu102.xmodel'

------------------------------------
3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:53) 
[GCC 9.4.0]
------------------------------------
Command line options:
 --build_dir    :  ./build
 --target       :  zcu104
 --num_images   :  10000
 --app_dir      :  application
------------------------------------

Copying application code from application ...
Copying compiled model from ./build/compiled_model/CNN_zcu104.xmodel ...
Traceback (most recent call last):
  File "target.py", line 121, in <module>
    main()
  File "target.py", line 117, in main
    make_target(args.build_dir, args.target, args.num_images, args.app_dir)
  File "target.py", line 83, in make_target
    shutil.copy(model_path, target_dir)
  File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.7/shutil.py", line 248, in copy
    copyfile(src, dst, follow_symlinks=follow_symlinks)
  File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.7/shutil.py", line 120, in copyfile
    with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: './build/compiled_model/CNN_zcu104.xmodel'

------------------------------------
3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:53) 
[GCC 9.4.0]
------------------------------------
Command line options:
 --build_dir    :  ./build
 --target       :  vck190
 --num_images   :  10000
 --app_dir      :  application
------------------------------------

Copying application code from application ...
Copying compiled model from ./build/compiled_model/CNN_vck190.xmodel ...
Traceback (most recent call last):
  File "target.py", line 121, in <module>
    main()
  File "target.py", line 117, in main
    make_target(args.build_dir, args.target, args.num_images, args.app_dir)
  File "target.py", line 83, in make_target
    shutil.copy(model_path, target_dir)
  File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.7/shutil.py", line 248, in copy
    copyfile(src, dst, follow_symlinks=follow_symlinks)
  File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.7/shutil.py", line 120, in copyfile
    with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: './build/compiled_model/CNN_vck190.xmodel'

------------------------------------
3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:53) 
[GCC 9.4.0]
------------------------------------
Command line options:
 --build_dir    :  ./build
 --target       :  u50
 --num_images   :  10000
 --app_dir      :  application
------------------------------------

Copying application code from application ...
Copying compiled model from ./build/compiled_model/CNN_u50.xmodel ...
Traceback (most recent call last):
  File "target.py", line 121, in <module>
    main()
  File "target.py", line 117, in main
    make_target(args.build_dir, args.target, args.num_images, args.app_dir)
  File "target.py", line 83, in make_target
    shutil.copy(model_path, target_dir)
  File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.7/shutil.py", line 248, in copy
    copyfile(src, dst, follow_symlinks=follow_symlinks)
  File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.7/shutil.py", line 120, in copyfile
    with open(src, 'rb') as fsrc:
FileNotFoundError: [Errno 2] No such file or directory: './build/compiled_model/CNN_u50.xmodel'

Tried to debug further to find the source of why the scripts were failing and tried to run the training. Then found that there were some issues with the hyperlinks in the scripts. So I had to download the dataset individually, extract them and train the models.

(vitis-ai-pytorch) Vitis-AI /workspace > vi run_all.sh 
(vitis-ai-pytorch) Vitis-AI /workspace > ls
application  common.py   docker_run.sh  PROMPT.txt   quantize.py  setup.sh   train.py
build        compile.sh  img            __pycache__  run_all.sh   target.py
(vitis-ai-pytorch) Vitis-AI /workspace > pwd
/workspace
(vitis-ai-pytorch) Vitis-AI /workspace > export BUILD=./build
(vitis-ai-pytorch) Vitis-AI /workspace > export LOG=${BUILD}/logs
(vitis-ai-pytorch) Vitis-AI /workspace > mkdir -p ${LOG}
(vitis-ai-pytorch) Vitis-AI /workspace > python -u train.py -d ${BUILD} 2>&1 | tee ${LOG}/train.log

-----------------------------------------
PyTorch version :  1.10.1
3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:53) 
[GCC 9.4.0]
-----------------------------------------
 Command line options:
--build_dir    :  ./build
--batchsize    :  100
--learnrate    :  0.001
--epochs       :  3
-----------------------------------------
No CUDA devices available..selecting CPU
Downloading http://yann.lecun.com/exdb/mnist/https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 404: Not Found

Downloading https://ossci-datasets.s3.amazonaws.com/mnist/https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
Failed to download (trying next):
HTTP Error 404: Not Found

Traceback (most recent call last):
  File "train.py", line 131, in <module>
    run_main()
  File "train.py", line 124, in run_main
    train_test(args.build_dir, args.batchsize, args.learnrate, args.epochs)
  File "train.py", line 72, in train_test
    transform=train_transform)
  File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.7/site-packages/torchvision/datasets/mnist.py", line 87, in __init__
    self.download()
  File "/opt/vitis_ai/conda/envs/vitis-ai-pytorch/lib/python3.7/site-packages/torchvision/datasets/mnist.py", line 190, in download
    raise RuntimeError("Error downloading {}".format(filename))
RuntimeError: Error downloading https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
(vitis-ai-pytorch) Vitis-AI /workspace > vim train.py 
(vitis-ai-pytorch) Vitis-AI /workspace > ls
application  common.py   docker_run.sh  PROMPT.txt   quantize.py  setup.sh   train.py
build        compile.sh  img            __pycache__  run_all.sh   target.py
(vitis-ai-pytorch) Vitis-AI /workspace > cd build/
(vitis-ai-pytorch) Vitis-AI /workspace/build > s
bash: s: command not found
(vitis-ai-pytorch) Vitis-AI /workspace/build > ls
dataset  logs  target_u50  target_vck190  target_zcu102  target_zcu104
(vitis-ai-pytorch) Vitis-AI /workspace/build > cd dataset/
(vitis-ai-pytorch) Vitis-AI /workspace/build/dataset > ls
MNIST
(vitis-ai-pytorch) Vitis-AI /workspace/build/dataset > cd MNIST/
(vitis-ai-pytorch) Vitis-AI /workspace/build/dataset/MNIST > ls
raw
(vitis-ai-pytorch) Vitis-AI /workspace/build/dataset/MNIST > cd raw/
(vitis-ai-pytorch) Vitis-AI /workspace/build/dataset/MNIST/raw > ls
(vitis-ai-pytorch) Vitis-AI /workspace/build/dataset/MNIST/raw > cd ../
(vitis-ai-pytorch) Vitis-AI /workspace/build/dataset/MNIST > ls
raw
(vitis-ai-pytorch) Vitis-AI /workspace/build/dataset/MNIST > cd ../
(vitis-ai-pytorch) Vitis-AI /workspace/build/dataset > rm -rf MNIST/
(vitis-ai-pytorch) Vitis-AI /workspace/build/dataset > wget https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
--2023-08-26 06:16:58--  https://ossci-datasets.s3.amazonaws.com/mnist/train-images-idx3-ubyte.gz
Resolving ossci-datasets.s3.amazonaws.com (ossci-datasets.s3.amazonaws.com)... 52.217.49.220, 52.217.126.225, 52.217.128.161, ...
Connecting to ossci-datasets.s3.amazonaws.com (ossci-datasets.s3.amazonaws.com)|52.217.49.220|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 9912422 (9.5M) [application/x-gzip]
Saving to: ‘train-images-idx3-ubyte.gz’

train-images-idx3-ubyte.gz          100%[================================================================>]   9.45M  7.25MB/s    in 1.3s    

2023-08-26 06:16:59 (7.25 MB/s) - ‘train-images-idx3-ubyte.gz’ saved [9912422/9912422]

(vitis-ai-pytorch) Vitis-AI /workspace/build/dataset > ls
train-images-idx3-ubyte.gz
(vitis-ai-pytorch) Vitis-AI /workspace/build/dataset > cd ../
(vitis-ai-pytorch) Vitis-AI /workspace/build > ls
dataset  logs  target_u50  target_vck190  target_zcu102  target_zcu104
(vitis-ai-pytorch) Vitis-AI /workspace/build > cd ../
(vitis-ai-pytorch) Vitis-AI /workspace > ls
application  common.py   docker_run.sh  PROMPT.txt   quantize.py  setup.sh   train.py
build        compile.sh  img            __pycache__  run_all.sh   target.py
(vitis-ai-pytorch) Vitis-AI /workspace > head train.py 
'''
 Copyright 2020 Xilinx Inc.
 Licensed under the Apache License, Version 2.0 (the "License");
 you may not use this file except in compliance with the License.
 You may obtain a copy of the License at
     http://www.apache.org/licenses/LICENSE-2.0
 Unless required by applicable law or agreed to in writing, software
 distributed under the License is distributed on an "AS IS" BASIS,
 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
(vitis-ai-pytorch) Vitis-AI /workspace > vi train.py 
(vitis-ai-pytorch) Vitis-AI /workspace > cd build/
(vitis-ai-pytorch) Vitis-AI /workspace/build > ls
dataset  logs  target_u50  target_vck190  target_zcu102  target_zcu104
(vitis-ai-pytorch) Vitis-AI /workspace/build > cd dataset/
(vitis-ai-pytorch) Vitis-AI /workspace/build/dataset > ls
train-images-idx3-ubyte.gz
(vitis-ai-pytorch) Vitis-AI /workspace/build/dataset > wget https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz
--2023-08-26 06:21:31--  https://ossci-datasets.s3.amazonaws.com/mnist/train-labels-idx1-ubyte.gz
Resolving ossci-datasets.s3.amazonaws.com (ossci-datasets.s3.amazonaws.com)... 52.217.132.65, 52.217.44.180, 16.182.42.57, ...
Connecting to ossci-datasets.s3.amazonaws.com (ossci-datasets.s3.amazonaws.com)|52.217.132.65|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 28881 (28K) [application/x-gzip]
Saving to: ‘train-labels-idx1-ubyte.gz’

train-labels-idx1-ubyte.gz          100%[================================================================>]  28.20K  --.-KB/s    in 0.09s   

2023-08-26 06:21:31 (323 KB/s) - ‘train-labels-idx1-ubyte.gz’ saved [28881/28881]

(vitis-ai-pytorch) Vitis-AI /workspace/build/dataset > https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz
bash: https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz: No such file or directory
(vitis-ai-pytorch) Vitis-AI /workspace/build/dataset > wget https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz
--2023-08-26 06:22:24--  https://ossci-datasets.s3.amazonaws.com/mnist/t10k-images-idx3-ubyte.gz
Resolving ossci-datasets.s3.amazonaws.com (ossci-datasets.s3.amazonaws.com)... 52.216.43.105, 3.5.28.118, 52.217.108.84, ...
Connecting to ossci-datasets.s3.amazonaws.com (ossci-datasets.s3.amazonaws.com)|52.216.43.105|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1648877 (1.6M) [application/x-gzip]
Saving to: ‘t10k-images-idx3-ubyte.gz’

t10k-images-idx3-ubyte.gz           100%[================================================================>]   1.57M  2.76MB/s    in 0.6s    

2023-08-26 06:22:25 (2.76 MB/s) - ‘t10k-images-idx3-ubyte.gz’ saved [1648877/1648877]

(vitis-ai-pytorch) Vitis-AI /workspace/build/dataset > wget https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz
--2023-08-26 06:22:46--  https://ossci-datasets.s3.amazonaws.com/mnist/t10k-labels-idx1-ubyte.gz
Resolving ossci-datasets.s3.amazonaws.com (ossci-datasets.s3.amazonaws.com)... 52.216.239.187, 52.216.205.51, 16.182.72.41, ...
Connecting to ossci-datasets.s3.amazonaws.com (ossci-datasets.s3.amazonaws.com)|52.216.239.187|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4542 (4.4K) [application/x-gzip]
Saving to: ‘t10k-labels-idx1-ubyte.gz’

t10k-labels-idx1-ubyte.gz           100%[================================================================>]   4.44K  --.-KB/s    in 0s      

2023-08-26 06:22:47 (12.2 MB/s) - ‘t10k-labels-idx1-ubyte.gz’ saved [4542/4542]

(vitis-ai-pytorch) Vitis-AI /workspace/build/dataset > ls
t10k-images-idx3-ubyte.gz  t10k-labels-idx1-ubyte.gz  train-images-idx3-ubyte.gz  train-labels-idx1-ubyte.gz
(vitis-ai-pytorch) Vitis-AI /workspace/build/dataset > cd ../
(vitis-ai-pytorch) Vitis-AI /workspace/build > ls
dataset  logs  target_u50  target_vck190  target_zcu102  target_zcu104
(vitis-ai-pytorch) Vitis-AI /workspace/build > cd ../
(vitis-ai-pytorch) Vitis-AI /workspace > ls
application  common.py   docker_run.sh  PROMPT.txt   quantize.py  setup.sh   train.py
build        compile.sh  img            __pycache__  run_all.sh   target.py
(vitis-ai-pytorch) Vitis-AI /workspace > echo $BUILD
./build

Training:

(vitis-ai-pytorch) Vitis-AI /workspace > python -u train.py -d ${BUILD} 2>&1 | tee ${LOG}/train.log

-----------------------------------------
PyTorch version :  1.10.1
3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:53) 
[GCC 9.4.0]
-----------------------------------------
 Command line options:
--build_dir    :  ./build
--batchsize    :  100
--learnrate    :  0.001
--epochs       :  3
-----------------------------------------
No CUDA devices available..selecting CPU
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ./build/dataset/MNIST/raw/train-images-idx3-ubyte.gz
9913344it [00:00, 11805072.48it/s]                             
Extracting ./build/dataset/MNIST/raw/train-images-idx3-ubyte.gz to ./build/dataset/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to ./build/dataset/MNIST/raw/train-labels-idx1-ubyte.gz
29696it [00:00, 15193224.15it/s]         
Extracting ./build/dataset/MNIST/raw/train-labels-idx1-ubyte.gz to ./build/dataset/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to ./build/dataset/MNIST/raw/t10k-images-idx3-ubyte.gz
1649664it [00:00, 11593782.05it/s]                             
Extracting ./build/dataset/MNIST/raw/t10k-images-idx3-ubyte.gz to ./build/dataset/MNIST/raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to ./build/dataset/MNIST/raw/t10k-labels-idx1-ubyte.gz
5120it [00:00, 8294645.22it/s]          
Extracting ./build/dataset/MNIST/raw/t10k-labels-idx1-ubyte.gz to ./build/dataset/MNIST/raw

Epoch 1

Test set: Accuracy: 9838/10000 (98.38%)

Epoch 2

Test set: Accuracy: 9868/10000 (98.68%)

Epoch 3

Test set: Accuracy: 9892/10000 (98.92%)

Trained model written to ./build/float_model/f_model.pth

Quantizing calibration and testing:

The Vitis-AI quantizer is responsible for converting floating-point models into fixed-point models that requires less memory bandwidth - providing faster speed and higher computing efficiency. This is achieved in two steps: calibration and testing.

(vitis-ai-pytorch) Vitis-AI /workspace > python -u quantize.py -d ${BUILD} --quant_mode calib 2>&1 | tee ${LOG}/quant_calib.log
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'

[VAIQ_NOTE]: Loading NNDCT kernels...

-----------------------------------------
PyTorch version :  1.10.1
3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:53) 
[GCC 9.4.0]
-----------------------------------------
 Command line options:
--build_dir    :  ./build
--quant_mode   :  calib
--batchsize    :  100
-----------------------------------------
No CUDA devices available..selecting CPU

[VAIQ_WARN]: CUDA is not available, change device to CPU

[VAIQ_NOTE]: Quant config file is empty, use default quant configuration

[VAIQ_NOTE]: Quantization calibration process start up...

[VAIQ_NOTE]: =>Quant Module is in 'cpu'.

[VAIQ_NOTE]: =>Parsing CNN...

[VAIQ_NOTE]: Start to trace model...

[VAIQ_NOTE]: Finish tracing.

[VAIQ_NOTE]: Processing ops...
███▌                                              | 1/14 [00:00<00:00, 5433.04it/s, OpInfo: name = CNN/Sequential[network]/Conv2d[0]/input.3,███████▏                                          | 2/14 [00:00<00:00, 1827.98it/s, OpInfo: name = CNN/Sequential[network]/BatchNorm2d[1]/inp██████████▋                                       | 3/14 [00:00<00:00, 1778.00it/s, OpInfo: name = CNN/Sequential[network]/ReLU[2]/input.7, t██████████████▎                                   | 4/14 [00:00<00:00, 1806.92it/s, OpInfo: name = CNN/Sequential[network]/Conv2d[3]/input.9,█████████████████▊                                | 5/14 [00:00<00:00, 1763.65it/s, OpInfo: name = CNN/Sequential[network]/BatchNorm2d[4]/inp█████████████████████▍                            | 6/14 [00:00<00:00, 1831.71it/s, OpInfo: name = CNN/Sequential[network]/ReLU[5]/input.13, █████████████████████████                         | 7/14 [00:00<00:00, 1981.92it/s, OpInfo: name = CNN/Sequential[network]/Conv2d[6]/input.15████████████████████████████▌                     | 8/14 [00:00<00:00, 2004.09it/s, OpInfo: name = CNN/Sequential[network]/BatchNorm2d[7]/inp████████████████████████████████▏                 | 9/14 [00:00<00:00, 2040.14it/s, OpInfo: name = CNN/Sequential[network]/ReLU[8]/input.19, ███████████████████████████████████▋              | 10/14 [00:00<00:00, 2144.22it/s, OpInfo: name = CNN/Sequential[network]/Conv2d[9]/input, ███████████████████████████████████████▎          | 11/14 [00:00<00:00, 2152.23it/s, OpInfo: name = CNN/Sequential[network]/BatchNorm2d[10]/5██████████████████████████████████████████▊       | 12/14 [00:00<00:00, 2172.00it/s, OpInfo: name = CNN/Sequential[network]/Flatten[11]/508, ██████████████████████████████████████████████▍   | 13/14 [00:00<00:00, 2081.38it/s, OpInfo: name = return_0, type = Return]                 ██████████████████████████████████████████████████| 14/14 [00:00<00:00, 2169.68it/s, OpInfo: name = return_0, type = Return]

[VAIQ_NOTE]: =>Doing weights equalization...

[VAIQ_NOTE]: =>Quantizable module is generated.(./build/quant_model/CNN.py)

[VAIQ_NOTE]: =>Get module with quantization.

Test set: Accuracy: 9886/10000 (98.86%)


[VAIQ_NOTE]: =>Exporting quant config.(./build/quant_model/quant_info.json)
(vitis-ai-pytorch) Vitis-AI /workspace > python -u quantize.py -d ${BUILD} --quant_mode test  2>&1 | tee ${LOG}/quant_test.log
No CUDA runtime is found, using CUDA_HOME='/usr/local/cuda'

[VAIQ_NOTE]: Loading NNDCT kernels...

-----------------------------------------
PyTorch version :  1.10.1
3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:53) 
[GCC 9.4.0]
-----------------------------------------
 Command line options:
--build_dir    :  ./build
--quant_mode   :  test
--batchsize    :  100
-----------------------------------------
No CUDA devices available..selecting CPU

[VAIQ_WARN]: CUDA is not available, change device to CPU

[VAIQ_NOTE]: Quant config file is empty, use default quant configuration

[VAIQ_NOTE]: Quantization test process start up...

[VAIQ_NOTE]: =>Quant Module is in 'cpu'.

[VAIQ_NOTE]: =>Parsing CNN...

[VAIQ_NOTE]: Start to trace model...

[VAIQ_NOTE]: Finish tracing.

[VAIQ_NOTE]: Processing ops...
███▌                                              | 1/14 [00:00<00:00, 5793.24it/s, OpInfo: name = CNN/Sequential[network]/Conv2d[0]/input.3,███████▏                                          | 2/14 [00:00<00:00, 1858.35it/s, OpInfo: name = CNN/Sequential[network]/BatchNorm2d[1]/inp██████████▋                                       | 3/14 [00:00<00:00, 1804.00it/s, OpInfo: name = CNN/Sequential[network]/ReLU[2]/input.7, t██████████████▎                                   | 4/14 [00:00<00:00, 1824.60it/s, OpInfo: name = CNN/Sequential[network]/Conv2d[3]/input.9,█████████████████▊                                | 5/14 [00:00<00:00, 1773.04it/s, OpInfo: name = CNN/Sequential[network]/BatchNorm2d[4]/inp█████████████████████▍                            | 6/14 [00:00<00:00, 1843.79it/s, OpInfo: name = CNN/Sequential[network]/ReLU[5]/input.13, █████████████████████████                         | 7/14 [00:00<00:00, 1996.34it/s, OpInfo: name = CNN/Sequential[network]/Conv2d[6]/input.15████████████████████████████▌                     | 8/14 [00:00<00:00, 2025.38it/s, OpInfo: name = CNN/Sequential[network]/BatchNorm2d[7]/inp████████████████████████████████▏                 | 9/14 [00:00<00:00, 2064.01it/s, OpInfo: name = CNN/Sequential[network]/ReLU[8]/input.19, ███████████████████████████████████▋              | 10/14 [00:00<00:00, 2169.17it/s, OpInfo: name = CNN/Sequential[network]/Conv2d[9]/input, ███████████████████████████████████████▎          | 11/14 [00:00<00:00, 2174.34it/s, OpInfo: name = CNN/Sequential[network]/BatchNorm2d[10]/5██████████████████████████████████████████▊       | 12/14 [00:00<00:00, 2194.44it/s, OpInfo: name = CNN/Sequential[network]/Flatten[11]/508, ██████████████████████████████████████████████▍   | 13/14 [00:00<00:00, 2100.95it/s, OpInfo: name = return_0, type = Return]                 ██████████████████████████████████████████████████| 14/14 [00:00<00:00, 2187.71it/s, OpInfo: name = return_0, type = Return]

[VAIQ_NOTE]: =>Doing weights equalization...

[VAIQ_NOTE]: =>Quantizable module is generated.(./build/quant_model/CNN.py)

[VAIQ_NOTE]: =>Get module with quantization.

Test set: Accuracy: 9885/10000 (98.85%)


[VAIQ_NOTE]: =>Converting to xmodel ...

[VAIQ_NOTE]: =>Successfully convert 'CNN' to xmodel.(./build/quant_model/CNN_int.xmodel)

Compiling:

Vitis-AI compiler maps the AI model to a highly efficient instruction set and data flow. It also performs sophisticated optimizations to reuse on-chip memory and resources as much as possible.

(vitis-ai-pytorch) Vitis-AI /workspace > 
(vitis-ai-pytorch) Vitis-AI /workspace > ls
application  common.py   docker_run.sh  PROMPT.txt   quantize.py  setup.sh   train.py
build        compile.sh  img            __pycache__  run_all.sh   target.py
(vitis-ai-pytorch) Vitis-AI /workspace > vim com
common.py   compile.sh  
(vitis-ai-pytorch) Vitis-AI /workspace > vim compile.sh 
(vitis-ai-pytorch) Vitis-AI /workspace > vim /opt/vitis_ai/compiler/arch/DPUCAHX8H/U50/arch.json
(vitis-ai-pytorch) Vitis-AI /workspace > #cd /opt/vitis_ai/compiler/arch/DPUCAHX8H
(vitis-ai-pytorch) Vitis-AI /workspace > pushd .
/workspace /workspace
(vitis-ai-pytorch) Vitis-AI /workspace > cd /opt/vitis_ai/compiler/arch/DPUCAHX8H
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/compiler/arch/DPUCAHX8H > ls
U280  U280-DWC  U50  U50LV  U50LV-DWC  U55C-DWC
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/compiler/arch/DPUCAHX8H > tre
tred  tree  
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/compiler/arch/DPUCAHX8H > tree -L 2
.
├── U280
│   └── arch.json
├── U280-DWC
│   └── arch.json
├── U50
│   └── arch.json
├── U50LV
│   └── arch.json
├── U50LV-DWC
│   └── arch.json
└── U55C-DWC
    └── arch.json

6 directories, 6 files
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/compiler/arch/DPUCAHX8H > cd ../
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/compiler/arch > ls
DPUCADF8H  DPUCAHX8H  DPUCAHX8L  DPUCVDX8G  DPUCVDX8H  DPUCZDX8G
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/compiler/arch > cd DPUCZDX8G/
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/compiler/arch/DPUCZDX8G > ls
KV260  ZCU102  ZCU104
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/compiler/arch/DPUCZDX8G > cd ../
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/compiler/arch > ls
DPUCADF8H  DPUCAHX8H  DPUCAHX8L  DPUCVDX8G  DPUCVDX8H  DPUCZDX8G
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/compiler/arch > c ../
bash: c: command not found
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/compiler/arch > cd ../
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/compiler > ls
arch
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/compiler > cd ../
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai > ls
compiler  conda  scripts
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai > cd conda/
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/conda > ls
bin              condabin    envs  include  libexec      man   sbin   shell  x86_64-conda_cos6-linux-gnu
compiler_compat  conda-meta  etc   lib      LICENSE.txt  pkgs  share  ssl    x86_64-conda-linux-gnu
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/conda > cd ../
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai > ls
compiler  conda  scripts
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai > ls
compiler  conda  scripts
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai > cd compiler/arch/
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/compiler/arch > ls
DPUCADF8H  DPUCAHX8H  DPUCAHX8L  DPUCVDX8G  DPUCVDX8H  DPUCZDX8G
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/compiler/arch > cd ../
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/compiler > ls
arch
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/compiler > popd
/workspace
(vitis-ai-pytorch) Vitis-AI /workspace > ls
application  common.py   docker_run.sh  PROMPT.txt   quantize.py  setup.sh   train.py
build        compile.sh  img            __pycache__  run_all.sh   target.py
(vitis-ai-pytorch) Vitis-AI /workspace > source compile.sh zcu102 ${BUILD} ${LOG}
-----------------------------------------
COMPILING MODEL FOR ZCU102..
-----------------------------------------
[UNILOG][INFO] Compile mode: dpu
[UNILOG][INFO] Debug mode: function
[UNILOG][INFO] Target architecture: DPUCZDX8G_ISA1_B4096
[UNILOG][INFO] Graph name: CNN, with op num: 31
[UNILOG][INFO] Begin to compile...
[UNILOG][INFO] Total device subgraph number 3, DPU subgraph number 1
[UNILOG][INFO] Compile done.
[UNILOG][INFO] The meta json is saved to "/workspace/./build/compiled_model/meta.json"
[UNILOG][INFO] The compiled xmodel is saved to "/workspace/./build/compiled_model/CNN_zcu102.xmodel"
[UNILOG][INFO] The compiled xmodel's md5sum is b9f201ffff4a5fb3d9b3a3a32e915fa9, and has been saved to "/workspace/./build/compiled_model/md5sum.txt"
**************************************************
* VITIS_AI Compilation - Xilinx Inc.
**************************************************
-----------------------------------------
MODEL COMPILED
-----------------------------------------
(vitis-ai-pytorch) Vitis-AI /workspace > ls build/compiled_model/
CNN_zcu102.xmodel  md5sum.txt  meta.json

Compiling for custom target based on Ultra96-V2:

When trying to run the custom models on the Ultra96v2 platfrom, I had some issues in generating the xmodel for the target zynq ultrascale MPSoC platform because of some varying fingerprint. I have been investigating the issue lately which has prevented me from running the example and my object detection and localization model on the DPU for acceleration. The official Vitis AI examples have support for ultra96v2 boards and the AVNET repos follow their own custom scripts to target the DPU. So it is a bit difficult to debug the issue like the one observed below as I started with the Vitis-AI tutorials from AMD. It will be good if at some point a common/standard approach is adopted across both repositories. The other approach is to spend a bit more time understanding the projects and we might fix the issue.

(vitis-ai-pytorch) Vitis-AI /workspace > vi build/compiled_model/meta.json 
(vitis-ai-pytorch) Vitis-AI /workspace > cd /opt/vitis_ai/
compiler/ conda/    scripts/  
(vitis-ai-pytorch) Vitis-AI /workspace > cd /opt/vitis_ai/compiler/arch/DPUC
DPUCADF8H/ DPUCAHX8H/ DPUCAHX8L/ DPUCVDX8G/ DPUCVDX8H/ DPUCZDX8G/ 
(vitis-ai-pytorch) Vitis-AI /workspace > cd /opt/vitis_ai/compiler/arch/DPUCZDX8G/
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/compiler/arch/DPUCZDX8G > ls
KV260  ZCU102  ZCU104
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/compiler/arch/DPUCZDX8G > mkdir ULTRA96V2
mkdir: cannot create directory ‘ULTRA96V2’: Permission denied
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/compiler/arch/DPUCZDX8G > sudo mkdir ULTRA96V2
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/compiler/arch/DPUCZDX8G > ls
KV260  ULTRA96V2  ZCU102  ZCU104
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/compiler/arch/DPUCZDX8G > cd ULTRA96V2/
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/compiler/arch/DPUCZDX8G/ULTRA96V2 > ls
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/compiler/arch/DPUCZDX8G/ULTRA96V2 > cp ../
KV260/     ULTRA96V2/ ZCU102/    ZCU104/    
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/compiler/arch/DPUCZDX8G/ULTRA96V2 > cp ../KV260/arch.json .
cp: cannot create regular file './arch.json': Permission denied
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/compiler/arch/DPUCZDX8G/ULTRA96V2 > sudo cp ../KV260/arch.json .
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/compiler/arch/DPUCZDX8G/ULTRA96V2 > ls
arch.json
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/compiler/arch/DPUCZDX8G/ULTRA96V2 > vi arch.json 
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai/compiler/arch/DPUCZDX8G/ULTRA96V2 > cd ../../../../
(vitis-ai-pytorch) Vitis-AI /opt/vitis_ai > cd 
(vitis-ai-pytorch) Vitis-AI ~ > cd /workspace/
(vitis-ai-pytorch) Vitis-AI /workspace > ls
application  common.py   docker_run.sh  PROMPT.txt   quantize.py  setup.sh   train.py
build        compile.sh  img            __pycache__  run_all.sh   target.py
(vitis-ai-pytorch) Vitis-AI /workspace > python -u target.py --target zcu102 -d ${BUILD} 2>&1 | tee ${LOG}/target_zcu102.log

------------------------------------
3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 06:08:53) 
[GCC 9.4.0]
------------------------------------
Command line options:
 --build_dir    :  ./build
 --target       :  zcu102
 --num_images   :  10000
 --app_dir      :  application
------------------------------------

Copying application code from application ...
Copying compiled model from ./build/compiled_model/CNN_zcu102.xmodel ...
100%|██████████| 10000/10000 [00:10<00:00, 981.85it/s]
(vitis-ai-pytorch) Vitis-AI /workspace > 
(vitis-ai-pytorch) Vitis-AI /workspace > ls
application  common.py   docker_run.sh  PROMPT.txt   quantize.py  setup.sh   train.py
build        compile.sh  img            __pycache__  run_all.sh   target.py
(vitis-ai-pytorch) Vitis-AI /workspace > ls
application  common.py   docker_run.sh  PROMPT.txt   quantize.py  setup.sh   train.py
build        compile.sh  img            __pycache__  run_all.sh   target.py
(vitis-ai-pytorch) Vitis-AI /workspace > vim compile.sh 
(vitis-ai-pytorch) Vitis-AI /workspace > cat /opt/vitis_ai/compiler/arch/DPUCZDX8G/ZCU102/arch.json
{
    "target": "DPUCZDX8G_ISA1_B4096"
}
(vitis-ai-pytorch) Vitis-AI /workspace > cat /opt/vitis_ai/compiler/arch/DPUCZDX8G/
KV260/     ULTRA96V2/ ZCU102/    ZCU104/    
(vitis-ai-pytorch) Vitis-AI /workspace > cat /opt/vitis_ai/compiler/arch/DPUCZDX8G/
cat: /opt/vitis_ai/compiler/arch/DPUCZDX8G/: Is a directory
(vitis-ai-pytorch) Vitis-AI /workspace > ls /opt/vitis_ai/compiler/arch/DPUCZDX8G/
KV260  ULTRA96V2  ZCU102  ZCU104
(vitis-ai-pytorch) Vitis-AI /workspace > ls /opt/vitis_ai/compiler/arch/DPUCZDX8G/ULTRA96V2/
arch.json
(vitis-ai-pytorch) Vitis-AI /workspace > cat /opt/vitis_ai/compiler/arch/DPUCZDX8G/ULTRA96V2/arch.json 
{
    "target": "DPUCZDX8G_ISA1_B4096"
}
(vitis-ai-pytorch) Vitis-AI /workspace > vi /opt/vitis_ai/compiler/arch/DPUCZDX8G/ULTRA96V2/arch.json 
(vitis-ai-pytorch) Vitis-AI /workspace > sudo vi /opt/vitis_ai/compiler/arch/DPUCZDX8G/ULTRA96V2/arch.json 
(vitis-ai-pytorch) Vitis-AI /workspace > vim compile.sh 
(vitis-ai-pytorch) Vitis-AI /workspace > vim compile.sh 
(vitis-ai-pytorch) Vitis-AI /workspace > cat /opt/vitis_ai/compiler/arch/DPUCZDX8G/ULTRA96V2/arch.json
{
    "target": "DPUCZDX8G_ISA0_B2304_MAX_BG2"
}
(vitis-ai-pytorch) Vitis-AI /workspace > source compile.sh ultra96v2 ${BUILD} ${LOG}
-----------------------------------------
COMPILING MODEL FOR ULTRA96V2..
-----------------------------------------
[UNILOG][FATAL][TARGET_FACTORY_UNREGISTERED_TARGET][Unregistered target!] Cannot find target with name DPUCZDX8G_ISA0_B2304_MAX_BG2, valid names are: {DPUCADF8H_ISA0=>0x700000000000000,DPUCAHX8H_ISA2=>0x20200000010002a,DPUCAHX8H_ISA2_DWC=>0x20200000010002b,DPUCAHX8H_ISA2_ELP2=>0x20200000000002e,DPUCAHX8L_ISA0=>0x30000000000001d,DPUCAHX8L_ISA0_SP=>0x30000000000101d,DPUCVDX8G_ISA3_C32B1=>0x603000b16011811,DPUCVDX8G_ISA3_C32B3=>0x603000b16011831,DPUCVDX8G_ISA3_C32B3_PSMNET=>0x603000b16026831,DPUCVDX8G_ISA3_C32B6=>0x603000b16011861,DPUCVDX8G_ISA3_C64B1=>0x603000b16011812,DPUCVDX8G_ISA3_C64B3=>0x603000b16011832,DPUCVDX8G_ISA3_C64B5=>0x603000b16011852,DPUCVDX8H_ISA1_F2W2_8PE=>0x501000000140fee,DPUCVDX8H_ISA1_F2W4_4PE=>0x5010000001e082f,DPUCVDX8H_ISA1_F2W4_6PE_aieDWC=>0x501000000160c2f,DPUCVDX8H_ISA1_F2W4_6PE_aieMISC=>0x5010000001e082e,DPUCZDI4G_ISA0_B4096_DEMO_SSD=>0x400002003220206,DPUCZDI4G_ISA0_B8192D8_DEMO_SSD=>0x400002003220207,DPUCZDX8G_ISA1_B1024=>0x101000016010402,DPUCZDX8G_ISA1_B1152=>0x101000016010203,DPUCZDX8G_ISA1_B1600=>0x101000016010404,DPUCZDX8G_ISA1_B2304=>0x101000016010405,DPUCZDX8G_ISA1_B3136=>0x101000016010406,DPUCZDX8G_ISA1_B4096=>0x101000016010407,DPUCZDX8G_ISA1_B512=>0x101000016010200,DPUCZDX8G_ISA1_B800=>0x101000016010201}
*** Check failure stack trace: ***
This program has crashed!
Aborted (core dumped)
**************************************************
* VITIS_AI Compilation - Xilinx Inc.
**************************************************
-----------------------------------------
MODEL COMPILED
-----------------------------------------

Conclusion:

The path to programmable 3 has been a interesting journey. I have learned a lot about the AMD AI/ML stack and general Machine learning stuff. It has also been very challenging as I had to expand my skillset to learn more tools like docker, python API's for tensor flow, keras etc. I hope to continue exploring the AI/ML offering from AMD in future blogs and hopefully fix some of the issues that I have mentioned above and see the performance improvements.