element14 Community
element14 Community
    Register Log In
  • Site
  • Search
  • Log In Register
  • About Us
  • Community Hub
    Community Hub
    • What's New on element14
    • Feedback and Support
    • Benefits of Membership
    • Personal Blogs
    • Members Area
    • Achievement Levels
  • Learn
    Learn
    • Ask an Expert
    • eBooks
    • element14 presents
    • Learning Center
    • Tech Spotlight
    • STEM Academy
    • Webinars, Training and Events
    • Learning Groups
  • Technologies
    Technologies
    • 3D Printing
    • FPGA
    • Industrial Automation
    • Internet of Things
    • Power & Energy
    • Sensors
    • Technology Groups
  • Challenges & Projects
    Challenges & Projects
    • Design Challenges
    • element14 presents Projects
    • Project14
    • Arduino Projects
    • Raspberry Pi Projects
    • Project Groups
  • Products
    Products
    • Arduino
    • Avnet Boards Community
    • Dev Tools
    • Manufacturers
    • Multicomp Pro
    • Product Groups
    • Raspberry Pi
    • RoadTests & Reviews
  • Store
    Store
    • Visit Your Store
    • Choose another store...
      • Europe
      •  Austria (German)
      •  Belgium (Dutch, French)
      •  Bulgaria (Bulgarian)
      •  Czech Republic (Czech)
      •  Denmark (Danish)
      •  Estonia (Estonian)
      •  Finland (Finnish)
      •  France (French)
      •  Germany (German)
      •  Hungary (Hungarian)
      •  Ireland
      •  Israel
      •  Italy (Italian)
      •  Latvia (Latvian)
      •  
      •  Lithuania (Lithuanian)
      •  Netherlands (Dutch)
      •  Norway (Norwegian)
      •  Poland (Polish)
      •  Portugal (Portuguese)
      •  Romania (Romanian)
      •  Russia (Russian)
      •  Slovakia (Slovak)
      •  Slovenia (Slovenian)
      •  Spain (Spanish)
      •  Sweden (Swedish)
      •  Switzerland(German, French)
      •  Turkey (Turkish)
      •  United Kingdom
      • Asia Pacific
      •  Australia
      •  China
      •  Hong Kong
      •  India
      •  Korea (Korean)
      •  Malaysia
      •  New Zealand
      •  Philippines
      •  Singapore
      •  Taiwan
      •  Thailand (Thai)
      • Americas
      •  Brazil (Portuguese)
      •  Canada
      •  Mexico (Spanish)
      •  United States
      Can't find the country/region you're looking for? Visit our export site or find a local distributor.
  • Translate
  • Profile
  • Settings
Eye On Intelligence Challenge
  • Challenges & Projects
  • Design Challenges
  • Eye On Intelligence Challenge
  • More
  • Cancel
Eye On Intelligence Challenge
Blog EC Blog #5
  • Blog
  • Forum
  • Documents
  • Polls
  • Files
  • Members
  • Mentions
  • Sub-Groups
  • Tags
  • More
  • Cancel
  • New
Join Eye On Intelligence Challenge to participate - click to join for free!
  • Share
  • More
  • Cancel
Group Actions
  • Group RSS
  • More
  • Cancel
Engagement
  • Author Author: venkat01
  • Date Created: 18 Nov 2024 7:16 PM Date Created
  • Views 305 views
  • Likes 4 likes
  • Comments 0 comments
  • Eye on Intelligence Challenge
Related
Recommended

EC Blog #5

venkat01
venkat01
18 Nov 2024

Using Tensil.ai for CNN Models on the Arty Z7

Tensil.ai is a powerful platform that enables the efficient deployment of convolutional neural networks (CNNs) on FPGA hardware, specifically tailored for development boards like the Arty Z7. This article provides a detailed and technical guide on how to utilize Tensil.ai for running CNN models on the Arty Z7 board, focusing on the necessary steps to set up the environment, compile models, and execute them effectively.

Setting Up the Environment

The first step in utilizing Tensil.ai on the Arty Z7 is to set up the PYNQ environment. Begin by downloading the appropriate SD card image for the Arty Z7 from the official PYNQ repository. The image typically includes a preconfigured Linux environment optimized for FPGA development.

  1. Flash the SD Card: Use a tool like Balena Etcher or Win32 Disk Imager to write the downloaded image to your SD card. Ensure that the SD card is at least 16 GB for optimal performance.

  2. Boot the Arty Z7: Insert the SD card into the Arty Z7 board, connect it to a power source, and establish a network connection (via Ethernet or Wi-Fi). Use a serial console or SSH to access the board.

  3. Kernel Configuration: Once logged into the PYNQ environment, you may need to adjust the kernel configuration to increase the size of the Contiguous Memory Allocator (CMA). This is crucial for handling large data transfers between the CPU and FPGA. You can modify the boot parameters by editing the bootargs in the /boot/uEnv.txt file to include:

bootargs=console=ttyPS0,115200 root=/dev/mmcblk0p2 rw rootwait cma=128M
  1. After saving the changes, reboot the board to apply the new settings.

Installing Tensil Driver and Artifacts

With the environment set up, the next step is to install the Tensil driver and necessary artifacts. This involves cloning the Tensil repository and transferring files to the Arty Z7.

  1. Clone the Tensil Repository: On your local machine, clone the Tensil GitHub repository:

    $ git clone https://github.com/tensil-ai/tensil.git
  • Transfer Drivers: Use SCP (Secure Copy Protocol) to transfer the Tensil drivers to your Arty Z7. You might want to navigate to the appropriate directory on the board:

    $ scp -r tensil/drivers/tcu_arty [email protected]:/home/xilinx/
  • Bitstream and Model Files: After compiling your model, transfer the generated bitstream and model files to the Arty Z7 board. For instance:

    $ scp my_model.bit [email protected]:/home/xilinx/
    $ scp my_model.onnx [email protected]:/home/xilinx/

Compiling the CNN Model

Compiling the CNN model is a critical step in the deployment process. Tensil provides a compiler that converts your high-level model definition into a format that can be executed on the FPGA.

  1. Prepare Your Model: Ensure that your CNN model is in ONNX format. If you have a model in another format (like TensorFlow or PyTorch), you can convert it to ONNX using the respective libraries.

  2. Compile the Model: Use the Tensil compiler to compile your ONNX model. The command typically looks like this:

    $ tensil compile -a /path/to/arty.tarch -m /path/to/my_model.onnx -o "Identity:0" -s true
  1. In this command:

    • -a specifies the architecture file for the Arty Z7.
    • -m points to your ONNX model file.
    • -o specifies the output node of your model.
    • -s indicates whether to include the softmax operation in the compilation.

    The compilation will generate several artifacts, including a manifest file (.tmodel), a program file (.tprog), and weights data (.tdata). These files are essential for running the model on the FPGA.

Running the Model on Arty Z7

After compiling the model, you can proceed to execute it on the Arty Z7. This involves initializing the PYNQ overlay, loading the model, and running inference.

  1. Initialize the PYNQ Overlay: In your Python environment on the Arty Z7, you need to load the PYNQ overlay and instantiate the Tensil driver. Here’s how you can do it:

from pynq import Overlay
from tcu_arty.driver import Driver
from tcu_arty.arch import Architecture

# Load the PYNQ overlay
overlay = Overlay("path/to/your/overlay.bit")

# Initialize the Tensil driver
driver = Driver(overlay)
  • Load and Preprocess Data: Prepare your input data for inference. If you are using the CIFAR dataset, you can load and preprocess the images as follows:

import numpy as np
from PIL import Image

def load_and_preprocess_image(image_path):
    image = Image.open(image_path).resize((32, 32))  # Resize to CIFAR dimensions
    image_array = np.array(image) / 255.0  # Normalize pixel values
    return image_array.flatten()  # Flatten the image for input
input_data = load_and_preprocess_image("path/to/image.png")
  • Execute the Model: With the model loaded and data prepared, you can run inference. The following code snippet demonstrates how to execute the model and retrieve the results:

# Load the model into the driver
driver.load_model("path/to/my_model.tmodel")

# Run inference
output = driver.run(input_data)

# Process the output
predicted_class = np.argmax(output)
print(f"Predicted class: {predicted_class}")
  • Performance Optimization: To maximize performance, consider optimizing the data transfer between the CPU and FPGA. Use DMA (Direct Memory Access) for efficient data handling, and ensure that the input data is aligned with the memory requirements of the FPGA.

  • Debugging and Monitoring: Utilize the PYNQ Jupyter Notebooks for debugging and monitoring the performance of your model. You can visualize the data flow and check for bottlenecks in the processing pipeline.

  • Sign in to reply
element14 Community

element14 is the first online community specifically for engineers. Connect with your peers and get expert answers to your questions.

  • Members
  • Learn
  • Technologies
  • Challenges & Projects
  • Products
  • Store
  • About Us
  • Feedback & Support
  • FAQs
  • Terms of Use
  • Privacy Policy
  • Legal and Copyright Notices
  • Sitemap
  • Cookies

An Avnet Company © 2025 Premier Farnell Limited. All Rights Reserved.

Premier Farnell Ltd, registered in England and Wales (no 00876412), registered office: Farnell House, Forge Lane, Leeds LS12 2NE.

ICP 备案号 10220084.

Follow element14

  • X
  • Facebook
  • linkedin
  • YouTube