element14 Community
element14 Community
    Register Log In
  • Site
  • Search
  • Log In Register
  • About Us
  • Community Hub
    Community Hub
    • What's New on element14
    • Feedback and Support
    • Benefits of Membership
    • Personal Blogs
    • Members Area
    • Achievement Levels
  • Learn
    Learn
    • Ask an Expert
    • eBooks
    • element14 presents
    • Learning Center
    • Tech Spotlight
    • STEM Academy
    • Webinars, Training and Events
    • Learning Groups
  • Technologies
    Technologies
    • 3D Printing
    • FPGA
    • Industrial Automation
    • Internet of Things
    • Power & Energy
    • Sensors
    • Technology Groups
  • Challenges & Projects
    Challenges & Projects
    • Design Challenges
    • element14 presents Projects
    • Project14
    • Arduino Projects
    • Raspberry Pi Projects
    • Project Groups
  • Products
    Products
    • Arduino
    • Avnet Boards Community
    • Dev Tools
    • Manufacturers
    • Multicomp Pro
    • Product Groups
    • Raspberry Pi
    • RoadTests & Reviews
  • Store
    Store
    • Visit Your Store
    • Choose another store...
      • Europe
      •  Austria (German)
      •  Belgium (Dutch, French)
      •  Bulgaria (Bulgarian)
      •  Czech Republic (Czech)
      •  Denmark (Danish)
      •  Estonia (Estonian)
      •  Finland (Finnish)
      •  France (French)
      •  Germany (German)
      •  Hungary (Hungarian)
      •  Ireland
      •  Israel
      •  Italy (Italian)
      •  Latvia (Latvian)
      •  
      •  Lithuania (Lithuanian)
      •  Netherlands (Dutch)
      •  Norway (Norwegian)
      •  Poland (Polish)
      •  Portugal (Portuguese)
      •  Romania (Romanian)
      •  Russia (Russian)
      •  Slovakia (Slovak)
      •  Slovenia (Slovenian)
      •  Spain (Spanish)
      •  Sweden (Swedish)
      •  Switzerland(German, French)
      •  Turkey (Turkish)
      •  United Kingdom
      • Asia Pacific
      •  Australia
      •  China
      •  Hong Kong
      •  India
      •  Korea (Korean)
      •  Malaysia
      •  New Zealand
      •  Philippines
      •  Singapore
      •  Taiwan
      •  Thailand (Thai)
      • Americas
      •  Brazil (Portuguese)
      •  Canada
      •  Mexico (Spanish)
      •  United States
      Can't find the country/region you're looking for? Visit our export site or find a local distributor.
  • Translate
  • Profile
  • Settings
Sci-Pi Design Challenge
  • Challenges & Projects
  • Design Challenges
  • Sci-Pi Design Challenge
  • More
  • Cancel
Sci-Pi Design Challenge
Blog Sci-Pi GPT - RPi 4B Limits with GPT4ALL V2
  • Blog
  • Forum
  • Documents
  • Leaderboard
  • Polls
  • Files
  • Members
  • Mentions
  • Sub-Groups
  • Tags
  • More
  • Cancel
  • New
Join Sci-Pi Design Challenge to participate - click to join for free!
  • Share
  • More
  • Cancel
Group Actions
  • Group RSS
  • More
  • Cancel
Engagement
  • Author Author: vlasov01
  • Date Created: 26 Jun 2023 4:28 PM Date Created
  • Views 4582 views
  • Likes 8 likes
  • Comments 0 comments
  • llm
  • test
  • optimization
  • considerations
  • raspberry pi 4 b
  • sci-pi design challenge
  • performance
  • gpt4all
Related
Recommended

Sci-Pi GPT - RPi 4B Limits with GPT4ALL V2

vlasov01
vlasov01
26 Jun 2023

General Considerations for Using LLM

While large language models are very powerful, their power requires a thoughtful approach. Here are some technical considerations.I think are very important:

  • Context window limit - most of the current models have limitations on their input text and the generated output. It is measured in tokens. Two tokens can represent an average word, The current limit of GPT4ALL is 2048 tokens. Why it is important? The current LLM models are stateless and they can't create new memories. One of the workarounds is to provide the previous dialogue as input. This context should provide the summary of the previous dialogue. Otherwise, they will not be able to remember anything from the discussion. As a result, it puts a significant constraint to make them useful for scientific research, which requires a large amount of data..
  • The LLM model size and the size of the context impact processing performance. And this impact can be very significant. Especially it is a concern for edge devices with limited computing and power. A LLM model needs to be loaded into RAM and significant extra memory space is required for their calculations. The bigger the input more processing time it will require.

GPT4ALL Performance Findings on RPi 4B

I was aware of memory considerations for LLM. So I've created a swap disk on RPi to add more memory for processing. This workaround comes with a huge performance impact as my sd card is much slower than my RAM,

I've modified the Python code and added some timer output, And then I run a few tests to measure performance,

image

Based on this test the load time of the model was ~90 seconds. It took much longer to answer my question and generate output - 63 minutes. I've run another test and asked a short question.

image

It took ~12 minutes to generate the answer.

I've used htop utility to look at resource usage. The RPi CPUs were not too busy during these tests.

image

But RAM and swap were used extensively.

image

Potential Optimization Options

As this RPI 4B has only 4GB of RAM I start looking at what can be done to reduce the memory usage.

There are several techniques like distillation, pruning, and quantizing for LLM model optimization. These techniques allow for reducing the model size and memory usage.

Another direction is to find a model that has a smaller size and needs less RAM.

And there are some other ideas to improve performance without changing hardware: Here are some of them:

  • Use free ARM-CPU optimized ArmPL library for ML 
  • Use the edge-optimized ML AIMET toolkit from Qualcomm 

Samo options from a hardware perspective:

  • Increase RAM to 8GB to fit the model in RAM and exclude the use of a swap
  • Use faster storage to reduce the negative impact of swapping
  • Use Hardware ML accelerators

  • Sign in to reply
element14 Community

element14 is the first online community specifically for engineers. Connect with your peers and get expert answers to your questions.

  • Members
  • Learn
  • Technologies
  • Challenges & Projects
  • Products
  • Store
  • About Us
  • Feedback & Support
  • FAQs
  • Terms of Use
  • Privacy Policy
  • Legal and Copyright Notices
  • Sitemap
  • Cookies

An Avnet Company © 2025 Premier Farnell Limited. All Rights Reserved.

Premier Farnell Ltd, registered in England and Wales (no 00876412), registered office: Farnell House, Forge Lane, Leeds LS12 2NE.

ICP 备案号 10220084.

Follow element14

  • X
  • Facebook
  • linkedin
  • YouTube