Ubuntu
I've decided to try use Ubuntu as OS for my project. Here a re a few points for my decision:
- I'm using it extensivly
- It is officially support RPi4
- Ubuntu is widly used by ML/AI community
- It was referenced in one of the videos where a large language model (LLM) was successfully installed on RPi4 with 8GB of RAM
- RPi Imager supports Ubuntu image creation
- Ubuntu 22.04 comes with ZSWAP (RAM compression), which may help me to run LLM on 4GB RAM
Installation
It was easy to flash SD CARD using RPi Imager. But I've got stuck tryinhg to login as it didn't like my credentials. I'v flashed it again and I've used Ubuntu default userid/password this time. I've named my RPi host as scipi.
{gallery}Flashing Ubuntu 22.04 LTS on RPi 4B |
---|
IRPi Imager Download |
Ubuntu 22.04 64-bit selection |
OS image settings |
Image flashing |
image flashing completed |
Ubuntu 22.04 on RPi4
I've used a free Windows utilityAdvanced IP Scanner 2.5 to find the IP address of RPi on my network.
I've updated Ubuntu and created a swap to in preparation to testing LLM.
First Attempt to Run Alpaca LLM
I've followed an instruction on an LLM installation on RPi4B with 8GB of RAM shared at
. The author specifically mentioned that it will not work with 4GB of RAM. But as I've created extra 12 GB of swap I was hopping that it may work.
The instruction was very simple:
git clone https://github.com/antimatter15/alpaca.cpp cd alpaca.cpp make chat curl -o ggml-alpaca-7b-q4.bin -C - https://ipfs.io/ipfs/QmQ1bf2BTnYxq73MFJWu1B7bQ2UD6qG7D7YDCxhTndVkPC ./chat
But the first challenge was to download the model. curl was not able to download the file. I've created a script to restart curl until the download completed.
But than I've got the Segmentation Fault error:
Than I've tried to adjust memory parameter with a few different values, but I've got the same issue:
It is possible that the model file was corrupted. But I was not sure if it has a checksum. And it seems that the program was validating the file too.
MPT-7B Model
I've looked for another models and found a new model from Mosaic. I've downloaded it and was able to run it on 4GB of RAM:
It was quite a slow process so.
The first test
I've asked a wrong question (mispelled Musk as Mask) and got a very imaginative answer. It took more than an hour for RPi 4 to generate it. Its 4 cores were not very busy at ~25% load. Most likely it was swapping memort back and forth. Next time I'll try to identify the bottleneck. I'll start with measuring the use of swap and zswap.