Bee Healthy - Blog 5: Edge Impulse Model

19 Feb 2023

I've selected the Buzz3 Audio Dataset to test model development and deployment to the Nicla Vision board using the Edge Impulse toolset. I discussed this dataset in my previous post - Bee Healthy - Blog 4: Audio Dataset.

The Buzz3 dataset contains 11746 2sec samples split as follows:

train
class	hour	min	samples
bee	1	36	2880
cricket	2	0	3600
noise	1	24	2520

test
class	min	sec	samples
bee	35	42	1071
cricket	19	14	577
noise	36	36	1098

I may end up rebalancing the training data, but I'll try it out first. I already have an Edge Impulse account , so I just needed to create a new project that I named Bee_Present:

A great feature of the Edge Impulse Studio is the ability to upload existing data files provided they are in an accepted data format or file type.

Data Acquisition Formats:

CBOR
JSON
CSV

File Types:

The audio files in this dataset are all WAV files, so uploading the files with the appropriate labels was easy.

Data Acquisition

Impulse Design

For non-voice audio data it is generally recommended to use a Spectrogram Processing block - either linear or MFE (Mel-filterbank energy). The MFE is designed to match human audio perception. One might expect MFE to perform better because the data was labeled by human listeners.

I decided to try it both ways.

Audio (MFE) Impulse

Train the Neural Network:

The Quantized (int8) model accuracy is reasonable when validated against the training data:

But performs less well against the test data as would be expected:

Linear Spectrogram Impulse

Train the Neural Network:

Again, reasonable accuracy with the training data

But worse performance against the test data

Summary of Project versions:

The inference time of the Mel Spectrogram Impulse (35ms) is half of the Linear Spectrogram Impulse (71ms). And it also has better memory efficiency and accuracy (with this dataset), so that's the model that I will use.

My first test when I get the Nicla Vision hardware will be to use the microphone to verify that I can do "Live classification" before I try to deploy the model.

ralphjy over 1 year ago in reply to beacon_dave

I was actually discussing this project with a friend and discovered that his wife has started raising Mason bees. In about a month she'll probably put out her cocoons saved from last year.

I might try to get some audio recordings, but I think it will be difficult with an open structure with solitary bees (not sure where I would put the microphone).

It would probably be easier to use image data with a "bee hotel".

The other problem with audio data is the labeling. I would probably need to do image capture correlated to the sound recording. I'd need to do all of that using the Nicla Vision WiFi since LoRa does not have enough bandwidth. I think that is out of scope for this challenge (at least for me).
- Cancel
- Vote Up 0 Vote Down
- Sign in to reply
- More
- Cancel
beacon_dave over 1 year ago in reply to ralphjy

Do you think you would get any buzzing at the likes of a 'bee hotel' ? A bee hotel might be a bit more doable as an experiment at home than a hive.

Spotted some on sale at the supermarket here yesterday but they were even more shallow than the ones already discussed here. Although to be fair they were being marketed as insect hotels rather than being solely for solitary bees. Might get a wider range of insect sounds however.
- Cancel
- Vote Up 0 Vote Down
- Sign in to reply
- More
- Cancel
ralphjy over 1 year ago in reply to javagoza

I agree with your comments and I may try to turn on data augmentation during training, but with this dataset I don't think that it will improve much. It will be interesting to see how well live classification will work using audio data from this dataset captured by the Nicla Vision microphone. There is a large problem with using out of context data (different hardware and setup), but it's not avoidable for this challenge. And I don't think that I will have the opportunity to try this on an actual hive to get my own data.

That being said, I'm amazed at how much effort is being made to capture beehive audio data. It's just been hard to find usable data.
- Cancel
- Vote Up 0 Vote Down
- Sign in to reply
- More
- Cancel
javagoza over 1 year ago

When I have made live audio classifications I have had a problem with environmental noise and in the case of interiors with reverberation. It helps a lot to expand the samples with data augmentation and mixing samples with recordings of the ambient sound in which the system is going to be deployed.
- Cancel
- Vote Up 0 Vote Down
- Sign in to reply
- More
- Cancel