element14 Community
element14 Community
    Register Log In
  • Site
  • Search
  • Log In Register
  • About Us
  • Community Hub
    Community Hub
    • What's New on element14
    • Feedback and Support
    • Benefits of Membership
    • Personal Blogs
    • Members Area
    • Achievement Levels
  • Learn
    Learn
    • Ask an Expert
    • eBooks
    • element14 presents
    • Learning Center
    • Tech Spotlight
    • STEM Academy
    • Webinars, Training and Events
    • Learning Groups
  • Technologies
    Technologies
    • 3D Printing
    • FPGA
    • Industrial Automation
    • Internet of Things
    • Power & Energy
    • Sensors
    • Technology Groups
  • Challenges & Projects
    Challenges & Projects
    • Design Challenges
    • element14 presents Projects
    • Project14
    • Arduino Projects
    • Raspberry Pi Projects
    • Project Groups
  • Products
    Products
    • Arduino
    • Avnet Boards Community
    • Dev Tools
    • Manufacturers
    • Multicomp Pro
    • Product Groups
    • Raspberry Pi
    • RoadTests & Reviews
  • Store
    Store
    • Visit Your Store
    • Choose another store...
      • Europe
      •  Austria (German)
      •  Belgium (Dutch, French)
      •  Bulgaria (Bulgarian)
      •  Czech Republic (Czech)
      •  Denmark (Danish)
      •  Estonia (Estonian)
      •  Finland (Finnish)
      •  France (French)
      •  Germany (German)
      •  Hungary (Hungarian)
      •  Ireland
      •  Israel
      •  Italy (Italian)
      •  Latvia (Latvian)
      •  
      •  Lithuania (Lithuanian)
      •  Netherlands (Dutch)
      •  Norway (Norwegian)
      •  Poland (Polish)
      •  Portugal (Portuguese)
      •  Romania (Romanian)
      •  Russia (Russian)
      •  Slovakia (Slovak)
      •  Slovenia (Slovenian)
      •  Spain (Spanish)
      •  Sweden (Swedish)
      •  Switzerland(German, French)
      •  Turkey (Turkish)
      •  United Kingdom
      • Asia Pacific
      •  Australia
      •  China
      •  Hong Kong
      •  India
      •  Korea (Korean)
      •  Malaysia
      •  New Zealand
      •  Philippines
      •  Singapore
      •  Taiwan
      •  Thailand (Thai)
      • Americas
      •  Brazil (Portuguese)
      •  Canada
      •  Mexico (Spanish)
      •  United States
      Can't find the country/region you're looking for? Visit our export site or find a local distributor.
  • Translate
  • Profile
  • Settings
Save The Bees Design Challenge
  • Challenges & Projects
  • Design Challenges
  • Save The Bees Design Challenge
  • More
  • Cancel
Save The Bees Design Challenge
Blog Bee Healthy - Blog 4: Audio Dataset
  • Blog
  • Forum
  • Documents
  • Leaderboard
  • Polls
  • Files
  • Members
  • Mentions
  • Sub-Groups
  • Tags
  • More
  • Cancel
  • New
Join Save The Bees Design Challenge to participate - click to join for free!
  • Share
  • More
  • Cancel
Group Actions
  • Group RSS
  • More
  • Cancel
Engagement
  • Author Author: ralphjy
  • Date Created: 18 Feb 2023 5:55 AM Date Created
  • Views 3056 views
  • Likes 13 likes
  • Comments 5 comments
  • save the bees
  • bee hive audio
  • savethebeesch
  • ChatGPT
Related
Recommended

Bee Healthy - Blog 4: Audio Dataset

ralphjy
ralphjy
18 Feb 2023

I had contemplated using either the camera or microphone on the Nicla Vision to do image or audio classification to monitor the health of a bee hive.  The challenge in either case is to build or obtain a labeled dataset in order to create an ML model for inferencing.  In the timeframe of this Design Challenge I can't realistically acquire enough data using the Nicla Vision to build a dataset.  For my project I decided on using bee hive sounds, so I set about searching for a suitable audio dataset.

ChatGPT

This seemed like a good opportunity to try out ChatGPT, so I submitted a query "find dataset of bee sounds to classify health".  And it returned results that looked very promising:

"  There are several datasets available online that contain bee sounds that could be used to classify the health of bees. Some examples include:

  1. The "BeeSounds" dataset: This dataset, published by researchers at Aberystwyth University, contains recordings of honeybees and bumblebees, along with annotations indicating the species and the presence of disease. The dataset can be found here: https://data.mendeley.com/datasets/5ctjnf2c6k/1
  2. The "BeeHealth" dataset: This dataset, created by researchers at the University of Sussex, contains recordings of honeybees, bumblebees, and solitary bees, along with annotations indicating the health status of the bees. The dataset can be found here: https://data.mendeley.com/datasets/v5d5c5z5c9/1
  3. The "BeeSounds-2018" dataset: This dataset contains recordings of honeybees, bumblebees, and solitary bees, along with annotations indicating the species and the presence of disease. The dataset can be found here: https://data.mendeley.com/datasets/kxncwwcxzc/1

It is important to note that using bee sounds to classify the health of bees is a relatively new and challenging task, so it is likely that there are not many datasets available yet and you may need to create your own dataset as well."

Unfortunately, the links to data.mendeley.com do not work.  And it is also not clear to me whether that data is publicly accessible.  At least ChatGPT acknowledged that it gave me a bad link:

" I apologize for the confusion. I apologize for the error in my previous response, the link https://data.mendeley.com/datasets/kxncwwcxzc/1 seems to be unavailable. It is possible that the dataset is no longer accessible, or that the link has been changed.

I suggest you to look for other datasets that might have bee sounds, and also check if they are labeled as healthy or unhealthy. Some other websites that might help you are:

  • https://zenodo.org/
  • https://datadryad.org/
  • https://figshare.com/
  • https://data.world/

You can also look for research papers that have used bee sound dataset and contact the authors to see if they can share the dataset with you.

Please let me know if there is anything else I can assist you with. "


I tried searching for the datasets using some of the keywords, but had no success.  The dataset from the University of Sussex sounds like exactly what I need, so I tried ChatGPT again with a more prescriptive query and it responded with an even more promising link:

" The University of Sussex BeeHealth dataset is a collection of data related to bee health and behavior, including measurements of bee activity and behavior, as well as information on the health and survival of bee colonies.

To access the BeeHealth dataset, you can visit the University of Sussex's BeeLab website at the following URL: https://www.sussex.ac.uk/beehealth/data. On this page, you can find links to download the BeeHealth dataset, as well as information on how the data was collected and processed.

The BeeHealth dataset consists of two main components: the Hive Monitoring dataset and the Apiary Survey dataset. The Hive Monitoring dataset contains data on bee activity and behavior, as well as environmental conditions within the hive, collected using sensors placed within beehives. The Apiary Survey dataset contains information on the health and survival of bee colonies, as well as data on the characteristics of the bees and their environment, collected through surveys of beekeepers.

Both datasets are available for download in CSV format. The Hive Monitoring dataset contains data from 24 beehives over a period of several years, while the Apiary Survey dataset contains data from over 500 beekeepers across the UK. "

 

Unfortunately, that link did not work either.  So, I've resorted to sending an email query to the Laboratory of Apiculture and Social Insects (LASI) at the University of Sussex.  I hope that I get a response.  ChatGPT seems to be able to dredge up a lot more info than a normal Google search.  It's too bad that it seems to find a lot of broken links.

Zenodo

I did find a dataset on Zenodo - To bee or not to bee: An annotated dataset for beehive sound recognition that I think I can use to build and deploy an ML model to the Nicla Vision.  This is a simpler dataset as it classifies the sound of bees in the hive versus external sounds (not bee) recorded in the hive.  Not quite what I wanted, but hopefully I can use it to demonstrate building and deploying a model.

 

This data was taken from hives in the Open Source Beehive (OSBH) and NU-Hive projects.  

Utah State University

I found a paper by Vladimir Kulyukin of Utah State University - Audio, Image, Video, and Weather Datasets for Continuous Electronic Beehive Monitoring that used Raspberry Pi based monitoring systems collecting data from Utah beehives.  Even though this paper was published in 2021, I still had problems with the data links - but I contacted the author and was able to acquire some of the audio datasets.  These datasets are similar to the "To bee or not to bee" dataset in that there are only three classes - bee, cricket, noise.  The segmentation of these data sets will make them easier to use with Edge Impulse, so I think I will start with one of them to test the model build and deployment process.

Here is the file structure for a dataset:

image

And the segment file labeling:

image

I'll describe the Edge Impulse process in a separate post.  I received a UPS notification that a delivery from Newark is scheduled for February 21, so hopefully I'll have some hardware to try this on soon Relaxed.

 

  • Sign in to reply
Parents
  • shabaz
    shabaz over 2 years ago

    Glad you stuck with it, and eventually found a dataset! Bee/Cricket/Noise seems pretty good!

    Nice idea using ChatGPT to assist : ) 

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
Comment
  • shabaz
    shabaz over 2 years ago

    Glad you stuck with it, and eventually found a dataset! Bee/Cricket/Noise seems pretty good!

    Nice idea using ChatGPT to assist : ) 

    • Cancel
    • Vote Up 0 Vote Down
    • Sign in to reply
    • More
    • Cancel
Children
No Data
element14 Community

element14 is the first online community specifically for engineers. Connect with your peers and get expert answers to your questions.

  • Members
  • Learn
  • Technologies
  • Challenges & Projects
  • Products
  • Store
  • About Us
  • Feedback & Support
  • FAQs
  • Terms of Use
  • Privacy Policy
  • Legal and Copyright Notices
  • Sitemap
  • Cookies

An Avnet Company © 2025 Premier Farnell Limited. All Rights Reserved.

Premier Farnell Ltd, registered in England and Wales (no 00876412), registered office: Farnell House, Forge Lane, Leeds LS12 2NE.

ICP 备案号 10220084.

Follow element14

  • X
  • Facebook
  • linkedin
  • YouTube