Currently our company is dealing with bird sound data sources which needs an AI model to do the following tasks:
1. Pre-processing .wav files containing bird sounds and other nature sounds into datasets for classification training (we would like to know how and what are the strategy you will use in process the datasets for training and classification, i.e. the frequency when cleaning, length of period slicing for training [0.5s, 1s, 1.5s, 5s, 10s, etc interval files], application for update of new datasource)
2. Training models for the datasets through deep learning (which updates when introducing new bird sound data sources, have at least efficiency of at least 75%, easy to understand for us to maintain and tweak)
3. The objective of this project is simply to help us train an AI model when we have a bird sound data source which we can pass through a model that can identify sufficiently the type of bird that is making the bird sound from the data source, which can self-improve through addition of more bird sound data source.
In the files, we include
1. Our current code model (we have an independent source code but due to intellectual property problems we cant share it at the moment)
2. a test file of data source (cause of 25MB limit I include the download link [login to view URL])
3. the test file label file
p/s If you have any inquiries about the project please let us know through chat and we will get through you asap.
hello sir , I am a machine learning engineer having 5+ years of experience . I have done many projects on image classification either in agriculture sector or in medical sectors.
I have used many framework for object detection like pytorch,mxnet , tensor flow ,keras , caffe .
some of my projects are
identification of different types of notes
object detection using YOLO or RCNN
custom object detection
brain hemorrhage subtypes detection
pneumethrox detection
covid-19 detection
Stock market prediction using deep learning
and many more
A deep. And I am pretty sure that you will get best results ever.
Hi,
I'm PhD student working on audiovisual modality in a university in Barcelona, Spain. I'm used to work with audio.
The starndard approach would be to work in the frequency domain.
Dataset partitioning strategy depends on how have you labaled the ground truth. Ideally (to reduce the lack of generalization), random crops should be done in each wav file.
I don't understand at all whether you want to indetify diferent species of birds or not.
Waterfalls (at least in the test example) sounds like an offset noise which can be suppresed.
I almost carried out in this project before but the organization which had to provide the dataset retired in the last moment :(
You can also try to contact them but they are afraid people use their data to track endagered species.
Most of the details would depend on the information you can provide me about the birds sound, namely, range of pitch, mean duration of singing and some other info.
Let me know if you are interested.
Regards.
Juan M