Baby Crying Sound Classification using Convolutional Neural Network
DOI:
https://doi.org/10.11113/humentech.v3n1.66Keywords:
Baby cry, Convolutional neural network, Machine learning, Mel-frequency cepstral coefficient, Sound classificationAbstract
Crying is a crucial means of communication for newborns, crying is a newborn's early form of communication. Many individuals are unable to recognise a baby's intention from cry unless they have the appropriate training or expertise, such as nurses, paediatricians, and childcare professionals. Accurately interpreting a baby's cry can be challenging. In this research paper, the study uses a method for classifying baby crying sounds using a Convolutional Neural Network (CNN) and the dataset includes belly pain, burping, discomfort, hungry, and tired for total of 3,495 one-second-long audio clips. The research methodology involves preprocessing the audio data, extracting Mel-Frequency Cepstral Coefficients (MFCC) as features, and training the CNN model. To determine the optimal architecture, two different configurations of the CNN model are evaluated. The settings for both configurations are the same, except for the layers. The first configuration utilizes 100, 200, and 100 neurons for the respective layers, while the second configuration employs 256, 512, and 256 neurons for each layer. the results have already been evaluated that the second configuration, with deeper and more complex layers, achieves higher accuracy (86%) compared to the first configuration (84%). The study demonstrates the effectiveness of CNNs in classifying baby cries and highlights the importance of model architecture in achieving accurate classification results. Future research could explore larger and more diverse datasets to improve generalizability.