Optimasi Convolutional Neural Network pada Klasifikasi Mood Musik Berbasis Fitur Audio
DOI:
https://doi.org/10.24843/Keywords:
Music Mood Classification, Audio Features, Convolutional Neural Network, Model Optimization, Music Information RetrievalAbstract
This study develops an audio-based music mood classification system using a Convolutional Neural Network (CNN) implemented independently without deep learning frameworks in the core architecture and model training process. This approach was chosen so that the model learning process, from forward propagation to parameter updates, can be understood more thoroughly. The GTZAN dataset is used, comprising 999 audio files. Each audio sample is processed at a sampling rate of 22050 Hz, converted to mono, and standardized into a 30-second segment. Audio representation is constructed as a 71-dimensional feature vector consisting of MFCC, spectral features, spectral contrast, chroma features, tonnetz, energy-related features, and tempo, and then standardized using StandardScaler. Class labels are defined into three mood categories, namely positive, negative, and neutral. This study evaluates three CNN architecture variants (5 layer, 7 layer, and 9 layer architectures) and applies training parameter settings including class weighting, label smoothing, Gaussian noise, and early stopping by monitoring the validation Macro F1-score. Experimental results indicate that hyperparameter tuning improves validation performance compared to the baseline, while fine-tuning provides performance improvement under certain configurations. The best model is obtained from the 7 layer architecture after fine-tuning, achieving an accuracy of 0,97 and a Macro F1-score of 0,97. These results indicate that the extracted audio features are capable of representing music mood characteristics effectively. Furthermore, architectural variation has a positive impact on classification performance; however, increasing network depth does not necessarily lead to consistently better performance. The system is integrated into a Flask-based web application for inference and result presentation.