Analisis Klasifikasi Tweet Berdasarkan Topik Sosial Menggunakan SVM
DOI:
https://doi.org/10.24843/JNATIA.2025.v03.i04.p15Keywords:
Twitter, Topic Classification , TF-IDF, Support Vector MachineAbstract
Social media platforms, including Twitter (now X), produce a constant flow of user-generated text that reflects public discourse in real time. However, the informal and unstructured nature of these short messages poses challenges for manual topic classification, especially when handling large volumes. This study aims to categorize Indonesian-language tweets into three topics: Politics, Entertainment, and Others, using a supervised machine learning approach. A total of 1,478 tweets were collected through keyword-based scraping and manually labeled according to predefined guidelines. The preprocessing stage included text cleaning, tokenization, stopword removal, stemming, and label encoding. TF-IDF was employed to convert the cleaned text into numerical features, while classification was performed using the Support Vector Machine (SVM) algorithm with a One-vs-Rest strategy for multi-class classification. The model reached an overall accuracy of 84 percent, with particularly high performance in the Politics and Entertainment categories. These results indicate that the combination of TF-IDF and SVM is effective for classifying short Indonesian-language tweets and can be applied to support the organization and filtering of topical content in social media analytics.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Abdurrazik, I Made Widhi Wirawan (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.