Analisis Kekuatan Kata Sandi Berbasis Konteks Bahasa Indonesia Menggunakan Machine Learning
DOI:
https://doi.org/10.24843/JNATIA.2026.v04.i03.p17Keywords:
Password Strength Analysis, Machine Learning, Decision Tree, XGBoost, Cybersecurity, Indonesian Language ContextAbstract
The widespread reliance on password authentication is persistently undermined by users creating contextually weak passwords, a vulnerability often overlooked by standard, English-centric password strength meters. This research addresses this security gap by developing and evaluating a machine learning model specifically tailored for password strength analysis within the Indonesian linguistic context. We trained a Decision Tree classifier and benchmarked it against a robust XGBoost model using a dataset enriched with local passwords and contextual features, including a custom heuristic score and Levenshtein similarity to a comprehensive Indonesian dictionary. To overcome severe class imbalance, the Synthetic Minority Oversampling Technique (SMOTE) was applied to the training data. While the XGBoost model achieved superior predictive performance, the most significant finding emerged from the feature importance analysis, which revealed that our custom heuristic score and the password's length were the two most dominant predictors. This study successfully validates that a context-aware machine learning approach can effectively analyze password strength, underscoring the critical need to integrate local linguistic patterns into security mechanisms and providing a robust foundation for developing more secure authentication systems for Indonesian users.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Putu Dena Satwika Sandi, I Wayan Supriana (Author)

This work is licensed under a Creative Commons Attribution 4.0 International License.