Leprosy is a neglected tropical disease (NTD). Brazil has long been recognized as an endemic country for leprosy; it is the second highest leprosy burden country in the world. In 2023, Brazil reported 22,773 new cases of leprosy, which represents 92% of the new cases in the Americas, 13% of 174,087 new cases worldwide. Leprosy can lead to severe physical deformities, making it a highly stigmatizing disease.
This study evaluates four machine learning models – Decision Tree, Random Forest, Adaptive Boosting (AdaBoost) and Gradient Boosting (GB) – to predict the progression of the grade of physical disability. We utilized a real Brazilian dataset extracted from SINAN, the Brazilian national notifiable disease information system. The dataset contained 12 attributes (including the target class) and 923,920 records of leprosy cases from 2001 to 2023. In the dataset, 157,062 patients showed no evolution or reduction in the grade of impairment function (GIF) while 12,957 exhibited an increase in the GIF from diagnosis to cure. We found that 29,905 cases demonstrated a decrease in GIF; these records were excluded from the dataset for model training. After preprocessing steps, a total of 199,924 records and 12 clinical and sociodemographic variables were selected for training and testing models. Models were evaluated using recall, also referred to as sensitivity, as the primary evaluation metric. Recall quantifies the proportion of true positive cases identified by the model out of all actual positive cases….