Background: The purpose of this study was to verify the accuracy and validity of using machine learning (ML) to select risk factors, to discriminate differences in feature selection by ML between men and women, and to develop predictive models for patients with osteoporosis in a big database. Methods: The data on 968 observed features from a total of 3,484 the Korea National Health and Nutrition Examination Survey participants were collected. To find preliminary features that were well-related to osteoporosis, logistic regression, random forest, gradient boosting, adaptive boosting, and support vector machine were used. Results: In osteoporosis feature selection by 5 ML models in this study, the most selected variables as risk factors in men and women were body mass index, monthly alcohol consumption, and dietary surveys. However, differences between men and women in osteoporosis feature selection by ML models were age, smoking, and blood glucose level. The receiver operating characteristic (ROC) analysis revealed that the area under the ROC curve for each ML model was not significantly different for either gender. Conclusions: ML performed a feature selection of osteoporosis, considering hidden differences between men and women. The present study considers the preprocessing of input data and the feature selection process as well as the ML technique to be important factors for the accuracy of the osteoporosis prediction model.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.
Ajou University Medical Information & Media Center 164 Worldcup-ro Yeongtong-gu Suwon 16499 Korea / TEL : 031-219-5312 Copyright (c) Ajou University Medical Information & Media Center All Rights Reserved. AJOU Open Repository는 국립중앙도서관 OAK 보급사업으로 구축되었습니다.