Measures a | Predictive model | ||||
---|---|---|---|---|---|
Decision tree | Random forest | Logistic regression | K-nearest neighbors | Linear support vector machine | |
Area under the curve score | 0.64 | 0.71 | 0.72 | 0.58 | 0.71 |
Accuracy score | 0.62 | 0.58 | 0.61 | 0.83 | 0.59 |
Among LBW babies | |||||
 Precision | 0.21 | 0.21 | 0.22 | 0.20 | 0.21 |
 Recall | 0.61 | 0.74 | 0.72 | 0.07 | 0.75 |
 f1-score | 0.31 | 0.33 | 0.34 | 0.10 | 0.33 |
Permutation feature importance b – top variables in order of importance | 1. maternal weight 2. hypertensive disorders | 1. maternal weight 2. clinical site 3. hypertensive disorder 4. antenatal care 5. maternal height 6. antepartum hemorrhage 7. previous livebirth 8. parity | 1. clinical site 2. maternal weight 3. antenatal care 4. hypertensive disorder 5. antepartum hemorrhage 6. severe infection during delivery 7. maternal height | 1. maternal weight 2. maternal height 3. socio-economic status 4. antenatal care 5. parity 6. previous livebirth 7. maternal education 8. vitamin / calcium supplementation | 1. clinical site 2. maternal weight 3. antenatal care 4. hypertensive disorder 5. antepartum hemorrhage 6. maternal education 7. severe infection during delivery |
Hyperparameter values c | Balanced class weights, Gini impurity criterion, minimum samples in a leaf = 2, minimum samples required for a split = 4, all others set to default | Balanced class weights, Gini impurity criterion, number of tree = 500, maximum depth = 8, all other set to default | Balanced class weights, L2 ridge regulation penalty, and maximum number of iterations for the solver to converge = 5000 | Weights set by distance, leaf size = 20, all other set to default | Linear kernel, balanced class weights, regulation parameter (C) = 100, all other set to default |