- Research article
- Open Access
Prediction of newborn’s body mass index using nationwide multicenter ultrasound data: a machine-learning study
BMC Pregnancy and Childbirth volume 21, Article number: 172 (2021)
This study introduced machine learning approaches to predict newborn’s body mass index (BMI) based on ultrasound measures and maternal/delivery information.
Data came from 3159 obstetric patients and their newborns enrolled in a multi-center retrospective study. Variable importance, the effect of a variable on model performance, was used for identifying major predictors of newborn’s BMI among ultrasound measures and maternal/delivery information. The ultrasound measures included biparietal diameter (BPD), abdominal circumference (AC) and estimated fetal weight (EFW) taken three times during the week 21 - week 35 of gestational age and once in the week 36 or later.
Based on variable importance from the random forest, major predictors of newborn’s BMI were the first AC and EFW in the week 36 or later, gestational age at delivery, the first AC during the week 21 - the week 35, maternal BMI at delivery, maternal weight at delivery and the first BPD in the week 36 or later. For predicting newborn’s BMI, linear regression (2.0744) and the random forest (2.1610) were better than artificial neural networks with one, two and three hidden layers (150.7100, 154.7198 and 152.5843, respectively) in the mean squared error.
This is the first machine-learning study with 64 clinical and sonographic markers for the prediction of newborns’ BMI. The week 36 or later is the most effective period for taking the ultrasound measures and AC and EFW are the best predictors of newborn’s BMI alongside gestational age at delivery and maternal BMI at delivery.
Low birthweight and childhood obesity are the leading causes of disease burden in the world. One in every seven babies was born with low birthweight (less than 2500 g) in the world for year 2015 and newborns with low birthweight are more likely to die in the first 28 days of life than their normal counterparts . Likewise, 40 million children under the age of five were overweight or obese in the world for year 2016  and childhood overweight or obesity is expected to have short-term and long-term consequences including asthma , depression , diabetes , hypertension , dyslipidemia  and cardiovascular disorders . Given this global challenge, member states of the World Health Organization endorsed “No Increase in Childhood Overweight by 2025” as one of six global nutrition targets .
In this context, several retrospective studies of obstetric patients and their newborns endeavored to analyze newborn’s weight and its major predictors [10,11,12,13]. These studies focused on ultrasound measures and maternal/delivery information, while coming from various regions including East Asia (Taiwan), Middle East (Lebanon), North America (United States) and North Europe (Denmark). Based on the linear-regression results of these studies, the following variables were good predictors of newborn’s weight: abdominal circumference or diameter, biparietal diameter, gestational age at delivery, maternal weight at delivery and maternal body mass index (BMI). However, these studies did not address (1) which predictors are more important for the prediction of newborn’s weight and (2) which periods are more effective for taking the ultrasound measures and managing the delivery outcome. Also, existing literature ignores newborn’s BMI and highlights newborn’s weight. However, newborn’s BMI, which has a strong association with newborn’s fat mass, would be a better indicator of newborn’s adiposity, given that newborn’s weight includes not only fat mass but also head size, lean mass and bone mass.
For this reason, this study introduces machine learning approaches to predict newborn’s BMI based on ultrasound measures and maternal/delivery information. Machine learning (or data mining) methods are statistical methods to extract knowledge from large amounts of data. Specifically, the random forest and the artificial neural network (ANN) do not require unrealistic assumptions of linear regression such as ceteris paribus, “all the other variables staying constant”. Also, the random forest can address (1) which predictors are more important for the prediction of newborn’s BMI and (2) which periods are more effective for taking the ultrasound measures and managing the delivery outcome. Indeed, data in this study are larger than those in the previous studies - 4590 mother-baby pairs and 64 independent variables. This study attempts to demonstrate that machine learning approaches based on ultrasound measures would be a useful noninvasive tool for predicting newborn’s BMI.
Participants and variables
Data came from the medical records of 3159 obstetric patients and their newborns enrolled in a multi-center retrospective study. This study was conducted during September 2019–April 2020 and 48 general hospitals participated in this study. This study was approved by the institutional review boards of the general hospitals. This process was followed by data collection, analysis and interpretation. One hundred women with singleton pregnancies were selected from each of the general hospitals. These women were Korean citizens aged 20–44 years. They gave births during June 2015–June 2019 and their gestational age at delivery varied from 24 weeks 0 days to 41 weeks 6 days. These women did not have any disease including pre-gestational or gestational diabetes or hypertension. Newborns who were large for gestational age or had fetal growth restrictions were included, whereas those with congenital anomalies were excluded.
The dependent variable was newborn’s BMI. Newborn’s weight and height were measured at the time of birth and newborn’s BMI was calculated from these measures. The following 64 independent variables were included in this study. Maternal information covered age (years), term births, preterm births, abortions, children alive, height, pre-gestational weight and weight at delivery, and pre-gestational BMI and BMI at delivery. Gestational age (W/D: weeks/days) and two ultrasound measures were taken once during the week 11 - week 13 (GA11): crown-rump length (CRL) (mm) and nuchal translucency (NT) (mm). These indicators were denoted by GA11W1, GA11D1, GA11CRL1 and GA11NT1. Then, gestational age (W/D: weeks/days) and five ultrasound measures were taken once during the week 14 - week 19 (GA14), once in the week 20 (GA20), three times during the week 21 - week 35 (GA21) and once in the week 36 or later (GA36): biparietal diameter (BPD) (mm), head circumference (HC) (mm), abdominal circumference (AC) (mm), femur length (FL) (mm) and estimated fetal weight (EFW) (g). These indicators got the notations of: (1) GA14W1, GA14D1, GA20W1, GA20D1, GA21W1, GA21D1, GA21W2, GA21D2, GA21W3, GA21D3, GA36W1 and GA36D1 (gestational age); (2) GA14BPD1, GA20BPD1, GA21BPD1, GA21BPD2, GA21BPD3 and GA36BPD1 (biparietal diameter); (3) GA14HC1, GA20HC1, GA21HC1, GA21HC2, GA21HC3 and GA36HC1 (head circumference); (4) GA14AC1, GA20AC1, GA21AC1, GA21AC2, GA21AC3 and GA36AC1 (abdominal circumference); (5) GA14FL1, GA20FL1, GA21FL1, GA21FL2, GA21FL3 and GA36FL1 (femur length); and (6) GA14EFW1, GA20EFW1, GA21EFW1, GA21EFW2, GA21EFW3 and GA36EFW1 (estimated fetal weight). For example, GA21BPD1 means the first BPD taken during the week 21 - week 35, whereas GA36EFW1 means the first EFW taken in the week 36 or later. For the calculation of EFW, all general hospitals used the Hadlock’s formula  (except one general hospital that used a formula from Shinozuka et al. ). These formulas share the same parameters (BPD, AC, FL) and are reported to show similar performances for the prediction of newborn’s weight . Finally, delivery/newborn information covered gestational age at delivery (weeks and days), caesarean delivery methods (no vs. yes), newborn’s sex - female (no vs. yes), Apgar scores in 1 and 5 min after delivery, and neonatal intensive care unit hospitalization (no vs. yes). These variables had missing rates lower than 30% in general and their missing values were replaced by their median values.
Five machine learning methods were applied for predicting newborn’s BMI, the dependent variable of this study: linear regression, random forest and ANNs with one, two and three hidden layers . Each hidden layer had three neurons in this study. Data on 3159 participants were divided into training and validation sets with a 75:25 ratio (2370 vs. 789 observations). The mean squared error (MSE), the average of the squares of errors among 789 observations, was introduced as a criterion for validating the models trained. Here, errors are gaps between actual and predicted values of the dependent variable, newborn’s BMI. Variable importance from the random forest, the effect of a variable on model performance, was used for identifying major predictors of newborn’s BMI among ultrasound measures and maternal/delivery information. R-Studio was employed for the analysis on April 2020.
Descriptive statistics for continuous and categorical variables in this study are summarized in Table 1. The median (Q2) values of newborn’s BMI, GA36AC1, GA36EFW1 and gestational age at delivery were 12.74 kg/m2, 322 mm, 2866 g and 38 weeks, respectively. Likewise, the median values of GA21AC1 and maternal BMI at delivery were 214.70 mm and 26.04 kg/m2, respectively. The MSEs of the five machine learning models are shown in Table 2. The random split and the statistical analysis were repeated 3 times and their average MSE was calculated for each of the five statistical methods, i.e., linear regression, random forest and ANNs with one, two and three hidden layers. Linear regression and the random forest were much better models than the ANNs for predicting newborn’s BMI. Their average MSEs over the three runs were 2.0744, 2.1610, 150.7100, 154.7198 and 152.5843, respectively.
Based on variable importance from the random forest, major predictors of newborn’s BMI were the first AC and EFW in the week 36 or later, gestational age at delivery, the first AC during the week 21 - the week 35, maternal BMI at delivery, maternal weight at delivery and the first BPD in the week 36 or later (Table 3, Table S1 (supplementary information) and Fig. 1). The findings of linear regression present useful information about the effect of a major determinant on newborn’s BMI. For example, newborn’s BMI will increase by 0.0142 if GA36AC1 increases by 1 mm. Likewise, newborn’s BMI will increase by 0.4142 if gestational age at delivery increases by 1 week. It is to be noted, however, that the results of linear regression are based on an unrealistic assumption of ceteris paribus, “all the other variables staying constant”. For this reason, the coefficients of some predictors were statistically significant in linear regression but their importance rankings were not high from the random forest, a data-driven approach with no such an assumption of “all the other variables staying constant”. In this context, the findings of linear regression are to be considered as just supplementary information to the variable importance from the random forest.
Findings of study
This study introduced machine learning approaches to predict newborn’s BMI based on ultrasound measures and maternal/delivery information. Based on variable importance from the random forest, the week 36 or later is the most effective period for taking the ultrasound measures and AC and EFW are the best predictors of newborn’s BMI alongside gestational age at delivery and maternal BMI at delivery. These results are consistent with existing literature on the topic [18, 19]. In terms of the MSE for predicting newborn’s BMI, linear regression (2.0744) and the random forest (2.1610) were much better models than ANNs with one, two and three hidden layers (150.7100, 154.7198 and 152.5843, respectively). Indeed, the MSEs of linear regression (2.0744) and the random forest (2.1610) were smaller than the variation of newborn’s BMI (2.4649). This suggests that machine learning approaches based on ultrasound measures would be a useful noninvasive tool for predicting newborn’s BMI.
The findings of this study are consistent with those of previous retrospective studies on the prediction of newborn’s weight with clinical and sonographic markers. In a study of 238 obstetric patients in Denmark, AC and BPD during the third trimester were effective predictors of newborn’s weight, given that the MSE of linear regression was similar with the variation of newborn’s weight . In a study of 109 pregnant women in the United States, newborn’s weight had positive associations with fetal adiposity in the week 30 and gestational age at delivery . In a study of 1000 obstetric patients in Lebanon, newborns with maternal gestational weight gain were more likely to have macrosomia than those with normal gestational weight gain (Odds Ratio 1.888) . Likewise, another study of 110 pregnant women in Taiwan reported that AC and BPD during the week 20 - week 24 are significant predictors of newborn’s weight together with gestational age at delivery, maternal weight at delivery and maternal BMI at delivery . However, the previous studies did not address (1) which predictors are more important for the prediction of delivery outcome and (2) which periods are more effective for taking ultrasound measures and managing delivery outcome. This study provides plausible answers to these challenging questions.
Moreover, conventional studies focus on newborn’s weight as a measure of newborn’s adiposity but the findings of this study suggest that newborn’s BMI would be a good alternative. Firstly, the United States Center for Disease Control and Prevention recommends the BMI-for-age chart as a screening tool for the overweight and underweight of boys and girls aged 2 to 20 years . Two major rationales behind this recommendation state that (1) the BMI is a more consistent indicator across different generations than weight and (2) the BMI contains the dimensions (and strengths) of weight and height measures at the same time. Secondly, it is reported that newborn’s BMI has stronger correlations with magnetic-resonance-imaging measures of newborn’s fat mass than do newborn’s other anthropometrics . Thirdly, infant’s BMI is expected to have a stronger correlation with early childhood obesity than infant’s weight-for-length. Based on the medical records of 73,949 full-term infants from a large pediatric network, 47% of infants with BMI ≥ 97.7th percentile at 2 months (vs. 29% of infants with weight-for-length ≥ 97.7th percentile at 2 months) were obese at 2 years . Fourthly, using newborn’s BMI (instead of newborn’s weight) would engender greater stability for statistical analysis. For example, the estimations of ANNs with two layers did not converge when newborn’s weight (instead of newborn’s BMI) was the dependent variable in this study.
Limitations of study
This study had some limitations. Firstly, for the calculation of EFW, one general hospital used a different formula. Using the same formula for EFW is expected to improve model performance in future study. However, the results of this study did not change after removing the data based on the different formula. Secondly, this study did not consider possible mediating effects among variables. Thirdly, it would be a good topic for future research to develop a BMI guideline for newborn’s adiposity. According to an international guideline, adult’s categories of underweight, normal, overweight and obesity are defined as BMIs smaller than 18.5 kg/m2, within 18.5–25.0 kg/m2, within 25.0–30.0 kg/m2 and equal to/greater than 30.0 kg/m2, respectively . An equivalent guideline for newborns needs to be developed based on comprehensive and systematic analysis. Fourthly, this study did not consider socioeconomic factors (education, income) and other possible obstetric variables such as periodontitis, upper gastrointestinal tract symptoms, gastroesophageal reflux disease, Helicobacter pylori, pelvic inflammatory disease history, diabetes mellitus (type I, type II, gestational), hypertension (chronic, gestational) and medication history (e.g., progesterone, calcium channel blocker, nitrate, tricyclic antidepressant, benzodiazepine and sleeping pills). Recent studies on preterm birth reported that these factors would affect the delivery outcome [24, 25] and it would be an important contribution to extend this study based on these new variables. Fifthly, further analysis of specific patients, e.g., symptomatic vs. asymptomatic, single vs. multiple gestation, would offer more insight on this line of research with more detailed clinical implications. Sixthly, this study did not consider various options of parameter tuning for the ANN. Its performance was worse than those of linear regression and the random forest in this study. Finding optimal parameters for the ANN is reported to be a challenging task and it will be a good topic for future research. Seventhly, the focus of this study was to find important predictors of newborn’s BMI. Exploring possible mechanisms between each important predictor and newborn’s BMI is expected to make a good contribution for this line of research. Finally, the values of the following variables outside 1.5*(Interquartile Range), so called “outliers”, were deleted in this study: maternal weight at delivery, GA11CRL1, GA20BPD1, GA20FL1, GA21BPD1, GA21FL1, GA21BPD2, GA21FL2, GA21BPD3 and GA21FL3. It was beyond the scope of this study to evaluate other optimal strategies to handle outliers in the data.
Conclusions of study
The week 36 or later is the most effective period for taking the ultrasound measures and AC and EFW are the best predictors of newborn’s BMI alongside gestational age at delivery and maternal BMI at delivery. Machine learning approaches based on ultrasound measures would be a useful noninvasive tool for predicting newborn’s BMI.
Availability of data and materials
The datasets used and/or analysed during the current study are available from the corresponding authors on reasonable request.
- AC :
- ANN :
Artificial neural network
- BMI :
Body mass index
- BPD :
- CRL :
- EFW :
Estimated fetal weight
- FL :
- HC :
- IRB :
Institutional review board
- MSE :
Mean squared error
- NT :
- GA11 :
Gestational age, week 11 - week 13
- GA14 :
Gestational age, week 14 - week 19
- GA20 :
Gestational age, week 20
- GA21 :
Gestational age, week 21 - week 35
- GA36 :
Gestational age, week 36 or later
- W/D :
Gestational age - weeks/days
Blencowe H, Krasevec J, de Onis M, Black RE, An X, Stevens GA, et al. National, regional, and worldwide estimates of low birthweight in 2015, with trends from 2000: a systematic analysis. Lancet Glob Health. 2019;7(7):e849–60.
World Health Organization. Global database on child health and malnutrition. Geneva: WHO; 2019. http://www.who.int/nutgrowthdb/estimates/en/. Accessed 24 Apr 2020.
Lang JE, Bunnell HT, Hossain MJ, Wysocki T, Lima JJ, Finkel TH, et al. Being overweight or obese and the development of asthma. Pediatrics. 2018;142(6). pii: e20182119.
Quek YH, Tam WWS, Zhang MWB, Ho RCM. Exploring the association between childhood and adolescent obesity and depression: a meta-analysis. Obes Rev. 2017;18(7):742–54.
Pulgaron ER, Delamater AM. Obesity and type 2 diabetes in children: epidemiology and treatment. Curr Diab Rep. 2014;14(8):508.
Brady TM. Obesity-related hypertension in children. Front Pediatr. 2017;5:197.
Cook S, Kavey RE. Dyslipidemia and pediatric obesity. Pediatr Clin N Am. 2011;58(6):1363–73 ix.
Raj M. Obesity and cardiovascular risk in children and adolescents. Indian J Endocr Metab. 2012;16:13–9.
Di Cesare M, Sorić M, Bovet P, Miranda JJ, Bhutta Z, Stevens GA, et al. The epidemiological burden of obesity in childhood: a worldwide epidemic requiring urgent action. BMC Med. 2019;17(1):212.
Secher NJ, Djursing H, Hansen PK, Lenstrup C, Sindberg Eriksen P, Thomsen BL, et al. Estimation of fetal weight in the third trimester by ultrasound. Eur J Obstet Gynecol Reprod Biol. 1987;24(1):1–11.
Ikenoue S, Waffarn F, Sumiyoshi K, Ohashi M, Ikenoue C, Buss C, et al. Association of ultrasound-based measures of fetal body composition with newborn adiposity. Pediatr Obes. 2017;12(Suppl 1):86–93.
Papazian T, Abi Tayeh G, Sibai D, Hout H, Melki I, Rabbaa KL. Impact of maternal body mass index and gestational weight gain on neonatal outcomes among healthy middle-eastern females. PLoS One. 2017;CF, Tsai HJ, Lin CY, Ying TH, Wang PH, Chen GD. Prediction of newborn birth weight based on the estimation at 2012(7):e0181255.
Su CF, Tsai HJ, Lin CY, Ying TH, Wang PH, Chen GD. Prediction of newborn birth weight based on the estimation at 20-24 weeks of gestation. Taiwan J Obstet Gynecol. 2010;49(3):285–90.
Hadlock FP, Harrist RB, Sharman RS, Deter RL, Park SK. Estimation of fetal weight with the use of head, body, and femur measurements - a prospective study. Am J Obstet Gynecol. 1985;151(3):333–7.
Shinozuka N, Okai T, Kohzuma S, Mukubo M, Shih CT, Maeda T, et al. Formulas for fetal weight estimation by ultrasound measurements based on neonatal specific gravities and volumes. Am J Obstet Gynecol. 1987;157(5):1140–5.
Melamed N, Yogev Y, Meizner I, Mashiach R, Bardin R, Ben-Haroush A. Sonographic fetal weight estimation: which model should be used? J Ultrasound Med. 2009;28(5):617–29.
Han J, Micheline K. Data mining: concepts and techniques. Second ed. San Francisco: Elsevier; 2006.
Ciobanu A, Khan N, Syngelaki A, Akolekar R, Nicolaides KH. Routine ultrasound at 32 vs 36 weeks’ gestation: prediction of small-for-gestational-age neonates. Ultrasound Obstet Gynecol. 2019;53(6):761–8.
Khan N, Ciobanu A, Karampitsakos T, Akolekar R, Nicolaides KH. Prediction of large-for-gestational-age neonate by routine third-trimester ultrasound. Ultrasound Obstet Gynecol. 2019;54(3):326–33.
United States Center for Disease Control and Prevention. Using the CDC BMI-for-age growth charts to assess growth in the United States among children and teens aged 2 years to 20 years. https://www.cdc.gov/nccdphp/dnpao/growthcharts/training/bmiage/index.html. Accessed 24 Apr 2020.
Stokes TA, Kuehn D, Hood M, Biko DM, Pavey A, Olsen C, et al. The clinical utility of anthropometric measures to assess adiposity in a cohort of prematurely born infants: correlations with MRI fat quantification. J Neonatal Perinatal Med. 2017;10(2):133-8.
Roy SM, Spivack JG, Faith MS, Chesi A, Mitchell JA, Kelly A, et al. Infant BMI or weight-for-length and obesity risk in early childhood. Pediatrics. 2016;137(5):e20153492.
United States Center for Disease Control and Prevention. Defining adult overweight and obesity. https://www.cdc.gov/obesity/adult/defining.html/. Accessed 24 Apr 2020.
Lee KS, Ahn KH. Artificial neural network analysis of spontaneous preterm labor and birth and its major determinants. J Korean Med Sci. 2019;34(16):e128.
Lee KS, Song IS, Kim ES, Ahn KH. Determinants of spontaneous preterm labor and birth including gastroesophageal reflux disease and periodontitis. J Korean Med Sci. 2020;35(14):e105.
We appreciate the following researchers and hospitals participated in this study: Korea University Anam Hospital (AKH, KHY, and LKS), Kangwon National University Hospital (NSH, LSJ, and KSO), Konkuk University Hospital (HHS), Ewha Womans University Hospital (PMH), Catholic University of Korea Seoul St. Mary’s Hospital (KHS), Catholic University of Korea Eunpyeong St. Mary’s Hospital (KJY), CHA Gangnam Medical Center (KMY), Kyung Hee University Hospital at Gangdong (SHJ), Hallym University Kangdong Sacred Heart Hospital (MJS), Gangneung Asan Hospital (JDH), Kangbuk Samsung Hospital (SJH), Konyang University Hospital (KTY), Kyungpook National University Hospital (SWJ), Gyeongsang National University Hospital (PJK), Keimyung University Dongsan Medical Center (BJG), Korea University Guro Hospital (CGJ), Hanyang University Guri Hospital (BHY), National Health Insurance Service Ilsan Hospital (KEH), Gachon University Gil Hospital (KSY), Dankook University Hospital (KYD), Daegu Catholic University Medical Center (HSY), Hallym University Dongtan Sacred Heart Hospital (KKS), Pusan National University Hospital (KSC), Inje University Busan Paik Hospital (KYN), Catholic University of Korea Bucheon St. Mary’s Hospital (SKJ and SJE), CHA Bundang Medical Center (LJY), Sungkyunkwan University Samsung Medical Center (OSY), Inje University Sanggye Paik Hospital (SYS), Seoul National University Hospital (LSM), Seoul National University Boramae Medical Center (KBJ), Soon Chun Hyang University Hospital (CGY), Ulsan University Asan Medical Center (WHS and LMY), Yonsei Univeristy Sinchon Severance Hospital (KYH), Pusan National University Yangsan Hospital (LDH), Ulsan University Hospital (LSJ), Inha University Hospital (CSR), Dongguk University Ilsan hospital (PHS), Inje University Ilsan Paik Hospital (KHS), Chonnam National University Hospital (KYH and KJW), Jeonbuk National University Hospital (JYJ and LDH), Jeonju Presbyterian Medical Center (KKJ), Jeju National University Hospital (KHS), Chosun University Hospital (CSJ and CJH), Chung-Ang University Hospital (KGJ), Soon Chun Hyang University Cheonan Hospital (KYS), Chungnam National University Hospital (LMA), Hallym University Kangnam Sacred Heart Hospital (SJE), Hanyang University Hospital (HJK).
Ethics approval and consent to participate
This study was approved by institutional review boards of forty-eight hospitals such as Korea University Anam Hospital (2019AN0433) participating in the study. Informed consent was waived by the institutional review boards. No administrative permissions or licenses were acquired by the authors to access the data used in this study.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Lee, KS., Kim, H.Y., Lee, S.J. et al. Prediction of newborn’s body mass index using nationwide multicenter ultrasound data: a machine-learning study. BMC Pregnancy Childbirth 21, 172 (2021). https://doi.org/10.1186/s12884-021-03660-5
- Body mass index
- Estimated fetal weight
- Abdominal circumference