Apgar score and neonatal mortality in China: an observational study from a national surveillance system

Background To examine the association between the Apgar score and neonatal mortality over gestational age in China and to explore whether this association changed when Apgar scores were combined at 1 and 5 min. Methods Data for all singleton live births collected from 438 hospitals between 2012 and 2016 were used in this study. Poisson regression with a robust variance estimator adjusted for a complete set of confounders was used to describe the strength of the association between the Apgar score and neonatal mortality. Results The relative risks of neonatal death-associated intermediate Apgar score at 5 min peaked at 39–40 weeks of gestation and subsequently decreased if the gestational age increased to 42 weeks or above, in contrast to the low Apgar score. Among both preterm and term new-borns with Apgar scores at 5 min, new-borns that were not small for gestational age had a lower mortality rate than those that were small for gestational age. The association between Apgar score and the neonatal mortality was even stronger when scores at 1 and 5 min were combined. Conclusions Apgar score is not only meaningful for preterm new-borns but also useful for term new-borns, especially term new-borns that are not small for gestational age. Once the baby’s Apgar score worsens, timely intervention is needed. There is still a gap between China and high-income countries in terms of sustained treatment of new-borns with low Apgar scores. Supplementary Information The online version contains supplementary material available at 10.1186/s12884-020-03533-3.


Background
China's national under-five mortality rate (U5MR) declined from 61 per 1000 live births in 1991 to 11 per 1000 live births in 2015 [1], being well ahead of the target for 2015 (20.3 per 1000 live births) set by the Millennium Development Goals. Among all under-five deaths in 2016, 73.8% were concentrated in the infant group (age < 1 year), and 47.9% were neonatal deaths [2]. Preterm birth complications (mainly premature delivery or low birth weight), intrapartumrelated complications (mainly birth asphyxia) and congenital abnormalities were the three major causes of neonatal death in China [3]. Neonatal resuscitation in immediate new-born care plays a very important and effective role in improving birth asphyxia-related outcomes [4]. At present, nearly all women in China choose to give birth in hospitals; thus, new-borns can receive good care from healthcare workers to minimize the risk of early neonatal mortality due to avoidable causes, such as birth asphyxia. After the Chinese government released its two-child policy in October 2015 [5], many Chinese couples were allowed and encouraged to have a second child from January 1, 2016. There is an urgent need to evaluate whether the existing neonatal status assessment methods are good enough to assist obstetricians and neonatal paediatricians in assessing the needs of obstetricians and neonatologists because of the rapid increase in the number of births.
The Apgar score has been used to quickly evaluate the physical condition of new-borns after delivery for more than 60 years [6] and is routinely used in obstetrics for every delivery in China. Nevertheless, Apgar scores are often used in the rapid assessment of asphyxia severity in clinical practice in China, although most experts have indicated that Apgar scores should not be used alone to diagnose birth asphyxia, as pointed out in Chinese textbooks of paediatrics [7]. Studies have shown that the Apgar score at 5 min after birth is related to neonatal mortality [8]. A cohort study of the UK from 1992 to 2010 suggested that the mortality rate increased with declining Apgar score at 5 min [9]. However, there are differences in neonatal resuscitation capacity, human resources, and ethics group between China and high-income countries. Whether the association between Apgar scores and neonatal mortality was different in China was uncertain, and the effect of Apgar score in terms of the difference between scores at 1 min and 5 min on neonatal mortality was not shown. Gestational age and birthweight are important indicators in predicting the health status of new-borns, but the association between neonatal mortality and Apgar score stratified by both gestational age and birthweight has also not been reported.
In this study, we first analysed data collected in over 400 hospitals from 2012 to 2016 in China to characterize maternal characteristics and delivery information in relation to Apgar scores at 5 min after birth. Second, we evaluated whether there is a correlation between Apgar score at 5 min and neonatal mortality in China and whether the association differed after adjusting for a complete set of covariates from different gestational age groups. We stratified the analysis by small for gestational age (SGA) categories to further understand the association between neonatal mortality and Apgar score at 5 min in different gestational age groups. The Apgar score is generally calculated at one and 5 min after birth; thus, the effect of the Apgar score on neonatal mortality was examined when scores at both 1 min and 5 min were combined.

Data sources
Data used in the study were from China's National Maternal Near Miss Surveillance System (NMNMSS), covering 441 sampled hospitals selected from 30 provinces in China. The sampling details have been reported elsewhere [10,11]. Within each of the sampled districts or counties, two health facilities with more than 1000 deliveries per year were selected (or one facility if only one was available). Because some districts or counties did not have hospitals with the necessary number of births, large hospitals in urban districts were oversampled. As a result, urban populations were overrepresented in the NMNMSS, particularly in the central and western regions.
For each pregnant woman or woman who was admitted to surveillance facilities up to 42 days postpartum, sociodemographic and obstetric information was collected prospectively from admission to discharge. Foetal information was also collected, including weight at birth, Apgar score and status of life. The NMNMSS was designed to enumerate all maternal deaths and near misses (women who nearly died from a severe complication of pregnancy or delivery) in health facilities based on an individual questionnaire, meaning that the life status of the new-born could only be tracked prospectively before the mother was discharged from the hospital. Data on the new-born's life status were collected in three ways: evaluation of maternal medical records after discharge of the mother if she gave birth in the monitoring hospital; verbal inquiry of the mother or her family if she gave birth in another place; or evaluation of the infant's medical records if the infant was transferred to the neonatology department. The data provided to us were de-identified.

Definitions
We obtained NMNMSS data for all new-borns delivered in 438 hospitals (3 hospitals excluded because data were not reported since 2012) between Jan 1, 2012, and Dec 31, 2016. Because a report from The New England Journal of Medicine showed that the survival rate reached 81% among those born at 26 weeks of gestation [12], our analysis was restricted to singleton births born alive with a gestational age at delivery equal to or greater than 26 weeks. The gestational age in China is generally ascertained on the basis of the last menstrual period or ultrasound when the date of the last menstrual period is not known or when the menstrual cycle is irregular. The current gestational age is recorded in the maternal health booklet at each antenatal visit of a pregnant woman. We classified Apgar scores into three groups: low (Apgar 0-3), intermediate , and normal (Apgar 7-10). The outcome was neonatal death (from birth to the date when the mother was discharged from the hospital). We excluded babies whose mothers had remained in the hospital after delivery for more than 27 days to ensure that all death cases in the study were neonatal deaths. We excluded babies delivered from abortion, including spontaneous abortion, induced abortion and medical abortion. In China, the government recommends five or more antenatal visits in rural areas and eight or more in urban areas, so the number of antenatal care visits during pregnancy was categorized as 0, 1-4, 5-7, or ≥ 8. Preterm was defined as a live birth at less than 37 weeks of gestation but equal to or more than 26 completed weeks, term was defined as a live birth between 37 and 41 completed weeks, and post-term was defined as a live birth at 42 or more gestational weeks. Standard definitions were used for low birthweight (< 2500 g) [13], normal birthweight (2500-4000 g), and macrosomia (≥4000 g) [14]. SGA was defined as weighing less than the 10th percentile based on gestational age-specific birth weight percentiles for male and female infants in China [15,16]. We also used a second definition for SGA according to the global INTERGROWTH-21st standards [17] and compared the results between the Chinese standards and the global standards. We classified maternal complications into mutually exclusive categories of direct obstetric complications and medical diseases. Direct obstetric complications included uterine rupture, placenta praevia, abruptio placentae, unspecified antepartum haemorrhage, pre-eclampsia, eclampsia, HELLP syndrome or any foetal malpresentation (breech, shoulder or other). Medical diseases included heart disease, embolism/thrombophlebitis, hepatic disease, severe anaemia (haemoglobin < 70 g/L), renal disease (including urinary tract infection), lung disease (including upper respiratory tract infection), HIV/AIDS, connective tissue disorders, gestational diabetes mellitus and cancer.

Statistical analysis
The alluvial diagram was used to show the association of new-borns with low and intermediate Apgar scores at 5 min and neonatal deaths under different types of maternal complications. Since the NMNMSS oversampled large urban hospitals, we weighed the neonatal mortality rates for the sampling distribution of the population according to the 2010 census of China, as detailed elsewhere [10,11,18]. Relative risks (RRs) and 95% confidence intervals (CIs) were used to describe the strength of the association between the Apgar score and neonatal mortality. RRs were calculated using a Poisson regression with a robust variance estimator adjusted for a complete set of confounders [19], taking into account the clustering of live births within monitoring facilities. The "ggalluvial" package of R (version 3.6.1) was used to produce the alluvial diagrams. All other analyses were performed with Stata (version 15.1).

Ethics approval
This study was approved by the ethics committee of the West China Second University Hospital (protocol ID, 2012008).

Results
From Jan 1, 2012, to Dec 31, 2016, the NMNMSS recorded 6,620,684 singleton live births with at least 26 completed gestational weeks. 7633 deaths occurred before the mother was discharged from the hospital, which gave a neonatal mortality rate of 0.11% after adjusting for the sampling distribution of the population. Table 1 shows the maternal characteristics and delivery details in relation to Apgar score at 5 min. A vast majority of neonates (6,531,945, 99.66%) had a normal Apgar score at 5 min after birth. Compared with women who gave birth to infants with a normal Apgar score at 5 min (7-10), mothers of infants with a low Apgar score (0-3) or intermediate Apgar score (4-6) at 5 min were more likely to be young (under 20 years old) or of advanced maternal age (over 35 years old), to have a lower educational level, to have delivered more than once, to have received few antenatal visits, and to have used general anaesthesia during childbirth. Compared with neonates whose Apgar scores at 5 min were 7-10, neonates with low Apgar scores or intermediate Apgar scores also had a much larger proportion of noncephalic presentation, premature birth, and low birthweight (< 2500 g).
Among new-borns with low Apgar scores at 5 min whose mothers had direct obstetric complications, nearly half of them died (Fig. 1). Among new-borns with low Apgar scores whose mothers had medical diseases, the proportion of neonatal deaths was close to 1/3. Among neonatal deaths with low Apgar scores, almost half of The weighted neonatal mortality rate with a low Apgar score at 5 min was 28.72%, which was higher than that for births with an intermediate (8.28%) or with a normal Apgar score (0.06%). We examined the weighted neonatal mortality rates stratified by gestational age and Apgar score at 5 min (Fig. 2). The results showed that the neonatal mortality rate of births with a low Apgar score (0-3) was higher than that of births with a normal (7-10) or intermediate Apgar score (4)(5)(6) in each gestational age group. Among births with low Apgar scores (0-3), the neonatal mortality rate decreased progressively with gestational age (26-40 weeks) but increased if pregnancy was prolonged to over 41 weeks of gestation. This pattern was not the same as the trend among groups of new-borns with normal or intermediate Apgar scores, whose mortality rates continued to trend downward by gestational age. The preterm birth group with a low Apgar score at 5 min had the highest rate of death (52.29%) compared with the rates of the term birth group (14.67%) and post-term group (30.08%).
Apgar score stratified by gestational age at birth was strongly associated with neonatal mortality. Compared with those of a normal Apgar score at 5 min, the RRs of neonatal death associated with a low Apgar score at 5 min increased greatly and progressively with advancing gestational age after adjustment for several related factors. The peak was apparent at gestational ages of 42 weeks or above (Fig. 3). In contrast, the RRs associated with intermediate Apgar scores at 5 min after birth peaked at 39-40 weeks of gestation and subsequently   decreased as the gestational age increased to 42 weeks or above while remaining statistically significant. Because the proportion of the birthweight group varied with each Apgar-score group at 5 min (Table 1), we stratified the analysis by SGA categories to further examine the association between neonatal mortality and Apgar score at 5 min among the different gestational age groups. As shown in Table 2, in both the preterm and term new-born groups, babies who were not SGA had a lower mortality rate than those that were SGA, within each Apgar score group at 5 min. Regarding of whether the birth was preterm with or without SGA or term with or without SGA, the neonatal mortality rate decreased with increasing Apgar score. Compared with that for neonatal mortality among births with normal Apgar scores, the adjusted RR for neonatal mortality among births with low Apgar scores was 43.96 (95% CI 36.98-52.26) in the group of preterm births that were SGA, which was much lower than that in the group of term births that were not SGA (adjusted RR 392.76, 95% CI 318.69-484.03). The same results were found in the intermediate Apgar score group. When the global INTERGROWTH-21st standards were used to define the SGA group, similar results were found. The biggest difference between the results of these two standards was that the number of SGA cases calculated based on the global INTERGROWTH-21st standards was less than that calculated by Chinese standards.
A low Apgar score at 5 min was strongly associated with neonatal mortality (adjusted RR 126.50, 95% CI 107.35-149.06), and an intermediate Apgar score at 5 min was also associated with neonatal mortality (adjusted RR 30.27, 95% CI 26.11-35.10). However, the effect of the Apgar score was even stronger when scores at 1 and 5 min were combined (Table 3). Compared with that in the groups with scores of 7-10 at both 1 and 5 min, the risk for neonatal death increased by over 200fold in both groups with scores of 3 or lower. Even newborns who recovered from a low Apgar score at 1 min to a normal Apgar score at 5 min had a nearly 13-fold increased risk of neonatal death compared with those with normal scores at both 1 and 5 min.

Discussion
In our analysis, 7633 deaths at 26 or more completed weeks of gestation were reported in 438 health facilities between 2012 and 2016, giving a weighted neonatal mortality of 1.1 per 1000 live births, which was similar to the rate reported in the UK [9]. Among births between 26 and 36 weeks of gestation, the adjusted overall neonatal mortality was 1.18%, similar to the rate estimated in Dallas, TX, USA (1.02%) [20]. As a value that quantifies the effects of obstetric anaesthesia, the 10-point Apgar score, regardless of underlying cause, has been routinely used worldwide for more than 60 years, since 1953, to quickly and summarily assess the condition and prognosis of every new-born child [6]. The Apgar score at 5 min after birth has been used more widely as an index of the early neonatal condition than the 1-min Apgar score [21]. Despite the warning against overinterpretation of the score for predicting children's outcomes that has been in place since the Apgar score was proposed [6], associations between the Apgar score and short-term or long-term health outcomes have still been reported [9,20]. Several studies present an opinion that the Apgar score is antiquated because of the dramatic changes in the care of newborns over the past 60 years, but studies have still found that the Apgar score is useful for evaluating the risk of neonatal death clinically [22,23]. With advances in technology, there are indeed some more accurate assessment methods, such as blood pH, umbilical cord arterial lactate, base excess (BE) and other indicators that reflect metabolic acidosis. However, these advanced indicators are not available in all hospitals in China. Only a small number of high-level hospitals (Level 3 hospitals) can provide these advanced tests. However, a large number of low-level hospitals (Level 1 and Level 2 hospitals) are unable to use these advanced indicators. In addition, it takes a long time to obtain the results of these indicators. The advantage of the Apgar score over these advanced indicators is that it is immediately available on site, and the results based on the score can also be used for timely intervention treatment. Therefore, the Apgar score has been used clinically to guide neonatal resuscitation. Given that the Apgar score does have some subjectivity, it should be assessed by both the obstetrician and the neonatologist to improve the accuracy. a Weighted for sampling distribution of the population b RRs were adjusted for hospital level, maternal age, maternal education, maternal marital status, history of caesarean section, number of antenatal visits, neonatal sex, use of anaesthesia, mode of delivery, foetal presentation, days that the mother stayed in the hospital after delivery, maternal complications and year of delivery (2012,2013,2014,2015,2016) c RRs were adjusted for hospital level, maternal age, maternal education, maternal marital status, history of caesarean section, number of antenatal visits, neonatal sex, use of anaesthesia, mode of delivery, foetal presentation, birthweight by gestational age, days that the mothers stayed at the hospital after delivery, maternal complications and year of delivery (2012,2013,2014,2015,2016) In China, the Apgar score has been used for many years and has become a routine assessment that obstetrics specialists need to perform immediately after childbirth. It is important to note that although the overall mortality rates among neonates born before term are close to those of high-income countries, the neonatal mortality rates, in both the low-and intermediate-Apgar score groups, are much higher in our study (524 per 1000 live births and 132 per 1000 live births, respectively) than those of high-income countries [20]. On the  (2012,2013,2014,2015,2016) other hand, there is still a gap between China and highincome countries in terms of the sustained treatment of preterm infants with low Apgar scores. Another possible and unavoidable reason is the poor long-term outcomes of preterm infants with low Apgar scores and limited family economics; family members are more likely to give up treatment for these new-borns in China. With the growing number of births in China since the introduction of the universal two-child policy in October 2015 [24], the Apgar score may still be a useful indicator for rapidly predicting the risk of death in the neonatal period.
It is generally accepted that neonatal mortality is associated with gestational age [25], and preterm birth accounts for 75% of perinatal deaths [26]. In our analysis, the proportion of neonates with low Apgar scores at 5 min decreased rapidly from 17.57% at 26 completed weeks of gestation to 0.12% at 37 weeks (Additional file 1). This result is consistent with a previous finding indicating that births before 37 weeks of gestational age usually have a higher frequency of low Apgar scores at 5 min [27]. It is necessary to note that neonatal mortality related to Apgar score is influenced by gestational age and that the effect of gestational age is different between the low and intermediate Apgar score groups. There is no doubt that the decreased Apgar score is related to the increased risk of neonatal mortality in both gestational age groups. However, the relative risk of an intermediate Apgar score for neonatal mortality decreased after 40 completed weeks of gestational age; conversely, the relative risk of a low Apgar score for neonatal mortality subsequently increased. The association between neonatal mortality and Apgar score observed in our study is not consistent with the findings of the UK study [9], meaning that both of relative risk values of the low and intermediate Apgar score groups decreased after 41 weeks of gestational age. This may suggest that the treatment of new-borns with low Apgar scores at 5 min and over 40 completed weeks of gestational age in China is less effective than that in high-income countries. In addition, pregnancy termination is more common at a gestational age over 41 weeks in China according to clinical guidelines [28]. This means that in medical institutions in China, most post-term births are due to a lack of regular antenatal care, which may lead to a higher proportion of new-borns with underlying diseases among post-term births. Neonates with underlying diseases such as congenital abnormalities, meconium staining of the amniotic fluid or acidosis at birth often have low Apgar scores as well as higher mortality rates [29].
The distribution of Apgar scores is related to gestational age, and babies born before 37 weeks of gestation are at an increased risk of neonatal mortality. However, preterm births include groups with or without SGA, and SGA is an important cause of foetal and neonatal mortality [30,31]. Previous studies stratified the risk of neonatal death or short-and long-term adverse health problems in relation to low Apgar score at 5 min by either birthweight or weeks of gestation [9,22,27,32], or only evaluated the neonatal mortality in the combined presence of preterm birth (PTB) and SGA [25,33,34], despite the strong association between weeks of gestation and weight at birth that had been reported widely and were both principal health indicators of newborns. However, few studies have stratified the risk of neonatal mortality in relation to the Apgar score at 5 min by the combination of gestational age and birthweight by gestational age. Our study showed that when birth occurred at a higher gestational age (term) and under better birthweight conditions (without SGA) and if the neonate also had a low Apgar scores, the risk of neonatal mortality increased compared with that of a neonate with a normal Apgar score. Sensitivity analysis using the global INTERGROWTH-21st standards further confirmed this association. This result suggested that for births with a good gestational age and birthweight, there might still be some other risk factors, such as birth defects. The Apgar score was still a meaningful predictor for the adverse outcomes of these new-borns. We should never neglect the care of any new-born with a poor Apgar score, even when the neonate's gestational age and birthweight are not poor.
Changes in Apgar score values at different times were used to assess the short-term and long-term risks of adverse outcomes [8,35,36]. A population-based study of term infants in Norway reported that the effect of Apgar score was even stronger when scores at 1 and 5 min were combined: if both scores were 3 or lower, the risks for neonatal death and infant death increased 642-fold and 123-fold, respectively, compared with scores of 7 to 10 [8]. However, in this study, no neonatal death was recorded when the Apgar score was 4-6 or 7-10 at 1 min but fell to 0-3 at 5 min, and the relative risks among all groups could not be compared. Our analysis confirmed that the neonatal mortality risk was higher for new-borns with low Apgar scores at both 1 min and 5 min (though no more than 0.1% births drop into the group) than for those with low Apgar scores only at 5 min. We were also surprised to find that the neonatal death risk was higher among newborns whose Apgar scores fell from 7 to 10 at 1 min to 4-6 at 5 min than among new-borns whose Apgar scores fell from 7 to 10 at 1 min to 0-3 at 5 min. We further stratified the analysis by hospital level and found that the same results were found only in Level 2 hospitals. This result suggested that Level 2 hospitals in China may have not paid enough attention to the treatment and care of newborns whose Apgar scores changed slightly but not very seriously.
The limitations of our study are as follows. (1) In the NMNMSS, infants were followed up from birth until their mothers were discharged from the hospitals, and the longest time for monitoring was less than 42 days, the standard time for postpartum women. All infant deaths recorded in the surveillance system occurred before maternal discharge; however, data on the exact time of death were not collected in the NMNMSS. Therefore, the neonatal mortality rate in our study is actually the neonatal mortality rate before maternal discharge from hospitals. (2) As the NMNMSS did not collect information on neonatal diseases (such as birth defects), adjustments for these factors cannot be made in the models used, so the relationship between Apgar score and infant mortality should be interpreted with caution.

Conclusions
This study is the first to analyse the relationship between Apgar scores and neonatal mortality in more than 400 hospitals and over 6 million live births in China. This suggests that the Apgar score is not only meaningful for preterm infants but also useful for term infants, especially for term infants who are not SGA, even if only a few term births had Apgar scores of 3 or less. When the new-born's Apgar score worsened at 5 min compared with that at 1 min, regardless of whether the range was large or small, timely intervention was needed. There is still a gap between China and high-income countries in the sustained treatment of new-borns with low Apgar scores.
Additional file 1. Distribution of Apgar score groups at 5 min by gestational age.