Approximately 0.59% of live term births had 1 min Apgar score <4, of which only 29.4% recovered to a 5 min Apgar ≥7. The 1 min Apgar score is indicative of how well the newly born manages the immediate transition from intrauterine to extrauterine life [6]. Previous studies have examined factors associated with 1 min Apgar scores <7, and found significant associations with, among other factors, prematurity, postmaturity, low birthweight, and breech delivery [19–21]. This study was quite different because of the focus on recovery of term neonates with 1 min Apgar score <4 – and excluded in the sampling strategy prematurity, postmaturity, and congenital anomalies. As we consider our findings in relation to the earlier literature, it should be borne in mind that our focus on live term births may explain any differences.
Delivery type and personnel factors
Berglund and colleagues found that the 1 min Apgar score was associated with the manner in which the labour itself was managed by health care staff [22]. That is, in addition to any pre-existing risk factors such as birth weight or gestational age, the clinical decisions made during labour and the practice of the labour ward had an effect on that immediate transition to extrauterine life. A recent systematic review of deliveries in the US found no significant difference in the prevalence of low Apgar scores in deliveries by doctors or by nurse-midwives [23]. In our study, among the uncomplicated vaginal deliveries there was a clear, independent association between the qualification of the person conducting the delivery (doctor or nurse-midwife) and Apgar recovery. Neonates with 1 min Apgar scores <4, delivered by nurse-midwives had odds of recovery 40% of those babies delivered by doctors (95%CI:.26–.63).
The earlier findings and these findings, however, should not be regarded as contradictory. The probability of a poor 1 min Apgar score could be identical for doctors and nurse-midwives, and the differences in outcome may have a number of explanations. First, given a low Apgar score doctors may simply be better equipped to manage the recovery. This may in turn lead to speculation about whether nurse-midwives require additional training in resuscitative techniques; [24] and/or whether structurally the health system needs to make changes to manage neonates with 1 min Apgar scores <4 [22] The second explanation, is that when nurse-midwives anticipate a poor outcome they are more likely to refer it to a doctor; and because the poor outcome was anticipated, there is time to prepare an appropriate clinical response. In contrast, the un-anticipated poor outcome will not be referred, and there will not be the same time to prepare an appropriate clinical response. This is a form of selection bias in which nurse-midwife’s have to manage a more complicated clinical situation than doctors.
The evidence on the association between the type of delivery and birth outcome is mixed, and highly dependent on the presenting clinical features at the time of delivery. Two systematic reviews (2006 and 2012) comparing planned caesarean delivery versus planned vaginal birth, for instance, identified no studies of sufficient quality quality to inform a scientific view [25, 26]. Whether a CS is planned or unplanned also introduces complicating factors such as the timing, and the medical reasons underpinning the decision [26].
Given a 1 min Apgar score <4, however, there was an independent association between the type of delivery and recovery. Specifically, neonates with 1 min Apgar scores <4 delivered by CS (emergency or elective) had significantly better odds of recovery than neonates with 1 min Apgar scores <4 delivered by uncomplicated vaginal deliveries. Elective CS was associated with odds of recovery 2.7 times greater than uncomplicated vaginal deliveries (95% CI: 1.39–5.23); and emergency CS was associated with 1.7 times greater odds of recovery (95% CI: 1.23–2.37). This may be an effect of the degree of trauma associated with different types of delivery. If CS births do result in less birth trauma in this cohort, then it would make sense that they would recover faster. The data may also point to issues in the identification of births requiring CS. Around half of the deliveries performed by doctors, that involved 1 min Apgar scores <4, did not receive an emergency CS. It is also worth noting that the hospitals’ protocol is for spinal anaesthesia in CS, so the change in Apgar is unlikely to reflect post-anesthesia recovery.
The findings on delivery personnel and type of delivery become useful hypothesis generating mechanisms for possible future research. While not definitive, the findings suggest that a planned investigation of labour ward practice for those relatively rare, 1 min Apgar scores <4 could help to identify strategies that would improve recovery rates.
Maternal clinical factors
In the unadjusted analyses, we found that the odds of recovery were better in neonates with 1 min Apgar scores <4 born to mothers with diabetes and mothers who were obese. We also found that foetal distress was associated with better odds of recovery, and that the odds of recovery were significantly worse when the mother had low Hb (<11). Most of these associations disappeared in adjusted analyses. BMI and diabetes were the exceptions.
A number of recent studies have reported a negative association between maternal BMI and birth outcomes, including Apgar score [27–30]. At least one recent study, however, found no significant association [31]. We found no association between BMI and the odds of recovery in the analysis of uncomplicated vaginal deliveries. In contrast to all the results showing a negative or neutral association between maternal BMI and birth outcomes, in the analysis of deliveries by doctors, we found that neonates with 1 min Apgar scores <4 born to obese mothers had a small, but significantly better chances of recovery than those born with 1 min Apgar scores <4 to normal weight mothers (OR = 1.34; 95% CI:1.00–1.8).
The literature on maternal diabetes and low Apgar score is not clear cut, but tends towards worse outcomes; [32, 33] although in one recent study that found worse Apgar scores associated with maternal diabetes, the association disappeared in an adjusted analysis [34] Whether the outcome was worse also appears to be associated with the level of glycaemic control [35]. Again, and in contrast with the results on birth outcome, the odds of recovery given a low 1 min Apgar score were better in uncomplicated vaginal deliveries born to mothers with diabetes.
The differences in the associations between the birth outcome data and the Apgar recovery data are noteworthy and raise rather than answer questions. It may be, for instance, that mothers with known diabetes or high BMI trigger a hyper-vigilant clinical care response. A neonate’s initial Apgar score may be <4, but because of the preparedness of the staff for a poorer outcome they may also be better prepared to respond.
Ethnicity
Numerous studies have reported ethnic variations in birth outcomes [36, 37]. Some of the variation appears to be attributable to biology, [38, 39] but there is also substantial evidence for social and economic factors driving differences, [36, 40] and not always in the direction of minority groups being worse off [41, 42]. The results necessarily raise questions about the differential and synergistic effects of genetics, culture, and environment [37, 43]. In the present study, in the adjusted analysis for uncomplicated vaginal deliveries, Indian neonates with 1 min Apgar scores <4 (a minority group) had odds of recovery 5 times greater than Malay neonates with 1 min Apgar scores <4 (the majority group) (OR:5.13; 95% CI: 1.74–16.46). In contrast, Orang Asal and “Others” had substantially lower odds of recovery than Malay neonates with 1 min Apgar scores <4 (OR:.12; 95% CI: .04–.29 and OR:.22; 95% CI: .06–.65 respectively). In deliveries by doctors, only the Orang Asal neonates with 1 min Apgar scores <4 were significantly different (worse off) from the Malay neonates with 1 min Apgar scores <4 (OR:.34; 95% CI: .17–.64).
One possible explanation for the worse outcomes for the Orang Asal lie in their comparatively more geographically isolated living conditions, which may give rise to fewer antenatal visits and a reduced opportunity to provide on-going obstetric care and risk assessment. Orang Asal mothers may also be physically less healthy during their pregnancy [44]. Finally, there may be health systems issues, including accessibility, leading to poorer healthcare for indigenous populations; this notwithstanding Malaysia’s historically strong performance in improving maternal and child health outcomes [45].
Data for 2012 from the Malaysian government’s Economic Planning Unit showed the “Other” ethnic group to have the lowest mean monthly income; [46] and Orang Asal are over represented in the “poverty” and “hardcore poverty” statistics [44]. This would also suggest socioeconomic drivers, but given the universal coverage of maternal and child health services in Malaysia, wealth/poverty may not be a complete explanation for the results.
Missing data
The problem of collecting high quality labour ward data is not new [47, 48]. Where missing data are usually treated as a problem to be overcome, [49] missing data can also be treated as informative [50]. Rather than using data imputation to fill in the blanks, [51] in this study we elected to model the missing category explicitly.
In the unadjusted models missing data on ethnicity, Hb, the neonate’s sex, birth weight, type of delivery, and the person conducting the delivery were all associated with lower odds of recovery than the base category. In the adjusted models, no category of “missing” was significantly different from the respective base categories, and in a number of cases the cell sizes were so small that estimating confidence intervals was impossible.
Nonetheless, the existence of the missing data does hint at something interesting about the relationship between the urgency of neonatal clinical need and data quality. It is conceivable that when a neonate is critically ill, recording the sex or birth weight is seen as less relevant, or more concerning it may point to a deeper issue of quality care.
Limitations
There are important limitations associated with the use of registry data [48]. For instance, a small number of births (n = 21) with Apgar scores of 0 at 1 and 5 min were excluded as stillbirths, because they likely were [21]. There is the possibility of a 0/0 Apgar score followed by successful resucitation [52]. The manner of data collection in registries, however, often relies on simplifying assumptions and these need to be understood. In spite of the limitations, registry data can make important contributions to quality improvement, clinical research, and policy development [53].
In this section, three points are discussed: coverage, residual confounding, and the reliability of the outcome measures (Apgar score).
The completeness and coverage of the registry is important for the population to which the data speak [54]. In the case of the NOR, the data were drawn comprehensively from the 14 major hospitals which account for around 27% of births nationally. One might be reasonably comfortable generalising the findings to those hospitals with requisite additional caution in drawing wider conclusions. In hospitals not represented in the registry, which includes Government district level hospitals and private hospitals, the number and qualification-mix (i.e., doctors and nurse-midwives) of staff, their training, and the equipment may vary. All of these could affect outcomes, and therefore generalisability.
The granularity of the data from general administrative registries is necessarily going to be lower than they would be in cohort studies looking at specific questions. Choices need to be made about the limited kinds of data that can be collected routinely within a functioning health care unit that is not dedicated to research. In choosing to record certain data and not other data, there is an obvious concern with residual confounding; [54] that is, failing to account for a relevant factor in the adjusted analyses. Apgar recovery is likely, for example, to be strongly associated with the resuscitative skills, technology, and protocols available on each of the labour wards across the 14 hospitals. This is not recorded in the registry, but critical for drawing more definitive conclusions from the data. Notwithstanding the issues of residual confounding, as part of a more general study of possible associations, these kinds of data fulfil an important hypothesis generating role, including hypotheses about other possible unmeasured factors.
Finally, there is some question about the capacity of healthcare professionals to make valid and reliable assessments of neonatal Apgar scores [55–57]. One of the studies that highlighted issues of reliability in Apgar assessment was based on the evaluation of neonates from 23 to 40 weeks gestation, [58] and the other considered very low birth weight neonates with a range of gestational ages [56]. The design of the reliability studies ignored the very high base rate of Apgar scores ≥7 and selected a wider range and more critical clinical presentations than would be expected on a normal delivery ward. In this study all the neonates with 1 min Apgar scores <4 had reached term, and one might anticipate that term singleton births are easier to assess with greater reliability. Futhermore, even allowing for variation in staff clinical assessment, the vast majority of the neonates had Apgar scores ≥7: 97.25% of neonates had a 1 min Apgar score ≥7 and 99.3% had a 5 min score ≥7. Accepting the average variation in Apgar assessment across clinical staff of 2.4 points, the separation used in this study between a 1 min Apgar scores <4 and an 1 min Apgar score ≥7 ensured little room for error in category.
The results of this study speak most directly to the 14 state tertiary hospital contributing data to the NOR – around 27% of national births. They may arguably extend to other government hospitals which operate under similar policies and practice guidelines (a further 58% of national births); however, those hospitals will also have different levels of specialisation. It seems less likely that the results would generalise to private hospitals (15% of national births) which would operate under their own policies and guidelines.