Skip to main content

Survey of women’s report for 33 maternal and newborn indicators: EN-BIRTH multi-country validation study



Population-based household surveys, notably the Demographic and Health Surveys (DHS) and Multiple Indicator Cluster Surveys (MICS), remain the main source of maternal and newborn health data for many low- and middle-income countries. As part of the Every Newborn Birth Indicators Research Tracking in Hospitals (EN-BIRTH) study, this paper focuses on testing validity of measurement of maternal and newborn indicators around the time of birth (intrapartum and postnatal) in survey-report.


EN-BIRTH was an observational study testing the validity of measurement for selected maternal and newborn indicators in five secondary/tertiary hospitals in Bangladesh, Nepal and Tanzania, conducted from July 2017 to July 2018. We compared women’s report at exit survey with the gold standard of direct observation or verification from clinical records for women with vaginal births. Population-level validity was assessed by validity ratios (survey-reported coverage: observer-assessed coverage). Individual-level accuracy was assessed by sensitivity, specificity and percent agreement. We tested indicators already in DHS/MICS as well as indicators with potential to be included in population-based surveys, notably the first validation for small and sick newborn care indicators.


33 maternal and newborn indicators were evaluated. Amongst nine indicators already present in DHS/MICS, validity ratios for baby dried or wiped, birthweight measured, low birthweight, and sex of baby (female) were between 0.90–1.10. Instrumental birth, skin-to-skin contact, and early initiation of breastfeeding were highly overestimated by survey-report (2.04–4.83) while umbilical cord care indicators were massively underestimated (0.14–0.22). Amongst 24 indicators not currently in DHS/MICS, two newborn contact indicators (kangaroo mother care 1.00, admission to neonatal unit 1.01) had high survey-reported coverage amongst admitted newborns and high sensitivity. The remaining indicators did not perform well and some had very high “don’t know” responses.


Our study revealed low validity for collecting many maternal and newborn indicators through an exit survey instrument, even with short recall periods among women with vaginal births. Household surveys are already at risk of overload, and some specific clinical care indicators do not perform well and may be under-powered. Given that approximately 80% of births worldwide occur in facilities, routine registers should also be explored to track coverage of key maternal and newborn health interventions, particularly for clinical care.

Key findings

What is known and what is new about this study?
• Population-based household surveys are the primary source of maternal and newborn health data for many low- and middle-income countries (LMICs). While surveys are important data sources, especially where coverage of robust routine health data systems remains low, they are also infrequent, expensive and have been shown to have limited validity for some aspects of perinatal care.
• EN-BIRTH is the largest validation study to date of maternal and newborn health indicators, across five hospitals in three countries, including > 14,000 exit surveys. This dataset enabled validity analyses to be made for 33 maternal and newborn indicators comparing gold-standard observation or case notes verification to exit survey-reported coverage and outcomes.
• This is the first validity testing for hospital-based clinical care of small and sick newborns (e.g. resuscitation, kangaroo mother care, and neonatal infection management).
What does this say about nine indicators already in MICS and/or DHS core or additional modules?
• 4 out of 9 indicators were accurately estimated by survey report at the pooled population level: baby dried or wiped immediately after birth (observed coverage: 90.5%, survey: 96.8%), birthweight measured (observed: 98.6%, survey: 93.8%), sex (observed female: 48.8%, survey-reported female: 49.1%) and low birthweight (based on observed weight: 15.2%, based on survey-reported weight: 14.1%).
• Early initiation of breastfeeding (observed: 14.4%, survey: 69.5%), and skin-to-skin contact (observed: 41.2%, survey: 84.2%) were highly overestimated in the exit survey.
• Application of a substance to the umbilical cord was massively (> 76%) underestimated in survey-report compared with observed coverage for anything applied to the cord or chlorhexidine applied to the cord, largely driven by “don’t know” responses (24.1–75.2%).
• Besides application of a substance/chlorhexidine to the umbilical cord, “don’t know” responses were < 10% for other indicators already in DHS/MICS.
Which questions are not appropriate for surveys?
• Validity of indicators not already in DHS/MICS was affected by high “don’t know” responses (> 20%), varying widely by hospital e.g. birth attendant listened to fetal heart sounds during labour, oxytocin given, antenatal corticosteroids given before birth, baby received injectable antibiotics, any diagnostic/blood test done.
• Clinical care indicators had low validity: any infection (percent agreement 47.1%) or sepsis (percent agreement 26.6%). Newborn resuscitation had high percent agreement and high specificity but low sensitivity. Moreover, indicators for the small and sick newborn target group may be underpowered even in large national household surveys.
What next and research gaps?
• Consistent with other studies, we found lower validity for clinical interventions and time-bound questions. Further research is needed on time-bound indicators to explore how the accuracy of crucial indicators such as early initiation of breastfeeding can be improved, e.g. if the time component were to be dropped.
• Women whose newborns were admitted to a neonatal or KMC ward reported this accurately; however, to be useful in population-based surveys, we would need to know how people not admitted would respond.
• Improved, respectful communication with families regarding clinical interventions for small and sick newborns is needed for both quality care and accuracy of survey-reported coverage. Families cannot report on clinical care if they were never informed.


Globally each year, 2.4 million newborns die in the first month of life, more than 2 million babies are stillborn and around 295,000 women die of maternal causes, the vast majority in low- and middle-income countries (LMICs) [1,2,3,4]. Most of these deaths can be prevented by high coverage and quality care during pregnancy and childbirth, and for small and sick newborns [5]. The Sustainable Development Goals include a target to reduce the national neonatal mortality rate to fewer than 12 per 1000 live births, and the global average maternal mortality ratio to fewer than 70 per 100,000 live births by 2030 [6]. To track the progress, and the linked stillbirth target of fewer than 12 stillbirths per 1000 total births, the Every Newborn Action Plan (ENAP) was launched in 2014. In close alignment with the World Health Organization Strategy for Ending Preventable Maternal Mortality, some indicators were prioritised for maternal and newborn care [7, 8]. Unfortunately, in the countries where most of the maternal and newborn deaths occur, data gaps for coverage and quality of care impede health systems improvement needed to drive progress towards universal health coverage [9].

Currently, most LMICs are reliant on retrospective data based on women’s self-report collected through household surveys, such as the Demographic and Health Surveys (DHS) and Multiple Indicator Cluster Surveys (MICS) [10, 11]. However, these population-level surveys track a limited number of indicators that measure maternal and newborn care and have focused especially on “contact points” such as antenatal care, skilled birth attendance, facility birth, and postnatal care. The DHS core women’s questionnaire has over 400 questions and takes 30–60 min to complete for most women; there is understandable reluctance to add more questions on maternal and newborn health.

Given the shift to evidence-based measurement, there is more demand for validation studies on indicators already in surveys, and to inform selection of new questions. It is recommended to test the validity by comparing survey measures of an indicator to a gold standard data source [12]. Several studies have assessed validity of women’s self-report in high income or upper middle-income countries using clinical records as the gold standard [13,14,15,16,17,18,19,20,21,22,23,24,25,26]. Other studies have sought to validate women’s reports of events related to care around the time of birth in LMICs using direct observation as the gold standard; however, these studies often had a small sample size and/or were conducted in one or two facilities [27,28,29,30]. These studies have found variable validity for some indicators including uterotonic administration, early initiation of breastfeeding and skin-to-skin contact, indicating need for further research in additional contexts. Previous validity studies have not been conducted in Bangladesh or Tanzania and those in Nepal have been limited to birthweight and gestational age [31]. Furthermore, no published studies have reported on validity of women’s reports for new indicators related to the care of small and sick newborns such as kangaroo mother care (KMC) for low birthweight babies or injectable antibiotics for newborn sepsis.

The Every Newborn Action Plan, agreed by all United Nations member states and > 80 development partners, includes an ambitious measurement improvement roadmap to validate measurement of indicators for care and outcomes around the time of birth [7, 32]. As part of this roadmap, the Every Newborn – Birth Indicators Research Tracking in Hospitals (EN-BIRTH) study was an observational study of > 23,000 women to test the validity of measurement for selected indicators [33].


This paper is part of a supplement based on the EN-BIRTH multi-country validation study, ‘Informing measurement of coverage and quality of maternal and newborn care’ and focuses on women’s report surveys, with the following objectives:

  1. 1.

    Assess the VALIDITY of measurement for nine current DHS/MICS indicators that measure care during the intrapartum and immediate postpartum period for women with vaginal births.

  2. 2.

    Explore 24 POTENTIAL MATERNAL AND NEWBORN INDICATORS, including indicators for care of small and sick newborns that could be included in population surveys (e.g. DHS/MICS) and assess their validity and measurement quality.


Study settings and design

EN-BIRTH was a mixed-methods, observational study comparing observer-assessed coverage (considered the gold standard) of selected maternal and newborn interventions to coverage measured by women’s report at exit survey and routine register records; this paper focuses on survey-report (Fig. 1). Data were collected from July 2017 to July 2018 in five public secondary/tertiary comprehensive emergency obstetric and newborn care (CEmONC) hospitals in three high burden countries: Maternal and Child Health Training Institute, Azimpur and Kushtia General Hospital in Bangladesh (BD); Pokhara Academy of Sciences in Nepal (NP); Temeke District Hospital and Muhimbili National Referral Hospital in Tanzania (TZ). Detailed information regarding the research protocol [33] and overall validity results, including for routine register data, has been published separately [34].

Fig. 1

Survey validation design, EN-BIRTH study. Exact wording for survey questions detailed in Additional file 1

Participants were pregnant women being admitted to labour/delivery wards (exclusion criteria at admission were imminent birth and no fetal heart beat heard, participants were not automatically excluded based on age), mother-baby pairs admitted in KMC corners/wards (all admissions were eligible) and newborns admitted to inpatient wards for treatment of presumed severe neonatal infection (neonates with clinically defined infection—sepsis, pneumonia, meningitis—were eligible). Trained clinical researchers observed participants 24 h per day and recorded data on care and outcomes in three clinical settings: labour and delivery ward, operating theatre, and KMC corners/wards. Verification of inpatient records were used as the gold standard for newborns who received antibiotics for presumed severe infection, and for women who received antenatal corticosteroids (ACS) for risk of preterm birth. Women were surveyed at the time of discharge before leaving the facility by a separate cadre of data collectors. Training for survey data collectors was based on DHS training materials. In the case of multiple births, women were asked only about the first birth. All data were collected using a custom-built android tablet-based application. All data collectors and study staff received standardised training on the study procedures and data collections tools.

We compared observer-assessed coverage of care and outcomes for women with vaginal births to women’s reports at exit survey. Women who give birth vaginally on labour wards have a very different experience than those giving birth by caesarean in operating theatre; thus we have analysed these separately. Differences between vaginal birth and caesarean birth for indicator accuracy are reported elsewhere for five indicators [34] and for specific indicators throughout this supplement series [35,36,37,38]. Further work is ongoing to examine quality of care and measurement accuracy for women with caesarean sections.

Indicator selection

At the study design phase, we conducted a mapping review of the MICS women’s questionnaire as well as DHS-7 core women’s questionnaire and newborn care additional module to identify maternal and newborn indicators from the intrapartum and immediate postnatal period for which we could test validity in a hospital setting. To identify maternal and newborn indicators that have the potential to be included in population-based surveys, we referred to the Ending Preventable Maternal Mortality, Every Newborn Action Plan strategy documents and earlier studies testing validity of measurement for maternal and newborn indicators [7, 8]. As a result of updates to DHS questionnaires (DHS-8) and the new supplemental module on maternal health care, we conducted an additional mapping review prior to data analysis. We selected 33 maternal and newborn indicators for analysis (nine indicators already existing in DHS/MICS and 24 indicators having the potential to be included in population-based surveys). The indicator list is shown in Fig. 2 and the exact wording for questions in EN-BIRTH, DHS and MICS is shown in Additional file 1.

Fig. 2

List of indicators tested for validity, EN-BIRTH study


To calculate observer-assessed coverage and survey-reported coverage, we used a relevant denominator from the EN-BIRTH dataset (total deliveries/ total births/ livebirths/ admitted to KMC ward/ admitted to inpatient for suspected neonatal infection, etc.) (Additional file 2) and expressed results as a percentage. “Don’t know” survey responses were also reported separately as a percentage. We calculated validity ratios, similar to verification ratios in data quality review (DQR) methods, calculating survey-coverage divided by observed coverage where “don’t know” responses were treated as “no” (Additional file 3). A ratio > 1 shows overestimation of survey-reported coverage compared to observed, while a ratio < 1 shows an underestimate. We used standard DQR cut-offs (over/underestimate by 0–5% = Excellent, by 6–10% = Very good, by 11–15% = Good by, 16–20% = Moderate and by  > 20% = Poor) for heat maps [39].

For individual-level validity reporting, we constructed two-way tables comparing observer-assessed coverage to survey-reported coverage. In line with DHS and common survey reporting, we combined survey “don’t know” responses with “no”, except for the low birthweight indicator where “don’t know” was excluded from the numerator and denominator [40]. Additional analysis for selected indicators is presented in Additional file 4 with “don’t know” excluded from the analysis (numerator and denominator).

As interventions/conditions with very high or very low coverage/prevalence may result in a small sample size for individual-level validity “diagnostic test” methods (low cell counts in two way tables), we report percent agreement for all indicators. Where column totals are ≥10 in the two way tables and “don’t know” responses were < 20%, we calculated sensitivity (true positive rate) and specificity (true negative rate) of survey-reported coverage to measure observed coverage (gold standard). Positive predictive value (PPV), negative predictive value (NPV), area under the curve (AUC), and inflation factor (IF) were also calculated. Percentage observed to have an intervention or outcome among women replying “don’t know” for indicators included in DHS/MICS were calculated. 95% confidence intervals were calculated assuming a binominal distribution. Validity analysis pooled results were calculated using random effects meta-analysis, presented with i2, τ2, and heterogeneity statistic (Q). Missing values from the observation dataset were excluded from the relevant analysis [12].

To determine reliability of the observational data (gold standard), study supervisors simultaneously observed births with data collectors for a 5% subset of cases. We calculated percent agreement and Cohen’s kappa coefficients of agreement for core indicators. Percent agreement between the two observers ranged between 85.0–100% by indicator and site. Kappa scores had a wider range (0–1), however, some low kappa scores were affected by prevalence and an imbalance in marginal totals. We included all indicators in the analysis and discussion on low kappa coefficients has been done elsewhere [34].

All statistical analyses were conducted using Stata (version 16) [41] and results are reported in accordance with STROBE statements checklists for cross-sectional studies (Additional file 5) [42].


Study participants

Three types of participants were involved in this study (Fig. 3). Amongst 23,015 women observed in labour and delivery wards, 16,030 had vaginal births resulting in 16,298 newborns (including twins and stillbirths). Exit surveys were conducted with 14,543 of these women (90.7%). Out of 842 mother-baby pairs admitted in KMC wards/corners, 840 pairs were observed (99.8%). Exit surveys were conducted with 652 women (77.6%). A total of 1523 babies were identified in the inpatient wards, 1015 met eligibility criteria for presumed severe infection (diagnosis of sepsis, pneumonia, or meningitis), consent was obtained for 100% and exit surveys were conducted with 910 women (89.7%). Reasons for non-participation in the exit survey included refusal and discharge prior to being approached for the survey.

Fig. 3

Flow Diagram: a Labour and Delivery b Kangaroo Mother Care c Neonatal Infection, EN-BIRTH study

The background characteristics of the participants are presented in Table 1, with details by site in Additional files 6, 7 and 8. One-third (37.3%) were age 20–24, more than 40% completed secondary education and half (51.1%) were pregnant for the first time. Among the babies who were observed in the KMC ward, 97.0% were less than or equal to 2000 g. Among the babies who were admitted in the inpatient ward for presumed severe infection and met the eligibility criteria, two-thirds were less than 7 days old at the time of admission.

Table 1 Characteristics of women and babies, EN-BIRTH study

Indicators already captured in MICS/DHS

Four out of nine indicators were accurately estimated by survey report at the pooled population level. Indicators with a time component (early breastfeeding and skin-to-skin contact) were over-reported while umbilical cord care indicators were under-reported. Instrumental birth had very low coverage and was overreported in survey. Figure 4 and Additional file 9 present the observer-assessed coverage and survey-reported coverage along with the percentage of “don’t know” responses. Sensitivity, specificity, and percent agreement are presented in Fig. 5 and Additional files 10 and 11.

Fig. 4

Coverage for selected indicators, EN-BIRTH study. 1These indicators are not interventions and prevalence is reported for these indicators. 2Not asked in Tanzania. 3"Don’t Know" is excluded from numerator and denominator. Validity ratio calculated as survey-coverage/observed coverage. Observed data: labour and delivery ward n = 16,030 women, 16,298 newborns; kangaroo mother care n = 840; neonatal infection n = 1015. Survey data: labour and delivery ward n = 14,543 women; kangaroo mother care n = 652; neonatal infection n = 910

Fig. 5

Individual-level validation in exit survey for selected indicators, EN-BIRTH study. 1Validation not done because “Don’t Know” response > 20%. 2Validation not done because ten or fewer observations per column of the two-way table. 3Not asked in Tanzania. Observed data: labour and delivery ward n = 16,030 women, 16,298 newborns; kangaroo mother care n = 840; neonatal infection n = 1015. Survey data: labour and delivery ward n = 14,543 women; kangaroo mother care n = 652; neonatal infection n = 910

Mode of birth (instrumental birth) had very low “don’t know” responses across all hospitals (< 1%, Fig. 4, Additional file 9). Observed coverage of instrumental birth was low (0.5%) while survey reported coverage was 1.9%. While percent agreement was 98.2% the validity ratio was “poor” (3.80) (Fig. 5). Individual level validity statistics could only be calculated for Pokhara NP due to low cell counts in two-way tables and showed low sensitivity (61.7%) and high specificity (99.5%, Additional file 10).

Immediate newborn care indicators ranged in coverage and performance. Immediate drying and birthweight measured had high observed coverage (> 90%) and low levels of “don’t know” responses (< 5%). While sensitivity was high (> 94%), specificity was low (4.1% for immediate drying, 48.8% for birthweight measured) and validity ratios were classified as “very good” (immediate drying: 1.07) and “excellent” (birthweight measured: 0.95). Early initiation of breastfeeding and skin-to-skin contact were both largely overestimated by the survey despite low levels of “don’t know” responses (< 1%). Observed coverage of early initiation of breastfeeding was very low (< 14.4%), while survey-reported coverage was 69.5%. While sensitivity was high (82.5%), specificity was low (35.9%) and validity ratio was “poor” (4.83). Observed coverage of skin-to-skin contact was 41.2%, while survey-reported coverage was 84.2%. Sensitivity was high (84.7%) while specificity was low (18.1%) and validity ratio was “poor” (2.04). Cord cleansing with chlorhexidine was nearly universal (97.9%) in the three hospitals with a chlorhexidine policy (Bangladesh and Nepal). Survey-reported coverage, however, had a large range. While survey-reported coverage of any cord cleansing and chlorhexidine application was very low in Azimpur BD (1.9% and 0.5%, respectively) survey-reported coverage for these interventions was higher in Kushtia BD (70.1% and 46.1%, respectively). Overall, sensitivity was low (21.7% for anything applied to the cord; 13.9% chlorhexidine) and specificity was higher (79.1% anything applied; 92.6% chlorhexidine). Validity ratios were “poor” (0.22 and 0.14).

Newborn outcomes such as sex of the baby and low birthweight were well estimated by the survey. Sex of the baby had very low “don’t know” responses (< 1%) with high sensitivity, high specificity (> 97%) and “excellent” validity ratio (1.01). Women were asked the birthweight of the baby, which was then categorized into low (< 2500 g) or normal birthweight (≥2500 g). “Don’t know” responses to birthweight were moderate (4.8%). For low birthweight classification, sensitivity was 83.4%, specificity was 97.1% and validity ratio was “very good” (0.93).

“Don’t know” responses analysis considerations

Among women who replied “don’t know” in the survey, the proportion observed to have the intervention or outcome is presented in Table 2. Of those who didn’t know if their baby was dried or wiped immediately after birth, most (79.2%) were observed to be dried/wiped. Similarly, for women who didn’t know about birthweight measurement or cord care practices, most were observed as completed (birthweight measured: 91%, anything applied to cord: 97.4%, Chlorhexidine applied to cord: 97.2%). However, for interventions involving women themselves, such as placing the newborn skin-to-skin or initiation of breastfeeding, observed coverage among women responding “don’t know” in survey was low.

Table 2 Percentage observed to have intervention/outcome despite reporting “don’t know” for indicators included in DHS/MICS, EN-BIRTH study
Table 3 Estimated sample size required to measure coverage of kangaroo mother care in a national household survey

Validity results with “don’t know” responses excluded are shown in Additional file 4. When “don’t know” responses were excluded for anything applied to the cord and chlorhexidine applied to the cord, individual-level validity improved. Other indicators had low “don’t know” responses and little change to validity when these were excluded.

Indicators not currently in DHS/MICS

Contact point coverage indicators with potential for surveys

Survey measurement was tested for two contact indicators for small and sick newborns, validity ratios were “excellent” among admitted newborns. Among women whose newborns were admitted to a neonatal unit for treatment of infection, 98.8% reported their newborns were admitted to a neonatal unit, and “don’t know” responses were < 1%. Women whose newborns were not admitted and did not have an infection diagnosis were not asked about admission to a neonatal unit. Similarly, both observed and survey reported coverage of KMC among women whose newborns were admitted to KMC corners/wards was universal. Percent agreement was 100%, however this must be interpreted with caution due to 100% coverage.

Content indicators with limited potential for surveys

Of the remaining 22 content indicators, seven had validity ratios of “good” or better when pooled across sites, however none were consistently good across all sites. Questions related to clinical interventions during labour and childbirth, such as listening to fetal heart sounds or administration of uterotonics, did not perform consistently well in the survey. While listening to fetal heart sounds had high sensitivity (> 98%), specificity was very low, < 6%. Questions around uterotonics had variable percent agreement, sensitivity and specificity.

Observed coverage of any resuscitation (stimulation, suction, bag-mask-ventilation (BMV)) was 17.1% where coverage was highest for stimulation (16.6%) and BMV coverage was 5%. Survey-reported stimulation was 1.3%, underestimating observed stimulation by 15.3 percentage points (validity ratio: 0.08). While specificity was high (> 99%) sensitivity was less than 7%. Similarly, suction and BMV were overestimated by surveys and had high specificity (> 99%), low sensitivity (< 12%), and “poor” validity ratios (< 0.22).

Among women whose newborns were admitted to a neonatal unit for infection, very few were able to report that their baby had an infection, and survey-reported prevalence underestimated verified prevalence by 24.6–82.9 percentage points with a “poor” validity ratio (0.47). Survey-reported receipt of injectable antibiotics was under-estimated in surveys by 6.4–45.0 percentage points, and “don’t know” responses ranged from 9.7–35.2% (validity ratio: 0.77). Patient notes verified oxygen administration ranged from 17.2–47.2% and survey-report ranged from underestimating oxygen administration by 6.4 percentage points to overestimating it by 23.9 percentage points (validity ratio: 0.79). “Don’t know” responses for diagnostic testing ranged from 1.6–40.4%. While in Pokhara NP there were few “don’t know” responses (1.6%) and survey-reports were very close to notes of verified coverage (within 1%), in Muhimbili TZ “don’t know” responses were 25.6% and verified coverage was underestimated by 18.6 percentage points.

Amongst women whose newborn was admitted to KMC corners/wards, while nasogastric feeding was low (0–17.0%) with a “poor” validity ratio (1.32), intravenous feeding support ranged from 55.2–72.8% and was underestimated by survey-report by 28.7–66.8 percentage points. Coverage of phototherapy ranged from 6.6–43.8% and was close to survey-reported coverage (validity ratio: 0.84).

Prevalence of postpartum haemorrhage ranged from 1.7–3.9% and had high specificity (> 95%) but low sensitivity (< 27%).


Currently, population-based surveys capture limited data on maternal and newborn care and few validity studies have evaluated available or potential indicators. EN-BIRTH study across five hospitals in three countries included > 14,000 women with vaginal births observed and with exit surveys, seven times more births than any previous maternal and newborn indicator validation study. Our dataset enabled validity analyses for measurement of 33 maternal and newborn indicators comparing time-stamped gold-standard observation to exit survey-reported indicators of coverage and outcomes, with nine indicators currently included in DHS/MICS and new indicators with potential for inclusion.

Overall, we found 4 of 9 indicators already in DHS/MICS performed well in surveys. Of indicators not already in DHS/MICS, “contact” indicators for small and sick newborns (admission to a neonatal unit or KMC ward) may be useful in population-based surveys while indicators on content of clinical care had high levels of “don’t know” responses and limited validity. Where previous validation research has shown mixed results, for example uterotonics for prevention of postpartum haemorrhage [26,27,28,29,30], we found survey report under-estimated true coverage by 10% whereas survey report overestimated early initiation of breastfeeding by nearly 5 times.

This is the first validity testing for hospital-based clinical care of small and sick newborns (e.g. resuscitation, KMC, and neonatal infection management). EN-BIRTH study allowed us to assess validity for these smaller number of vulnerable newborns who needed special care such as: neonatal resuscitation (5–10%) [43, 44], KMC for newborns weighing ≤2000 g (10–20%) [45, 46] and treatment of newborn presumed severe infection (7%) [47], which have not been validated before, partly because of sample size challenges, but also because policy attention is more recent [48]. Coverage of KMC was accurate by survey-report in our study although exit survey questions on KMC were asked only for those women whose newborns were admitted to a KMC ward. Further research is required to validate this indicator for all women, including those not admitted to a KMC ward. Population-based surveys, however, even when conducted with large sample sizes, may be under-powered to measure KMC targeted to stable newborns ≤2000 g. Sample size calculations suggest that for current levels of coverage of KMC for neonates ≤2000 g (believed to be under 10%), a national household survey in Nepal would need to have a 10-fold higher sample size than the most recent DHS survey (Table 3). Usefulness of surveys for interventions in subset target groups is a function of the prevalence of the clinical need for the intervention (i.e. denominator) and coverage, thus once KMC coverage reaches over 50%, then currently used national DHS sample size may suffice.

Indicators related to treatment for presumed severe neonatal infection, particularly those related to antibiotic treatment, may be difficult to capture through surveys. Among newborns admitted for treatment of presumed severe infection, we found poor validity in questions about the baby’s diagnosis and treatment, even with short recall periods. Previous studies of survey-reported antibiotic use for childhood illness have shown that these questions perform poorly and were even worse with longer recall periods [49]. These studies also found that maternal reports of symptoms of acute respiratory infection do not provide a correct denominator for monitoring antibiotic treatment rates [50].

Admission to a neonatal unit for infection may be a useful contact point indicator as women were able to report this with high sensitivity. However, similar to KMC, this exit survey question was only asked to women with admitted newborns and further research is required to validate this indicator in a wider population. Additionally, neonatal infection questions will be subject to sample size issues similar to KMC as incidence risk of possible severe bacterial infection is estimated at 7.6% [47]. Hospital registers and records may be a better alternative for reporting coverage of interventions for small target groups such as small and sick newborns. Specific registers can be designed for documentation of treatment of infection in neonatal inpatient wards rather than only maintaining individual case record forms [51].

For indicators already present in DHS/MICS, we found sex of the baby and low birthweight were reported accurately, although birthweight is known to have issues with heaping (preferential reporting of weight with numbers ending in 00) [35, 46]. Immediate drying had very high sensitivity but very low specificity, possibly relating to the timing element. Drying was counted as “immediate” when it was observed as done within 5 min of birth while women were asked, “Was your baby dried or wiped immediately after birth (within a few minutes)?”. In qualitative interviews with women about their understanding of the word "immediate" in questions relating to immediate newborn care, McCarthy et al. found a wide range of responses including 1 or 2 min, up to 7 min, and less than 20 min [30]. Other studies have also shown immediate drying to have high sensitivity, and low or moderate specificity alone or as a composite indicator with other immediate newborn care [27, 28, 30]. Similar to other validation studies, we found early initiation of breastfeeding was largely over-estimated by survey-reported coverage. This over-estimate may be due to poor recall of the timing component if breastfeeding was initiated but not within 1 h [26, 28, 29]. Furthermore, definitions of breastfeeding may differ between clinical observers and breastfeeding women. A woman may have put her baby to the breast and considered this initiation of breastfeeding, but an observer may not have recorded breastfeeding initiation if they did not observe attachment and suckling, as breastfeeding is a complex and dynamic process [34, 37]. Survey questions on breastfeeding may be more accurate if the focus on timing is removed or shifted to something easier to recall such as place.

While interventions involving women themselves, (e.g. skin-to-skin contact or initiation of breastfeeding) had low “don’t know” responses, questions regarding clinical interventions had high levels of “don’t know” responses. These indicators had lower accuracy in survey-reports, even when the recall period was very short (exit survey) compared with 2 to 5-year recall periods expected in population-based surveys. Low accuracy may relate to not seeing an intervention happening if newborns are separated from their mothers or may relate to poor communication about care from health care workers. While a study conducted in primary health care facilities in northern Nigeria found high validity for measurement of Chlorhexidine application to newborn’s cord [27], our study showed low validity in these facilities, possibly due to not applying Chlorhexidine in front of the mother or lack of communication between health care workers and women. A detailed validation analysis for Chlorhexidine application is published elsewhere [38].

We have considered “don’t know” replies for most yes/no survey questions as “no”, consistent with DHS reporting [40]. We found, however, for clinical interventions observed coverage was high among women who responded “don’t know”. While in our study, observed coverage of these clinical interventions was high among all newborns in these facilities, true coverage among women responding “don’t know” to these questions for home births or births in smaller facilities may not be as high. Survey-reported coverage of maternal and newborn care may have improved accuracy if “don’t know” responses are excluded from both numerators and denominators.

Strengths and limitations

Strengths of this study include the large sample of more than 23,000 facility births (> 14,000 exit-surveys with women with vaginal births) across five high-burden facilities in three countries from sub-Saharan Africa and south Asia and direct observation by clinically trained researchers used as gold standard. Errors in data collection for observation were minimized by using a custom-built android application with time-stamping designed to reduce delay in recording events [52]. Data quality was promoted by refresher training and subsets of dual observation by supervisors for comparison [34]. While we did not base our assessment of validity on AUC cut-offs as our indicators were all binary (yes/no), we provide these calculations in Additional files 4 and 10.

This study's limitations included conducting the survey at the time of discharge from the hospital, in contrast to several years after birth as is often done in population-based surveys. As such, the recall bias will be minimized for our study, representing the best-case scenario and not the level of validity captured by population-based surveys. However, as surveys were conducted at the time of discharge, the busy clinical setting may have been distracting and women may have been in a hurry to return home, which may differ from the context of the population-based surveys occurring in a home setting. Some bias may have been introduced as > 5% of women were discharged before they were approached for interview. We also note that the results may not be representative of lower-level facilities since EN-BIRTH was conducted in five high-volume facilities. Additionally, observed coverage of care may have been higher due to the presence of the observer, further limiting generalisability and possibly altering women’s perception and recollection of care received [12]. In this paper we excluded the 6698 women who had caesarean sections. Since caesarean section affects both the practice of care and survey report, all our results for many of the 33 indicators would need to be split by caesarean section non-caesarean, adding even more complexity. These important analyses will be undertaken later.

The coverage of the indicators for treatment of presumed severe neonatal infection was reported from data extraction from individual case notes, as observation of admitted neonates for the whole hospital stay was not feasible. There is a possibility that a specific intervention was given but was not documented in the case notes. Despite having a large sample, there were still indicators with very high or low coverage that did not have enough observations in each column of the two-way table to report individual-level validity statistics. For those indicators, we did not report sensitivity, specificity, AUC and IF, and instead reported percent agreement [12]. The percentage agreement should be interpreted cautiously as there is the possibility of high percentage agreement for high sensitivity and low specificity of indicators that have high coverage. Additionally, high percentage agreement is also possible where an indicator has low sensitivity and high specificity with very low coverage.

Rates of caesarean sections are rising globally [53]. In our study, the caesarean section rate was 29% overall, and as high as 73% in one hospital, Azimpur BD. Women with vaginal births have different experiences from women undergoing caesareans and may experience more separation from their newborns. Caesarean birth negatively affected accuracy of survey-reported data [34,35,36,37,38]; thus this analysis has focused on vaginal birth. Further research of care and measurement among women with caesareans in this study is ongoing. Women with stillbirths were included in our survey, and coverage and measurement gaps for stillbirths are shown for specific indicators throughout this supplement series [35, 54]. The majority of women with stillbirths approached for survey consented to participate in and responded to questions on labour and birth [54], in line with other research involving women with stillbirths showing high survey completeness [55]. Women with stillbirths should be included in population-based surveys, particularly to inform action to end preventable stillbirths.

Further research is needed to understand if improving wording for some survey questions, particularly those related to clinical interventions or those with a timing component (i.e. early initiation of breastfeeding), may improve accuracy. Research on communication surrounding clinical interventions for newborn care, including small and sick newborns, is needed to understand factors contributing to accuracy of survey-reported coverage. More qualitative research regarding women’s understanding of and recall for questions related to timing, such as early breastfeeding and immediate drying, will allow us to improve question wording or indicator definitions. More process evaluation is required to better understand and improve aspects of surveys and survey burden.


Population-based surveys remain an important source of generalisable maternal and newborn health information, especially where routine systems are not available. Among 33 indicators assessed, survey-reported birthweight measured and low birthweight classification performed well, however other clinical intervention questions and early initiation of breastfeeding performed poorly in survey-report. Further research is needed to see if differently phrased questions could lead to higher accuracy. While specific clinical interventions are not appropriate for surveys, contact indicators such as admission to a neonatal unit or a KMC ward may be a useful survey indicator option as a marker of care for small and sick newborns. Given that ~ 80% of births worldwide are now in facilities, investment in routine health management information systems could improve potential for tracking coverage of clinically focused maternal and newborn health interventions. Household surveys have numerous questions, and careful evidence-based measurement approaches should be applied to select and reject which indicators are best measured in surveys and/or routine systems based on impact and validity. Valid measurement is required to track scale-up of high-impact interventions and end preventable deaths of women and newborns.

Availability of data and materials

The datasets generated during and/or analysed during the current study are available on LSHTM Data Compass repository,



Area under the receiver operating curve




Positive predictive value


Negative predictive value


Inflation factor


Children’s Investment Fund Foundation


The Demographic and Health Survey Program


Every Newborn Action Plan now branded as Every Newborn


Every Newborn-Birth Indicators Research Tracking in Hospitals study


International Centre for Diarrheal Disease Research, Bangladesh


Ifakara Health Institute, Tanzania


Low-Middle Income Country


London School of Hygiene & Tropical Medicine


Muhimbili University of Health and Allied Sciences, Tanzania


Multiple Indicator Cluster Survey








United Nations Children's Fund


  1. 1.

    UNICEF, World Health Organization, World Bank Group. Levels & trends in child mortality 2020. New York: United Nations Children’s Fund; 2020. Accessed 14 Sep 2020.

    Google Scholar 

  2. 2.

    World Health Organization. Trends in maternal mortality 2000 to 2017: estimates by WHO, UNICEF, UNFPA, World Bank Group and the United Nations population division. Geneva: World Health Organization; 2019.

    Google Scholar 

  3. 3.

    Blencowe H, Cousens S, Jassir FB, Say L, Chou D, Mathers C, et al. National, regional, and worldwide estimates of stillbirth rates in 2015, with trends from 2000: a systematic analysis. Lancet Glob Health. 2016;4:e98–108.

    Article  Google Scholar 

  4. 4.

    United Nations Inter-agency Group for Child Mortality Estimation (UN IGME). A Neglected Tragedy: The global burden of stillbirths. New York: United Nations Children’s Fund; 2020. Accessed 12 Oct 2020.

    Google Scholar 

  5. 5.

    World Health Organization. WHO recommendations on newborn health. 2017. Accessed 7 Feb 2020.

    Google Scholar 

  6. 6.

    United Nations. Transforming our world: the 2030 Agenda for Sustainable Development 2015. Accessed 7 Feb 2020.

  7. 7.

    World Health Organization. Every newborn: An action plan to end preventable deaths. Geneva: World Health Organization; 2014. Accessed 7 Feb 2020.

  8. 8.

    World Health Organization. Strategies towards ending preventable maternal mortality (EPMM): executive summary. 2015. Accessed 19 Aug 2020.

    Google Scholar 

  9. 9.

    Lawn JE, Cousens S, Zupan J. 4 million neonatal deaths: when? Where? Why? Lancet. 2005;365:891–900.

    Article  Google Scholar 

  10. 10.

    The DHS Program. The DHS Program Accessed 9 Feb 2018.

  11. 11.

    Multiple Indicator Cluster Surveys (MICS). Accessed 20 Aug 2020.

  12. 12.

    Munos MK, Blanc AK, Carter ED, Eisele TP, Gesuale S, Katz J, et al. Validation studies for population-based intervention coverage indicators: design, analysis, and interpretation. J Glob Health. 2018;8:020804.

    Article  Google Scholar 

  13. 13.

    Buka SL, Goldstein JM, Spartos E, Tsuang MT. The retrospective measurement of prenatal and perinatal events: accuracy of maternal recall. Schizophr Res. 2004;71:417–26.

    Article  Google Scholar 

  14. 14.

    Casey R, Rieckhoff M, Beebe SA, Pinto-Martin J. Obstetric and perinatal events: the accuracy of maternal report. Clin Pediatr (Phila). 2016.

  15. 15.

    Elkadry E, Kenton K, White P, Creech S, Brubaker L. Do mothers remember key events during labor? Am J Obstet Gynecol. 2003;189:195–200.

    Article  Google Scholar 

  16. 16.

    Elliott JP, Desch C, Istwan NB, Rhea D, Collins AM, Stanziano GJ. The reliability of patient-reported pregnancy outcome data. Popul Health Manag. 2010;13:27–32.

    Article  Google Scholar 

  17. 17.

    Githens PB, Glass CA, Sloan FA, Entman SS. Maternal recall and medical records: An examination of events during pregnancy, childbirth, and early infancy. Birth. 1993;20:136–41.

    CAS  Article  Google Scholar 

  18. 18.

    Quigley MA, Hockley C, Davidson LL. Agreement between hospital records and maternal recall of mode of delivery: evidence from 12 391 deliveries in the UK millennium cohort study. BJOG Int J Obstet Gynaecol. 2007;114:195–200.

    CAS  Article  Google Scholar 

  19. 19.

    Yawn BP, Suman VJ, Jacobsen SJ. Maternal recall of distant pregnancy events. J Clin Epidemiol. 1998;51:399–405.

    CAS  Article  Google Scholar 

  20. 20.

    Lederman SA, Paxton A. Maternal reporting of prepregnancy weight and birth outcome: consistency and completeness compared with the clinical record. Matern Child Health J. 1998;2:123–6.

    CAS  Article  Google Scholar 

  21. 21.

    Tate AR, Dezateux C, Cole TJ, Davidson L. Factors affecting a mother’s recall of her baby’s birth weight. Int J Epidemiol. 2005;34:688–95.

    Article  Google Scholar 

  22. 22.

    Hakim RB, Tielsch JM, See LC. Agreement between maternal interview- and medical record based gestational age. Am J Epidemiol. 1992;136:566–7.

    CAS  Article  Google Scholar 

  23. 23.

    Troude P, L’Hélias LF, Raison-Boulley A-M, Castel C, Pichon C, Bouyer J, et al. Perinatal factors reported by mothers: do they agree with medical records? Eur J Epidemiol. 2008;23:557–64.

    Article  Google Scholar 

  24. 24.

    Rice F, Lewis A, Harold G, van den Bree M, Boivin J, Hay DF, et al. Agreement between maternal report and antenatal records for a range of pre and peri-natal factors: the influence of maternal and child characteristics. Early Hum Dev. 2007;83:497–504.

    Article  Google Scholar 

  25. 25.

    Liu L, Li M, Yang L, Ju L, Tan B, Walker N, et al. Measuring coverage in MNCH: A validation study linking population survey derived coverage to maternal, newborn, and child health care records in rural China. PLoS One. 2013;8.

  26. 26.

    Blanc AK, Diaz C, McCarthy KJ, Berdichevsky K. Measuring progress in maternal and newborn health care in Mexico: validating indicators of health system contact and quality of care. BMC Pregnancy Childbirth. 2016;16:255.

  27. 27.

    Bhattacharya AA, Allen E, Umar N, Usman AU, Felix H, Audu A, et al. Monitoring childbirth care in primary health facilities: a validity study in Gombe state, northeastern Nigeria. J Glob Health. 2019;9:020411.

  28. 28.

    Blanc AK, Warren C, McCarthy KJ, Kimani J, Ndwiga C, RamaRao S. Assessing the validity of indicators of the quality of maternal and newborn health care in Kenya. J Glob Health. 2016;6.

  29. 29.

    Stanton CK, Rawlins B, Drake M, dos Anjos M, Cantor D, Chongo L, et al. Measuring coverage in MNCH: testing the validity of women’s self-report of key maternal and newborn health interventions during the Peripartum period in Mozambique. PLoS One. 2013;8:e60694.

  30. 30.

    McCarthy KJ, Blanc AK, Warren CE, Kimani J, Mdawida B, Ndwidga C. Can surveys of women accurately track indicators of maternal and newborn care? A validity and reliability study in Kenya. J Glob Health. 2016;6:020502.

    Article  Google Scholar 

  31. 31.

    Chang KT, Mullany LC, Khatry SK, LeClerq SC, Munos MK, Katz J. Validation of maternal reports for low birthweight and preterm birth indicators in rural Nepal. J Glob Health. 2018;8:010604.

    Article  Google Scholar 

  32. 32.

    Moxon SG, Ruysen H, Kerber KJ, Amouzou A, Fournier S, Grove J, et al. Count every newborn; a measurement improvement roadmap for coverage data. BMC Pregnancy Childbirth. 2015;15:S8.

    Article  Google Scholar 

  33. 33.

    Day LT, Ruysen H, Gordeev VS, Gore-Langton GR, Boggs D, Cousens S, et al. “Every newborn-BIRTH” protocol: observational study validating indicators for coverage and quality of maternal and newborn health care in Bangladesh, Nepal and Tanzania. J Glob Health. 2019;9.

  34. 34.

    Day LT, Rahman QS, Rahman AE, Salim N, KC A, Ruysen H, Tahsina T, Masanja H, Basnet O, Gore-Langton GR et al. Assessment of the validity of the measurement of newborn and maternal health-care coverage in hospitals (EN-BIRTH): an observational study. The Lancet Global Health. 2020.

  35. 35.

    Kong S, Day LT, Bin Zaman S, Peven K, Salim N. Birthweight: EN-BIRTH multi-country validation study 2020. BMC Pregnancy Childbirth. 2021.

  36. 36.

    Ruysen H, Shabani J, Hanson C, Day LT, Pembe AB, Peven K, et al. Uterotonics for prevention of postpartum haemorrhage: EN-BIRTH multi-country validation study. BMC Pregnancy Childbirth. 2021.

  37. 37.

    Tahsina T, Hossain AT, Ruysen H, Rahman AE, Day LT, Peven K, et al. Immediate newborn care and breastfeeding: EN-BIRTH multi-country validation study. BMC Pregnancy Childbirth. 2021.

  38. 38.

    Zaman SB, Siddique AB, Ruysen H, KC A, Peven K, Ameen S, et al. Chlorhexidine for facility-based umbilical cord care: EN-BIRTH multi-country validation study. BMC Pregnancy Childbirth. 2021.

  39. 39.

    World Health Organization. Data quality review: module 2: desk review of data quality. Geneva: World Health Organization; 2017. Accessed 7 Jan 2020.

  40. 40.

    Croft T, Marshall A, Allen C. Guide to DHS statistics. Rockville: ICF; 2018. Accessed 5 Nov 2018.

  41. 41.

    StataCorp. Stata Statistical Software. College Station: StataCorp LLC; 2019.

    Google Scholar 

  42. 42.

    von Elm E, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, Vandenbroucke JP, et al. The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. Int J Surg Lond Engl. 2014;12:1495–9.

    Article  Google Scholar 

  43. 43.

    Lee AC, Cousens S, Wall SN, Niermeyer S, Darmstadt GL, Carlo WA, et al. Neonatal resuscitation and immediate newborn assessment and stimulation for the prevention of neonatal deaths: a systematic review, meta-analysis and Delphi estimation of mortality effect. BMC Public Health. 2011;11:S12.

    Article  Google Scholar 

  44. 44.

    KC A, Lawn JE, Zhou H, Ewald U, Gurung R, Gurung A, et al. Not crying after birth as a predictor of not breathing. Pediatrics. 2020;145.

  45. 45.

    Vesel L, Bergh A-MM, Kerber KJ, Valsangkar B, Mazia G, Moxon SG, et al. Kangaroo mother care: a multi-country analysis of health system bottlenecks and potential solutions. BMC Pregnancy Childbirth. 2015;15:S5.

    Article  Google Scholar 

  46. 46.

    Blencowe H, Krasevec J, de Onis M, Black RE, An X, Stevens GA, et al. National, regional, and worldwide estimates of low birthweight in 2015, with trends from 2000: a systematic analysis. Lancet Glob Health. 2019;7:e849–60.

    Article  Google Scholar 

  47. 47.

    Seale AC, Blencowe H, Manu AA, Nair H, Bahl R, Qazi SA, et al. Estimates of possible severe bacterial infection in neonates in sub-Saharan Africa, South Asia, and Latin America for 2012: a systematic review and meta-analysis. Lancet Infect Dis. 2014;14:731–41.

    Article  Google Scholar 

  48. 48.

    World Health Organization. Survive and thrive: transforming care for every small and sick newborn: World Health Organization; 2019. Accessed 22 Jul 2020.

  49. 49.

    Feikin DR, Audi A, Olack B, Bigogo GM, Polyak C, Burke H, et al. Evaluation of the optimal recall period for disease symptoms in home-based morbidity surveillance in rural and urban Kenya. Int J Epidemiol. 2010;39:450–8.

    Article  Google Scholar 

  50. 50.

    Campbell H, El Arifeen S, Hazir T, O’Kelly J, Bryce J, Rudan I, et al. Measuring Coverage in MNCH: Challenges in Monitoring the Proportion of Young Children with Pneumonia Who Receive Antibiotic Treatment. PLOS Med. 2013;10:e1001421.

  51. 51.

    Shamba D, Day LT, Zaman SB, Sunny AK, Tarimo MN, Peven K, et al. Barriers and enablers to routine register data collection for newborns and mothers: EN-BIRTH multi-country validation study. BMC Pregnancy Childbirth. 2021.

  52. 52.

    Ruysen H, Rahman AE, Gordeev VS, Hossain T, Basnet O, Shirima K, et al. Electronic data collection for multi-country, hospital-based clinical observation of maternal and newborn care: experiences from the EN-BIRTH study. BMC Pregnancy Childbirth. 2021.

  53. 53.

    Boerma T, Ronsmans C, Melesse DY, Barros AJD, Barros FC, Juan L, et al. Global epidemiology of use of and disparities in caesarean sections. Lancet. 2018;392:1341–8.

    Article  Google Scholar 

  54. 54.

    Peven K, Day LT, Ruysen H, Tahsina T, KC A, Shabani J, et al. Stillbirths including intrapartum timing: EN-BIRTH multi-country validation study. BMC Pregnancy Childbirth. 2021.

  55. 55.

    Di Stefano L, Bottecchia M, Haider MM, Galiwango E, Dzabeng F, Fisker AB. Stillbirth maternity care measurement and associated factors in population-based surveys: EN-INDEPTH study]. BMC Popul Health Metr. 2021.

Download references


Firstly, and most importantly, we thank the women, their families, the health workers and data collectors. We credit the inspiration of the late Dr. Godfrey Mbaruku. We thank Claudia DaSilva, Veronica Ulaya, Mohammad Raisul Islam, Sudip Karki and Rabina Sarki for their administrative support and Sabrina Jabeen, Goutom Banik, Md. Shahidul Alam, Tamatun Islam Tanha and Md. Mohsiur Rahman for support during data collectors training.

We acknowledge the following groups for support and inputs:

National Advisory Groups:

Bangladesh: Mohammod Shahidullah, Khaleda Islam, Md Jahurul Islam.

Nepal: Naresh P KC, Parashu Ram Shrestha.

Tanzania: Muhammad Bakari Kambi, Georgina Msemo, Asia Hussein, Talhiya Yahya, Claud Kumalija, Eliudi Eliakimu, Mary Azayo, Mary Drake, Honest Kimaro.

EN-BIRTH validation collaborative group:

Bangladesh: Md. Ayub Ali, Bilkish Biswas, Rajib Haider, Md. Abu Hasanuzzaman, Md. Amir Hossain, Ishrat Jahan, Rowshan Hosne Jahan, Jasmin Khan, M A Mannan, Tapas Mazumder, Md. Hafizur Rahman, Md. Ziaul Haque Shaikh, Aysha Siddika, Taslima Akter Sumi, Md. Taqbir Us Samad Talha.

Tanzania: Evelyne Assenga, Claudia Hanson, Edward Kija, Rodrick Kisenge, Karim Manji, Fatuma Manzi, Namala Mkopi, Mwifadhi Mrisho, Andrea Pembe.

Nepal: Jagat Jeevan Ghimire, Rejina Gurung, Elisha Joshi, Avinash K Sunny, Naresh P. KC, Nisha Rana, Shree Krishna Shrestha, Dela Singh, Parashu Ram Shrestha, Nishant Thakur.

LSHTM: Hannah Blencowe, Sarah G Moxon.

EN-BIRTH Expert Advisory Group: Agbessi Amouzou, Tariq Azim, Debra Jackson, Theopista John Kabuteni, Matthews Mathai, Jean-Pierre Monet, Allisyn C. Moran, Pavani K. Ram, Barbara Rawlins, Jennifer Requejo, Johan Ivar Sæbø, Florina Serbanescu, Lara Vaz.

We are also very grateful to fellow researchers who peer-reviewed this paper.

This paper is published with permission from the Directors of Ifakara Health Institute, Muhimbili University of Health and Allied Sciences icddr,b and Golden Community.

About this supplement

This article has been published as part of BMC Pregnancy and Childbirth Volume 21 Supplement 1, 2021: Every Newborn BIRTH multi-country validation study: informing measurement of coverage and quality of maternal and newborn care. The full contents of the supplement are available online at


The Children’s Investment Fund Foundation (CIFF) are the main funder of The EN-BIRTH Study which is administered via The London School of Hygiene & Tropical Medicine. The Swedish Research Council specifically funded the Nepal site through Lifeline Nepal and Golden Community. We acknowledge the core funders for all the partner institutions. Publication of this manuscript has been funded by CIFF. CIFF attended the study design workshop but had no role in data collection, analysis, data interpretation, report writing or decision to submit for publication. The corresponding author had full access to study data and final responsibility for publication submission decision.

Author information





The EN-BIRTH study was conceived by JEL, who acquired the funding and led the overall design with support from HR. Each of the three country research teams input to design of data collection tools and review processes, data collection and quality management with technical coordination from HR, GRGL, and DB. The iccdr,b team (notably AER, TT, TH, QSR, SA and SBZ) led the development of the software application, data dashboards and database development with VG and the LSHTM team. IHI (notably DS) coordinated work on barriers and enablers for data collection and use, working closely with LTD. QSR was the main lead for data management working closely with OB, KS and LTD. For this paper. SA led the analyses working closely with ABS and assistance from QSR. SA, ABS, KP, LTD drafted the manuscript with JEL. Authors made substantial contributions to the conception, design, data collection or analysis or interpretation of data for the work including, icddr,b Bangladesh: SA with ABS, TT, QSR, AER, SBZ, ATH, AA, SEA; Golden community, Nepal: OB, HM, AKC; Ifakara health institute, Tanzania: JS with DS. LSHTM: KP with DB, HR, HB, LTD, JEL. Other: FA and JR. All authors revised the manuscript and gave final approval of the version to be published and agree to be accountable for the work. The EN-BIRTH study group authors made contributions to the conception, design, data collection or analysis or interpretation of data.

EN-BIRTH Study Group

Bangladesh: Qazi Sadeq-ur Rahman, Ahmed Ehsanur Rahman, Tazeen Tahsina, Sojib Bin Zaman, Shafiqul Ameen, Tanvir Hossain, Abu Bakkar Siddique, Aniqa Tasnim Hossain, Tapas Mazumder, Jasmin Khan, Taqbir Us Samad Talha, Rajib Haider, Md. Hafizur Rahman, Anisuddin Ahmed, Shams El Arifeen.

Nepal: Omkar Basnet, Avinash K Sunny, Nishant Thakur, Rejina Gurung, Anjani Kumar Jha, Bijay Jha, Ram Chandra Bastola, Rajendra Paudel, Asmita Paudel, Ashish KC.

Tanzania: Nahya Salim, Donat Shamba, Josephine Shabani, Kizito Shirima, Menna Narcis Tarimo, Godfrey Mbaruku (deceased), Honorati Masanja.

LSHTM: Louise T Day, Harriet Ruysen, Kimberly Peven, Vladimir Sergeevich Gordeev, Georgia R Gore-Langton, Dorothy Boggs, Stefanie Kong, Angela Baschieri, Simon Cousens, Joy E Lawn.

Corresponding author

Correspondence to Shafiqul Ameen.

Ethics declarations

Ethics approval and consent to participate

This study was granted ethical approval by institutional review boards in all operating counties in addition to the London School of Hygiene & Tropical Medicine (Additional file 12).

Voluntary informed written consent was obtained from all observed participants and their families for newborns. Participants were assured of anonymity and confidentiality. All women were provided with a description of the study procedures in their preferred language at admission, and offered the right to refuse, or withdraw consent at any time during the study. Facility staff were identified before data collection began and no health worker refused to be observed whilst providing care. EN-BIRTH is study number 4833, registered at

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

Exit survey question wording of EN-BIRTH study compared to DHS/MICS question wording.

Additional file 2.

List of indicators with denominator used.

Additional file 3.

Definitions and formulas for validation metrics.

Additional file 4.

Full validation analysis of the selected indicators in the EN-BIRTH study.

Additional file 5.

STROBE checklist for cross-sectional studies.

Additional file 6.

Characteristics of mother & baby- Labour & Delivery of EN-BIRTH study.

Additional file 7.

Characteristics of mother & baby- Kangaroo mother care (KMC) of EN-BIRTH study.

Additional file 8.

Characteristics of mother and baby- Neonatal Infection of EN-BIRTH study.

Additional file 9.

Observer-assessed and survey-reported coverage by site.

Additional file 10.

Full validation analysis of the selected indicators included in DHS/MICS (only yes vs no).

Additional file 11.

Pooled analysis (random effects).

Additional file 12.

Ethical approval of local institutional review boards for EN-BIRTH study.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ameen, S., Siddique, A.B., Peven, K. et al. Survey of women’s report for 33 maternal and newborn indicators: EN-BIRTH multi-country validation study. BMC Pregnancy Childbirth 21, 238 (2021).

Download citation


  • Birth
  • Maternal
  • Newborn
  • Coverage
  • Validity
  • Survey
  • Indicators
  • Accuracy