Reliability of maternal recall of delivery and immediate newborn care indicators in Sarlahi, Nepal

Background The intrapartum period is a time of high mortality risk for newborns and mothers. Numerous interventions exist to minimize risk during this period. Data on intervention coverage are needed for health system improvement. Maternal report of intrapartum interventions through surveys is the primary source of coverage data, but they may be invalid or unreliable. Methods We assessed the reliability of maternal report of delivery and immediate newborn care for a sample of home and health facility births in Sarlahi, Nepal. Mothers were visited as soon as possible following delivery (< 72 h) and asked to report circumstances of labor and delivery. A subset was revisited 1–24 months after delivery and asked to recall interventions received using standard household survey questions. We assessed the reliability of each indicator by comparing what mothers reported immediately after delivery against what they reported at the follow-up survey. We assessed potential variation in reliability of maternal report by characteristics of the mother, birth event, or intervention prevalence. Results One thousand five hundred two mother/child pairs were included in the reliability study, with approximately half of births occurring at home. A higher proportion of women who delivered in facilities reported “don’t know” when asked to recall specific interventions both initially and at follow-up. Most indicators had high observed percent agreement, but kappa values were below 0.4, indicating agreement was primarily due to chance. Only “received any injection during delivery” demonstrated high reliability among all births (kappa: 0.737). The reliability of maternal report was typically lower among women who delivered at a facility. There was no difference in reliability based on time since birth of the follow-up interview. We observed over-reporting of interventions at follow-up that were more common in the population and under-reporting of less common interventions. Conclusions This study reinforces previous findings that mothers are unable to report reliably on many interventions within the peripartum period. Household surveys which rely on maternal report, therefore, may not be an appropriate method for collecting data on coverage of many interventions during the peripartum period. This is particularly true among facility births, where many interventions may occur without the mother’s full knowledge. Supplementary Information The online version contains supplementary material available at 10.1186/s12884-021-03547-5.


Background
The intrapartum and immediate postpartum periods are a time of high mortality risk for newborns and mothers. Neonatal deaths account for almost half of all under-five deaths, and mortality within this period has been difficult to reduce [1]. Numerous interventions exist to minimize risk to both newborns and mothers in the peripartum period. Data on population coverage of these interventions are needed for health programs to ensure that interventions are reaching those in need. However, such data are often scarce or unreliable. Information on the content or quality of care administered during labor and delivery at health facilities is frequently un-or under-documented, and rarely reported in national health information systems. Even less information is available for deliveries occurring outside of the government health system. Population-based surveys of women, with questions focused on care received during recent deliveries, are often used to capture data on intervention coverage for both facility and non-facility deliveries. However, an increasing number of studies have shown that women are often unable to accurately report on the content of care received, particularly for interventions occurring during the peripartum period [2][3][4][5][6]. A qualitative study by Yoder and colleagues in Bangladesh and Malawi found mothers had difficulty understanding some terminology related to peripartum care and comprehending questions about the timing of events following birth [7]. Work by McCarthy and colleagues suggests that pain, fatigue, and relief of a successful delivery may distract women from noting the care they received [5]. Further, women may not be told about the care they receive, such as the type of injection, or maybe more or less likely to report an intervention due to social desirability biases, sometimes providing responses that are believed to be viewed more positively by others. To date, validation studies of peripartum care have focused on women delivering in health facilities and have primarily been conducted in sub-Saharan Africa or Latin America.
We assessed the reliability of maternal report of delivery care and immediate newborn care for a sample of both home and health facility births in Nepal. Within the first 3 days after delivery, mothers were asked to report on interventions received during the peripartum period. These same women were visited between 1 and 24 months later and asked to recall the interventions they received during the peripartum period. Maternal report at both time points was compared to assess the reliability of maternal recall of peripartum health interventions. We also assessed potential variation in recall reliability by characteristics of the birth, the mother, and intervention prevalence.

Study setting
The study was conducted in the Sarlahi district of Nepal, bordering the Indian state of Bihar, to the south.
Residents are primarily Hindu and agrarian. In the study area, approximately half of births occur at home and half in health facilities.

Parent trial
Data on interventions received during the peripartum period were collected through a parent trial conducted jointly by the Nepal Nutrition Intervention Project -Sarlahi (NNIPS) and our local partner organization, Nepal Netra Jyoti Sangh under the auspices of the Social Welfare Council of the Government of Nepal. Women and their newborns were enrolled in a randomized community-based trial to investigate the impact of fullbody newborn massage with sunflower seed oil on newborn deaths and infections. The trial was registered at ClinicalTrials.gov (NCT01177111). The study took place in 34 Village Development Committees in the rural district of Sarlahi, Nepal, between November 2010 and January 2017.
Pregnant women were identified in the community and followed through delivery. Pregnant women participating in the trial were given clean birth kits, chlorhexidine (CHX) for application to the cut umbilical stump, deworming tablets, and counseling on early breastfeeding, thermal care, umbilical cord care, delivery care, postnatal care, and danger signs during labor and postnatal period. Mothers were visited as soon as possible following delivery, typically within 24 h, and asked to report on the date/time of delivery, circumstances of labor and delivery, the health status of the mother and newborn, and baby's weight. The wording of relevant delivery and immediate postpartum intervention questions administered to the mother at the first visit are listed in Supplementary Questionnaire 1. Additional interviews conducted throughout the first month (days 3, 7, 10, 14, 21, and 28) focused on maternal report and directly observed aspects of newborn health.

Reliability substudy
We randomly selected a subset of mother/child pairs that participated in the parent trial and revisited these women between April and September 2016. Each selected mother was visited at home and asked to report on interventions and events in the peripartum period, including labor and delivery, immediate newborn care, postnatal care, and illness and care within the first 7 days of life using standard questions from the Demographic and Health Survey (DHS) or Multiple Indicator Cluster Survey (MICS) where applicable (see Supplementary Questionnaire 2). Mothers who had a singleton live birth and who were visited at home within 72 h after delivery were eligible. We interviewed approximately equal numbers of mothers at each of seven follow-up time periods: 1, 3, 6, 9, 12, 18, or 24 months after birth (Fig. 1).
Selected mothers were requested to participate in the validation substudy conducted by study staff through an oral consent process in either the Nepali or Maithili language, both of which are spoken in the area. Those who consented to participate were asked to recall care during delivery and the immediate postnatal period prior to discharge.

Ethical approval
The parent trial and validation substudy were approved by the Johns Hopkins Bloomberg School of Public Health Institutional Review Board in Baltimore, USA. In Nepal, approval was received from the Tribhuvan University Institute of Medicine, Kathmandu (parent trial) and the Nepal Health Research Council, Kathmandu (substudy).

Data analysis
The reliability of each indicator was assessed by comparing what mothers reported immediately after delivery against what was reported at the follow-up survey. We assessed the reliability of 14 indicators, including the use of items within the clean birth kit, injections given during labor / delivery, immediate newborn care, cord care, and early initiation of breastfeeding. Indicators related to immediate newborn care were defined as practices generally occurring between the delivery of the child and the delivery of the placenta. Questions about the application of CHX or other substances to the cord stump were limited to applications immediately following delivery. Early breastfeeding initiation was defined as putting the child to the breast within the first hour after delivery.
For both the initial assessment and follow-up survey, we assessed the proportion of mothers who responded "don't know" (DK) when asked whether the intervention or practice occurred. Each mother was asked to identify each injection given during labor and delivery. If the mother reported she could not identify a specific injection and did not state oxytocin or ergometrine was given their response on "injectable Oxytocin / Ergometrine given during delivery" was classified as "DK". The same logic was used for classifying the use of chlorohexidine. Excluding DK responses, we calculated the observed percent agreement, expected percent agreement, and kappa of maternal recall for each indicator. A sensitivity analysis maintaining DK responses as a separate response category was also conducted. The kappa statistic (κ) was used to measure the test-retest reliability of maternal report after excluding agreement due to chance. Chance expected percent agreement (p e ) was defined as classification at random assuming probability equal to the overall proportion of yes and no responses at the initial (time 1) and follow-up (time 2) interview. Observed percent agreement (p o ) was calculated as the number of mothers reporting either receiving or not receiving a specific intervention at both the initial and follow-up survey. The kappa statistic was calculated as the difference in the expected and observed agreement over one minus the expected change agreement (Formula 1).
A κ = 1 is considered perfect agreement and κ = 0 is considered no agreement beyond that expected by chance alone. We interpreted κ values of greater than 0.4 as indicating moderate reliability and values greater than 0.6 as indicating strong reliability. We also calculated the proportion of women who changed their responses from 1) not receiving an intervention at the initial assessment but reported receipt during the followup survey (over-report), and 2) reported receiving an intervention at the initial assessment but did not report the intervention during the follow-up survey (under-report). Analyses were stratified by site of delivery, dichotomized as facility deliveries and home deliveries. At the level of the individual respondent, we assessed potential variation in the reliability of maternal report by characteristics of the mother or birth event we hypothesized could potentially alter women's ability to recall events around delivery. We used multivariable logistic regression to assess differences in percent agreement, over-reporting, and under-reporting of each indicator by time between birth and follow-up interview (continuous variable), child sex, maternal education (none versus any), maternal age, parity, birth location, and presence of delivery complication. Delivery complications were defined as maternal report of complications during delivery such as excessive bleeding, prolonged labor, convulsions, fever, or obstructed labor at the initial interview. We also looked for unadjusted differences in percent agreement by time since birth using binned categories for the best performing indicators.
We also assessed associations at the indicator level between the reliability of maternal recall and underlying intervention coverage or prevalence, based on initial maternal report. We calculated the unadjusted associations between intervention prevalence and indicator estimates of percent agreement, κ, the proportion of women overreporting, and the proportion of women under-reporting each indicator. Where significant associations were identified, we calculated the proportion of variability in indicator reliability explained by differences in underlying intervention prevalence. All analyses were conducted in Stata version 14.0 (StataCorp, College Station, TX, USA).

Results
Of the 1892 women selected for the reliability study, 363 were not available, 5 had moved permanently, 3 had died, 4 refused to participate, and 1517 were consented and interviewed (see Supplementary Figure 1). There was no significant difference in the characteristics of those women who were and were not available to participate. After excluding 15 participants (birth assessment > 72 h after birth [n = 3], twin delivery [n = 1], repeat participation due to multiple eligible births [n = 11]), 1502 mother/child pairs were included in the substudy. Of these, 220 were enrolled in the one-month recall group, 207 in the three-month group, 205 in the six-month group, 196 in the nine-month group, 193 in the 12-month group, 284 in the 18-month group, and 197 in the 24-month group (Fig. 1).
More than half of newborns were male (55.5%), and a majority of births occurred in the home (53.8%) ( Table 1). The mean recall period was 10.8 months. The mean age of mothers was 23.9 years at the time of delivery. Most mothers had no schooling (68.3%) and had prior children (71.4%). Participants were nearly universally of Madhesi (people of the plains) ethnic origin (96.2%), frequently lacked a household latrine (71.2%), but owned some type of land (97.4%). The substudy sample was comparable to the parent trial sample, but the parent trial sample was more balanced by child sex (male = 51.3%). Stratifying by the site of delivery, women who delivered in health facilities were younger and more educated than women who delivered at home (see Supplementary Table 1). They also were more often having their first child (41.6% vs 17.3%) and more often reported birth complications (28.9% vs 8.7%) compared to women who delivered at home.
As an initial assessment of intervention recall and question comprehension, we assessed the proportion of mothers that responded DK when asked about each intervention or practice at the initial post-birth assessment and at follow-up (Table 2). Only a handful of indicators (4 for facility births, 1 for home births) had a greater than 5% DK response rate during the initial assessment, but this increased to 9 and 5 indicators, respectively, at follow-up. Recall of the type of injection received the highest proportion of DK responses. During the initial interview, most mothers reported receiving multiple injections during delivery, and mothers could not identify approximately 90% of the injections they reportedly received. Recall of type of injection was also poor at the follow-up survey, however it was partially masked by a reduction in the number of injections that mothers reported receiving during delivery. During the initial assessment, women reported receiving 1.83 injections on average, which fell to 1.27 during the follow-up survey.
The proportion of women who gave DK responses was higher overall among women that delivered at a facility, compared to women who delivered at home, at both the initial and follow-up interview. For example, at the initial assessment, 10.4% of women who delivered at a facility could not report whether a new blade had been used to cut the umbilical cord compared to just 0.2% of women who delivered at home. This was true for other cord care indicators, with 5-10% of facility-delivering mothers reporting DK at the initial assessment increasing to 15-29% reporting DK at the follow-up survey. Indicators involving the timing of wiping, wrapping, bathing, and cord-cutting had > 5% DK responses at follow-up across both home and facility births, but the proportion of DK responses was much higher for facility births.
The reported intervention coverage, percent agreement, and kappa values of each indicator are presented in Table 3, and for facility and home deliveries separately in Table 4. The majority of indicators had a high observed percent agreement, but most kappa values were below 0.4, indicating agreement was primarily due to chance. Among all observations, three indicators showed moderate reliability with kappa values greater than 0.4, including "any part of the clean birthing kit used for the delivery," "sheet from clean birthing kit used for the delivery," and "cord cut after placenta delivered." Only one indicator, "received any injection during delivery," demonstrated high reliability with a kappa of 0.737. Stratifying by place of delivery, the reliability of maternal report was much lower among women who delivered at a facility. Only 5 out of 14 indicators had greater than 70% agreement among facility deliveries, and none had kappa values above 0.4. Among home deliveries, 11 out of 14 indicators had > 70% agreement, and both use of a clean birthing kit and any injection during delivery had kappa values above 0.4 and 0.6 respectively. Inclusion of DK as a separate response category did not significantly alter the reliability of any indicator, with the exception of "received oxytocin / ergometrine during delivery," where reliability improved due to the high number of DK responses (see Supplementary Tables 2 & 3).
We assessed associations at the indicator level between underlying intervention coverage or prevalence, based on the initial maternal report (after excluding DKs), and measures of maternal recall reliability. There was a Ushaped association between intervention prevalence and observed percent agreement (Fig. 2). The proportion of women who changed their initial report of each indicator and coverage of each intervention or prevalence of each practice in the population is presented in Supplementary Table 4. We observed an association between underlying intervention coverage and the proportion of women over or under-reporting the intervention at the follow-up interview (Fig. 3). Among both home and facility births, we observed a higher proportion of women changed their initial report of no intervention to received intervention (over-reporting) for interventions that were more common in the population. Conversely, we observed a higher proportion of women changed their initial report of received intervention to did NOT receive the intervention (under-reporting) among interventions that were less common in the population. These associations were stronger among home deliveries than facility deliveries. Underlying intervention prevalence accounted for 83% of the variation in overreporting and 85% of under-reporting among home births, but only 46 and 43% of over-and underreporting respectively among facility births. There was no association between underlying intervention prevalence and indicator kappa statistics (data not shown).
We observed no significant differences in percent agreement by binned recall time for any of the four best performing indicators (Fig. 4). Similarly, in the adjusted logistic regression model, we observed a negligible but statistically significant reduction in agreement of report by increasing recall period for most indicators, controlling for child sex, place of delivery, birth complications, parity, maternal age, education, and ethnicity (Tables 5). Mothers had statistically lower odds of reliably reporting 10 of 14 interventions if they delivered in a facility and statistically greater odds of reliably reporting if the baby was wrapped before the placenta delivery if they delivered at a facility. After disaggregating by location of delivery, there was little statistical difference in the reliability of report by recall length among home deliveries (Table 6). There were no clear cross-cutting associations between respondent and birth characteristics and over-or under-reporting overall (see Supplementary Tables 5 & 7) or by site of delivery (see Supplementary Tables 6 & 8).

Discussion
This study found poor reliability of maternal report of immediate newborn care indicators as collected through a household survey. A high percent agreement between initial and follow-up reports was observed for most interventions. However, the kappa values for most interventions were low, suggesting observed agreement was mostly due to chance from very high or very low intervention coverage. This was further evidenced by the Ushaped association between intervention prevalence and observed agreement. Only receipt of an injection during delivery could be recalled with high reliability, but not information on the type of injection. For most indicators, women who delivered at home had greater odds of reliably reporting on the intervention compared to women who delivered at a facility. We also observed a high proportion of women who delivered in health facilities failing to recall interventions during the initial interview (< 72 h) after delivery. A negligible, although sometimes statistically significant, decline in recall This study reinforces previous research that suggests mothers are unable to effectively recall interventions or practices which occur during the peripartum period. Studies by Blanc, McCarthy, and Stanton have demonstrated poor recall accuracy within the peripartum period among women in multiple low-and middleincome settings [2][3][4][5][6]. Multiple factors could contribute to poor recall. Our study suggests that mothers may never have known about some interventions they receivedas evidenced by the high proportion of mothers that reported they didn't know the type of injection they received among both facility and home births at the initial assessment. This agrees with findings from Mexico and Kenya, which showed poor recall of peripartum interventions among women at initial discharge from their labor and delivery facility [2,3]. Similar to previous studies, we found recall close to the time of care was generally poor; however, there was little evidence of reliability-altering recall degradation with increasing time since delivery up to a two-year recall period [5,6]. Currently, the DHS asks women to report on delivery and newborn care for their most recent birth in the previous 3 years, with surveys prior to Phase 8 asking about births in the previous 5 years. However, this study and others only assessed recall for up to 2 years postpartum.
Previous studies have primarily assessed recall among women who delivered in health facilities. In general, this study found the proportion of women who were unable to report whether they received an intervention within 72 h of birth was higher among facility deliveries relative to home deliveries. A possible explanation is that women delivering in a facility may not have been informed about the details of interventions received, such as the substance applied to the child's umbilical cord or whether the instrument used to cut the cord was new or had been sterilized. We also observed significantly weaker recall reliability for most indicators among women who delivered at a facility compared to women who delivered at home. This suggests receipt of these interventions were less salient events for women delivering in facilities potentially because they were not informed or counseled on various interventions, events may have been obscured by an unfamiliar or chaotic environment, or they paid less attention to events because they trusted the skilled attendants providing care. Alternatively, mothers may have had a better rapport with home birth attendants, often family or other community members, who may have more effectively communicated events occurring throughout the delivery process. No other characteristics had a consistent effect on mothers' recall reliability.
We observed an association between underlying intervention coverage and maternal report. Interventions that were more common in the population were more likely to be over-reported during the follow-up interview. Likewise, interventions that were uncommon in the population were more likely to be under-reported. This association was stronger among home births than facility births. A forthcoming assessment of maternal report of antenatal and postnatal care interventions in Kenya, Cambodia, and Bangladesh found a similar association between intervention prevalence and under-and overreporting of intervention receipt among women receiving facility-based care [8]. This association could potentially reflect a social desirability bias, such as wanting to report a practice the mother thought the interviewer would perceive favorably. Alternatively, if the mother could not clearly recall the intervention, she may assume   that an intervention did or did not occur if it was or was not a standard practice in the setting. This study had a number of limitations. This analysis is limited to assessing the test-retest reliability of maternal report and changes in response over time. We were unable to observe interventions and practices during delivery, so we are unable to assess the validity of maternal report. Our assessment did effectively assess changes in recall over time and demonstrated inconsistencies in women's reports of interventions received. Another limitation is that women were classified as having delivered at home or in a facility; however, 2.3% of women delivered on the way to a health facility. These women were classified as delivering at home because of the lack of services they would have received in-transit. Less than half (n = 11) of those women reported continuing on to a health facility, so we would expect that categorizing them as home births would have minimal effect on the study findings. Additionally, this study was conducted in a study population that has received a number of interventions related to clean delivery and newborn care over a number of years. This population may have been uniquely primed to recall interventions or may have felt additional pressure to report use of specific interventions. Use of clean delivery kits and clean cord practices was high in this population among both home and facility births due to the trial protocol of providing clean delivery kits and the existence of a successful national chlorhexidine cord care program in Nepal. The reliability of reporting may be different within a population with lower coverage and perhaps less awareness of these interventions. Additional work is needed to assess coverage of interventions among home births in populations without high access to clean birth kits and programs for safer home delivery practices.

Conclusions
This study reinforces previous findings that mothers are unable to effectively report on many interventions or practices within the peripartum period. Household surveys which rely on maternal report therefore may not be an appropriate method for collecting data on coverage of many interventions during the peripartum period. This is particularly true among facility births, where many interventions may occur without the mother's full knowledge. New methods are needed for generating more robust estimates of peripartum intervention coverage. Data suggest that mothers are able to accurately report on the location on delivery. Linking valid data on where mothers deliver with robust data on the content and quality of delivery care at these facilities may be used to generate estimates of intervention coverage.

Supplementary Information
The online version contains supplementary material available at https://doi. org/10.1186/s12884-021-03547-5.  Table 2. Reliability of maternal report of immediate newborn care indicators treating "don't know" as response category. Supplementary Table 3. Reliability of maternal report of immediate newborn care indicators treating "don't know" as response category, by site of delivery. Supplementary Table 4. Coverage of intervention and measures of maternal report, by site of delivery. Supplementary Table 5. Characteristics associated with maternal overreporting. Supplementary Table 6. Characteristics associated with maternal over-reporting, by site of delivery. Supplementary Table 7. Characteristics associated with maternal under-reporting. Supplementary Table 8. Characteristics associated with maternal under-reporting, by site of delivery