Nomogram predicting cesarean delivery undergoing induction of labor among high-risk nulliparous women at term: a retrospective study

Background Our aim was to create and validate a nomogram predicting cesarean delivery after induction of labor among nulliparous women at term. Methods Data were obtained from medical records from Nanjing Drum Tower Hospital. Nulliparous women with singleton pregnancies undergoing induction of labor at term were involved. A total of 2950 patients from Jan. 2014 to Dec. 2015 were served as derivation cohort. A nomogram was constructed by multivariate logistic regression using maternal, fetal and pregnancy characteristics. The predictive accuracy and discriminative ability of the nomogram were internal validated by 1000-bootstrap resampling, followed by external validation of a new dataset from Jan. 2016 to Dec. 2016. Results Logistic regression revealed nine predictors of cesarean delivery, including maternal height, age, uterine height, abdominal circumference, estimated fetal weight, indications for induction of labor, initial cervical consistency, cervical effacement and station. Nomogram was well calibrated and had an AUC of 0.73 (95% confidence interval [CI], 0.70-0.75) after bootstrap resampling for internal validation. The AUC in external validation reached 0.67, which was significantly higher than that of three models published previously (P<0.05). Conclusions This validated nomogram, constructed by variables that were obtained form medical records, can help estimate risk of cesarean delivery before induction of labor. Supplementary Information The online version contains supplementary material available at 10.1186/s12884-022-04386-8.


Background
Induction of labor is one of the most frequently used methods to initiate labor. In the United States, more than 22% pregnancies undergo induction of labor and almost one third of inductions will end in cesarean delivery [1].
The status of cervix has been recognized as one of the most important factors affecting the mode of delivery, and an unfavorable Bishop score (≤ 5) is the predominant risk factor of failed induction [2,3]. Other studies have proposed that, several maternal and fetal characteristics, such as maternal age, parity, maternal height, body mass index, gestational age and fetal position, also affect the success rate of labor induction [4][5][6][7][8].
It has been widely assumed that induction of labor increases the risk of cesarean delivery. Previous studies have reported that cesarean delivery rates varied from 22.6 to 50% among nulliparous after induction of labor in different institutes, most of which included patients for both medical and non-medical indications [9][10][11][12][13]. However, one randomized trial has demonstrated that elective induction can decrease risk of cesarean delivery among low-risk nulliparous [14]. For high-risk nulliparous, indications for labor induction, such as diabetes mellitus and hypertension, have been reported to associate with cesarean delivery [10].
However, there is little knowledge about how to evaluate the risk so far, since these factors are seldom used in a comprehensive fashion. Therefore, it is reasonable to develop and validate a nomogram of cesarean delivery for nulliparous undergoing induction of labor, by combination of maternal, fetal and pregnancy characteristics.

Materials and methods
The protocol of the retrospective study was approved by the Ethic Review Committee of Nanjing Drum Tower Hospital (reference number 2020-027-01, date of approval 2020-02-25). Written informed consent was obtained from all women enrolled in the program. All nulliparous with singleton, term, cephalic pregnancies who underwent induction of labor from Jan. 2014 to Dec. 2016 at Nanjing Drum Tower Hospital were enrolled.
Data on maternal and neonatal characteristic were abstracted from medical record. These data were doublechecked by two obstetricians. Women with cervical dilation ≥3 cm were excluded since they might have been in spontaneous labor and were misclassified as induction of labor. Besides, women with missing data were also excluded for further data analysis. Factors enrolled for analysis were 1) maternal demographic characteristics, including maternal age, maternal height, maternal weight at delivery, uterine height and abdominal circumference; 2) medical indications for induction of labor, including premature rupture of membrane, late term, diabetes mellitus (gestational diabetes and pregestational diabetes), hypertensive disorder of pregnancy, liver dysfunction, fetal growth restriction and oligohydramnios; 3) obstetric conditions, including gestational age, fetal position, cervical examination at admission (cervical dilation, effacement, position, consistency and station), Bishop score; and 4) neonatal characteristics, including estimated fetal weigh, neonatal weight, neonatal sex. Results of PROBAAT trial and further meta-analysis has demonstrated that cesarean delivery rate was similar between Foley catheter group and prostaglandin group [15]. Besides, the original purpose of the retrospective study was to provide a user-friendly tool for both doctors and patients. The procedures of induction might be too professional and make patients confused. Therefore, methods of induction were not included in the study.
Gestational age was determined by the last menstrual period and confirmed by ultrasound examination. Timing to start induction was mainly based American College of Obstetricians and Gynecologists (ACOG) committee opinion "Medically indicated late-preterm and earlyterm deliveries" published in April 2013 [16].
The admission exam was the full 5 component Bishop score (cervical dilation, effacement, position, consistency and station). A Bishop score of <6 was considered unfavorable. All women with intact membrane whose Bishop score <6 received at least 1 method for cervical ripening: Foley catheter, vaginal misoprostol 25 μg every 4 h or vaginal dinoprostone (Propess). The standard application of Foley catheter has been established by our group previously and chosen as the first choice for cervical ripening for women without vaginal infection [16]. The choice of vaginal misoprostol or dinoprostone (Propess) was mainly based on provider preference. Artificial rupture of membrane was considered once the Bishop score ≥ 6.
Women who had successful vaginal delivery after induction of labor were classified as successful induction of labor; those who ended with cesarean delivery for any reason after induction of labor were classified as failed induction of labor. The model was constructed using derivation cohort from Jan. 2014 to Dec. 2015, which has been used to analyze cesarean delivery rate by 10-Group Classification System [17]. The dataset of Group 2a (nulliparous, single cephalic, ≥ 37 week, induced labor) had detailed variables for evaluation and perfectly fitted the target population in our research. Women with same inclusion criteria form Jan. 2016 to Dec. 2016 were enrolled as validation cohort. Continuous variables were presented as mean and standard deviation or median and interquartile range, where appropriate. Categorical variables were presented as frequencies and percentages. Student t test and Chi square test were used for continuous and categorical date in univariate analysis, respectively. All variables with a P<0.05 in univariate analysis were then included in a logistic regression model. Covariates were removed in a stepwise fashion until all covariates in the final value had a P<0.05. A nomogram was created based on coefficients weighted by the logistic regression model in R. The nomogram was internal and external validated by discrimination and calibration. Discrimination was assessed by receiver-operative characteristics (ROC) analysis using 1000 bootstrap resampling, and Calibration curve was graphically assessed by plotting the observed rates against the nomogram-predicted probabilities.
To compare the nomogram with other existing prediction model, we searched Medline between 1987 and 2020 on prediction model for induction of labor. The search strategy consisted of keywords of "induction of labor" and "prediction model". Studies were selected in a twostage process. Firstly, we went through title and abstract of all citations. Secondly, we obtained full reports published in English which established mathematical models to predict cesarean delivery after induction of labor, with sufficient detail to calculate the probability of cesarean delivery using our own data. Studies on both nulliparous and multiparous were also included. Area under the curve (AUC) for ROC analysis between the nomogram and models finally included were compared.

Results
During 2014 to 2015, a total of 11,006 deliveries occurred at our hospital, of which 2961 met inclusion criteria. Due to incomplete data, 11 pregnancies were excluded. Therefore, 2950 pregnancies were finally enrolled as derivation cohort. A total of 1935 pregnancies admitted in 2016 were enrolled as validation cohort. Table 1 showed the antepartum characteristics of pregnancies in derivation and validation cohort. The cesarean delivery rate slightly in increased from 13.2% in derivation cohort to 16.4% in validation cohort. PROM, consisted of over 40% of study population, remained the first indication for induction of labor, followed by late term, which consisted of 25% of the population. The derivation cohort and validation cohort shared similar characteristics in both maternal and fetal features.
Maternal, neonatal and obstetric characteristics were compared between women who delivered vaginally and women who delivered by cesarean delivery by univariate analysis (Table 2). Detailed information about indications for induction of labor among women in derivation cohort was presented in Table 3.
In univariate analysis, the following variables had a P<0.05 and were considered in logistic regression modeling: maternal age, weight, height, uterine height, abdominal circumference, gestational age, estimated fetal weight, initial cervical dilation, initial station, initial cervical effacement, initial cervical consistency and indications for induction of labor. After stepwise logistic regression analysis, independent risk factors associated with cesarean delivery after induction of labor were shown in Table 4. The following 9 variables remained significantly associated with cesarean delivery: maternal age, height, uterine height, abdominal circumference, estimated fetal weight, initial station, initial cervical effacement, initial cervical consistency and indications for induction of labor. For every 1-year increase in maternal age, there was a 9% increase in the odds of cesarean delivery (odds ratio [OR] 1.09, 95% confidence interval [CI] 1.05-1.13). Decreased maternal height was associated with increased probability of cesarean delivery (OR 0.91, 95%CI 0.88-0.93). A 1 cm increase in uterine height, 1 cm increase in abdominal circumference and 100 g increase in estimated fetal weight were associated with 6.0% (OR 1.07, 95%CI 1.00-1.13), 5.0% (OR 1.05, 95%CI 1.03-1.08) and 6.0% (OR 1.06, 95%CI 1.01-1.12) increase in the odds of cesarean delivery, respectively. Three components of Bishop score, including initial station, initial cervical effacement and initial cervical consistency were associated with risk of cesarean delivery. Initial cervical effacement levels 60-70%, 40-50% and 0-30% were associated with increased risk of cesarean delivery (OR 1.79, 95%CI 0.97-3.31; OR 2.19 95%CI 1.14-4.21; OR 9.23, 95%CI 1.26-67.56). Hypertensive disorder of pregnancy was one of the most important risk factors, since it had over a 2-fold risk of cesarean delivery (OR 2.38, 95%CI 1.38-4.13), followed by late term, which associated with 58.0% increase in the odds of cesarean delivery (OR 1.58, 95%CI 1.20-2.08).
The nomogram was created based on coefficients of parameters enrolled in the final logistic regression (Fig. 1). The receiver operating characteristic curve for the nomogram achieved an area under the curve (AUC) of 0.73 (95%CI, 0.70-0.75) for internal validation and 0.67 (95%CI, 0.64-0.702) for external validation after 1000 bootstrap resampling. The sensitivity and specificity of the nomogram reached 0.63 and 0.71, respectively. The calibration curve for both derivation and validation cohort revealed good agreement between predicted risk of cesarean delivery after induction of labor by the nomogram and actual observation, as shown in Fig. 2A and B.
The initial literature review revealed 271 hit. After reading titles and abstracts, 18 articles on cesarean delivery after induction of labor were identified. Fifteen literatures were excluded due to various reasons. Details on literature selection were presented in Fig. S1.
The remaining 3 models were finally included in our study [18][19][20]. The probability of cesarean delivery after induction of labor for each patient was calculated according to the mathematical equation published in article. The areas under the curve were 0.68, 0.66 and 0.64 for Robert model, Antonio model and Gordon model, respectively, which were significantly lower than AUC of our nomogram in derivation dataset (Fig. 3A). Meanwhile, similar results were found when applying validation dataset to these models (Fig. 3B).

Discussion
In this single-institutional retrospective cohort of high-risk nulliparous women who underwent induction of labor, we found that maternal age, height, uterine height, abdominal circumference, estimated fetal weight, initial station, initial cervical effacement and initial cervical consistency were independent risk factors for cesarean delivery. Besides, indications for induction of labor also independently affected the rate of cesarean delivery. Using these factors, a nomogram was developed and validated to calculate the likelihood of cesarean delivery, which achieved acceptable AUC of 0.73 for internal validation and 0.67 for external validation. Further analysis revealed that the nomogram showed better discriminative ability than three models published previously. The risk factors enrolled in the nomogram have been previously reported in literatures. In consistent with previous studies, maternal age and height were associated with cesarean delivery [10,[20][21][22][23]. Meanwhile, medical indications for induction of labor, such as PROM, late term, diabetes mellitus and hypertensive disorder of pregnancy were also shown to be independently affected rate cesarean delivery [10,23]. In our research, uterine height, abdominal circumference and estimated fetal weight were modifiable risk factor, influencing probability of cesarean delivery. Traditionally, Bishop score has been used as the standard evaluation for induction planning. However, not all components were related to cesarean delivery [10,23]. Cervical dilation, a favorable factor for vaginal delivery, was not recognized as a risk factor in our study. This was likely due to the fact that only a few patients were enrolled with dilated cervix. Meanwhile, cervical position was not associated with cesarean delivery after adjusting for confounders in the large cohort.
Efforts have been made to assess the risk of cesarean delivery among different populations and AUC of these models ranged from 0.68-0.79. Gordon et al. established a prediction model for cesarean delivery after labor induction in nulliparous by four risk factors, including maternal age, height, gestational age and fetal sex [18].  for both nulliparous and multiparous women after induction of labor, respectively [19,20]. The differences might result from population composition, since multiparous women were more likely to experience successful vaginal delivery. Therefore, we chose to emphasize on nulliparous women as efforts to reduce primary cesarean delivery regarding its contribution to cesarean delivery rate.
To further compare the discriminative ability of our nomogram and three models mentioned previously, we performed a validation process, using both derivation dataset and validation dataset. AUC of our nomogram reached 0.73 for internal validation and 0.67 for external validation, which showed better accuracy. Therefore, our nomogram might be more specialized for nulliparous in Chinese population.
Our study aimed to establish a nomogram predicting probability of cesarean delivery after induction of labor. The nomogram achieved clinical useful prediction of cesarean delivery by basic characteristics. However, we should aware that the nomogram is constructed to help with patient consultation instead of making clinical decision directly. For example, a nulliparous woman of 30 years (26 points) with hypertensive disorder (22  Because induction of labor will continue to be one important method for pregnancies at risk of maternal and neonatal morbidity, future studies should focus on patients with high risk of cesarean delivery to choose the safest option.
Our results were robust for several reasons. First, we used large, well-described datasets from one retrospective study on changes of cesarean delivery rate by 10-Group Classification System to establish the nomogram. It provided data with high quality. Secondly, variables enrolled in the nomogram could be easily measured antepartumly. Hence, the nomogram will be easy and inexpensive to daily clinical application. Importantly, both internal and external validation of the nomogram was carried out to ensure its reproducibility in more generalized population. Meanwhile, better discriminative ability of the nomogram was revealed by comparison with three existing models.
The main limitation of the study was the retrospective design. Potential predictors, like pre-pregnancy weight, were not taken into account, because information of these characteristics was not documented in our medical records. Second, the application of the nomogram should be limited to patients who met inclusion criteria of the research. For example, patients with elective induction of labor should not be consulted by the nomogram, since the nomogram was constructed from nulliparous with medical indications   Receiving operating curve (ROC) of the nomogram and existing models. ROC of the nomogram and existing models for derivation cohort (A) and validation cohort (B). AUC of the nomogram reaches 0.73 in derivation cohort, which is significantly higher than three existing models (P<0.05). Meanwhile, AUC the nomogram reaches 0.67 in validation cohort, which is also significantly higher than three existing models (P<0.05)