Skip to main content

Establishment and validation of a predictive model for spontaneous preterm birth in singleton pregnant women

Abstract

Introduction

In the current study, we screened for highly sensitive and specific predictors of premature birth, with the aim to establish an sPTB prediction model that is suitable for women in China and easy to operate and popularize, as well as to establish a sPTB prediction scoring system for early, intuitive, and effective assessment of premature birth risk.

Methods

A total of 685 pregnant women with a single pregnancy during the second trimester (16–26 weeks) were divided into premature and non-premature delivery groups based on their delivery outcomes. Clinical and ultrasound information were collected for both groups, and risk factors that could lead to sPTB in pregnant women were screened and analyzed using a cut-off value. A nomogram was developed to establish a prediction model and scoring system for sPTB. In addition, 119 pregnant women who met the inclusion criteria for the modeling cohort were included in the external validation of the model. The accuracy and consistency of the model were evaluated using the area under the receiver operating characteristic (ROC) and C-calibration curves.

Results

Multivariate logistic regression analysis showed a significant correlation (P < 0.05) between the number of miscarriages in pregnant women, history of miscarriages in the first week of pregnancy, history of preterm birth, CL of pregnant women, open and continuous cervical opening, and the occurrence of sPTB in pregnant women. We drew a nomogram column chart based on the six risk factors mentioned above, obtained a predictive model for sPTB, and established a scoring system to divide premature birth into three risk groups: low, medium, and high. After validating the model, the Hosmer Lemeshow test indicated a good fit (p = 0.997). The modeling queue C calibration curve was close to diagonal (C index = 0.856), confirming that the queue C calibration curve was also close to diagonal (C index = 0.854). The AUCs of the modeling and validation queues were 0.850 and 0.881, respectively.

Conclusion

Our predictive model is consistent with China’s national conditions, as well as being intuitive and easy to operate, with wide applicability, thus representing a helpful tool to assist with early detection of sPTB in clinical practice, as well as for clinical management in assessing low, medium, and high risks of sPTB.

Peer Review reports

Spontaneous preterm birth (sPTB) is defined as the spontaneous appearance of threatened preterm birth, premature rupture of membranes, and subsequent premature delivery before 37 weeks [1]. sPTB can seriously affect the survival rate and prognosis of newborns and impose a heavy burden on families [2]. Thus far, scholars at home and abroad have conducted extensive research on the influencing factors of sPTB and selected different influencing factors to establish different prediction models for different populations [3,4,5]. However, the modeling theory targets different pregnant women, and the included predictive factors may be limited or difficult to obtain, which affects its clinical applicability. Moreover, most predictive models used in previous studies were applicable to non-Asian women, and there remains a lack of effective predictive models for clinical applications in China. Therefore, in this study, we aimed to select more sensitive and specific predictors, establish a prediction model that is clinically easy to operate, easy to popularize, and more suitable for Chinese women, and conduct risk classification to detect and prevent sPTB early and reduce the incidence rate of preterm birth, occurrence of complications, and damage to mothers and infants.

Materials and methods

Research object

This study was approved by the Ethics Committee of the First Affiliated Hospital of Shandong First Medical University (Qianfoshan Hospital, Shandong Province). The research subjects were all from the First Affiliated Hospital of Shandong First Medical University (Qianfoshan Hospital, Shandong Province).

The inclusion criteria were as follows: (1) singleton pregnancy (irrespective of the presence of premature birth symptoms); (2) age at pregnancy of 20–50 years old; (3) mid-term pregnancy (16–26 w); and (4) patients that had been undergoing prenatal checkups in our hospital since the mid-pregnancy period (16–26 w) and were continuously examined until delivery.

The exclusion criteria were as follows: (1) Pregnant women with premature delivery for iatrogenic reasons (e.g., placental abruption, infection, pregnancy diabetes, pregnancy hypertension, and immune system diseases); (2) after emergency cerclage surgery; and (3) interrupted pregnancy due to abnormalities in fetal structure or other factors.

Model establishment: In the early stages of this study, the basic information and medical histories of pregnant women with premature birth symptoms (e.g., vaginal bleeding and short cervical canal) were statistically analyzed to preliminarily identify factors that can predict pregnancy women will be premature. After reading the relevant literature, 13 factors that could predict the risk of premature birth were identified. Statistical analysis of the data collected in the early stages revealed that the premature birth rate of pregnant women with premature birth symptoms (e.g., vaginal bleeding and short cervical canals) was 22.7%. Considering the possibility of a dropout rate of approximately 10–20%, the sample size of the subjects was approximately 685. Therefore, a retrospective study was conducted on 685 mid-pregnancy (16–26 w) pregnant women who underwent prenatal examinations and gave birth in our hospital between January 1, 2021 and January 1, 2023. Model validation: According to the modeling queue: validation queue = 8:2, 119 mid-pregnancy (16–26 w) pregnant women who underwent prenatal examinations and gave birth in our hospital from May 10, 2023 to January 31, 2024, were prospectively collected. All of the pregnant women provided written informed consent.

Data collection methods and instruments

Data collection methods

The hospital’s His case system and questionnaire were used to collect the basic information and medical history of pregnant women, including age, body mass index (BMI), fertility history, history of miscarriage, history of miscarriage at gestational age (≥ 12 w), history of premature birth, history of cervical conization, whether cervical cerclage has been performed, previous history of cervical cerclage, symptoms, and medication during pregnancy.

We employed ultrasound instruments to examine the cervical condition of pregnant women and recorded the results using an ultrasound workstation (Medicon system), including: (1) Cervix length (CL) (CL ≤ 2.5 cm indicates short cervical canal); (2) cervical opening; and (3) persistent cervical opening.

Instruments

Using the GE Voluson E10 color Doppler ultrasound diagnostic instrument, a convex array probe (C1-5-D) and intracavity probe (IC5-9-D) were used, with frequencies of 2.0–8.0 MHz and 5.0–8.0 MHz, respectively. Convex array probes are preferred for transabdominal and perineal ultrasounds due to the difficulty of transvaginal ultrasound examination in some pregnant women. If the cervix is not clearly visible due to intestinal obstruction or other reasons, this was explained to the pregnant woman, and a transvaginal ultrasound was performed. If the pregnant women experienced premature birth symptoms or other conditions, they provided informed consent and underwent a transvaginal ultrasound examination. Close monitoring was conducted within 16–26 weeks, and clinical intervention was performed if necessary.

Prediction model establishment and verification process

Model establishment: We conducted statistical analysis of the personal information and collected data of pregnant women. Subsequently, we screened out significant risk factors that lead to sPTB in pregnant women, integrated them into R studio, drew a column chart, established a prediction model for sPTB, and assigned scores to relevant parameters to obtain a scoring system for sPTB risk. We classified the risk of premature birth in pregnant women based on the quartile of the total score for each pregnant woman.

Model validation: After the model was established, another pregnant woman in the middle pregnancy period (16–26 w) who met the inclusion criteria was selected for the external validation of the model. The accuracy of the model was externally validated by calculating the area under the curve (AUC) of the modeling and validation queues. Finally, the consistency and clinical benefits of the model were verified using calibration curves, DCA curves, and other methods.

Statistical methods

SPSS 26.0 statistical software was used for analysis, with continuous variables following a normal distribution represented by the mean ± standard deviation. Independent sample t-test was used for inter-group comparison; continuous variables that did not follow a normal distribution are represented by M (P25, P75) and were compared between groups using Mann–Whitney U non-parametric rank sum test; categorical variables are represented by frequency and percentage, and the chi-square test or Fisher’s exact probability method was used to compare the differences between groups. Logistic regression analysis was used to analyze the risk factors associated with premature birth, draw the ROC curve of logistic regression, and evaluate the predictive performance of the model using corresponding indicators such as AUC, sensitivity, and specificity. A column chart was drawn using the R studio software based on the results of the logistic regression model. Calibration curves and Hosmer–Lemeshow tests were used to evaluate the validity of the column chart. P-values < 0.05 (bilateral) were considered to indicate a statistically significant difference.

Results

Establishment of predictive models

Determination of predictive factors

The predictive factors related to sPTB were screened using single-factor analysis (Table 1). Subsequently, through multiple logistic regression analysis of the significant predictive factors in Table 1, the final predictive factors included in the model were determined as follows: number of abortions, history of miscarriages at gestational age (≥ 12 w), history of preterm birth, maternal CL, and open and continuous cervical opening (all P < 0.05), as shown in Table 2.

Table 1 Preliminary screening of risk factors for spontaneous premature birth
Table 2 Multivariate analysis of risk factors for sPTB in pregnant women

Multi factor logistic regression analysis to calculate the optimal cut-off value

The six meaningful risk factors outlined in 2.1.1 were included as independent variables, and the ROC curve was plotted with sPTB occurrence in pregnant women as the dependent variable to obtain the results. The optimal cut-off value for the number of abortions was 2.0, with a sensitivity of 79.3% and specificity of 85.7%. The Youden index reached its maximum value of 0.160, corresponding to an AUC value of 0.606. The optimal cut-off value for the number of miscarriages at gestational age (≥ 12 w) was 1.0, with a sensitivity of 32.1% and a specificity of 97.7%. The Youden index reached its maximum value of 0.299, corresponding to an AUC value of 0.650. The optimal cut-off value for CL was 2.50 cm, with a sensitivity of 50.0% and specificity of 89.2%. The Youden index reached a maximum value of 0.428, corresponding to an AUC value of 0.747. We also plotted ROC curves for research subjects with a CL ≤ 2.5 cm, and obtained a cut-off value of 2.0 cm for their CL. See Fig. 1.

Fig. 1
figure 1

Binary logstic regression analysis forest map

al cut-off value was reached, the sensitivity was 70.7%, the specificity was 88.1%, the Youden index was 0.588, and the corresponding AUC value was 0.856, as shown in Fig. 2.

Fig. 2
figure 2

ROC curves for various risk factors and joint predictions

Establishment of prediction model

Integrating the above six risk factors as predictive factors into R studio, a nomogram column chart was drawn and a predictive model that could predict sPTB was obtained, as shown in Fig. 3.

Fig. 3
figure 3

Nomogram prediction model for spontaneous premature birth in pregnant women

Using the third percentile of the total score of the research participants in the modeling queue as the cut-off value, the research participants were divided into three risk groups: low, medium, and high. The scoring system used for the sPTB prediction model is shown in Table 3.

Table 3 Evaluation of the effect of risk level standards for predicting spontaneous premature birth in modeling and validation cohorts

As shown in Table 3, when the total score is 0–59, it is considered low risk, with a sPTB risk < 20%; when the total score is 60–259, it is medium risk, with a sPTB risk of 20–80%; and when the total score is ≥ 260, it is high risk, and the sPTB risk is > 80%.

External validation of predictive models

The Hosmer–Lemeshow test indicated a good fit of the model (p = 0.997). The calibration curve of the modeling queue C was close to the diagonal (C index = 0.856), as was the validation queue C calibration curve (C index = 0.854). The model exhibits good consistency, as shown in Fig. 4. The AUCs of the modeling and validation queues were 0.850 and 0.881, respectively, as shown in Fig. 5, indicating good accuracy.

Fig. 4
figure 4

Calibration curves of modeling queue and validation queue C. Note: Figure (a) shows the calibration curve of modeling queue C, and Figure (b) shows the calibration curve of validation queue C

Fig. 5
figure 5

Modeling queue and validation queue ROC curves. Note: Figure (a) shows the area under the modeling queue curve (AUC), while Figure (b) shows the area under the validation queue curve (AUC)

The clinical effectiveness of the decision curve analysis (DCA) model is shown in Fig. 6. The benefits of both modeling and validating queues are positive.

Fig. 6
figure 6

Analysis of clinical decision curves (DCA) for modeling and validation queues. Note: Figure (a) shows the modeling queue DCA, while Figure (b) shows the validation queue DCA. The green line represents the net benefit of patients who were not considered to have sPTB; The red diagonal represents the net benefit for all patients diagnosed with sPTB. The farther the curve in the model is from the green and red lines, the greater the benefit the model brings to patients when used to predict the diagnosis of sPTB

Discussion

According to the 2022 ISUOG guidelines [6], the CL in pregnant women is generally stable between 14 and 28 weeks of pregnancy, before gradually shortening thereafter. The guidelines clearly state that it is best to screen asymptomatic women by measuring CL between 18 and 26 weeks of pregnancy. If the CL is measured before the gestational age, its value is often overestimated because of the difficulty in identifying the cervical opening. In addition, as the mid-pregnancy period (24–26 w) is a common deadline for preventive measures such as progesterone administration and cerclage, and the starting point for intervention treatment, 26 w is considered the upper limit for screening for sPTB. Therefore, we established a sPTB prediction model for mid-pregnancy pregnant women.

The predictive value of sPTB risk factors varies according to race, lifestyle, and adherence to medical advice. The increasing education level and emphasis on offspring among Chinese women has led to higher demands for a good personal quality of life before and during pregnancy. Therefore, risk factors such as smoking and excessive BMI studied by foreign scholars cannot be used as predictive factors for sPTB in Chinese women. In this study, we selected six predictive factors with high sensitivity and specificity for Chinese women through multiple logistic regression analysis, namely the number of abortions, history of miscarriages at gestational age (≥ 12 w), history of premature birth, maternal CL, cervical opening, and continuous cervical opening. These factors are easily obtainable and widely applicable in clinical practice, thereby increasing the popularity of this predictive model.

A column chart is a graphical calculation formula that presents the results of the logistic regression analysis graphically, assigns specific scores to each predictive factor, and calculates the corresponding risk probability based on the total score [7]. When reviewing the literature, we found that column charts are mostly used to predict the prognosis of tumors and are rarely used to predict the risk of spontaneous preterm birth in pregnant women [8]. We drew a column chart based on the results of the multiple logistic regression analysis, which resulted in a more intuitive and personalized prediction model. At the same time, as shown in the model diagram established in this study, each predictive factor corresponds to different cut-off values. For example, if a pregnant woman has a history of miscarriage in the first week of pregnancy, a CL ≤ 2.0 cm in this pregnancy, and the presence of continuous cervical opening, the corresponding scores are 64 points, 100 points, 82 points, and 62 points, representing a total score of 308 points. For each pregnant woman, the higher the total score, the higher the risk of sPTB; this will help doctors to prioritize pregnant women with a high risk of premature birth and intervene in a timely manner. In addition, this prediction model divided pregnant women into low-, medium-, and high-risk groups based on their scores, corresponding to different risks of premature birth. This will help to provide more accurate and targeted interventions in clinical practice. For high-risk pregnant women, intervention measures, such as inhibiting uterine contractions, lying in bed, reducing abdominal pressure, and cervical cerclage, can be taken to reduce the occurrence of sPTB and avoid overtreatment. Because a higher proportion of pregnant women included in this study had a history of miscarriage, premature birth, and premature birth symptoms in early mid-pregnancy, low-risk pregnant women still had a 20% chance of developing sPTB, which overestimated the risk of sPTB in pregnant women. However, external validation confirmed the accuracy and effectiveness of the model, demonstrating that it can be reliably applied in clinical practice.

Many researchers have established predictive models for sPTB to evaluate the risk of premature birth in pregnant women. Odibo et al. [9]. established a predictive model for preterm birth in pregnant women after cervical cerclage; this incorporated three risk factors, including cervical length, history of cervical conization, and history of cervical cerclage, and had an AUC of 0.907. Kuhrt et al. [10]. established a predictive model for sPTB in high-risk pregnant women; this incorporated fFn, CL, and premature birth history as risk factors, and had an AUC of 0.77–0.99. Some predictive models also incorporate the clinical information and medical history of pregnant women as predictive factors, with an area under the ROC curve of 0.724 [11]. The predictive performance of each model was good; however, research by foreign scholars has rarely included Asian races. Owing to the differences in the factors influencing sPTB among pregnant women of different races, the application of foreign sPTB prediction models in China is limited. Recently, scholars in China have conducted research on the prediction of premature births. Indeed, Yang et al. [12]. used age, marital status, educational level, abnormal fetal position, multiple pregnancies, conception methods, cervical canal length, reproductive tract infections, and hypertension to establish a predictive model. The AUC was 0.875 and the predictive performance was good; however, its clinical popularity was limited, largely because the predictive factors selected for inclusion were highly targeted, had a small sample size, and required complex operations or calculations. The AUC value of the prediction model constructed in this study was 0.856 (95% CI: 0.807–0.893), and the AUC value of the validation queue was 0.881 (95% CI: 0.848–0.915), indicating good prediction accuracy. The calibration capability of the model was evaluated using a C-calibration curve. The data point connections of the modeling and validation queues were well fitted with diagonal lines, with C-indices of 0.856 and 0.854, respectively. This indicates that the calibration ability of the column chart is good and that the model is effective in predicting sPTB. This model is aimed at Chinese women, and lays the foundation for future research on Asian populations. We also applied the DCA curve to evaluate the clinical benefits of the model after clinical application. The results showed that for suspected cases, the net benefits to patients between clinical and non-clinical intervention were both positive. This model has good clinical practicality for predicting sPTB.

Limitations of this study: This was a single-center study with samples sourced from the same hospital, resulting in a low positivity rate. Further research with multiple centers and larger sample sizes is required. At the same time, owing to the large proportion of pregnant women with a history of premature birth, a history of miscarriage in the first week of pregnancy, and a risk of premature birth in this pregnancy included in the modeling cohort of this study, the premature birth rate (20.6%) was higher than that at the national level. Although external validation has shown that the model is effective and accurate, future multicenter studies are required to make the prediction results of the prediction model more accurate.

In summary, the predictive model established by our research institute is consistent with China’s national conditions, as well as being intuitive and easy to operate, with wide applicability. Thus, this predictive model may be helpful for early detection of sPTB, as well as in clinical management to assess the low, medium, and high risks of sPTB.

Data availability

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Abbreviations

sPTB:

Spontaneous Premature Birth

CL:

Cervical Length

BMI:

Body Mass Index

AUC:

Area Under Curve

DCA:

Decision Curve Analysis

References

  1. Xuming B. Dong Yue. Recommended guidelines for clinical diagnosis and treatment of premature birth (draft) [J]. Chin J Obstet Gynecol. 2007;42(07):498–500.

    Google Scholar 

  2. Prediction and Prevention of Spontaneous Preterm Birth. ACOG Practice Bulletin, Number 234. Obstet Gynecol. 2021;138(2):e65–90.

    Article  Google Scholar 

  3. Lizhou S. Screening and evaluation of high-risk factors for spontaneous premature birth [J]. J Practical Obstet Gynecol. 2012;28(10):803–5.

    Google Scholar 

  4. Souza RT, Cecatti JG. A Comprehensive Integrative Review of the Factors Associated with spontaneous Preterm Birth, its Prevention and Prediction, including metabolomic markers. Rev Bras Ginecol Obstet. 2020;42(1):51–60.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Cobo T, Kacerovsky M, Jacobsson B. Risk factors for spontaneous preterm delivery. Int J Gynaecol Obstet. 2020;150(1):17–23.

    Article  PubMed  Google Scholar 

  6. Coutinho CM, Sotiriadis A, Odibo A, et al. ISUOG Practice guidelines: role of ultrasound in the prediction of spontaneous preterm birth. Ultrasound Obstet Gynecol. 2022;60(3):435–56.

    Article  CAS  PubMed  Google Scholar 

  7. Liu H, Li J, Guo J, et al. A prediction nomogram for neonatal acute respiratory distress syndrome in late-preterm infants and full-term infants: a retrospective study. EClinicalMedicine. 2022;50:101523.

    Article  PubMed  PubMed Central  Google Scholar 

  8. Hoshino N, Hida K, Sakai Y, et al. Nomogram for predicting anastomotic leakage after low anterior resection for rectal cancer. Int J Colorectal Dis. 2018;33(4):411–8.

    Article  PubMed  Google Scholar 

  9. Odibo AO, Farrell C, Macones GA, et al. Development of a scoring system for predicting the risk of preterm birth in women receiving cervical cerclage. J Perinatol. 2003;23(8):664–7.

    Article  PubMed  Google Scholar 

  10. Kuhrt K, Smout E, Hezelgrave N, et al. Development and validation of a tool incorporating cervical length and quantitative fetal fibronectin to predict spontaneous preterm birth in asymptomatic high-risk women. Ultrasound Obstet Gynecol. 2016;47(1):104–9.

    Article  CAS  PubMed  Google Scholar 

  11. De Silva DA, Lisonkova S, von Dadelszen P, et al. Timing of delivery in a high-risk obstetric population: a clinical prediction model. BMC Pregnancy Childbirth. 2017;17(1):202.

    Article  PubMed  PubMed Central  Google Scholar 

  12. Bai Y. Analysis of risk factors for spontaneous premature birth and construction of a column chart [D]. Shihezi University; 2022.

Download references

Acknowledgements

We would like to thank Editage (www.editage.cn) for English language editing.

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Contributions

LZM and LW designed the experimental research method and wrote this article. LZM and HJY collected data, ZHW and LH organized and managed the data, and LZM conducted statistical analysis on the data. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Liu Wei.

Ethics declarations

Ethics approval and consent to participate

This study was approved by the Ethics Committee of the First Affiliated Hospital of Shandong First Medical University (Qianfoshan Hospital in Shandong Province). Ethical approval number: YXLL-KY-2023 (058).

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zimeng, L., Jingyuan, H., Naiwen, Z. et al. Establishment and validation of a predictive model for spontaneous preterm birth in singleton pregnant women. BMC Pregnancy Childbirth 24, 595 (2024). https://doi.org/10.1186/s12884-024-06772-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12884-024-06772-w

Keywords