Meta-analysis of studies on biochemical marker tests for the diagnosis of premature rupture of membranes: comparison of performance indexes

Background Premature rupture of the membranes (PROM) is most commonly diagnosed using physical examination; however, accurate decision making in ambiguous cases is a major challenge in current obstetric practice. As this may influence a woman’s subsequent management, a number of tests designed to assist with confirming a diagnosis of PROM are commercially available. This study sought to evaluate the published data for the accuracy of two amniotic fluid-specific biomarker tests for PROM: insulin-like growth factor binding protein-1 (IGFBP-1 – Actim® PROM) and placental alpha microglobulin-1 (PAMG-1 – AmniSure®). Methods Main analysis included all PubMed referenced studies related to Actim® PROM and AmniSure® with available data to extract performance rates. To compare accuracy, a comparison of pooled indexes of both rapid tests was performed. Studies in which both tests were used in the same clinical population were also analysed. Membrane status, whether it was known or a suspected rupture, and inclusion or not of women with bleeding, were considered. Results All the available studies published in PubMed up to April 2013 were reviewed. Data were retrieved from 17 studies; 10 for Actim® PROM (n = 1066), four for AmniSure® (n = 1081) and three studies in which both biomarker tests were compared directly. The pooled analysis found that the specificity and positive predictive value were significantly higher for AmniSure® compared with Actim® PROM. However, when 762 and 1385 women with known or suspected rupture of membranes, respectively, were evaluated, AmniSure® only remained significantly superior in the latter group. Furthermore, when the two tests were compared directly in the same study no statistically significant differences were observed. Remarkably, women with a history or evidence of bleeding were excluded in all four studies for AmniSure®, in two Actim® PROM studies and in two of the three studies reporting on both tests. Conclusions No differences were observed in the performance of the two tests in studies where they were used under the same clinical conditions or in women with known membrane status. Although AmniSure® performed better in suspected cases of PROM, this may need further analysis as exclusion of bleeding may not be representative of the real clinical presentation of women with suspected PROM.


Background
Disruption of foetal membranes prior to the onset of labour, commonly known as premature rupture of membranes (PROM), is a frequent complication of pregnancy [1,2]. PROM occurs in 8 − 10% of all pregnancies [3] and pre-term PROM (PROM <37 weeks' gestation) is associated with approximately a third of all premature births [1,2].
Often considered as an inert gestational sac, foetal membranes have a stratified structure with special biochemical characteristics that provide them with the ability to adapt to the expansion that occurs during pregnancy, resulting from increasing foetal size and amniotic fluid. Foetal membranes are composed of two layers, the amnion which faces the amniotic cavity and the chorion which faces the decidua [4]. Membrane integrity is essential to ensure normal term pregnancy. Evidence suggests that the mechanisms involved in the rupture of membranes include biochemical, immunologic and bacteriologic events. Currently, it is widely accepted that term or preterm rupture is associated with structural changes, caused by inflammatory processes induced by endocrine or infectious triggers [5,6].
The main complications and consequences of PROM are related to the gestational age at which it occurs, the latency until birth, concomitant infection of the gestational tissues which may impact both foetal and maternal outcomes, in addition to conditions specific to the foetus, such as oligohydramnios, cord compression, abruptio or cord prolapse [2]. The accurate diagnosis of PROM coupled with appropriate obstetric interventions, according to gestational age, are of key importance to limit the potential risk posed by these adverse maternal and foetal outcomes.
Without clear evidence of amniotic fluid loss observed by speculum examination, the diagnosis of PROM can be uncertain and complementary diagnostic tests are frequently needed. The diagnostic confirmation in ambiguous cases is a major challenge in current obstetric practice, because correct diagnosis is necessary in order to decide upon the most appropriate management and ultimately to reduce both maternal and foetal complications. The optimal test should be specific for amniotic fluid and not be affected by contamination from other corporal substances or vaginal medications. Multiple tests with varying performance, are available in order to assess the integrity of foetal membranes [7,8], including cytological, biochemical, or colorimetric and ultrasound techniques. Limitations of the accuracy of tests, e.g. poor specificity (i.e. a high proportion of false positives), may lead to unnecessary interventions such as hospitalisation, antibiotic therapy, application of corticosteroids [9,10] and even induction of labour [3,9,10]. In contrast, poor sensitivity (i.e. a high proportion of false negative results) may be reassuring and delay or deprive women of appropriate treatments [2], increasing the risk of potential maternal and foetal morbidity and mortality. Traditional bedside and non-invasive tests, such as the fern and nitrazine test, have a high rate of false-negative and false-positive results in cases where women have vaginal infections or the presence of semen, blood or topical antiseptics [1,3].
New non-invasive tests have been developed in the last 15-20 years, with a simple dipstick test format, based on the detection of specific proteins found in amniotic fluid and which combine high sensitivity rates with low false-positive results. There are a number of rapid immunoassay tests commercially available, of which the most commonly used are Actim® PROM (Medix Biochemica, Kauniainen, Finland), designed to detect insulin-like growth factor-binding protein-1 (IGFBP-1), and AmniSure® (Qiagen, Hilden, Germany) which detects the presence of placental alpha macroglobulin-1 (PAMG-1).
IGFBP-1 is an excreted protein synthesised in the decidual cells and foetal liver and detected in amniotic fluid throughout pregnancy [11][12][13][14]. Although serum concentration of IGFBP-1 increases with gestational age [12], it is found at considerably lower concentrations in maternal serum compared to amniotic fluid. This concentration difference is also described for PAMG-1 [14], although reported concentration data vary between publications [15]. Biomarker and rapid test characteristics are shown in Table 1. Samples for both tests are collected with a sterile polyester swab before vaginal examination and/or vaginal ultrasound. The sample is collected from vaginal fluid and extracted by placing the swab in a buffer containing a solvent, with the lower end of the strip submerged.
The aim of this study was to compare the available information on two of the most commonly used commercially available rapid tests for the diagnosis of PROM. This study sought not only to critically evaluate the published evidence on the use of IGFBP-1 (Actim® PROM) and PAMG-1 (AmniSure®) tests and make a comparison of their performance indices (sensitivity, specificity, positive predictive value [PPV] and negative predictive value [NPV]) for the diagnosis of PROM, but also to identify any variants that may influence the reported performance of both tests. These variants included the diagnosis status groups (known membrane status and suspected membrane rupture) at the time of inclusion of patients in the study and the inclusion/exclusion of women with evidence of bleeding. For this meta-analysis, pooled sensitivity and specificity rates were calculated based on the results of those studies which directly compared both tests in the same clinical setting. The results of this meta-analysis are of potential value to physicians to help them in their choice of rapid test to aid in the diagnosis of PROM.

Methods
This analysis was conducted in accordance with PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines (see Additional file 1: completed PRISMA 2009 checklist).
A search of the PubMed database was conducted to identify all published studies, up to April 2013, relating to the rapid tests Actim® PROM and AmniSure®, without language restrictions and using a combination of the predefined search terms: PAMG-1 test, IGFBP-1 test, PAMG-1 PROM test, IGFBP-1 PROM test, placental alpha microglobulin-1 PROM test, insulin-like factor binding protein-1 PROM test, AmniSure® and Actim® PROM.
All abstracts, full texts and citations were reviewed to select the papers in which: a) the rapid tests were used as a tool to diagnose or complement diagnosis of a rupture of membranes in a clinical setting, where b) the confirmation on the final membrane status through a reference method was available in the paper and where c) the results of the test performance, through sensitivity and specificity or the raw number of positive and negative test results, were available. All articles which were not consistent with these criteria were excluded from this analysis.
Data extracted from each study included: year of publication, inclusion and exclusion criteria of the study (e.g. active bleeding), gestational age at test performance (range), number of women excluded and rationale for exclusion, reference method used to confirm PROM, condition of women at the beginning of the study (total women with suspected and confirmed or non-confirmed PROM), rapid test results and diagnosis (ruptured membranes or intact membranes) at final evaluation. When values of true positive (TP), false negative (FN), true negative (TN) and false positive (FP), were not explicitly reported, these were estimated based on sensitivity and specificity values and confidence intervals reported in the original publications.
To provide an estimation of the predictive performance of the tests, the sensitivity, specificity, PPV and NPV results for each study were calculated according to the Newcombe efficient-score method (corrected for continuity) [16], taking into account only the number of women with confirmed diagnosis of rupture according to the reference method in each study (per protocol, cases of suspected PROM without later confirmation of the diagnosis were not included in the final analysis). To further explore the results of this pooled analysis, a post-hoc comparison along with 95% Confidence Intervals (CIs) was also performed using the chi-square test, between each test result for subgroups whose membrane status was known and those who had a suspected membrane rupture, in order to explore reasons for potential differences. In this comparison, known membrane status refers to those women for whom membrane integrity status was clearly defined, i.e. women without any symptoms or suspicion of PROM and women who had an artificial rupture. Suspected membrane rupture refers to women whose membrane status was not known upon study entry and who were being evaluated for a suspected rupture. All probability values were 2-tailed and were corrected for multiple testing, and p ≤ 0.05 was considered statistically significant. All the statistical analyses were performed using Excel 2007 and SPSS 19.0 for Windows.
The results are presented for each test considering the pooled data and then stratified according to whether the membrane status was known (intact or ruptured) or PROM was clinically suspected.

Results
From an initial 125 identified manuscripts, all the retrieved titles and abstracts were screened to discard repeated articles, leading to a total of 52 evaluable papers: 31 papers relating to Actim® PROM, 11 papers relating to AmniSure®, and 10 referring to both biomarkers. After a detailed process of selection ( Figure 1), 35 papers were excluded because: they did not evaluate the specific biomarkers as a rapid test for PROM diagnosis [14,[17][18][19], they studied the concentration of the biomarkers through pregnancy [11,[20][21][22][23] or after an amniocentesis [24], they were solely studies of biochemical processes [25,26], they were adjunct to a genetic study [27], they comprised guidelines [9], they were review articles [8,13,28,29], they were a meta-analysis [30] or letter/comments on other articles [15,[31][32][33], and they related to the application of the biomarker in obstetrics [18,23]. In two cases, the full text version of the studies were not available for consultation [34,35]. The rest of excluded publications: did not evaluate the commercially available test in a daily clinical setting (i.e. they were in vitro studies [36][37][38], or presented test results mixed with other test modalities [39], they evaluated physicians' confidence on PROM suspicion after the test [40], they presented incomplete data on sensitivity and specificity [41] or used an inadequate reference method to confirm PROM diagnosis [42,44]). One study on AmniSure® [45] had been retracted from publication due to inaccurate results, and thus it was also excluded from the analysis. Reasons for exclusion and a detailed flow chart of the selection process are presented in Figure 1.
The Following the differences observed between the two tests in the pooled analysis, a post-hoc analysis of subgroups was undertaken to explore the potential reasons for these differences. Women included in the identified published studies were a mixed population and most studies included two types of patients. 1) Women with a confirmed membrane rupture or intact membranes; in these studies the women were used to evaluate the tests as true positive or true negatives, to show the efficacy of the tests in women with known membrane status. 2) Women who were suspected of having a membrane rupture; these represent the women who are relevant in the clinical utility of these tests and studies on these women evaluated the efficacy of the tests in the clinical setting. The overall population was stratified into women with known (Table 3) or suspected rupture of membranes (Table 4), where 762 and 1385 women, respectively, were evaluated. In this case, specificity and PPV only remained significantly higher for AmniSure® in the population where rupture of membranes was suspected. There were no differences between the two tests when they were compared in the group of women with known membrane status. A comparison of the performance indices in both populations is shown in Figure 2. Furthermore, in three studies, the two tests were compared directly in the same     population. In these studies there was no statistically significant difference in any of the performance metrics of Actim® PROM compared with AmniSure® (Table 5).

Discussions
Based on clinical evaluation, PROM can be equivocal in 10 to 20% of women consulting due to suspected loss of vaginal fluid [2,33]. Improved diagnostic methods, using biochemical markers specific for amniotic fluid, have been developed and extensively studied in the last few decades. These biomarkers are found at higher concentrations in amniotic fluid compared with vaginal fluid and thus provide a strong predictive value for the diagnosis of PROM. Multiple studies have shown the superiority of the new generation of tests, which have improved ease of sample processing and accuracy, compared with 'classic' tests [7,47,53].
The main finding of this analysis was the fact that the two tests evaluated (Actim PROM® and AmniSure®) performed equally when they were compared directly under the same clinical conditions and where women with known membranes status were tested. Considering the estimated pooled data, AmniSure® showed a higher specificity and PPV than Actim® PROM. As a result of these differences, the post-hoc analysis of subgroups was performed to evaluate separately women with known membrane status from those with suspected rupture of membranes, finding that a higher specificity and PPV of AmniSure® was only observed in samples from cases of suspected rupture of membranes ( Figure 2).  These observed differences between the two tests could possibly be linked to the consideration of active or a past history of bleeding in test evaluations. Six of the seven AmniSure® studies [55][56][57][58][59]61], explicitly excluded women when there was evidence of active bleeding, or even a history of bleeding. Considering the importance of this exclusion, we found that eight studies in which women with bleeding were excluded (four for AmniSure®, two for Actim® PROM and two for both tests) comprise more than 90% of the available data relating to women  tested using AmniSure®, but only approximately 20% of data relating to women tested with Actim® PROM. This exclusion is most likely due to the reported interference of blood with the test performance of AmniSure® (according to manufacturer's recommendations), leading to falsepositive results. This is unlike Actim® PROM, which is understood to be efficient in almost all cases, including women with some bleeding. This is due to a) the cut-off detection limit for IGFBP-1 in Actim® PROM is >25 μg/L in the extracted sample, which corresponds to a concentration of >400 μg/L in the sample taken from the woman, which is well above the level found in maternal blood (29-300 μg/L) [12] ( Table 1) and b) a low affinity of the antibody used in Actim® PROM for the highly phosphorylated form of IGFBP-1 which is predominant in blood [25]. Thus blood contamination is highly unlikely to affect the test result of Actim® PROM. Altogether, these data provide supporting evidence that blood contamination may have limited impact on Actim® PROM's performance [12,[46][47][48]. The presence of blood, in varying degrees, is observed in up to 20% of PROM cases, it is particularly common during the pre-labour period due to cervical ripening [12,46,47] or in cases of placental implantation abnormalities (i.e. placentae previa). The exclusion of women with bleeding can consequently provide unrepresentative performance values of a test for PROM and may impact upon test accuracy. Indeed, as the threshold of the AmniSure® test is very close to the lower limit described as a normal range in the maternal serum (Table 1), it could be hypothesized that traces of blood would have resulted in more false-positive tests, thus limiting the specificity, while this threshold is well above the levels found in maternal blood for the Actim® PROM test. Therefore, the presence of traces of blood should not impact on the test results using Actim® PROM. In a recently published meta-analysis that concluded a superiority of the AmniSure® test compared with the Actim® PROM test (Ramsauer et al. [62]), this exclusion of women with contaminating blood in their samples was not considered. Therefore, the results of this analysis should be interpreted with caution.
Another strength of the meta-analysis reported here is that it only included studies which met well-defined criteria. In the meta-analysis by Ramsauer et al. [62] comparing the two tests, the criteria for selection of the studies were in some cases conflicting with the described methodology: some of the published data available at that time were not included [58,60] and some evidence on AmniSure® results could not be verified because it was published only in abstract form and not available as full text [41].
It should also be noted that the study for AmniSure® with the largest sample size [55] was performed with a version of the test that is no longer commercially available. Although instructions for use only vary slightly from the currently available test (the diluent with the sample was applied to a slide instead of a test strip dipped directly into the diluent vial), it is not known whether this new test strip format has any influence on the efficacy of AmniSure® for the diagnosis of PROM. Of interest is the fact that the prevalence of the final diagnosis of PROM in the pooled data is approximately 50%, which depicts the true nature of the conflicting diagnosis of PROM being evaluated. This meta-analysis thus reflects the clinical situation experienced by physicians, in which women presenting with suspected PROM have a final confirmed diagnosis in approximately 50% of cases.
Our results, however, are not exempt from limitations, mainly related to the high complexity involved in the evaluation of the performance of diagnostic tests and the possibility of misleading published studies which are not available through Medline searches, in addition to the heterogeneity of design across studies. These factors were considered and lead us to perform subgroup analysis, which included those papers in which the final outcome was an interpretation of the performance index presented by the authors of each publication.
Particularly when tests are evaluated in the clinical setting, when PROM is suspected, the specific characteristics of each test, the selection of the women and the reference method used to confirm the diagnosis may contribute to inconsistencies. This is due to the fact that in most of the studies available for consultation, the reference method was not clearly stated or was heterogeneous (included a composite reference method, which combined the results of several available tests [12]).
A number of statistical methods have been proposed to estimate the performance of tests in the absence of a single accepted reference standard [13,14]. The importance of the diagnostic criteria for assessment of the tests performance is particularly highlighted in the group of suspected cases, where the sensitivity and specificity rates vary strongly throughout the studies. These findings suggest that the women had heterogeneous clinical characteristics and were managed according to different protocols during the studies, i.e., regarding reference methods to confirm PROM. In contrast, prevalence rates as well as accuracy characteristics such as sensitivity, specificity, PPV and NPV from the analysed data of Actim® PROM and AmniSure® studies are reasonably homogeneous. Despite a higher number of published studies for Actim® PROM, the total number of women included in both rapid test studies is comparable.
Overall, this analysis shows that accuracy of Actim® PROM and AmniSure® for the detection of PROM are comparable if used in the same clinical population [59][60][61]. Although there are significant differences in the test performance in women with suspected membrane rupture, one should be cautious to conclude from this meta-analysis that under clinical conditions either test is superior in diagnosing PROM, as women with bleeding were mostly excluded when testing one of the biomarkers.

Conclusions
In this analysis, both tests appear equally useful for clinical use to aid in the diagnosis of PROM, as no differences were observed between the tests when compared side by side in the same study. The exclusion of women with bleeding from all but one of the AmniSure® studies may limit direct comparison of the studies evaluating these two biomarkers. As some degree of bleeding may be present in a significant number of women presenting with suspected PROM in the real clinical setting, further studies are necessary to consider the performance of AmniSure® in such conditions.

Competing interests
Montse Palacio has previously received honoraria from Alere for an oral presentation. Richard Berger has previously received honoraria from Alere for oral presentations and advisory board attendance. Maritta Kühnert has previously received honoraria from Alere for advisory board attendance. LM has no competing interest to declare. Cindy L. Larios is part of the Medical Department of Clever Instruments, Barcelona, Spain, which is an independent CRO.
Authors' contributions MP participated in the interpretation of the meta-analysis data and the preparation of the manuscript. RB, MK and LM participated in the interpretation of the data and in the critical review and revision of the manuscript draft. CLL performed statistical analysis for the study and participated in the preparation of the manuscript. All authors read and approved the final manuscript.