Identifying women with gestational diabetes based on maternal characteristics: an analysis of four Norwegian prospective studies

Background There is still no worldwide agreement on the best diagnostic thresholds to define gestational diabetes (GDM) or the optimal approach for identifying women with GDM. Should all pregnant women perform an oral glucose tolerance test (OGTT) or can easily available maternal characteristics, such as age, BMI and ethnicity, indicate which women to test? The aim of this study was to assess the prevalence of GDM by three diagnostic criteria and the predictive accuracy of commonly used risk factors. Methods We merged data from four Norwegian cohorts (2002–2013), encompassing 2981 women with complete results from a universally offered OGTT. Prevalences were estimated based on the following diagnostic criteria: 1999WHO (fasting plasma glucose (FPG) ≥7.0 or 2-h glucose ≥7.8 mmol/L), 2013WHO (FPG ≥5.1 or 2-h glucose ≥8.5 mmol/L), and 2017Norwegian (FPG ≥5.3 or 2-h glucose ≥9 mmol/L). Multiple logistic regression models examined associations between GDM and maternal factors. We applied the 2013WHO and 2017Norwegian criteria to evaluate the performance of different thresholds of age and BMI. Results The prevalence of GDM was 10.7, 16.9 and 10.3%, applying the 1999WHO, 2013WHO, and the 2017Norwegian criteria, respectively, but was higher for women with non-European background when compared to European women (14.5 vs 10.2%, 37.7 vs 13.8% and 27.0 vs 7.8%). While advancing age and elevated BMI increased the risk of GDM, no risk factors, isolated or in combination, could identify more than 80% of women with GDM by the latter two diagnostic criteria, unless at least 70–80% of women were offered an OGTT. Using the 2017Norwegian criteria, the combination “age≥25 years or BMI≥25 kg/m2” achieved the highest sensitivity (96.5%) with an OGTT required for 93% of European women. The predictive accuracy of risk factors for identifying GDM was even lower for non-European women. Conclusions The prevalence of GDM was similar using the 1999WHO and 2017Norwegian criteria, but substantially higher with the 2013WHO criteria, in particular for ethnic non-European women. Using clinical risk factors such as age and BMI is a poor pre-diagnostic screening method, as this approach failed to identify a substantial proportion of women with GDM unless at least 70–80% were tested. Supplementary Information The online version contains supplementary material available at 10.1186/s12884-021-04086-9.

Conclusions: The prevalence of GDM was similar using the 1999 WHO and 2017 Norwegian criteria, but substantially higher with the 2013 WHO criteria, in particular for ethnic non-European women. Using clinical risk factors such as age and BMI is a poor pre-diagnostic screening method, as this approach failed to identify a substantial proportion of women with GDM unless at least 70-80% were tested.
Keywords: Gestational diabetes mellitus, Pre pregnancy BMI, Pregnancy, Screening, Diagnostic criteria Background Gestational diabetes mellitus (GDM) is glucose intolerance with onset or first diagnosis during pregnancy which is clearly not overt diabetes [1]. GDM is associated with higher maternal and neonatal morbidities in the short-and long-term and predisposes both women and their offspring to later development of type 2 diabetes [2]. Screening followed by treatment of GDM reduces the risk of several pregnancy complications [3]. However, there is no worldwide agreement on the best diagnostic thresholds to define GDM, and a wide variety of clinical guidelines have been employed [4].
In 2013, the World Health Organization (WHO) recommended glycaemic thresholds for the diagnosis of GDM based on findings from the multinational Hyperglycaemia and Adverse Pregnancy Outcome (HAPO) study demonstrating a linear dose-response between maternal glycaemia and adverse neonatal outcomes. These criteria were determined to identify women with an adjusted odds ratio (OR) of 1.75 for adverse events in their offspring relative to the mean [5]. Glucose values set to identify women with a higher risk, corresponding to an adjusted OR of 2.0, were also considered but this proposal was rejected. Nonetheless, several countries, among them Canada and Norway, adopted the latter noting the substantial rise in GDM prevalence by 2013 WHO criteria, without clear evidence of clinically important benefits [6]. The prior WHO criteria, established in 1999 and used in Norway until 2017, were identical to those for diagnosis of glucose intolerance in a non-pregnant population.
Controversy surrounds not only the thresholds values of glycemia, but also the optimal approach for identifying women with GDM. A high-risk approach has traditionally been recommended based on easily available maternal characteristics such as advanced age and BMI, known to be associated with an increased risk of GDM [7]. However, although this approach reduces unnecessary testing in those least likely to test positive, a key issue is their performance as indicators for diagnostic testing and the usefulness of risk factors in a clinical setting today [8]. The alternative option, universal screening, has a high detection rate but poses a large immediate burden to healthcare services as well as pregnant women.
In this study that merged data from four existing Norwegian pregnancy and birth cohorts, we aimed to address some of the clinical controversies related to GDM diagnosis and screening. The objectives were: 1) To establish the prevalence of GDM with three diagnostic criteria ( 1999 WHO, 2013 WHO, and the 2017 Norwegian criteria), 2) identify cut-off levels for age and BMI that identify at least 80% of women with GDM and 3) assess the predictive accuracy of commonly used risk factors.

Methods
All population-based birth cohort studies in Norway with a special focus on gestational diabetes were eligible. For the present study, the following inclusion criteria were defined: (i) prospective studies comprising women with singleton live-born children recruited early in pregnancy (between week 15-20); (ii) data on maternal pre-pregnancy BMI; (iii) glucose measurements obtained from at least one universally offered 75 g 2-h oral glucose tolerance test (OGTT) performed ≥20 weeks' gestation; (iv) at least one offspring measurement (birthweight). Exclusion criteria were studies without the core data and studies that only included specific subgroups (such as obese women only).
Four Norwegian studies (two cohort studies ( [9,10] and two randomized controlled trials (RCT) [11,12] were identified, and primary investigators were invited to become part of the "Norwegian Hyperglycemia in pregnancy" consortium in 2017. Principal investigators from all four studies agreed to participate, providing data from 3315 pregnant women and 3293 live births (Fig. 1).
The original studies collected data between 2002 and 2013. If GDM was diagnosed, women received diabetes care according to local guidelines. Details of the methods and characteristics of participants in each study, including eligibility criteria, methods of recruitment and measurements obtained, have been previously published [9][10][11][12]. Authors were requested to provide anonymous raw data to be stored and analyzed in The University of Oslo's Service for Sensitive data (TSD) storage platform with access for all the project partners. Data were further harmonized and assessed for internal consistency and missing items. Investigators were asked for clarification on issues regarding the coding of variables and a final summary of relevant variables was sent for verification. After resolution, all datasets were merged. We excluded from analyses participants for which no OGTT data were available, as well as multi-fetal pregnancies (Fig. 1).
The primary outcome was GDM prevalence. All women underwent a 75 g OGTT after an overnight fast. In two of the studies [9,10] venous blood samples were collected in tubes containing Ethylenediaminetetraacetic acid (EDTA) and glucose was analyzed on site in fresh, whole EDTA blood, using HemoCue 201+ glucose analyser (Angelholm, Sweden) [9] or a Accu-Chek Sensor glucometer (Roche Diagnostics, Mannheim, Germany) [10] according to protocols. In Sagedal et al. [11] and Stafne et al. [12] fasting and 2-h glucose levels were measured in plasma or serum, respectively, by the routine methods used at the participating hospital laboratory.
The diagnosis was originally made according to the 1999 WHO criteria which was used during data collection. In addition, we applied the 2013 WHO criteria and the 2017 Norwegian criteria (Table 1) for the purposes of this specific study. The 2013 WHO criteria also includes a 1-h glucose which was not measured in the respective studies.
In each individual study, women were either interviewed or asked to complete a questionnaire including information on current smoking status and their highest educational qualification. Women were further assessed at the study sites with respect to biological and anthropometric data. Height was measured directly while weight prior to becoming pregnant was self-reported in all studies. Categories for age and pre-pregnancy body mass index (BMI) were determined prior to analysis and based on clinical relevance. Furthermore, women were classified as primiparous or parous for the purpose of this study. STORK Groruddalen [9] was the only study that actively included a multiethnic population (59% ethnic minority women, primarily born outside Europe). Ethnic origin was defined as European (predominantly Scandinavian as well as East and West-European origin) or non-European (mainly Asian, North-African, Middle Eastern or Sub-Sahara African). Family history of diabetes was not measured in the Fit for Delivery study.

Statistical analysis
Distributions of all potential predictors were checked for normality. The characteristics of the women were categorized by GDM-status and the two groups were compared using X 2 statistic for categorical data and the Student's t Test for continuous variables. Data are reported as frequencies and percentages for categorical variables and mean and standard deviation for continuous variables.
Information was available for 95% of the selected covariates. To assign values for the missing data for  pre-pregnancy weight (5%), height (0.4%), educational attainment (0.3%) and parity (0.3%) we used Stochastic regression imputation with predictive mean matching as the imputation model to substitute missing items in the observed population [13].
To examine associations between GDM and maternal factors, we modelled GDM as a binary outcome (GDM vs no-GDM) and variables related to GDM in univariable logistic regression models with p-value < 0.2 were considered in separate multivariable analyses [14]. The final model resulted from a backward selection procedure (exclusion if p > 0.15). All models were adjusted for cohort to handle unmeasured confounders. Results from logistic regression are presented as OR with accompanied 95% confidence intervals (CI), and with Nagelkerke R 2 for model fit.
In the analyses, the two RCT's were treated as cohort studies as the primary outcome (GDM) did not differ between control and intervention group in the original studies [11,12]. The interventions in the two trials consisted of either an exercise program (supervised exercise sessions) or a combination of a physical activity component and dietary counselling. The regression analysis was repeated after excluding participants who received the intervention to examine the potential role of the intervention in these RCTs.
Finally, we assessed the diagnostic accuracy across different pre-specified cut-offs for maternal age and BMI with and without the addition of parity, based on previous and current screening guidelines. We calculated sensitivity (proportion of GDM cases correctly identified by the risk factor), specificity (proportion of women without GDM who did not have the risk factor), and the proportion of women with the risk factor (i.e. who would be offered an OGTT). Analyses were performed and presented separately for European and non-European women due to strong effect of ethnicity. For each risk factor, single or in combination, the sensitivity estimates were plotted in Receiver Operating Characteristic (ROC) space against the proportion of women subjected to OGTT. An optimal risk factor combination will have high sensitivity with small numbers needing to be tested (results near the top left of the space). We opted for a sensitivity level of 80% for the risk factors. Statistical analyses were performed using SPSS software, Version 26 (USA).

Results
We excluded more participants from the TRIP study than from the other studies due to missing GDM data ( Fig. 1). Apart from this, no significant differences were noted between the women who were included in the study and those excluded (not shown). After exclusions, the pooled dataset comprised 2981 women with a mean (SD) age of 30.2 (4.4) years and pre-pregnant BMI of 23.7 kg/m 2 ( Table 2). The majority were of European origin (87.0%), had higher education (73.4%) and were in their first pregnancy (61.0%). GDM was diagnosed in 320 (10.7%), 504 (16.9%) and 308 (10.3%) pregnancies with the 1999 WHO, 2013 WHO and 2017 Norwegian criteria, respectively.
Compared with the non-GDM group, women diagnosed with GDM by either criteria were more likely to be older, heavier, shorter and of non-European origin (Supporting information Table S1, additional file 1). Moreover, using the 2017 Norwegian criteria, while 25.5% of women without GDM had overweight or obesity (BMI > 25 kg/m 2 ), this was observed in 51.3% of women with GDM (P < 0.001). There were more primiparas in the non-GDM group (P < 0.001), except when applying the 1999 WHO criteria.
In logistic regression analyses, all selected variables except smoking, were significantly associated with GDM with the 2017 Norwegian criteria prior to adjustments (Table 3). Nevertheless, the associations observed for parity, education and height were strongly attenuated and lost their significance in the multivariable adjusted model 1. Age, pre-pregnancy BMI and ethnicity remained the only significant predictors in the final multivariable model (model 2). However, compared with women ≤25 years, an increased OR for developing GDM was only found for those above 35 years of age (aOR 1.73; 95% CI: 1.07-2.80; P < 0.026).
Applying the 2013 WHO criteria led to similar findings (Table 4). For the 1999 WHO, however, non-European ethnicity was not significantly associated with GDM, while parity and height remained significant in the final adjusted model (Supporting information Table S2, additional file 1). The predictive power of all models was low, with Nagelkerke values ranging from 0.9 to 16.4%, depending on the criteria applied. Sensitivity analysis restricted to individuals without lifestyle intervention in two of the cohorts led to similar findings, although age was no longer significant (not shown). Table 5 displays estimates of sensitivity and the proportion needed to be screened for selected risk factors combinations, stratified for ethnic origin. In European women, the combination "age≥25 years or BMI≥25 kg/m 2 " achieved the highest sensitivity of 96.5% (i.e. detected 96.5% of GDM cases), but because these risk factors occurred in 93%, an OGTT would be required in almost all women. By adding parity to the age thresholds (25 years for primipara and 35 years for parous) the number of OGTT needed was reduced to 75%, although a reduction in sensitivity to 85% was observed. Similar trends were observed for women with non-European background, except that family history of diabetes achieved a higher sensitivity (42.6%) than in their European counterparts (11%). Overall, the sensitivity of the risk factors was slightly higher when applying the 2017 Norwegian criteria than the 2013 WHO. Figure 3 shows the proportion of correctly identified GDM cases for European women, and proportion that would be offered an OGTT for each risk factor or combination of factors by the 2017 Norwegian and 2013 WHO criteria. Irrespective of the risk factor used, the sensitivity increased with the number of women needing a test for both diagnostic criteria, displaying three clusters of four to five factors with poor, moderate and good performance. To identify at least 80% of women with GDM (good performance), at least 75% of all women would need to undergo an OGTT. The risk factor displaying both high sensitivity and the smallest proportion of OGTT's, was the combination "BMI≥25 kg/m 2 or (primipara+age≥25) or (parous+age≥35)". With 75% requiring a test, this factor combination failed to identify 15% of women with GDM by 2017 Norwegian criteria. The proportion of OGTT required could be reduced to 54% by increasing the threshold for age to ≥30 years for primipara (moderate performance); however, this approach implies that 27% of women with GDM will remain undiagnosed (Table 5).

Discussion
In this study of women universally offered an OGTT during the second half of pregnancy, we found a similar overall prevalence of GDM (10.7% vs 10.3%) with the 2017 Norwegian criteria and the previously used criteria ( 1999 WHO), but using lower glucose level thresholds in line with 2013 WHO criteria, identified considerably higher numbers of women with GDM (16.9%). The prevalence more than doubled for non-European women applying the 2013 WHO and 2017 Norwegian criteria, even after adjusting for covariates. Our study further shows that while advancing age and elevated BMI increased the risk of GDM, using these risk factors in pre-diagnostic screening is a poor method for accurately identifying women with GDM, resulting in many missed cases unless 70-80% of European women are tested. The sensitivity of the risk factors was lower for non-European women, indicating an even stronger rationale for universal screening in these women. Although shifting from the older 1999 WHO criteria to the new 2017 Norwegian criteria resulted in a similar frequency of GDM, the groups identified differ in terms of their metabolic profile. The latter criteria identified more women with a higher prepregnancy BMI and non-European ethnicity, presumably attributable to the lower fasting glucose threshold.
Our prevalence rates applying the 2013 WHO criteria are comparable with estimates reported in other studies in the past decade, although differences in screening procedures, demographic characteristics of the subjects as well as the ethnic make-up of the population make direct comparisons complex. Guariguata et al. [15] estimated that the global prevalence of hyperglycaemia using the 2013 WHO criteria was 16.9%. A more recent meta-analysis of high-income countries in Europe found an overall GDM prevalence of 5.4%, regardless of diagnostic criteria used [16]. In contrast, a study using 2013 WHO thresholds and only fasting glucose in a Danish pregnancy cohort, found that 40% were classified as having GDM [17]. The authors raised important questions about uniform application of diagnostic thresholds across the world, and suggested population-based local recommendations.
Multiple studies have evaluated selective risk factor-based strategies aiming to identify the best diagnostic approach for GDM [18,19]. We demonstrate that the most sensitive and specific cut-offs for maternal age and BMI in European women were age ≥ 25 years and BMI ≥25 kg/m 2 when parity was added. However, used as a screening strategy this would mean inviting the majority of women for an OGTT as at least one of these risk factors applies to most women today. This confirms recent findings from a systematic review and meta-analysis by Farrar et al. [8] concluding that sensitivity increases with the number of women needing a test. This strategy does not vary much from universal screening, and supports the contention that identification of GDM requires testing of almost all pregnant women [20]  especially considering the rise in maternal age and overweight/obesity among childbearing over recent years [21].
Selective screening has the potential to spare many pregnant women of diagnostic testing thereby reducing time and resource use. However, consistent with others [22,23] we found that screening on the basis of risk factors would result in a larger number of missed diagnoses and hence limit the opportunity for immediate and long-term follow up and treatment. This is of concern, as a substantial proportion of women with GDM have no defined risk factors [24,25]. The importance of GDM management is now widely accepted, and evidence supports that treatment of even milder degrees of hyperglycaemia could improve pregnancy outcomes [26,27].
Additionally, universal screening has the unique potential to identify this subset of women who would not otherwise be identified as having GDM, and, therefore, provide clinicians, as well as the women themselves, an opportunity to plan postpartum lifestyle interventions that could prevent or delay the onset of future type 2 diabetes [28][29][30].
Our study has several strengths. We merged data from four contemporary birth cohorts, allowing more powerful and flexible analyses. Additionally, although the level of missing was generally low, missing data were adequately handled by multiple imputation to prevent biased results. By including different geographical populations in Norway, we believe that the results may be broadly generalizable in Norway as well as to different antenatal It is of note, however, that almost all non-European women came from one study and more than half were of Asian (mainly South Asian) origin. Nevertheless, the proportion included and the composition of this group, is representative for the pregnant population with non-European ethnicity living in Norway [31]. The majority of the European women in our study had a normal BMI and high educational level, which may indicate that our prevalence rates of GDM are less generalizable to more high-risk populations. The rates of overweight and obesity in our cohort were somewhat lower than our background population (8% obesity in our study vs 12% nationally in 2018) [32]. A selection bias towards inclusion of individuals with a higher health awareness, as is often seen in clinical studies, may have led to underestimation of the reported prevalence rates and the numbers needed to be screened. A higher proportion of overweight/obesity would require an OGTT of a larger number of women. In addition, had a 1-h value been measured in our study, the prevalence of GDM by the 2013 WHO criteria would presumably have increased somewhat. Second, two of the included studies were RCT's with a lifestyle intervention for half of the women. However, no effect of the intervention on GDM status was reported in these studies and, reassuringly, our findings remained unchanged in sensitivity analyses. Lastly, we present data from four cohorts pooled into one data set where each study  differs somewhat in terms of inclusion period, time of OGTT and geography, although by including Norwegian studies only and adjusting for study cohort this source of heterogeneity was limited.

Conclusion
The use of a stricter diagnostic criteria than the 2013 WHO (OR of 2.0 vs. 1.75) limited the prevalence of GDM to approximately the same level as the older 1999 WHO. We found that maternal characteristics are of limited use in identifying women with GDM, requiring testing of almost all women to avoid overlooking a substantial number of cases. The costs and benefits of universal screening, and the use of alternative testing algorithms or biomarkers, require further evaluation.