Research article | Open | Open Peer Review | Published:
Symphysis-fundus height measurement to predict small-for-gestational-age status at birth: a systematic review
BMC Pregnancy and Childbirthvolume 15, Article number: 22 (2015)
Fetal growth restriction is among the most common and complex problems in modern obstetrics. Symphysis-fundus (SF) height measurement is a non-invasive test that may help determine which women are at risk. This study is a systematic review of the literature on the accuracy of SF height measurement for the prediction of small-for-gestational-age (SGA) status at birth in unselected and low-risk pregnancies.
The Medline, Embase, Cinahl, SweMed, and Cochrane Library databases were searched with no limitation on publication date (through September 2014), which returned 722 citations. Two reviewers then developed a short list of 51 publications of possible relevance and assessed them using the following inclusion criteria: cohort study of test accuracy performed in a routine prenatal care setting; SF height measurement for all participants; classification of SGA, defined as birth weight (BW) < 10th, 5th, or 3rd percentile or ≥ one or two standard deviations below the mean; study conducted in Northern, Western, or Central Europe; USA; Canada; Australia; or New Zealand; and sufficient data for 2 × 2 table construction. Quality of the included studies was assessed in duplicate using criteria suggested by the Cochrane Collaboration. Review Manager 5.3 software was used to analyze the data, including plotting of summary receiver operating curve spaces.
Eight studies were included in the final dataset and seven were included in summary analyses. The sensitivity of SF height measurement for SGA (BW < 10th percentile) prediction ranged from 0.27 to 0.76 and specificity ranged from 0.79 to 0.92. Positive and negative likelihood ratios ranged from 1.91 to 9.09 and from 0.29 to 0.83, respectively.
SF height can serve as a clinical indicator along with other clinical findings, information about medical conditions, and previous obstetric history. However, SF height has high false-negative rates for SGA. Clinicians must understand the limitations of this test.
The protocol has been registered in the international prospective register of systematic reviews, PROSPERO (Registration No. CRD42014008928, http://www.crd.york.ac.uk/prospero/display_record.asp?ID=CRD42014008928).
Screening for fetal growth restriction (FGR) is one of the main purposes of antenatal care.
FGR is used to describe a fetus that did not reach its genetic growth potential and is associated with increased risks of morbidity and mortality, as well as adverse effects in childhood and later life [1-4]. Because no unanimously agreed-upon definition of FGR currently exists, small-for-gestational-age (SGA) is often used as a proxy. SGA is defined as weight below a specific percentile for gestational age, usually the 10th percentile. Although not all SGA neonates are pathologically growth restricted, detection of this group aims to facilitate the identification of at-risk pregnancies requiring further investigation due to potential FGR. Early identification and appropriate management of FGR can reduce perinatal morbidity and mortality .
In Scandinavia, screening relies on routine measurement of SF height, complemented by ultrasound measurement of fetal size in women with pregnancy complications or with a relevant history or clinical evidence of FGR [6-8]. SF height is a technique involving measurement of the maternal abdomen from the symphysis pubis to the uterine fundus with a tape measure. The measurement is plotted on a curve and compared with the distribution of the reference population [9,10]. If the recorded measurement is below acceptable limits according to the reference curves, further investigations of fetal growth and well-being are to be performed, including ultrasound estimations, uteroplacental and fetoplacental flow evaluations by Doppler, as well as cardiotocography.
Despite the routine use of SF height to predict SGA at birth, evidence for this method remains unclear. To date there is insufficient evidence from high quality trials to fully evaluate the effect of routine use of SF height during prenatal care on pregnancy outcomes . Several studies have examined the accuracy of SF height in predicting SGA status at birth, but inconsistency in the results has been observed . Most SF height research has been conducted in hospital-based settings and has investigated the relationship between SF height and SGA status in high risk populations [13-15]. Because of a different prevalence (pre-test probability) of SGA, results from hospital-based studies cannot be extrapolated to primary care settings.
In this systematic review we aim to assess the sensitivity and specificity of SF height for the prediction of SGA status at birth in unselected and low-risk pregnancies.
Criteria for considering studies for this review
Studies were selected for inclusion in the review according to the population, index test, target condition, reference standard, outcome measure, and study design.
Studies examining singleton pregnancies in unselected or low-risk populations, conducted in comparable health care systems to Scandinavia (Northern, Western and Central Europe, USA, Canada, Australia, and New Zealand).
SF measurement compared to the SF distribution of the population.
SGA or FGR.
Diagnosis of FGR or SGA, defined as birth weight (BW) < 10th, 5th, or 3rd percentile, or ≥ one or two standard deviations (SDs) below the mean (performed postnatally).
Data required to populate 2 × 2 contingency tables.
Diagnostic cohort studies.
Search methods for identification of studies
Electronic databases (PubMed, Medline, Embase, CINAHL, Cochrane Library, and SweMEd) were searched to identify eligible diagnostic studies from the earliest year possible through September 2014. The search strategy was developed for PubMed and modified for use in other databases (see Additional file 1). The reference lists of all included publications and relevant systematic reviews were checked and forward citation searches were performed.
The search strategy involved combinations of SF-related terms appearing in subject headings and as keywords. Our Medline search query was (fund* adj height*) OR (symph* adj fund*) OR (uter* adj height*) OR (symph* adj height*) OR (gravidogram*) OR (uterus fundus height*) OR (uter* fund* height*). We conducted our search and reported our findings according to the Meta-Analysis of Observational Studies in Epidemiology and Preferred Reporting Items for Systematic Reviews and Meta-Analyses statements [16-18].
Data collection and analysis
A list of articles meeting the inclusion criteria based on abstracts was compiled. The full texts of these studies and those of uncertain relevance were retrieved. Two reviewers (ASDP and JW) independently evaluated the studies’ fulfillment of the inclusion criteria, with any discrepancy discussed with a third reviewer until a final set of relevant studies was agreed upon.
Data extraction and management
The following data were extracted from all selected studies: general information (first author, publication year, country of investigation), population (health care setting, number of participants, level of risk), study design (design, data collection), characteristics of SF height test (SF height curve, cut-off points), reference standard (SGA definition) and results (data required for the construction of 2 × 2 contingency tables). Data were entered into a database using Review Manager 5.3 software.
Assessment of methodological quality
The quality of each included study was assessed by two review authors (ASDP, JW) using the QUality Assessment of Diagnostic Accuracy Studies (QUADAS-2) checklist [19,20]. The QUADAS-2 checklist asks signaling questions in four risks of bias domains relating to patient selection, index test, reference standard, and flow and timing. Each domain is assessed in terms of risk of bias, and the first three domains are also assessed in terms of applicability. The review authors classified each item as “yes” (adequately addressed), “no” (inadequately addressed), or “unclear” (inadequate detail presented to allow a judgment to be made). The QUADAS-2 tool is shown in Additional file 2.
Statistical analysis and data synthesis
Data on sensitivity, specificity, and true-positive, false-positive, true-negative, and false-negative results were taken directly from the source papers or, if necessary, calculated from the data provided. Positive likelihood ratios (PLRs), negative likelihood ratios (NLRs), diagnostic odds ratios (DORs), and 95% confidence intervals (CIs) were calculated.
An LR describes how many times more likely it is that a person with the target condition will receive a particular test result than will a person without it. Categorization of LRs was adopted from Deeks et al.  where PLRs > 10 or NLRs < 0.1 are considered to provide convincing diagnostic evidence. The DOR is commonly used as an overall indicator of diagnostic performance and calculated as the odds of a positive test result among those with the target condition, divided by the odds of a positive test result among those without the condition. As a general rule, a DORs > 100 indicates high accuracy, values of 25–100 indicate moderate accuracy, and those < 25 indicates that the test is not useful .
The data were displayed graphically on forest and summary receiver operating characteristic (SROC) plots . The SROC curve was fitted using the hierarchical bivariate random-effects method . For studies that used more than one SF threshold, the analysis was based on the cut-off point of “one value < 10th percentile”.
Investigation of heterogeneity
Both clinical and statistical heterogeneity were evaluated. Assessment of clinical heterogeneity involved comparison of SF reference curves, cut-off criteria used to identify abnormal results, and SGA definitions. Assessment of statistical heterogeneity involved visual inspection of forest plots and calculation of the inconsistency index (I2), which describes the percentage of total variation across studies that is due to heterogeneity, rather than chance .
Initial database searches retrieved 722 citations of which 525 citations remained after duplicates were removed (Figure 1). Screening of the titles and abstracts identified 51 potentially relevant articles that were retrieved in full text format. Forward and backward citation tracking did not result in the identification of additional relevant articles. Eight articles were included in final analyses. Additional file 3 lists the reasons for excluding 43 articles on the basis of study population, design or outcome measures.
Characteristics of included studies [25-32] are presented in Table 1. All studies were published before 1991. Most studies used locally derived SF curves. Different cut-off criteria were used to identify abnormal results, including one value < 10th percentile; two consecutive or three isolated values < 10th percentile; one value > 2 cm below the mean; one value > 2 cm below the mean or three static or falling values; and one value > two SDs below the mean. Definitions of SGA included BW < 10th percentile, < 5th percentile, and ≥ two SDs below the mean, according to local standards.
Methodological quality of included studies
The QUADAS-2 ratings of risk of bias and study applicability are shown in Table 2. Based on the inclusion criteria, no included study had a case–control design. All studies avoided inappropriate exclusions. Six of the eight studies used consecutive or random recruitment of participants. The two remaining studies [30,32] did not report such information and were considered to be at unclear risk of patient selection bias. Most studies had a low risk of bias due to patient flow and timing; seven of eight studies involved the analysis of all recruited participants and one analysis included 78% of recruited participants . Studies included in this review had a low risk of bias for the conduct of the reference standard. All studies used pre-specified index test thresholds. No study reported blinding to test results, but BW is objective and should not result in bias. Regarding the applicability of studies to the review questions, no study raised concern about the index test, reference standard or patient selection.
Accuracy of SF height for the prediction of SGA defined as BW < 10th percentile
Seven studies assessed the accuracy of SF height for the prediction of SGA defined as BW < 10th percentile. Sensitivities ranged from 0.27 to 0.76 and specificities ranged from 0.79 to 0.92. All studies produced DORs exceeding 1 and CIs that did not include 1, implying that the positive association of SF height with SGA was not due to chance alone. PLRs exceeded 1 in all studies, indicating that abnormal SF height values were associated with SGA status at birth. However all PLRs were <10, the threshold generally accepted for a useful test. The same seven studies reported NLRs < 1, indicating that normal SF height values were correctly associated with the absence of SGA. However, no study met the accepted criterion of NLR < 0.1 in this group of women. The SROC curve (Figure 2) constructed using data from these studies lies to the left of the diagonal, signifying that the SF height test has value. The I2 value was typically high (98%). Given the small number of included studies (and thus low statistical power), subgroup analyses and covariate hierarchical modeling to investigate heterogeneity were not performed.
Accuracy of SF height for the prediction of SGA defined as BW <5th percentile
One study assessed the accuracy of SF height for the prediction of SGA defined as BW < 5th percentile. This study used several cut-off points, with stricter criteria yielding lower sensitivity and higher specificity values. NLRs and PLRs did not meet the accepted criteria for classification of SF height measurement as a useful test.
Accuracy of SF height for the prediction of SGA defined as BW ≥ 2 SDs below the mean
One study assessed the outcome of SGA defined as BW ≥ 2 SDs below the mean. For a less strict SF cut-off point (one value > 2 cm below mean or falling or static values), the authors reported low sensitivity (59%) and high specificity (97%). The PLR exceeded 10, but the NLR did not meet the required criterion of <0.1.
SF height measurement seems to have some significance for the prediction of SGA defined as BW < 10th percentile. All studies reported DORs > 1. The SROC curve (Figure 2) lies to the left of the diagonal, signifying that the SF height test has value. Adequate levels of sensitivity appear to be achieved at the expense of lower specificity, with higher numbers of false-positive SF results. The study of Rogers et al.  positioned at the upper left of the SROC curve produced the most significant results supporting the use of SF height. Its false negative rate of only seven is likely to be due to the small size of the study. In contrast, the study of Persson et al.  is the largest study and has the narrowest CI. Its sensitivity and specificity lies along the SROC line, adding weight to our findings.
For the prediction of SGA defined as BW < 5th percentile and BW ≥ 2 SDs below the mean, no summary measure could be performed due to the insufficient number of studies assessing these outcomes. Further assessment of the predictive value of SF in prediction of SGA defined as BW < 5th percentile and BW ≥ 2 SDs below the mean is required.
The diagnostic accuracy of SF height in other populations of pregnant women has recently been reviewed. Goto  assessed the diagnostic value of SF height, mainly in developing countries. However, this review included studies across a wide range of ethnic groups, clinical settings and disease spectrums. Despite such a diverse case mix, the study did not assess its effect on the pooled estimates, thus making it difficult to interpret its finding in a low-risk setting. In view of these limitations, we applied more strict inclusion criteria in our study, focusing mainly on a more homogenous and relevant population.
Strengths and weaknesses of the review
The majority of studies available in this systematic review were conducted in the 1980s. Given the limited amount of data available for the accuracy of SF height measurement, we did not discard studies based solely on year of publication. All included studies had low concern regarding applicability, implying that evidence is relevant to current practice. The focus on nations with comparable health systems means that the findings may not be relevant to different and less well-resourced national health systems.
Many parameters involving the performance of SF height measurement, such as technique, frequency of measurement, and performer’s experience, potentially affect test accuracy. Unfortunately, we did not have detailed information about the test conditions, limiting our ability to explore the effects of potential differences in methods. As no universal SGA definition has been established, the studies included in this review may also have been biased by the choice of reference test. Our inclusion criteria required postnatal confirmation of SGA classification. All studies fulfilled this requirement, but most did not provide information about how gestational age was determined or which BW reference were used to classify SGA status postnatally.
This review focused on the role of SF height in detecting SGA as a proxy for FGR. However, FGR can exist without SGA. The role of SF height in this setting remains undefined because all SF height studies in this review used SGA as an outcome. Customized SF charts (adjusted for ethnicity, parity, and body mass index) are said to be better predictors of FGR . Furthermore, this review did not address the issue of effect, for which additional studies would be needed to assess the role of SF height.
Ultimately, the lack of large cohort studies conducted in routine prenatal care setting that were suitable for our analysis was the main limitation of this review.
Applicability of findings to clinical practice and policy
SF height can be the first parameter raising suspicion of FGR. We have previously discussed the limitations of the study populations. However, our results can be applied to low-risk and unselected pregnancies in routine prenatal care setting, which is useful for general practitioners and midwifes to assure the identification of pregnancies at risk of SGA.
We found that the SF height test had a sensitivity ranging from 0.27 to 0.76, which means it potentially fails to identify over 70% of pregnancies affected by SGA. This is important to consider in counselling of pregnant women. However, in clinical practice the SF height test is not carried out in isolation and the combination of other clinical findings, medical conditions and previous obstetric history, together will contribute to estimating the likelihood of being at risk for SGA.
Our results show that the SF height test has a high degree of specificity (≥80% in all studies), indicating that few pregnancies not characterized by SGA are referred for ultrasound examination in practice. However, in this case over-referral or the misidentification of pregnancies as at risk is of less concern than the failure to identify pregnancies at risk.
Primary screening should emphasize the importance of sensitivity over specificity to identify almost all at-risk participants. No test is perfect and there will always be problems with incorrect results, e.g., anxiety and unnecessary intervention due to a false-positive result or a false sense of security caused by a false-negative result. A positive SF screening result can usually be confirmed or refuted with further evaluation of fetal growth and well-being by a specialist.
Implications for practice
SF height can play a role in clinical practice. It is a non-invasive, simple, and inexpensive method. However, it has low sensitivity. Other techniques that could improve upon this limitation (e.g., routine ultrasound in the third trimester) have not been implemented in the routine prenatal care setting . We recommend the continued use of SF height measurement in clinical practice as one of several indicators for referral to an obstetric care unit. However, clinicians must understand the limitations of the test.
Implications for research
Further studies including larger numbers of patients and better standardized reporting criteria are desirable. The accuracy of adjusted over unadjusted SF curves needs to be evaluated.
Diagnostic odds ratio
Fetal growth restriction
Negative likelihood ratio
Positive likelihood ratio
Preferred reporting items for systematic reviews and meta-analyses
Quality assessment of diagnostic accuracy studies
Summary receiver operating characteristic
Frøen JF, Gardosi JO, Thurmann A, Francis A, Stray-Pedersen B. Restricted fetal growth in sudden intrauterine unexplained death. Acta Obstet Gynecol Scand. 2004;83(9):801–7.
Goldenberg RL, Hoffman HJ, Cliver SP. Neurodevelopmental outcome of small-for-gestational-age infants. Eur J Clin Nutr. 1998;52 Suppl 1:S54–8.
Selling KE, Carstensen J, Finnstrom O, Sydsjo G. Intergenerational effects of preterm birth and reduced intrauterine growth: a population-based study of Swedish mother-offspring pairs. BJOG. 2006;113(4):430–40.
Simchen MJ, Beiner ME, Strauss-Liviathan N, Dulitzky M, Kuint J, Mashiach S, et al. Neonatal outcome in growth-restricted versus appropriately grown preterm infants. Am J Perinatol. 2000;17(4):187–92.
Lindqvist PG, Molin J. Does antenatal identification of small-for-gestational age fetuses significantly improve their outcome? Ultrasound Obstet Gynecol. 2005;25(3):258–64.
Sosial- og helsedirektoratet. Retningslinjer for svangerskapsomsorgen [Guidelines for antenatal care] ( in Norwegian). Oslo: Sosial- og helsedirektoratet; 2005.
Sundhedsstyrelsen. Anbefalinger for svangreomsorgen [Recomendations for antenatal care] (in Danish). København: Sundhedsstyrelsen; 2013.
Svensk förening för obstetrik och gynekologi. Mödrahälsovård, sexuell och reproduktiv hälsa [Antenatal care, sexual and reproductive health] (in Swedish). Stockholm: SFOG; 2008.
Steingrimsdottir T, Cnattingius S, Lindmark G. Symphysis-fundus height: construction of a new Swedish reference curve, based on ultrasonically dated pregnancies. Acta Obstet Gynecol Scand. 1995;74(5):346–51.
Westin B. Gravidogram and fetal growth. Comparison with biochemical supervision. Acta Obstet Gynecol Scand. 1977;56(4):273–82.
Robert Peter J, Ho JJ, Valliapan J, Sivasangari S. Symphysial fundal height (SFH) measurement in pregnancy for detecting abnormal fetal growth. Cochrane Database Syst Rev. 2012;7:CD008136.
Morse K, Williams A, Gardosi J. Fetal growth screening by fundal height measurement. Best Pract Res Clin Obstet Gynaecol. 2009;23(6):809–18.
Belizan JM, Villar J, Nardin JC. Diagnosis of intrauterine growth retardation by a simple clinical method: measurement of uterine height. Am J Obstet Gynecol. 1978;131(6):643–6.
Quaranta P, Currell R, Redman CWG, Robinson JS. Prediction of small-for-dates infants by measurement of symphysial-fundal-height. Br J Obstet Gynaecol. 1981;88(2):115–9.
Wallin A, Gyllensward A, Westin B. Symphysis-fundus measurement in prediction of fetal growth disturbances. Acta Obstet Gynecol Scand. 1981;60(3):317–23.
Liberati A, Altman DG, Tetzlaff J, Mulrow C, Gotzsche PC, Ioannidis JP, et al. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. J Clin Epidemiol. 2009;62(10):e1–34.
Moher D, Liberati A, Tetzlaff J, Altman DG, Group P. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA Statement. Open Med. 2009;3(3):e123–30.
Stroup DF, Berlin JA, Morton SC, Olkin I, Williamson GD, Rennie D, et al. Meta-analysis of observational studies in epidemiology: a proposal for reporting. Meta-analysis Of Observational Studies in Epidemiology (MOOSE) group. JAMA. 2000;283(15):2008–12.
Whiting PF, Rutjes AW, Westwood ME, Mallett S, Deeks JJ, Reitsma JB, et al. QUADAS-2: a revised tool for the quality assessment of diagnostic accuracy studies. Ann Intern Med. 2011;155(8):529–36.
Whiting PF, Weswood ME, Rutjes AW, Reitsma JB, Bossuyt PN, Kleijnen J. Evaluation of QUADAS, a tool for the quality assessment of diagnostic accuracy studies. BMC Med Res Methodol. 2006;6:9.
Deeks JJ. Systematic reviews in health care: systematic reviews of evaluations of diagnostic and screening tests. BMJ. 2001;323(7305):157–62.
Macaskill P, Gatsonis C, Deeks JJ, Harbord RM, Takwoingi Y. Chapter 10: analysing and presenting results. In: Deeks JJ, Bossuyt PM, Gatsonis C, editors. Cochrane Handbook for Systematic Reviews of Diagnostic Test Accuracy Version 1.0. The Cochrane Collaboration. 2010. Available from: http://srdta.cochrane.org/.
Rutter CM, Gatsonis CA. A hierarchical regression approach to meta-analysis of diagnostic test accuracy evaluations. Stat Med. 2001;20(19):2865–84.
Higgins JP, Thompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta-analyses. BMJ. 2003;327(7414):557–60.
Calvert JP, Crean EE, Newcombe RG, Pearson JF. Antenatal screening by measurement of symphysis-fundus height. Br Med J. 1982;285(6345):846–9.
Cnattingius S. Antenatal screening for small-for-gestational-age, using risk factors and measurements of the symphysis-fundus distance - 6 Years of experience. Early Hum Dev. 1988;18(2–3):191–7.
Jensen OH, Larsen S. Evaluation of symphysis-fundus measurements and weighing during pregnancy. Acta Obstet Gynecol Scand. 1991;70(1):13–6.
Pearce JM, Campbell S. A comparison of symphysis-fundal height and ultrasound as screening tests for light-for-gestational age infants. Br J Obstet Gynaecol. 1987;94(2):100–4.
Persson B, Stangenberg M, Lunell NO. Prediction of size of infants at birth by measurement of symphysis fundus height. Br J Obstet Gynaecol. 1986;93(3):206–11.
Rogers MS, Needham PG. Evaluation of fundal height measurement in antenatal care. Aust N Z J Obstet Gynaecol. 1985;25(2):87–90.
Rosenberg K, Grant JM, Tweedie I. Measurement of fundal height as a screening test for fetal growth retardation. Br J Obstet Gynaecol. 1982;89(6):447–50.
Stuart JM, Healy TJ, Sutton M, Swingler GR. Symphysis-fundus measurements in screening for small-for-dates infants: a community based study in Gloucestershire. J R Coll Gen Pract. 1989;39(319):45–8.
Goto E. Prediction of low birthweight and small for gestational age from symphysis-fundal height mainly in developing countries: a meta-analysis. J Epidemiol Community Health. 2013;67(12):999–1005.
Gardosi J, Francis A. Controlled trial of fundal height measurement plotted on customised antenatal growth charts. Br J Obstet Gynaecol. 1999;106(4):309–17.
Skrastad RB, Eik-Nes SH, Sviggum O, Johansen OJ, Salvesen KA, Romundstad PR, et al. A randomized controlled trial of third-trimester routine ultrasound in a non-selected population. Acta Obstet Gynecol Scand. 2013;92(12):1353–60.
The study was supported by the Norwegian Extra Foundation for Health and Rehabilitation and the Norwegian SIDS and Stillbirth Society (grant no. 2010/10230).
The authors declare that they have no competing interests.
ASDP, JW, BJ, AS, BB, and AK contributed to the conception and design of the study, interpretation of results, and writing of the manuscript. ASDP performed the statistical analyses and drafted the manuscript. All authors participated in the evaluation of the data and approved the final manuscript.