Population
Three data sources were used in this retrospective study. Retrospective study was favored instead of prospective study because data was readily available, it is less expensive, and study can be completed within a shorter period. Data sources were: (1) De-identified linked birth/death vital health data records for WV 2010–2011 [24], (2) United States Census Bureau website, [25] and (3) County health ranking and roadmaps website [26].
The West Virginia University Institutional Review Board (IRB) approved this study.
Measures
The overall goal of this study was to investigate an association of county perinatal resources and GWG in WV, a predominately rural state in central Appalachia. West Virginia Bureau for Public Health, Charleston, WV provided de-identified 2010–2011 linked birth/death data. The number of births for a two-year period was 41,176 births in 55 counties. After removing univariate outliers for GWG, as well as exclusion of observations with missing maternal age, missing gestational age, and missing birth weight, 41,106 observations remained.
Percentage of county population in poverty in 2010–2011 was extracted from U.S. Census Bureau [25]. Percentage of county zip codes with health food as well as number of primary care providers in 2010–2011 were extracted from County health rankings and roadmaps website [26]. All these three datasets were linked together by county of residence.
Study variables included individual-level and global-level characteristics. Individual-level characteristics were birth identification number, birth year, gestational hypertension, infant’s sex, insurance type/payer, maternal education level, smoking during pregnancy, gestational diabetes, trimester prenatal care began, GWG, maternal age, and race/ethnicity. Global level variables included county of residence, county’s perinatal resources, county’s medium household income, percentage of county population in poverty, county population estimate, number of county primary care providers, and percentage of county zip codes with access to healthy food options.
Outcome
The outcome of interest was a continuous variable called GWG (in lbs.) reported in linked birth vital health data records. GWG is maternal weight gain during pregnancy, which is the difference between pregnancy weight (i.e. maternal weight before birth) and pre-pregnancy weight. Healthcare providers collects maternal pregnancy weight during prenatal visits.
Predictor
The primary predictor of interest was county perinatal resources which is a composite categorical variable (0 = Above Average, 1 = Average, 2 = Below Average). Number of primary care providers in a county and percentage of county zip codes with healthy food options were used to derive categorical county perinatal resource levels. Primary health care providers were identified as appropriate if they met or exceeded Solucient recommendation of approximately 23 primary care providers per 100,000 population, and not appropriate otherwise [27]. Zip codes have been used in the past to study disparities in access to healthy food options [28, 29]. Healthy food options was defined as having at least 50% of county zip code with access to healthy food options (e.g., supermarkets). Access to healthy food options was defined as grocery stores with greater than 4 employees with fresh fruits and vegetables stands as provided by county health ranking and roadmaps [26]. When census tracts are typically used to identify food deserts, at least 33% must have limited access to healthy food options [30]. Although food desert can be defined as counties with at least 50% of its population having limited access to health food options, [31] 50% is used in this study to describe access of health food options to minimize overestimation of the findings.
Thus, county perinatal resources were obtained by combining percentage of county zip codes with access to healthy food options and number of primary care providers in a county as follows: Counties with equal to or less than 50% of zip codes with access to healthy food options, and with less than recommended number of primary care providers was coded as having below average perinatal resources. Counties with equal to or less than 50% of zip codes with access to healthy food options and with more than recommended number of primary care providers, or vice versa was coded as having average perinatal resources. Finally, counties with more than 50% of zip codes with access to healthy food options, and with more than recommended number of primary care providers was coded as having above average perinatal resources.
Covariates
Covariates included categorical and continuous variables. Categorical covariates included gestational hypertension, neonate’s sex, smoking during pregnancy, race/ethnicity, trimester prenatal care began, and insurance type. Continuous covariates included, percentage of county population in poverty, and maternal age (years).
Statistical methods
Data management and analysis used in this paper was SAS/STAT software, version 9.4 of SAS systems for windows [32]. Descriptive statistics are presented to explain measures of central tendency and spread for continuous variables as well as proportion and frequency distribution for categorical variables. Additionally, inferential statistics was conducted in order to produce population parameters for hypotheses testing using hierarchical linear models (HLM) using mixed modeling procedure with random effects as described by Suzuki and Sheu, and Bell and his colleagues [33, 34]. HLM is recommended for nested data.
Regression assumptions were tested during statistical modeling process. Before deriving a parsimonious model, continuous variables were tested for bivariate associations using Pearson’s correlation. Furthermore, categorical variables were assessed for bivariate associations using Spearman’s correlation. Most covariates were not correlated except weak correlation between insurance type and maternal education level, and between smoking during pregnancy and maternal education level (both had r = − 0.3). Strong correlation existed between county median household income and percentage of county population in poverty (r = − 0.8). Thus, education and median household income were dropped from the model due to concerns about multicollinearity. All other regression model assumptions held.
A final parsimonious model was obtained by first, removal of all interactions with p-value greater than 0.1, and second, removing all interactions at once with p-value greater than 0.05. The second step above was repeated until no interaction had p-value greater than 0.05. Finally, main effects removed if they had p-value greater than 0.05. The best fitting model was selected via Akaike information criterion (AIC) goodness-of-fit, taking into account variability due to nesting as assessed by calculating intraclass coefficient (ICC). Fully parameterized HLM was a better fit than intercept only HLM (AIC = 251,985, ICC = 0.05 and AIC = 203,919, ICC = 0.07 respectively. HLM was used to capture differences between different levels of county perinatal resources. Final model results, after assumption testing, are presented. These results include omnibus test, parameter estimates, standard errors, confidence intervals, t-statistics, and p-values.