Data source
PRAMS is an ongoing state-based surveillance system of self-reported maternal behaviors, attitudes, and experiences before, during, and shortly after pregnancy [5]. It is administered by the Division of Reproductive Health (DRH) at CDC in collaboration with state health departments. The PRAMS protocol is approved by the Institutional Review Board of the CDC and by each participating PRAMS site. From each participating site, a stratified, random sample of women with a recent live birth is selected monthly from birth certificate files. Women are surveyed 2–6 months postpartum (average = 4 months) using a standardized protocol and questionnaire. In the current study, we examined 2016–2018 PRAMS data (n = 65,803) from 23 states (a total of 1,471 counties) that agreed to provide their data for this research. The dataset included a geographic identifier of county (the number of respondents in 1,405 counties with non-zero respondents ranged from 1 to 2,335 with a median of 10, and 66 counties did not contain any respondents) for each respondent and individual level demographic and health variables. Annually, PRAMS data for each site are weighted for sampling design, nonresponse, and noncoverage to produce data representative of the site’s birth population for the year.
Two health-related behaviors, infant sleeping position and maternal breastfeeding duration, were assessed using 2 questions from PRAMS. Respondents were asked, “In which one position do you most often lay your baby down to sleep now?” and responses were classified into 2 groups, supine sleep position (on back) and non-supine (on side or on stomach). Respondents were also asked several questions related to maternal breastfeeding. If they were currently (at the time of survey completion) breastfeeding (or feeding pumped milk) or, for those not breastfeeding at the time of survey completion, if they ever breastfed (or pumped breast milk) for ≥ 8 weeks or ≥ 2 months, then they were coded as any breastfeeding at 8 weeks; otherwise, they were coded as not. Missing values or responses with “do not know” were excluded for each indicator (n = 2,680 for infant sleeping position and n = 2,637 for maternal breastfeeding duration). We calculated the state-level weighted direct estimates for both outcomes using SUDAAN (Research Triangle Institute, North Carolina).
We obtained county-level direct estimates from PhillyPRAMS 2018, which is not a part of PRAMS but an independent survey sponsored by the Philadelphia Department of Public Health (https://www.phila.gov/departments/department-of-public-health). It provides estimates for Philadelphia County (which includes the city of Philadelphia) for indicators similar to those in PRAMS with a sample size of 1,489. The data were acquired by DRH through a data use agreement. Given identical questions on breastfeeding and infant sleep position on the 2 surveys, we defined infant non-supine sleeping position and breastfeeding for 8 weeks for PhillyPRAMS in the same way as for PRAMS data and calculated their weighted direct estimates for Philadelphia County using SUDAAN (Research Triangle Institute, North Carolina).
Model construction
We constructed a multilevel logistic regression model for each of the binary outcomes, Y, respectively.
$$P(Y_{ijk} = 1) = \log {\text{it}}^{ - 1} \left( {X_{ijk} \beta + {\text{re}}_{j \left( i \right)} + {\text{re}}_{ i} } \right)$$
(1)
where
Yijk: Survey response of infant non-supine sleeping position (yes or no) or breastfeeding at 8 weeks (yes or no) from respondent k in county j and state i
\({\mathrm{X}}_{\mathrm{ijk}}\): A matrix of respondent k’s demographic and socioeconomic variables.
\(\beta\): A fixed but unknown parameter vector for \({X}_{\mathrm{ijk}}\)
\({\mathrm{re}}_{j (i)}\): Random effect of county j nested in state i.
\({\mathrm{re}}_{i}\): Random effect of state i.
\({X}_{ijk}\) was a set of respondent k’s covariates. It initially included variables that were associated with Yijk based on literature review, such as maternal age, maternal race, maternal marital status, maternal education level, paternal education level, and infant sex and that were evaluated based on Pickering et al.’s empirical approach [16, 17]. Only those variables that showed significant associations (the square of the parameter estimate divided by the square of its standard error greater than 4) entered the model. The final variables included in the model were maternal education (less than high school, high school, some college, college and above), maternal Hispanic ethnicity (yes or no), and maternal race (white, black, Asian, and other; other included American Indian, Hawaiian, non-white other, Alaska Natives, or mixed race) for both indicators. As the county-level random effect, \({\mathrm{re}}_{j(i)}\), was not significant (p > 0.05) in the model for infant non-supine sleeping position, it was excluded from the final model for this outcome. If the between-state variance (variance in null model minus variance in final model [model includes random effects only] and then divided by variance in the null model, %) was greater than 40%, then we moved on to the next step of prediction and produced the estimates. We performed modeling using the SAS 9.4 GLIMMIX procedure (SAS Institute, Cary, North Carolina). The residual subject-specific pseudo-likelihood method was used to produce the parameter estimates \(\widehat{\beta }\) (with variance \({\widehat{\upsigma }}_{\widehat{\beta }}\)) and empirical best linear unbiased predictors, \({\widehat{\mathrm{re}}}_{j(i)}\)(with variance \({\widehat{\upsigma }}_{{\widehat{\mathrm{re}}}_{j(i)}}\)) and \({\widehat{\mathrm{re}}}_{i} (\mathrm{with variance }{\widehat{\upsigma }}_{{\widehat{\mathrm{re}}}_{i}})\). The covariance structure was specified as variance components. For the 66 counties in the 23 states included in this study without any respondents in PRAMS, we also generated their \({\widehat{\mathrm{re}}}_{j(i)}\) by averaging their neighboring \({\widehat{\mathrm{re}}}_{j(i)}\) so that we could provide estimated prevalence for all the counties.
Model parameters were then applied to live birth counts from National Center for Health Statistics birth certificate files, 2016–2018. To post-stratify, we categorized the live birth counts by maternal education level, maternal Hispanic status, and maternal race for each county; thus, each county had a total of 32 (4 × 2 × 4) categories. The predicted probability (\({\widehat{p}}_{\mathrm{ijm}}\)) of breastfeeding at 8 weeks for the mth category in county j, state i was calculated based on the following formula:
$$\hat{p}_{ijm} = \exp \left( {X_{ijm} \hat{\beta } + \widehat{{{\text{re}}}}_{j\left( i \right)} + \widehat{{{\text{re}}}}_{i} } \right)/\left(1 + \exp \left( {X_{ijm} \hat{\beta } + \widehat{{{\text{re}}}}_{j\left( i \right)} + \widehat{{{\text{re}}}}_{i} } \right)\right)$$
(2)
where X is a matrix of demographic variables, and \({X}_{\mathrm{ijm}}\) is the row of category m, and \({X}_{ijm}\widehat{\beta }={\widehat{\beta }}_{\mathrm{maternal\,education}}+{\widehat{\beta }}_{\mathrm{maternal}\,\mathrm{hispanic}}+{\widehat{\beta }}_{\mathrm{maternal}\,\mathrm{race}}\). The formula for the probability of infant non-supine sleeping position was similar to (2) but omitting \({\widehat{\mathrm{re}}}_{j(i)}\). With \({\widehat{p}}_{ijm}\), we could calculate the estimated prevalence of the indicator through post-stratification for state i and for county j as below:
$$\begin{gathered} \hat{P}_{i} = \sum \left( {\hat{p}_{ijm} *N_{ijm} } \right)/\sum N_{ijm} \hfill \\ \hat{P}_{j} = \sum \left( {\hat{p}_{ijm} *N_{jm} } \right)/\sum N_{jm} \hfill \\ \end{gathered}$$
(3)
where \(\sum {N}_{ijm}\) is the live birth counts of state i, and \(\sum {N}_{jm}\) is the live birth counts of county j. To take into account the uncertainty arising from the models, we adopted Monte Carlo simulation, a tool to generate sample statistics by using point estimates of model parameters and their asymptotic covariance matrix of these estimates [18] to simulate the distribution for the estimate. Thus formula (2) became the following:
$$\hat{p}_{ijm} = {\raise0.7ex\hbox{${\exp \left( {X_{ijm} \hat{\beta }^{*} + \widehat{{{\text{re}}}}_{j\left( i \right)}^{*} + \widehat{{{\text{re}}}}_{i}^{*} )} \right)}$} \!\mathord{\left/ {\vphantom {{\exp \left( {X_{ijm} \hat{\beta }^{*} + \widehat{{{\text{re}}}}_{j\left( i \right)}^{*} + \widehat{{{\text{re}}}}_{i}^{*} )} \right)} {\left( {1 + \exp \left( {X_{ijm} \hat{\beta }^{*} + \widehat{{{\text{re}}}}_{j\left( i \right)}^{*} + \widehat{{{\text{re}}}}_{i}^{*} } \right)} \right)}}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{${\left( {1 + \exp \left( {X_{ijm} \hat{\beta }^{*} + \widehat{{{\text{re}}}}_{j\left( i \right)}^{*} + \widehat{{{\text{re}}}}_{i}^{*} } \right)} \right)}$}}$$
(4)
where \({\widehat{\beta }}^{*}\) is a normal variate with mean \(\widehat{\beta }\) and variance \({\widehat{\upsigma }}_{\widehat{\beta }}\), \({\widehat{\mathrm{re}}}_{j(i)}^{*}\) is a normal variate with mean \({\widehat{\mathrm{re}}}_{j(i)}\) and variance \({\widehat{\upsigma }}_{{\widehat{\mathrm{re}}}_{j(i)}}\), and \({\widehat{\mathrm{re}}}_{i}^{*}\) is a normal variate with mean \({\widehat{\mathrm{re}}}_{i}\) and variance \({\widehat{\upsigma }}_{{\widehat{\mathrm{re}}}_{i}}\). Model (4) was repeated 1000 times, and then we applied each \({\widehat{p}}_{ijm}\) to formula (3), which generated 1,000 \({\widehat{P}}_{j}\). The mean and 95% confidence interval (CI, a range between the 2.5th and 97.5th values) were determined for breastfeeding at 8 weeks and similarly for infant non-supine sleeping position but without \({\widehat{\mathrm{re}}}_{j(i)}^{*}\) in formula (4). In a sensitivity analysis, we fit the same model but incorporated the sampling weights from the PRAMS data. We checked the performance of the models by comparing the model estimates with weighted estimates observed from the PRAMS survey file (state-level only) and PhillyPRAMS survey (county-level). All the above analysis was conducted in SAS 9.4 (SAS Institute, Cary, North Carolina).