Skip to main content

Evaluation of four gamma-based methods for calculating confidence intervals for age-adjusted mortality rates when data are sparse

Abstract

Background

Equal-tailed confidence intervals that maintain nominal coverage (0.95 or greater probability that a 95% confidence interval covers the true value) are useful in interval-based statistical reliability standards, because they remain conservative. For age-adjusted death rates, while the Fay–Feuer gamma method remains the gold standard, modifications have been proposed to streamline implementation and/or obtain more efficient intervals (shorter intervals that retain nominal coverage).

Methods

This paper evaluates three such modifications for use in interval-based statistical reliability standards, the Anderson–Rosenberg, Tiwari, and Fay–Kim intervals, when data are sparse and sample size-based standards alone are overly coarse. Initial simulations were anchored around small populations (P = 2400 or 1200), the median crude all-cause US mortality rate in 2010–2019 (833.8 per 100,000), and the corresponding age-specific probabilities of death. To allow for greater variation in the age-adjustment weights and age-specific probabilities, a second set of simulations draws those at random, while holding the mean number of deaths at 20 or 10. Finally, county-level mortality data by race/ethnicity from four causes are selected to capture even greater variation: all causes, external causes, congenital malformations, and Alzheimer disease.

Results

The three modifications had comparable performance when the number of deaths was large relative to the denominator and the age distribution was as in the standard population. However, for sparse county-level data by race/ethnicity for rarer causes of death, and for which the age distribution differed sharply from the standard population, coverage probability in all but the Fay–Feuer method sometimes fell below 0.95. More efficient intervals than the Fay–Feuer interval were identified under specific circumstances. When the coefficient of variation of the age-adjustment weights was below 0.5, the Anderson–Rosenberg and Tiwari intervals appeared to be more efficient, whereas when it was above 0.5, the Fay–Kim interval appeared to be more efficient.

Conclusions

As national and international agencies reassess prevailing data presentation standards to release age-adjusted estimates for smaller areas or population subgroups than previously presented, the Fay–Feuer interval can be used to develop interval-based statistical reliability standards with appropriate thresholds that are generally applicable. For data that meet certain statistical conditions, more efficient intervals could be considered.

Peer Review reports

Background

The number of deaths reported for any given age group and time period can be assumed to follow a Poisson distribution, which leads to exact confidence intervals (CIs) for age-specific mortality rates [1, 2]. Further, because the sum of independent Poisson random variables is Poisson-distributed, the crude mortality rate also has an exact CI. However, no exact CI is known for age-adjusted mortality rates (i.e., directly standardized rates), because those are based on a weighted sum of Poisson random variables [3].

Various methods have been proposed to calculate approximate CIs for directly standardized rates, and recent simulation studies have continued to compare those methods based on metrics such as coverage probability and expected width; see [4,5,6] for three such simulation studies. To date, only the gamma-based method of Fay and Feuer [7] has been shown empirically to guarantee nominal coverage (e.g., 0.95 or higher probability that a 95% CI covers the true rate) in all simulation and real-world settings considered, though it often results in overly wide CIs. Tiwari et al. [8] developed a modification of the Fay–Feuer method to address the need for more efficient intervals (i.e., shorter intervals that retain nominal coverage) to accompany the estimates of age-adjusted rates that are published by the National Cancer Institute (NCI) at the US National Institutes of Health [9]. However, in some cases, CIs based on the Tiwari method can fail to retain nominal coverage [4]. Fay and Kim [10] proposed a mid-p modification to the Fay–Feuer CI, which does not guarantee nominal coverage but achieves it in many situations while remaining narrower than the Fay–Feuer or Tiwari CIs.

The Division of Vital Statistics at the US National Center for Health Statistics (NCHS), Centers for Disease Control and Prevention (CDC), had developed a gamma-based approximation to the Fay–Feuer method for the age-adjusted mortality rates that it published; see technical notes in Anderson and Rosenberg [11] as well as the methods section, below, for a description. Whereas the US Cancer Statistics (which include cancer registry data from NCI’s Surveillance, Epidemiology, and End Results Program as well as from CDC’s National Program of Cancer Registries) currently use the Fay–Feuer and Tiwari CIs [12], NCHS publications (e.g., National Vital Statistics Reports) and CDC WONDER use the Anderson–Rosenberg method when the number of events is less than 100; for 100 events or more, the normal CI is used [11, 13]. By design, the Anderson–Rosenberg method was simpler than the Fay–Feuer method to implement at NCHS as well as in the 57 state and local vital registration jurisdictions [14] because it could use pre-tabulated standard values for the upper and lower CI limits and allowed the user to more easily replicate calculations from published data.

With the wider availability of computing resources, the simplicity of the Anderson–Rosenberg method can no longer be the standalone rationale for its continued adoption in federal, state, or local agencies. Additionally, over the past 9 years, NCHS has been in the process of critically evaluating the hitherto prevailing statistical standards for the presentation of estimates in NCHS publications with an eye toward releasing statistically reliable estimates for sparse data (e.g., smaller geographical areas or population subgroups) that would have previously been suppressed due to sample size alone or other statistical considerations. Current statistical reliability standards for proportions at NCHS include sample-size based requirements (minimal sample or effective sample size) and interval-based criteria (thresholds for maximal length and relative width of “exact” confidence intervals) [15]. Similar criteria are under discussion for rates [16].

As of the writing of this manuscript, the prevailing NCHS standard for vital rates was sample size-based. Estimates would be suppressed or flagged as statistically unreliable if they were based on less than 20 events [17]. The interval-based thresholds discussed in [16] had not been adopted. For age-adjusted rates that are based on 20 or more events, and when the underlying at-risk population is large, the aforementioned gamma-based methods result in comparable CIs, all with at least nominal coverage, though, as will be seen below, the Anderson–Rosenberg CIs tend to be narrower (i.e., more efficient) than those from the original Fay–Feuer method and the Tiwari modification and can sometimes also be narrower than the Fay–Kim mid-p CIs. If the sample size threshold for presentation of rates was lowered from 20 to just 10 events, consistent with the minimum threshold required for disclosure protection of sub-national vitals data at NCHS [13] and elsewhere [18], but also with recent findings in [6] about the sufficient stability of estimates that are based on 10 events or more, then additional data presentation criteria could be required. If interval-based thresholds were to be used, then it would be necessary for the continued use of the Anderson–Rosenberg method to formally evaluate it in comparison with the other three gamma-based methods, specifically in terms of coverage probability and expected width, because, like the Tiwari and the Fay–Kim methods, it may also result in CIs that fail to maintain nominal coverage.

This paper conducts such a comparative evaluation, which, to our knowledge, had not previously been conducted. When data are sparse, our aim is to better understand the conditions that lead to the coverage probability of those a priori conservative CIs to fall below the desired level (for example, 0.95) or to CIs that are overly wide and less useful to assess precision. Our ultimate goal is to inform a CI-based statistical reliability threshold to use in conjunction with a sample size-based threshold of 10, say, as basis for the decision to suppress or present official estimates. Many other CI methods appear in the literature, and we do not aim to study them all here. We focus instead on the relative performance of the four aforementioned gamma-based methods because they are most relevant when a conservative approach to the assessment of statistical reliability of age-adjusted rates is desired.

Methods

With n age groups, let Di denote the number of deaths for group i. The Di are assumed to be independent Poisson random variables, and the age-specific rates Ri are defined as the ratios Di/Pi, with means \({\mathbb{E}}\)(Ri) = λi and variances \({\mathbb{V}}\)(Ri) = λi/pi.

Let πi denote the size of group i in the reference population; see "Technical Appendix". Let the wi denote the relative proportions for group i in the reference population: wi = πi/∑πj. The age-adjusted death rate R′ is defined as

$$R^{\prime} = \sum w_{i} R_{i} = \sum \left( {w_{i} /P_{i} } \right)D_{i}$$

Given the parameters λi and denominators Pi = pi, the age-adjusted rate R′ has mean \({\mathbb{E}}\)(R′) = λ′ = ∑wi λi and variance \({\mathbb{V}}\)(R′) = ∑wi2 λi/pi.

Fay–Feuer interval

As explained in "Technical Appendix", Fay and Feuer [7] conjecture that tail probabilities for the age-adjusted rate R′ can be approximated by those of a gamma-distributed random variable Z with \({\mathbb{E}}\)(Z) = y and \({\mathbb{V}}\)(Z) = v, i.e., with α = y2/v and β = v/y, where y = ∑(wi/pi) xi and v = ∑(wi/pi)2 xi:

$${\text{Pr}} (R^{\prime} \ge y|\lambda^{\prime}) \approx \Pr (Z \le \lambda^{\prime}|y,v)$$

As a result, the lower limit L(y) of an equal-tailed 100(1 − a) percent CI for the parameter λ′ can be resolved approximately from the lower tail probability of a gamma distribution with parameters α = y2/v and β = v/y, with the convention that L(0) = 0.

For the upper bound, the observed number of deaths xj within group j is incremented by 1, resulting in the addition of the quantity wj/pj to the age-adjusted rate y = ∑(wi/pi) xi. Because such a unit increment could be realized in any of the n groups,

$$\Pr [R^{\prime} > y|\lambda^{\prime} = U\left( y \right)\left] { \, \ge \, \Pr } \right[R^{\prime} \ge y + \kappa_{0} |\lambda^{\prime} = U\left( y \right)]$$

where κ0 = max{wj/pj}. Thus, an upper CI limit U(y) can be resolved from the upper tail probability of a gamma distribution with shape parameter α = y′2/v′ and scale parameter β = v′/y′ where y′ = y + κ0 and v′ = v + κ02.

Fay and Feuer [7] conjecture that the approximate gamma CI thus constructed remains conservative. Although this conjecture remains unproven, findings from the many simulation studies to date continue to support it, e.g., [4,5,6].

Tiwari modification

Tiwari et al. [8] developed a modification to the Fay–Feuer method described above by distributing an average increment 1/n uniformly across the n age groups instead of a unit increment in a single age group. Thus, with κ1 = n −1wi/pi and κ2 = n −1 ∑(wi/pi)2, the gamma random variable Z′ above now has mean y′ = y + κ1 and variance v′ = v + κ2. The Tiwari modification reduces the CI width relative to the Fay–Feuer method; see "Technical Appendix". However, the resulting CI sometimes fails to retain the nominal coverage level; see [4].

Fay–Kim modification

Fay and Kim [10] more recently developed a mid-p version of the Fay–Feuer CI, as detailed in "Technical Appendix". Drawing B = b from a Bernoulli distribution with Pr(B = 1) = 1/2, the mid-p version uses the following gamma distribution:

$${\text{gamma}}_{{\text{mid-p}}} = b \times {\text{gamma}}\left( {y^{2} /v,v/y} \right) \quad\quad\quad\quad\quad\quad + \left( {1 - b} \right) \times {\text{gamma}}\left( {y^{{\prime}{2}} /v^{\prime},v^{\prime}/y^{\prime}} \right)$$

where y′ = y + κ0 and v′ = v + κ02 are as in the Fay–Feuer construction. Thus, the lower and upper limits are defined as the (a/2)th and (1−a/2)th quantiles of gammamid-p. R syntax is provided to find numerical solutions L(y) and U(y) [10].

Anderson–Rosenberg approximation

Anderson and Rosenberg [11] had introduced an approximation to the Fay–Feuer upper CI limit that alleviated the need to calculate κ0 = max{wj/pj}; see "Technical Appendix". A “standardized” gamma random variable Gadj is defined as Z/(v/y), where the gamma-distributed Z has mean y and variance v. As a result, Gadj has mean and variance equal to y2/v. Define xadj = y2/v and 1/padj = v/y. If xadj was an integer, then there would exist a Poisson random variable Dadj with mean and variance equal to λ′ padj such that

$$\Pr \left( {D_{{{\text{adj}}}} \ge x_{{{\text{adj}}}} |\lambda^{\prime}} \right) = \Pr \left( {G_{{{\text{adj}}}} \le \lambda^{\prime}p_{{{\text{adj}}}} |x_{{{\text{adj}}}} } \right)$$

Because y2/v will generally not be integer, xadj is defined as the nearest integer instead (although this is not strictly necessary), and the equality in this last equation is assumed to hold approximately. Either way, CI limits L(y) and U(y) for λ′ are derived as the (a/2)-quantile of the gamma(xadj, 1/padj) distribution and the (1 − a/2)-quantile of the gamma(xadj + 1, 1/padj), respectively.

Comparisons among the four gamma-based CI methods

The Anderson–Rosenberg method can be seen not just as an approximation to, but as a modification of the Fay–Feuer CI that, like the Tiwari modification, reduces CI width. Further, a sufficient (but not necessary) condition exists that, when it holds, ensures the Anderson–Rosenberg CI is narrower than the Tiwari CI; see "Technical Appendix". Of course, as it is theoretically possible for both the Anderson–Rosenberg and the Tiwari CIs to be so narrow as to fail to retain nominal coverage, the empirical simulations, below, investigate situations where this may occur. In addition, these two CI methods are compared to the more recent Fay–Kim mid-p modification.

Several simulation scenarios were considered, each consisting of 500 simulations with 10,000 replicates. For each replication, the 95 percent CI limits were calculated according to the Fay–Feuer, Tiwari, Fay–Kim, and Anderson–Rosenberg methods. For each simulation, the coverage probability and expected CI width were tracked and plotted against the coefficient of variation (CV) of the weights ui = wi/pi, as variability of the latter is known to contribute to under-coverage [7]. To account for simulation error, nominal 95% coverage was considered to have been achieved if the simulated coverage probability was ≥ 0.9449, which is the one-sided 99% confidence limit for a binomial with size 10,000 and success probability 0.95.

NCHS conventionally rounds the age-specific mortality rates, expressed as rates per 100,000 population, to one decimal point prior to calculating the age-adjusted rate for dissemination. In the simulations, unrounded values, including for xadj and padj, were retained for comparability with the other two gamma methods.

All simulations and data analyses were conducted in R version 4.1.2 [19].

Scenario 1

In the first set of simulations, counts were anchored to the median annual crude all-cause mortality rate in the USA from 2010 to 2019, estimated at 833.8 per 100,000 population, and the corresponding median annual probabilities of death in each age group, namely 0.009, 0.001, 0.002, 0.011, 0.018, 0.028, 0.066, 0.132, 0.181, 0.239, and 0.313 for < 1 year, 1–4, 5–14, 15–24, 25–34, 35–44, 45–54, 55–64, 65–74, 75–84, and 85 years and over, respectively. An overall population size of P = 2400 was selected to target a small overall mean number of events \({\mathbb{E}}\)(D) = 20. The total number of deaths D was drawn from a Poisson distribution with mean \({\mathbb{E}}\)(D). The age-specific event counts Di were generated according to a multinomial distribution with ∑Di = D and cell probabilities drawn from a Dirichlet distribution with concentration parameters equal to 833.8 times the above probabilities for each group. Finally, group sizes were generated according to a multinomial with ∑Pi = P and cell probabilities anchored at the median annual US values for the period 2010–2019, namely (0.012, 0.050, 0.129, 0.137, 0.137, 0.127, 0.135, 0.126, 0.084, 0.043, and 0.019) for the 11 age groups listed above.

Another simulation was conducted using the same scenario 1, but with a smaller target mean \({\mathbb{E}}\)(D) = 10. Here, because counts under 10 may be suppressed for disclosure protection (e.g., state- or county-level estimates in NCHS vital statistics releases), the statistical properties of CIs that accompany presented (non-suppressed) estimates will be impacted. Thus, a truncated Poisson distribution was used in the simulation to maintain the overall number of deaths ∑Di = D ≥ 10, with the resulting true values of the crude, age-specific, and age-adjusted rates having been recalculated accordingly.

Because the year 2000 US standard population weights wi were held constant and the age-specific population sizes pi were generated in proportion to the overall US national age distribution, the CV for the weights ui in scenario 1 remained in a relatively narrow range and was typically no larger than 0.30. To evaluate the performance of the four gamma CIs in situations where the weights ui varied more widely, the settings in Fay and Feuer [7] were implemented, as described next.

Scenario 2

The second set of simulations mimicked the settings in [7] and [8], with the weights ui = wi/pi drawn at random from the uniform distribution on the unit interval. The total number of deaths D in the population was generated from a Poisson distribution with mean \({\mathbb{E}}\)(D) = 20. The age-specific probabilities of death were drawn independently from the uniform distribution on the unit interval, and the counts Di were drawn jointly from a multinomial distribution with ∑Di = D. Again, to study the effect of a smaller overall mean number of events and assess the impact of disclosure protection on the statistical properties of CIs for estimates that are not suppressed, we also experimented with \({\mathbb{E}}\)(D) = 10 using a truncated Poisson distribution to maintain ∑Di = D ≥ 10.

Scenario 3

Finally, the gamma CI methods were evaluated in county-level mortality data from four causes of death, selected to capture varying age distributions: all causes; external causes of morbidity and mortality (ICD-10 codes: V01–Y89); congenital malformations, deformations, and chromosomal anomalies (Q00–Q99); and Alzheimer disease and other degenerative diseases of the nervous system, not elsewhere classified (G30–G31).

County-level data were queried using CDC WONDER as 20-year aggregate counts over the 1999–2019 period for 3147 US counties (boundary changes notwithstanding). Data were tabulated by age group (< 1 year, 1–4, 5–14, 15–24, 25–34, 35–44, 45–54, 55–64, 65–74, 75–84, and 85 years and over), race (American Indian or Alaska Native; Asian or Pacific Islander; Black or African American; and White), and Hispanic origin (Hispanic or Latino and not Hispanic or Latino).

Some counties had numerator case or population denominator counts under 10 for selected combinations of age and race and Hispanic origin, which were suppressed in CDC WONDER due to the NCHS confidentiality protection rules. Those missing cell case and/or population counts were imputed for this analysis, holding fixed the marginal counts by age and race and Hispanic origin, to obtain a complete, semi-synthetic dataset to use in simulations.

To investigate the impact of high CV on CI coverage for those sparse county-level data, each county's observed overall numerator count and age-adjusted death rate were taken as the “truth” and 10,000 replicates were generated according to a Poisson distribution for that county with the mean equal to the observed numerator count. The county’s overall population denominator was kept fixed. Age-specific numerator counts were assigned according to a multinomial distribution conditional on the crude total, with assignment probabilities for the various age groups taken proportional to the observed counts for that county. As in scenario 1, age-adjusted rates were calculated relative to the year 2000 US standard population.

The analyses for scenario 3 were restricted to data by race and Hispanic origin instead of sex or other demographic characteristics because disparities in health and mortality outcomes by race and Hispanic origin remain an important public health concern in the US [20], and because county-level estimates by race and Hispanic origin can be based on sparse data (less than 20 deaths) and suppressed or flagged as statistically unreliable in official publications when sample size is the only criterion used to define statistical reliability.

Results

Scenario 1

The top row in Fig. 1 shows the result of the first set of simulations, with the Fay–Feuer, Tiwari, and Anderson–Rosenberg CIs retaining nominal coverage over the limited range of variability of the weights ui = wi/pi, whereas the coverage of the Fay–Kim CIs dipped below the 0.95 threshold in some cases, although those were within the simulation error bound of 0.9449. Additionally, the strength of the Anderson–Rosenberg approach is demonstrated in CIs that were consistently narrower (i.e., more efficient) than the Tiwari and Fay–Feuer CIs. The Fay–Kim method resulted in even narrower CIs for smaller CV values, albeit at the occasional cost of coverage probability falling below 0.9449.

Fig. 1
figure 1

Average width and coverage probability of selected gamma CIs for age-adjusted mortality rates: simulation scenario 1. Average CI width and coverage probability of the Anderson–Rosenberg, Tiwari, Fay–Kim, and Fay–Feuer gamma CIs is based on 500 simulations with 10,000 replications each for age-adjusted mortality rates R = ∑(wi/Pi) Di and is presented as a function of the coefficient of variation (CV) of the weights ui = wi/pi. Age-adjusted rates are anchored around an overall crude all-cause mortality rate of 833.8 per 100,000 population and the US national age distribution in 2010–2019. A multinomial distribution was used to generate the Di conditional on the total D = ∑Di. Results in the top row are for an overall population size of P = 2400, corresponding to \({\mathbb{E}}\)(D) = 20, whereas those in the bottom row are for P = 1200, corresponding to \({\mathbb{E}}\)(D) = 10. For the latter, a truncated Poisson distribution was used in simulations to ensure the overall numerator count remained ≥ 10

The bottom row in Fig. 1 shows the results of a second set of simulations conducted using the same scenario 1, but with a smaller target mean \({\mathbb{E}}\)(D) = 10 and a truncated Poisson distribution. The results here were similar to the ones in the top row of Fig. 1 and highlight the relative efficiency of the Anderson–Rosenberg CI, even in this sparser setting, compared with the Fay–Feuer and Tiwari methods for all values of CV(ui) shown, and with the Fay–Kim method for larger values of CV(ui).

Scenario 2

The top row in Fig. 2 shows the result of the second set of simulations, which allow for an increased variability in the weights ui = wi/pi, with both the Tiwari and Anderson–Rosenberg methods in close agreement and resulting in narrower intervals than the Fay–Feuer method while retaining 0.95 coverage, except for a handful of instances where the weights ui = wi/pi had CV close to 1.00. The Fay–Kim method performed relatively well when CV ≈ 1.00 compared to the Tiwari and Anderson–Rosenberg methods, increasing coverage probability (albeit with slightly wider CIs).

Fig. 2
figure 2

Average width and coverage probability of selected gamma CIs for age-adjusted mortality rates: simulation scenario 2. Average CI width and coverage probability of the Anderson–Rosenberg, Tiwari, Fay–Kim, and Fay–Feuer gamma CIs is based on 500 simulations with 10,000 replications each for age-adjusted mortality rates R = ∑(wi/Pi) Di and is presented as a function of the coefficient of variation (CV) of the weights ui = wi/pi. The weights ui and age-specific probabilities of death were drawn at random from the uniform distribution on the unit interval, and a multinomial distribution was used to generate the Di conditional on the total D = ∑Di. In the top row, D was generated from a Poisson distribution with mean 20, whereas a mean of 10 was used in the bottom row. For the latter, a truncated Poisson distribution was used in simulations to ensure D remained ≥ 10

The bottom row in Fig. 2 shows the impact of a smaller target mean \({\mathbb{E}}\)(D) = 10 and a truncated Poisson distribution, with the results that were again similar to the ones in the top row and showed adequate coverage for all three modifications to the original Fay–Feuer method, although coverage declined as weights variability increased.

Scenario 3

Because high variability of the weights wi/pi is a known contributor to under-coverage [7], as shown in Fig. 2, the distribution of these weights was examined using the county-level data, where the pi are the age- and race- and Hispanic origin-specific population denominators for each county. Boxplots are shown in Fig. 3, with a CV as high as 3.0 for some counties and race and Hispanic origin groups.

Fig. 3
figure 3

Boxplots of county-level coefficients of variation for the weights ui, by race and Hispanic origin. Boxplots of the coefficient of variation of the weights ui = wi/pi in the age-adjusted mortality rate R = ∑(wi/Pi) Di are presented by race and Hispanic origin, for 3147 US counties, using 1999–2019 aggregate data

Age-adjusted mortality rates for counties where the overall count D was less than 10 would be suppressed in accordance with NCHS confidentiality protection. Thus, comparisons among the four CI methods were most informative in counties with 10 or more deaths, as shown in Table 1. For those, when the CV of the ui = wi/pi was below 0.5, the Anderson–Rosenberg and Tiwari CIs almost always achieved nominal coverage, just like the Fay–Feuer CI, even for counties with 10–19 deaths. On the other hand, the Fay–Kim CI failed to achieve nominal coverage in cases where the other CIs did, notably for counties with 100 or more deaths. When the CV was larger than 0.5, there was a marked under-coverage for the Anderson–Rosenberg CIs, and, to a lesser extent, the Tiwari CI, in counties with 10–19 deaths but also in those with 20–99 deaths; in comparison, the Fay–Kim CI performed better in those cases, almost on par with the Fay–Feuer CI. Under-coverage of the Anderson–Rosenberg CI was more pronounced for rarer causes of death, e.g., ICD-10 codes Q00–Q99, in smaller and more clustered population subgroups than the non-Hispanic white population, such as the Hispanic or Latino or the non-Hispanic American Indian or Alaska Native populations, where nominal coverage was achieved for only about three in four counties with D = 10–99 and CV > 0.5.

Table 1 Coverage of selected gamma CIs for county-level age-adjusted mortality rates, by race and Hispanic origin. Coverage of the Anderson–Rosenberg, Tiwari, Fay–Kim, and Fay–Feuer gamma CIs is based on 10,000 replications for 3147 county-level age-adjusted mortality rates R = ∑(wi/Pi) Di by overall numerator size (D = ∑Di < 10 vs. D = 10–19, D = 20–99, or D ≥ 100), coefficient of variation (CV) of the weights ui = wi/pi (CV > 0.5 vs. CV ≤ 0.5), and race and Hispanic origin for four causes of death, using 1999–2019 aggregate data. Causes of death include: all causes; external causes of morbidity and mortality (ICD-10 codes: V01–Y89); congenital malformations, deformations, and chromosomal anomalies (Q00–Q99); and Alzheimer disease and other degenerative diseases of the nervous system, not elsewhere classified (G30–G31)

Discussion

This paper conducted a comparative evaluation of four gamma-based methods for calculating CIs for age-adjusted mortality rates to inform their possible use in setting CI-based statistical reliability standards. In addition to being easier to implement because it can use pre-tabulated standard values for the upper and lower CI limits and allows the user to more easily replicate calculations from published data, the Anderson–Rosenberg CI appeared in simulations to be more efficient (i.e., shorter, while retaining nominal coverage) than either the Tiwari or Fay–Feuer CI in “large scale” estimates where the numerator count was large relative to the denominator population size and the age distribution followed the age distribution in the standard population. In contrast, even though the Fay–Kim method could result in even narrower CIs in those “large scale” scenarios, this was sometimes at the expense of the coverage probability falling below 0.95. However, for “small scale” estimates like county-level data by race and Hispanic origin for less common causes of death (scenario 3), and for which the age distribution differed sharply from the age distribution in the standard population, nominal CI coverage in both the Anderson–Rosenberg and Tiwari methods was compromised when the adjustment weights ui = wi/pi were highly variable, and the Fay–Kim method performed better in those situations, on par with the Fay–Feuer method. Nonetheless, in situations where the CV of the weights ui = wi/pi can be assessed in advance, when the CV is low, e.g., below 0.5, the user may still decide to use either the Anderson–Rosenberg or the Tiwari CIs instead of the Fay–Feuer CI if a shorter yet conservative interval is desired. If the user is willing to trade off sub-nominal coverage (e.g., coverage probability below 0.95 for 95% CIs) in some instances with low CV (e.g., below 0.5) for CIs that attain nominal coverage “on average” and are generally shorter, the Fay–Kim mid-p CI can be a good alternative.

Conclusions

The Fay–Feuer CI can be used universally as the basis for formulating a CI-based threshold for statistical reliability of age-adjusted rates, because it maintains the nominal (e.g., 0.95 or higher) coverage probability in a large variety of studied situations. However, alternatives exist that are more efficient and perhaps more desirable under some specific circumstances. When the CV of the age-adjustment weights is below 0.5, the Anderson–Rosenberg and Tiwari CIs appear in simulations to be most efficient, whereas in cases where the CV is above 0.5, the Fay–Kim CI appears to be most efficient among the four gamma-based CI methods. In situations where the CV or the underlying distribution of the age-adjustment weights are unknown, while all four gamma-based methods studied in this paper appear to perform reasonably well, the Fay–Feuer method is recommended. For setting CI-based thresholds for statistical reliability, the properties of the interval can be considered, and thresholds for less efficient (wider) conservative intervals might be set higher than thresholds for more efficient (shorter) conservative intervals. However, it should be noted that such conservative CIs may have limited use in comparisons between two rates (e.g., by looking at whether there is overlap) because, as seen in simulations, they can be overly wide and will have low power to detect differences. Instead, differences in rates should be assessed using statistical significance testing or other suitable methods [21].

Availability of data and materials

The dataset analyzed during the current study is available from CDC WONDER, https://wonder.cdc.gov. The R syntax used is available from the authors on request.

Abbreviations

CI:

Confidence interval

NCHS:

National Center for Health Statistics

NCI:

National Cancer Institute

CDC:

Centers for Disease Control and Prevention

ICD-10:

International Classification of Diseases, Tenth Revision

References

  1. Brillinger DR. The natural variability of vital rates and associated statistics. Biometrics. 1986;42:693–734.

    CAS  Article  Google Scholar 

  2. Garwood F. Fiducial limits for the Poisson distribution. Biometrika. 1936;28:437–42.

    Google Scholar 

  3. Dobson AJ, Kuulasmaa K, Eberle E, Scherer J. Confidence intervals for weighted sums of Poisson parameters. Stat Med. 1991;10:457–62.

    CAS  Article  Google Scholar 

  4. Ng HKT, Filardo G, Zheng G. Confidence interval estimating procedures for standardized incidence rates. Comput Stat Data Anal. 2008;52:3501–16.

    Article  Google Scholar 

  5. Swift MB. A simulation study comparing methods for calculating confidence intervals for directly standardized rates. Comput Stat Data Anal. 2010;54:1103–8.

    Article  Google Scholar 

  6. Morris JK, Tan J, Fryers P, Bestwick J. Evaluation of stability of directly standardized rates for sparse data using simulation methods. Popul Health Metr. 2018;16:19.

    Article  Google Scholar 

  7. Fay MP, Feuer EJ. Confidence intervals for directly standardized rates: a method based on the gamma distribution. Stat Med. 1997;16:791–801.

    CAS  Article  Google Scholar 

  8. Tiwari RC, Clegg LX, Zou Z. Efficient interval estimation for age-adjusted cancer rates. Stat Methods Med Res. 2006;15:547–69.

    Article  Google Scholar 

  9. National Cancer Institute Surveillance Research Program. SEER*Stat software, version 8.3.9.2. 2021. https://seer.cancer.gov/seerstat. Accessed 31 Jan 2022.

  10. Fay MP, Kim S. Confidence intervals for directly standardized rates using mid-p gamma intervals. Biom J. 2017;59:377–87.

    Article  Google Scholar 

  11. Anderson RN, Rosenberg HM. Age standardization of death rates: implementation of the year 2000 standard. Natl Vital Stat Rep. 1998;47:3.

    Google Scholar 

  12. Centers for Disease Control and Prevention. U.S. Cancer Statistics Data Visualizations Tool. Technical notes 2020; submission diagnosis years 1999–2018. Atlanta, GA: U.S. Dept of Health and Human Services. https://www.cdc.gov/cancer/uscs/pdf/uscs-data-visualizations-tool-technical-notes-508.pdf. Accessed 31 Jan 2022.

  13. Centers for Disease Control and Prevention. Underlying cause of death 1999–2019. CDC WONDER Technical Reference Material. Atlanta, GA: U.S. Dept of Health and Human Services. https://wonder.cdc.gov/wonder/help/ucd.html. Accessed 31 Jan 2022.

  14. National Research Council Committee on National Statistics. The U.S. Vital Statistics System: The Role of State and Local Health Departments. In: Vital Statistics: Summary of a Workshop. Washington, DC: National Academies Press. 2009. https://www.ncbi.nlm.nih.gov/books/NBK219870/. Accessed 31 Jan 2022.

  15. Parker JD, Talih M, Malec DJ, Beresovsky V, Carroll M, Gonzalez JF, Hamilton BE, Ingram DD, Kochanek K, McCarty F, Moriarty C, Shimizu I, Strashny A, Ward BW. National Center for Health Statistics data presentation standards for proportions. Vital Health Stat. 2017;2:175.

    Google Scholar 

  16. National Center for Health Statistics. Data presentation standards for rates. Presented at the meetings of the Board of Scientific Counselors. https://www.cdc.gov/nchs/about/bsc/bsc_meetings.htm. 10 Feb 2022 and 9–10 Jan 2020.

  17. Murphy SL, Xu JQ, Kochanek KD, Arias E, Tejada-Vera B. Deaths: Final data for 2018. Natl Vital Stat Rep. 2020;69:13.

    Google Scholar 

  18. Centers for Medicare and Medicaid Services. CMS cell suppression policy, guidance portal. Washington, DC: U.S. Dept of Health and Human Services. 2020. https://www.hhs.gov/guidance/document/cms-cell-suppression-policy. Accessed 31 Jan 2022.

  19. R Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. 2021. https://www.R-project.org/. Accessed 31 Jan 2022.

  20. Office of Disease Prevention and Health Promotion. Healthy People 2030 framework. Washington, DC: U.S. Dept of Health and Human Services. 2021. https://health.gov/healthypeople/about/healthy-people-2030-framework. Accessed 31 Jan 2022.

  21. Schenker N, Gentleman JF. On judging the significance of differences by examining the overlap between confidence intervals. Am Stat. 2001;55(3):182–6.

    Article  Google Scholar 

  22. Blaker H. Confidence curves and improved exact confidence intervals for discrete distributions. Can J Stat. 2000;28:783–98.

    Article  Google Scholar 

  23. Casella G, Berger RL. Statistical Inference. Belmont: Wadsworth; 1990.

  24. Curtin LR, Klein RJ. Direct standardization (age-adjusted death rates). Stat Notes 6. Hyattsville, MD: National Center for Health Statistics; 1995.

Download references

Acknowledgements

Members of the NCHS workgroup on data presentation standards for rates and counts provided input at various stages of the development of this manuscript. The findings and conclusions in this paper are those of the authors and do not necessarily represent the official position of NCHS or CDC. The first author completed this work under a contract with Strategic Innovative Solutions, LLC, a CDC/NCHS contract holder.

Funding

None to declare.

Author information

Authors and Affiliations

Authors

Contributions

M.T. conceptualized the study, implemented mathematical derivations and simulations, compiled the figures and table, and led the manuscript writing and revisions. R.A. had developed the Anderson–Rosenberg and helped interpret the study findings. J.P. had led the investigation of CI-based data presentation standards for rates at NCHS, which motivated this study, and proposed the use of county-level mortality data with the four causes of deaths selected. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Jennifer D. Parker.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Technical appendix

Technical appendix

The number of deaths D reported for a given area and time period is assumed to be Poisson-distributed, with mean \({\mathbb{E}}\)(D) and variance \({\mathbb{V}}\)(D) satisfying the equality \({\mathbb{E}}\)(D) = λP = \({\mathbb{V}}\)(D), where P denotes the population denominator [1]. The age-specific or crude death rate R, defined as the ratio D/P, is usually multiplied by 100,000 and reported as a rate per 100,000 population.

Poisson-gamma relationship

For a positive integer x ≤ P, it can be shown [2] that there exists a gamma random variable G such that \({\mathbb{E}}\)(G) = x = \({\mathbb{V}}\)(G) and

$$\Pr \left( {D \ge x|\lambda } \right) = \Pr \left( {G \le \lambda P|x} \right)$$
(A1)

Recall that if G is gamma-distributed with shape parameter α > 0 and scale parameter β > 0, then its mean and variance are \({\mathbb{E}}\)(G) = αβ and \({\mathbb{V}}\)(G) = αβ2. Conversely, the parameters are given by α = \({\mathbb{E}}\)(G)2/\({\mathbb{V}}\)(G) and β = \({\mathbb{V}}\)(G)/\({\mathbb{E}}\)(G). Thus, with \({\mathbb{E}}\)(G) = x = \({\mathbb{V}}\)(G) in Eq. A1, the corresponding gamma distribution has α = x and β = 1.

For the rate R = D/P, with P = p, y = x/p, and v = x/p2, Eq. A1 becomes

$$\Pr (R \ge y|\lambda ) = \Pr (Z \le \lambda |y,v)$$
(A2)

where Z = G/P is gamma-distributed with mean y and variance v.

Gamma CI for age-specific and crude rates

When D = x is observed, the ratio y = x/p is an estimate of \({\mathbb{E}}\)(R) = λ. An equal-tailed 100(1 − a) percent CI [L(y), U(y)] for the parameter λ, e.g., with a = 0.05, is obtained as a solution to the following two equations:

$$\Pr \left[ {R \ge y|\lambda = L\left( y \right)} \right] = a/2$$
(A3a)
$$\Pr \left[ {R \le y|\lambda = U\left( y \right)} \right] = a/2$$
(A3b)

Eqs. A3a and A3b follow from looking upon L(y) as the largest λ for which Pr(R ≥ y|λ) ≤ a/2 and U(y) as the smallest λ for which Pr(R ≤ y|λ) ≤ a/2; see [22] and theorem 9.2.3.a in [23].

From Eqs. A2 and A3a,

$$a/2 = \Pr [R \ge y|\lambda = L(y)] = \Pr [Z \le L\left( y \right)|y,v]$$

where Z is gamma-distributed with mean y and variance v, i.e., with parameters α = y2/v = x and β = v/y = 1/p. Thus, the lower CI limit L(y) is obtained as the (a/2)-quantile of the gamma(x, 1/p) distribution. For y = 0 = x, L(0) = 0 by convention.

Similarly, from Eqs. A2 and A3b,

$$1 - \left( {a/2} \right) = \Pr [R > y|\lambda = U(y)] \quad\quad\quad\quad\, = \Pr [R \ge y + 1/p|\lambda = U(y)] \quad\quad\quad\quad\, = \Pr [Z^{\prime} \le U(y)|y^{\prime},v^{\prime}]$$

where the second equality is due to x being a positive integer, so that D/p > x/p if and only if D/p ≥ (x + 1)/p, and Z′ is a gamma random variable with mean y′ = y + 1/p and variance v′ = v + 1/p2. Because y′2/v′ = x + 1 and v′/y′ = 1/p, the upper CI limit U(y) is obtained as the (1 − a/2)-quantile of the gamma(x + 1, 1/p) distribution.

Approximate gamma CIs for age-adjusted rates

With n age groups, let Di denote the number of deaths for group i. The Di are assumed to be independent Poisson random variables, and the age-specific rates Ri are defined as the ratios Di/Pi, with means \({\mathbb{E}}\)(Ri) = λi and variances \({\mathbb{V}}\)(Ri) = λi/pi.

Let πi denote the size of group i in the reference population, e.g., the projected year 2000 US population [24]. Let wi denote the relative proportions for group i in the reference population: wi = πi/∑πj. The age-adjusted death rate R′ is defined as

$$R^{\prime} = \sum w_{i} R_{i} = \sum \left( {w_{i} /P_{i} } \right)D_{i}$$

Given the parameters λi and denominators Pi = pi, the age-adjusted rate R′ has mean \({\mathbb{E}}\)(R′) = λ′ = ∑wi λi and variance \({\mathbb{V}}\)(R′) = ∑wi2 λi/pi.

Fay–Feuer interval. Fay and Feuer [7] assume that Eq. A2 holds approximately for the age-adjusted rate R′, so that, for y = ∑(wi/pi) xi and v = ∑(wi/pi)2 xi,

$$\Pr (R^{\prime} \ge y|\lambda^{\prime}) \approx \Pr (Z \le \lambda^{\prime}|y,v)$$
(A4)

where Z is gamma-distributed with \({\mathbb{E}}\)(Z) = y and \({\mathbb{V}}\)(Z) = v, i.e., with α = y2/v and β = v/y. As for the crude rate R, an equal-tailed 100(1 − a) percent CI for λ′ solves the equations:

$$\Pr \left[ {R^{\prime} \ge y|\lambda^{\prime} = L\left( y \right)} \right] = a/2$$
(A5a)
$$\Pr \left[ {R^{\prime} \le y|\lambda^{\prime} = U\left( y \right)} \right] = a/2$$
(A5b)

From Eqs. A4 and A5a, the lower limit L(y) can be resolved approximately from the lower tail probability of a gamma distribution with parameters α = y2/v and β = v/y, again with the convention that L(0) = 0.

For the upper bound, note that a unit increment in the observed number of deaths xj within group j results in the addition of the quantity wj/pj to the age-adjusted rate y = ∑(wi/pi) xi. Because such a unit increment could be realized in any of the n groups,

$$\Pr [R^{\prime} > y|\lambda^{\prime} = U(y)] \ge \Pr [R^{\prime} \ge y + \kappa_{0} |\lambda^{\prime} = U(y)]$$

where κ0 = max{wj/pj}. From Eq. A4, the right-hand side in this last inequality is approximately equal to Pr[Z′ ≤ U(y)|y′, v′], where Z′ is gamma-distributed with mean y′ = y + κ0 and variance v′ = v + κ02. Thus, an upper CI limit U(y) can be resolved from the upper tail probability of a gamma distribution with shape parameter α = y′2/v′ and scale parameter β = v′/y′. Fay and Feuer [7] make the conjecture that the approximate gamma CI thus constructed remains conservative. Although this conjecture remains unproven, findings from the many simulation studies to date continue to support it, e.g., [4,5,6].

Tiwari modification. Tiwari et al. [8] developed a modification to the Fay–Feuer method described above by distributing an average increment 1/n uniformly across all age groups instead of a unit increment in a single age group:

$$y^{\prime} = \sum \left( {w_{i} /p_{i} } \right)\left( {x_{i} + \frac{1}{n}} \right) = y + \frac{1}{n}\sum w_{i} /p_{i}$$

Thus, with κ1 = n−1 ∑wi/pi and κ2 = n−1 ∑(wi/pi)2, the gamma random variable Z′ above now has mean y′ = y + κ1 and variance v′ = v + κ2. The Tiwari modification reduces the CI width relative to the Fay–Feuer method, because

$$\Pr [R^{\prime} \ge y + \kappa_{1} |\lambda^{\prime} = U(y)] \ge \Pr [R^{\prime} \ge y + \kappa_{0} |\lambda^{\prime} = U(y)]$$

However, the resulting CI sometimes fails to retain the nominal coverage level, e.g., [4].

Fay–Kim modification. Fay and Kim [10] more recently developed a mid-p version of the Fay–Feuer CI. A modification of exact CIs from discrete data, mid-p CIs trade-off guaranteed nominal coverage in all of the parameter space (which tends to result in overly wide CIs) for proximity to nominal coverage (and narrower CIs) for most parameter values.

For the mid-p interval, a solution to the following equations is sought:

$$\Pr \left[ {R^{\prime} > y|\lambda^{\prime} = L\left( y \right)} \right] \quad\quad\quad\quad + \left( {1/2} \right) \times \Pr \left[ {R^{\prime} = y|\lambda^{\prime} = L\left( y \right)} \right] = a/2$$
(A7a)
$$\Pr \left[ {R^{\prime} < y|\lambda^{\prime} = U\left( y \right)} \right] \quad\quad\quad\quad + \left( {1/2} \right) \times \Pr \left[ {R^{\prime} = y|\lambda^{\prime} = U\left( y \right)} \right] = a/2$$
(A7b)

Drawing B = b from a Bernoulli distribution with Pr(B = 1) = 1/2, Fay and Kim [10] define the mid-p version of the Fay–Feuer CI using the following gamma distribution:

$${\text{gamma}}_{{\text{mid-p}}} = b \times {\text{gamma}}\left( {y^{2} /v,v/y} \right) \quad\quad\quad\quad\quad\quad + \left( {1 - b} \right) \times {\text{gamma}}\left( {y^{{\prime}{2}} /v^{\prime},v^{\prime}/y^{\prime}} \right)$$

where y′ = y + κ0 and v′ = v + κ02 are as in the Fay–Feuer construction, above. Thus, the lower and upper limits are defined as the (a/2)th and (1−a/2)th quantiles of gamma mid-p. The special case y = 0 is addressed using L(0) = 0 and U(0) defined as the (1−a)th quantile of the gamma(y2/v, v/y) distribution. R syntax is provided to solve for L(y) and U(y) numerically [10].

Anderson–Rosenberg approximation. Anderson and Rosenberg [11] had introduced an approximation to the Fay–Feuer upper CI limit that alleviated the need to calculate κ0 = max{wj/pj}. Instead, the Poisson-gamma relationship in Eq. A1 is assumed to hold for an appropriately defined Poisson random variable Dadj corresponding to a crude rate that would have been equal to the age-adjusted rate R′, i.e., such that R′ = Dadj/Padj. Therefore, a “standardized” gamma random variable Gadj is defined as Z/(v/y), where the gamma-distributed Z has mean y and variance v. As a result, Gadj has mean and variance equal to y2/v. Define xadj = y2/v and 1/padj = v/y. If xadj was an integer, then there would exist a Poisson random variable Dadj with mean and variance equal to λ′ padj such that

$$\Pr \left( {D_{{{\text{adj}}}} \ge x_{{{\text{adj}}}} |\lambda^{\prime}} \right) = \Pr \left( {G_{{{\text{adj}}}} \le \lambda^{\prime}p_{{{\text{adj}}}} |x_{{{\text{adj}}}} } \right)$$
(A6)

Because y2/v will generally not be integer, xadj is defined as the nearest integer instead (although this is not strictly necessary), and the equality in Eq. A6 is assumed to hold approximately. Either way, one proceeds as for the crude rate to derive CI limits L(y) and U(y) for λ′ as the (a/2)-quantile of the gamma(xadj, 1/padj) distribution and the (1 − a/2)-quantile of the gamma(xadj + 1, 1/padj), respectively.

Exact intervals. When there is a constant scalar c > 0 such that pi = i for all i, the age-adjusted rate equals the overall crude rate, and the above CIs reduce to the exact gamma CI for λ = p1λi pi where p = ∑pi and the total number of deaths D = ∑Di follows a Poisson distribution with mean λp. In particular, when y = 0, v = 0 and xadj is undefined. However, because the age-adjusted rate equals the crude rate in this case, the limits of all three approximate gamma CIs for the age-adjusted rate are defined to be those of the exact gamma CI for the crude rate, with p = ∑pi and x = ∑xi = 0. Thus, in this extreme case, L(0) = 0 and U(0) is the (1 − a/2)-quantile of the gamma(1, 1/p) distribution.

Anderson–Rosenberg CI as a modification of the Fay–Feuer CI. The Anderson–Rosenberg construction can be seen to follow that of the Fay–Feuer CI, with a gamma-distributed Z′′ that has mean y′′ = y + κ and variance v′′ = v + κ2, where κ = κ3 = 1/padj instead of κ = κ0 = max{wj/pj}. Indeed, with 1/padj = v/y and xadj = y2/v,

$$\frac{{v^{\prime\prime} }}{{y^{\prime\prime} }} = \frac{{v(y^{2} + v)/y^{2} }}{{(y^{2} + v)/y}} = \frac{v}{y} = \frac{1}{{p_{{{\text{adj}}}} }}$$
$${\text{and}}\quad \frac{{y^{{\prime\prime}{2}} }}{{v^{\prime\prime}}} = \frac{{(y^{2} + v)^{2} /y^{2} }}{{v(y^{2} + v)/y^{2} }} = \frac{{y^{2} + v}}{v} = x_{{{\text{adj}}}} + 1$$

Furthermore, 1/padj can be expressed as follows:

$$\begin{aligned}\frac{1}{{p_{{{\text{adj}}}} }} &= \frac{v}{y} = \sum \left( {w_{i} /p_{i} } \right)\xi_{i} \quad{\text{with}}\\\xi_{i} &= \frac{{(w_{i} /p_{i} )x_{i} }}{{\sum (w_{j} /p_{j} )x_{j} }}\;{\text{and}}\;\sum \xi_{i} = 1.\end{aligned}$$

As a result,

$$y^{\prime\prime} = y + \frac{1}{{p_{{{\text{adj}}}} }} = \sum (w_{i} /p_{i} )(x_{i} + \xi_{i} )$$

and the Anderson–Rosenberg method is seen to result in incrementing the age-specific death counts from xi to xi + ξi, whereas in the Fay–Feuer method only the count xi* for the age group i* for which wi*/pi* = max{wj/pj} is incremented—and in the Tiwari modification, the age-specific counts are incremented from xi to xi + ζi, where ζi = 1/n. Additionally,

$$\kappa_{3} = \frac{1}{{p_{{{\text{adj}}}} }} = \sum (w_{i} /p_{i} )\xi_{i} \le \max \{ w_{i} /p_{i} \} = \kappa_{0}$$

since ∑ξi = 1. Thus, like the Tiwari modification, the Anderson–Rosenberg construction reduces the CI width relative to the Fay–Feuer method:

$$\Pr [R^{\prime} \ge y + \kappa_{3} |\lambda^{\prime} = U(y)] \ge \Pr [R^{\prime} \ge y + \kappa_{0} |\lambda^{\prime} = U(y)]$$

Two questions emerge from the above derivations:

  1. (1)

    Under what circumstances does the Anderson–Rosenberg method result in a shorter CI than the Fay–Feuer method that retains nominal coverage?

  2. (2)

    Since both the Anderson–Rosenberg and Tiwari methods result in narrower CIs than the Fay–Feuer method, when is one preferable to the other?

To partially answer question 2, note that the Anderson–Rosenberg CI would be narrower than the Tiwari CI if (but not only if) κ3 ≤ κ1, as that ensures

$$\Pr [R^{\prime} \ge y + \kappa_{3} |\lambda^{\prime} = U(y)] \ge \Pr [R^{\prime} \ge y + \kappa_{1} |\lambda^{\prime} = U(y)]$$

By definition, the condition κ3 ≤ κ1 is realized when

$$\sum (w_{i} /p_{i} )\xi_{i} \le \frac{1}{n}\sum (w_{i} /p_{i} )$$

which is equivalent to

$$\frac{1}{n}\sum (w_{i} /p_{i} )^{2} x_{i} \le \left\{ {\frac{1}{n}\sum (w_{i} /p_{i} )x_{i} } \right\} \left\{ {\frac{1}{n}\sum (w_{i} /p_{i} )} \right\}$$

This last condition indicates that the slope of the line from the simple regression of the weight-adjusted age-specific death rates wi (xi/pi) = (wi/pi) xi onto the weights wi/pi is negative or zero. This could be verified upfront for any set of age-adjustment weights (w1, …, wn) and population distribution (p1, …, pn), and it would be sufficient to ensure that the Anderson–Rosenberg CI will be narrower than the Tiwari CI. Of course, this leaves the issue of efficiency unresolved, as it would theoretically be possible for either the Anderson–Rosenberg or the Tiwari CIs to be so narrow as to fail to retain nominal coverage. The empirical simulations investigate situations where this may occur. In addition, these two CI methods are compared to the more recent Fay–Kim mid-p modification.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Talih, M., Anderson, R.N. & Parker, J.D. Evaluation of four gamma-based methods for calculating confidence intervals for age-adjusted mortality rates when data are sparse. Popul Health Metrics 20, 13 (2022). https://doi.org/10.1186/s12963-022-00288-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12963-022-00288-1

Keywords

  • Direct standardization
  • Confidence interval width
  • Coverage probability
  • Statistical reliability