## Abstract

### Background

Equal-tailed confidence intervals that maintain nominal coverage (0.95 or greater probability that a 95% confidence interval covers the true value) are useful in interval-based statistical reliability standards, because they remain conservative. For age-adjusted death rates, while the Fay–Feuer gamma method remains the gold standard, modifications have been proposed to streamline implementation and/or obtain more efficient intervals (shorter intervals that retain nominal coverage).

### Methods

This paper evaluates three such modifications for use in interval-based statistical reliability standards, the Anderson–Rosenberg, Tiwari, and Fay–Kim intervals, when data are sparse and sample size-based standards alone are overly coarse. Initial simulations were anchored around small populations (*P* = 2400 or 1200), the median crude all-cause US mortality rate in 2010–2019 (833.8 per 100,000), and the corresponding age-specific probabilities of death. To allow for greater variation in the age-adjustment weights and age-specific probabilities, a second set of simulations draws those at random, while holding the mean number of deaths at 20 or 10. Finally, county-level mortality data by race/ethnicity from four causes are selected to capture even greater variation: all causes, external causes, congenital malformations, and Alzheimer disease.

### Results

The three modifications had comparable performance when the number of deaths was large relative to the denominator and the age distribution was as in the standard population. However, for sparse county-level data by race/ethnicity for rarer causes of death, and for which the age distribution differed sharply from the standard population, coverage probability in all but the Fay–Feuer method sometimes fell below 0.95. More efficient intervals than the Fay–Feuer interval were identified under specific circumstances. When the coefficient of variation of the age-adjustment weights was below 0.5, the Anderson–Rosenberg and Tiwari intervals appeared to be more efficient, whereas when it was above 0.5, the Fay–Kim interval appeared to be more efficient.

### Conclusions

As national and international agencies reassess prevailing data presentation standards to release age-adjusted estimates for smaller areas or population subgroups than previously presented, the Fay–Feuer interval can be used to develop interval-based statistical reliability standards with appropriate thresholds that are generally applicable. For data that meet certain statistical conditions, more efficient intervals could be considered.