 Research
 Open Access
 Open Peer Review
 Published:
Quantifying temporal trends of agestandardized rates with odds
Population Health Metrics volume 16, Article number: 18 (2018)
Abstract
Background
To quantify temporal trends in agestandardized rates of disease, the convention is to fit a linear regression model to logtransformed rates because the slope term provides the estimated annual percentage change. However, such logtransformation is not always appropriate.
Methods
We propose an alternative method using the rankordered logit (ROL) model that is indifferent to logtransformation. This method quantifies the temporal trend using odds, a quantity commonly used in epidemiology, and the logodds corresponds to the scaled slope parameter estimate from linear regression. The ROL method can be implemented by using the commands for proportional hazards regression in any standard statistical package. We apply the ROL method to estimate temporal trends in agestandardized cancer rates worldwide using the cancer incidence data from the Cancer Incidence in Five Continents plus (CI5plus) database for the period 1953 to 2007 and compare the estimates to their scaled counterparts obtained from linear regression with and without logtransformation.
Results
We found a strong concordance in the direction and significance of the temporal trends in cancer incidence estimated by all three approaches, and illustrated how the estimate from the ROL model provides a measure that is comparable to a scaled slope parameter estimated from linear regression.
Conclusions
Our method offers an alternative approach for quantifying temporal trends in incidence or mortality rates in a population that is invariant to transformation, and whose estimate of trend agrees with the scaled slope from a linear regression model.
Background
Monitoring incidence and mortality rates in a population allows stakeholders in health care to track the burden of the disease. Changes in the population rates over time can help to assess the effectiveness of interventions in public health or health care and also inform projections for future health services. Recent years have seen extensive work in analyses of the global burden of disease, with published estimates of global, regional, and national incidence and prevalence rates of several hundred diseases for a majority of countries, both sexstandardized [1] and for specific sex and age groups [2]. Established methods of assessing trends include ageperiodcohort models [3, 4] and the estimated annual percentage change [5]. The estimated annual percentage change (EAPC) has been in use for many years by cancer registries to quantify changes in cancer rates over time and to project future rates [3, 5,6,7]. Conceptually, EAPC represents the average change in the agestandardized rate (ASR) per year. It is usually computed by estimating the slope of a linear regression (LR) model fitted to the logtransformed ASR. Under this framework, for every oneyear increase in calendar time, the ASR is assumed to change by a constant factor when expressed as a percentage of the previous year’s rate. However, the LR model can also be used to model the ASR without logtransformation and the slope term will then correspond to the change in ASR for each calendar year [8]. There is no simple relationship between the slopes from these two models and data analysts need to assess whether the increase is linear or exponential when deciding whether the untransformed or logtransformed ASR is the most appropriate.
In epidemiology, the odds ratio is a commonly used measure of association between a binary outcome and an exposure. In this paper, we propose to use odds to quantify time trends in annual ASRs to eliminate the need to consider whether transformation of ASR is necessary when testing for a temporal trend. This approach involves modeling the ranked ASR values across calendar years using the rankordered logit (ROL) regression model to obtain the relevant estimates [9]. We illustrate the method by applying it to data from the Cancer Incidence in Five Continents plus (CI5plus) database, and comparing the estimates we obtain to the scaled estimates from the usual LR models, where the scale parameter is estimated from the standard deviation of the error terms.
Material and methods
The usual approach used to compute EAPC in incidence rates assumes the logtransformed ASR is linearly related to time and a LR model is fitted to the logtransformed ASR with calendar year as the (continuous) independent variable:
where the subscript i represents the ith year (i = 1, 2, …, n) and the error terms, ε_{i}s, are assumed to be independent and normally distributed with mean 0 and variance \( {\sigma}_i^2 \) [5, 6]. If the error terms have equal variance (i.e., \( {\sigma}_i^2={\sigma}^2 \)), then simple unweighted least squares provides an estimate of the slope term, β_{1}. As incidence is represented as a count, the assumption of equal variances may not be reasonable, especially for rare diseases, and a weighted least squares may be more appropriate, where the weight for y_{i} is \( {w}_i=\frac{1}{{\sigma_i}^2} \) (see Supplementary materials and methods for details). In practice, when fitting such models to sparse data, there is a need to account for age strata with no events as the log of zero is undefined.
When a LR of the ASR (i.e., no logtransformation) is used to estimate trend [8], the parameter β_{1} in Eq. (1) provides an estimate of the annual increment in the incidence rate. On fitting a LR model to logtransformed rates, the EAPC is given by the following transformation of the coefficient (β_{1}):
The rankordered logit model
The ROL model was originally developed in marketing research for modelling an individual’s preferences for n products [9]. The model is linear as in Eq. (1), but the error terms are assumed to be extreme value type 1 (EVT1) distributed with location μ = 0 and scale λ = 1 (i.e., standard EVT1 distributed). Under these assumptions, β_{1} can be estimated from the ranked observations based on:
In marketing research applications, the β_{1} parameter indicates the association between a feature of the products and the individual’s preference: for example, if a decrease in the price of a product is associated with an increase in its preference, then exp{β_{1}} represents the odds of a higher rank (or preference) when the price decreases by one unit. When the error term assumption is fulfilled, the estimate of β_{1} also has the usual linear interpretation as in Eq. (1). Note that Eq. (3) is the familiar partial likelihood of a Coxregression model [10,11,12]. Hence, the ROL model can be implemented using standard statistical software by using the commands provided for Cox regression analysis.
In applying ROL models to time trend analysis of incidence rates, the ASR (i.e., y) is used to rank the calendar years. Thus the calendar year is the explanatory variable (i.e., x) and the ROL model provides an estimate of the association between calendar year and the magnitude (or rank) of the ASR. Since the ROL is indifferent to any transformation of the outcome that preserves the ordering, the odds of the subsequent calendar year having a higher value (or rank) than the current year is exp{β_{1}}, regardless of whether or not the ASR is log transformed.
The scale parameter, λ, for the slope term, β _{1}, from the linear regression model
The ROL model specifically assumes standard EVT1 distributed error terms, thus the variance equals π^{2}/6. In contrast, the variance of the error terms in the LR model is not specified a priori but estimated from the data. Because of this, the β_{1} estimates from the two regression models are not comparable. We can overcome this by scaling the outcome in the LR model (and thus scaling β_{1}).
For a linear model such as that in Eq. (1), if the error terms are independently and identically distributed with an EVT1 distribution with μ = 0 and λ > 0, the variance of the error terms (and consequently the variance of the outcome) is given by,
Hence, we can estimate a scalelike parameter, λ, from the error terms obtained from the usual LR (assuming these are independent and normally distributed with mean 0 and σ > 0) by equating the variance expression in Eq. (4) with the estimate of σ from the LR model and solving for λ, i.e., \( \uplambda =\sqrt{6}\sigma /\pi \).
Scaling the outcome variable y from Eq. (1) by λ gives \( {y}_i^{\ast }={y}_i/\uplambda ={\beta}_0^{\ast }+{\beta}_1^{\ast }{x}_i+{\varepsilon}_i^{\ast } \) where \( {\varepsilon}_i^{\ast }={\varepsilon}_i/\uplambda \) mimics the standard EVT1 distribution assumption of the error terms in the ROL in Eq. (3). Hence, the scaled slope parameter \( {\beta}_1^{\ast }={\beta}_1/\uplambda \) from Eq. (1) represents the slope parameter in the ROL model in Eq. (3). Thus, the proposed scaled slope from LR has a similar interpretation to the logodds in Eq. (3). Thus, we have provided a heuristic argument for scaling the slope from a simple (unweighted) LR where the error variance is represented by a single parameter, σ. Extending this to weighted LR would require a single value to represent the variation of the error terms. For simplicity, we propose using the mean of the standard deviations in the different calendar years (i.e., \( \sigma =\sum \limits_{i=1}^n{\sigma}_i/n \)) to represent the overall underlying variation over the timeperiod of study.
Application to cancer data
The CI5plus database has annual incidence rates for 27 cancer sites in 118 populations from 1953 to 2007 with calendar periods of coverage varying for different populations. With the exception of cancers of the breast, cervix uteri, corpus uteri, and ovary and other uterine adnexa in females, and cancer of the prostate and testis in males, cancers at all sites are reported separately for males and females. Yearly incident cancer cases, c_{ij}s, and population denominators, n_{ij}s, aggregated by fiveyear age groups provide incidence rates suitable for performing time trend analysis where i and j denote the ith calendar year and jth age group. We harmonized all incidence rates and denominators using 16 agegroups (0–4, 5–9, …, 70–74, 75+), and used the Segi world standard population, s_{j}s, to compute the ASRs [13]. We replaced any ASR of zero with half the value of the smallest nonzero ASR in the database for the cancer site(s) being analyzed. From the 27 cancer sites (four of them genderspecific) in 118 populations, we had a total of 5900 trends for analysis. In addition to site and sexspecific cancers, we also considered all sites excluding nonmelanoma skin cancer.
We applied the three approaches outlined in the previous section to these worldwide cancer rates. The first approach was the LR of the logtransformed rates (LRln), the second approach was the LR of the untransformed rates (LRun), and the third approach was our proposed ROL regression model on the ranked rates. The LR models were fit using weighted least squares. The estimates of trend obtained from the three approaches and the corresponding scaledestimates for LRln and LRun were compared. We inspected the concordance in sign with respect to pvalues for the scaled and unscaled estimates. Additionally, we reported the results from analysis of the trends stratified by sex to demonstrate the consistency with published work and to highlight important trends. To corroborate the contrasting trends that have been reported for breast cancer in Singapore and Sweden [14, 15], we conducted a specific analysis that compared the incidence rates to illustrate the ROL model’s indifference to transformation and to demonstrate the comparability of the estimates obtained.
All analyses were performed with the statistical package R, version 3.1.2 [16] and the commands are provided in the Supplementary material, together with the commands for implementation in other widelyused statistical software packages (SAS, Stata, SPSS).
Results
Figure 1 presents the results from the application of the three approaches to the CI5plus database. The scatterplots in the left column of Fig. 1 provide a pairwise comparison of the estimates of the slope, β_{1}, from (a) the LR of logtransformed and untransformed ASRs (b) the logtransformed ASRs and the ROL, and (c) the untransformed ASRs and the ROL. As expected, these plots did not exhibit a clear relationship between the estimates, although there was a high concordance in the signs of the estimates across the three approaches, with 5325 out of 5900 combinations (90.3%) having the same sign across all three approaches. With regard to inference concerning the direction of temporal trends, the pvalues corresponding to these concordant scenarios were lower than those from scenarios where the signs were discordant (see Fig. 2). Examining the scaledestimates, \( {\beta}_1^{\ast } \), from the linear regression of untransformed and logtransformed rates and comparing them to each other (Fig. 1 (d)) and comparing each of these estimates to the β_{1} estimate from the ROL analysis (Fig. 1 (e) and (f)), we see that the scatterplots exhibit a pronounced linear relationship along the lineofidentity (i.e., the grey diagonal line corresponding to y = x).
On inspection of the divergent points in Fig. 1 (d), (e), and (f), we found several of these were for prostate cancer where the introduction of screening resulted in the familiar “screening effect” feature in the incidence profile so that it is not reasonable to consider a linear fit. One unexpected disagreement was for thyroid cancer in New York, whose incidence curve had an apparent screening effect in 2000–2005, and we found that indeed thyroid cancer screening had been offered in New York after the events of 9/11 [17, 18]. For disagreements not due to screening, we found that where the estimates from LR models of untransformed and logtransformed rates disagree, the ROL estimate tends to agree well with the most appropriate LR estimate. These and other divergent points from Fig. 1 are presented in detail in Additional file 1: Figures S1 and S2.
The numerical results from the ROL and LR of the logtransformed rates of sexspecific rates are presented in Table 1, where we do not report results for any cancers where 25% or more of the yearly ASRs were less than 3 per 100,000: cancers of the eye, bone, testis, gallbladder, Hodgkin’s lymphoma and multiple myeloma. The remaining cancers were sorted by the concordance in the significance between the two approaches across the 118 populations. For cancers that affect both genders, the average concordance was used. For “All sites but nonmelanoma skin,” the overall concordance between LRln and ROL was 84.7% among the 118 populations for both males and females, with an increasing trend in at least 75% of the 118 populations as indicated by the interquartile range excluding an odds of 1 in the ROL and excluding an EAPC value of 0 in the LRln analyses respectively. Among the nine cancer sites with more than 70% concordance in significant findings among the 118 populations, two of the three sexspecific cancers (prostate and breast) had an increasing trend in the majority of the populations (≥ 75%) while cancer of the cervix had a decreasing trend in the majority of the populations. For the six cancers affecting both sexes, there was evidence in a majority of populations of an increasing trend for both males and females in cancer of the thyroid and kidney, nonHodgkin’s lymphoma and nonmelanoma skin cancer and a decreasing trend in both sexes for stomach cancer. For lung cancer, there was evidence of an increasing trend in females and decreasing trend in males.
For cancer sites with lower concordance in significant findings between LRln and ROL, the evidence of an increasing or decreasing trend among the 118 populations is weaker. Only liver cancer in men and uterine cancer in women had an increasing trend of reasonable magnitude (median odds 1.14 and 1.10 respectively). For most of the rarer cancers, the odds estimates from the different populations were close to 1 and the EAPC close to 0.
Figure 3 displays the untransformed and logtransformed ASR of female breast cancer incidence in Singapore and Sweden, suggesting that a linear trend was reasonable for both the untransformed (a) or logtransformed (b) data in both populations. In Table 2, we report the estimates from the LR analysis of both the untransformed and logtransformed rates. The scaledslope estimates from both analyses were close to the estimates from the ROL analysis in both populations, with slightly better agreement for untransformed rates in the Swedish data. All analyses indicated an increasing trend in breast cancer incidence in both Singapore and Sweden, with a steeper trend in Singapore than in Sweden, consistent with Fig. 3 and with previously published work [14, 15].
Discussion
We have described an alternative approach to quantifying temporal trends that is comparable to current practice but with some important advantages. In contrast to much of the published disease trends, which are estimated with specialized models and software [1, 2], our approach uses simple commands available in any standard statistical package and implements a familiar model (Cox proportional hazards regression) to yield an estimate of trend using a measure (the odds) that is familiar in epidemiology. We have provided detailed instructions in the Supplementary material for implementation in several commonly used statistical software packages. The method uses the ROL model, which is commonly used in marketing research but is not a mainstream analytical tool in traditional epidemiology. The usefulness of the model in assessing trends is that it is indifferent to transformations of the agestandardized rates, so there is no need to assess whether the untransformed or logtransformed rates are the most appropriate before proceeding with estimation. This can simplify comparisons across populations where the decision to transform differs.
We applied the method to investigate evidence of temporal trends in sitespecific cancer incidence rates in the 118 populations represented in the CI5plus database and compared our results to those from the usual regression models. We found strong concordance in the signs of the estimates and the significance of temporal trends across the three approaches: linear regression (LR) analysis of the untransformed (LRun) or transformed rates (LRln), and ROL. In particular, we found the scaled slopes from the weighted LR analyses to be highly correlated with, and similar to, the β_{1} estimates from the ROL model. Unlike the weighted LR whose weights require agespecific population counts and incident cases, our method can be implemented with only annual ASR data. To compare our estimates to those that could be obtained from LR of such data, we conducted a sensitivity analysis using unweighted least squares and obtained very similar results (see Additional file 1: Table S1 and Figure S3) and a high concordance (93.7%: 5526 out of 5900 combinations) in the signs of the estimates across all three approaches (see Additional file 1: Figure S4).
Our analysis demonstrated an increasing trend in many cancers for both men and women, consistent with what has been reported previously [14]. Exceptions, which have also been noted previously, were stomach cancer which had a decreasing trend in both sexes [19], and lung cancer which had an increasing trend in women but decreasing trend in men in a majority of the populations [20]. This lung cancer pattern has been recently observed in many countries and has been attributed to increased smoking among women [21]. The decrease in stomach cancer is harder to explain, but may be due in part to increased exposure to antibiotics [22]. We also found evidence of a decreasing trend in cervical cancer, which has been observed in many populations and been attributed to populationbased screening programs [14, 23].
Our comparative analysis of trends can offer additional insights into the health situation within or between specific populations. Our analysis of worldwide cancer incidence rates highlighted a number of interesting features, including the effects of population screening programs (e.g., for prostate cancer), unexpected screening as in New York after the events of 9/11, and the lung cancer profile in Russia (Additional file 1: Figure S2(h)) due to the lack of progress in tobacco control [24].
Conclusions
The consistency of our estimates from ROL with those from least squares provides empirical evidence that temporal trends in cancer incidence can be represented by odds. The method, which can be seamlessly implemented in standard software, provides a transformationfree alternative that facilitates comparison of trends across different populations in the incidence or mortality rates for any disease or the prevalence rates of known risk factors [25]. For trends that are routinely assessed and reported using regression models, using transformed or untransformed rates, simply including an estimate of the error variance with the reported slope would allow population estimates to be compared with estimates from ROL and all estimates to be combined in metaanalyses, simplifying communication and comparison across populations.
Abbreviations
 ASR:

Agestandardized rate
 EAPC:

Estimated annual percentage change
 LR:

Linear regression
 LRln:

Linear regression of logtransformed
 LRun:

Linear regression of untransformed
 ROL:

Rankordered logit
References
 1.
GBD 2016 Disease and Injury Incidence and Prevalence Collaborators. Global, regional, and national incidence, prevalence, and years lived with disability for 328 diseases and injuries for 195 countries, 1990–2016: a systematic analysis for the Global Burden of Disease Study 2016. Lancet. 2017;390:1211–59.
 2.
GBD 2016 Causes of Death Collaborators. Global, regional, and national agesex specific mortality for 264 causes of death, 1980–2016: a systematic analysis for the Global Burden of Disease Study 2016. Lancet. 2017;390:1151–210.
 3.
Rosenberg PS, Anderson WF. Ageperiodcohort models in Cancer surveillance research: ready for prime time? Cancer Epidemiol Biomark Prev. 2011;20:1263–8.
 4.
Masters RK, Tilstra AM, Simon DH. Explaining recent mortality trends among younger and middleaged white Americans. Int J Epidemiol. 2018;47:81–8.
 5.
Fay MP, Tiwari RC, Feuer EJ, Zou ZH. Estimating average annual percent change for disease rates without assuming constant change. Biometrics. 2006;62:847–54.
 6.
NORDCAN: Glossary of statistical terms http://wwwdep.iarc.fr/nordcan/English/glossary.htm. Accessed April 14 2018.
 7.
Rahib L, Smith BD, Aizenberg R, Rosenzweig AB, Fleshman JM, Matrisian LM. Projecting cancer incidence and deaths to 2030: the unexpected burden of thyroid, liver, and pancreas cancers in the United States. Cancer Res. 2014;74:2913–21.
 8.
Jensen OM, Parkin DM, MacLennan R, Muir CS, Skeet RG. Cancer registration: principles and methods. IARC Sci Publ. 1991;95:1288.
 9.
Beggs S, Cardell S, Hausman J. Assessing the potential demand for electric cars. J Econ. 1981;17:1–19.
 10.
Therneau TM, Grambsch PM. Modeling survival data: extending the cox model. New York: Springer; 2000.
 11.
Allison PD, Christakis NA. Logitmodels for sets of ranked items. Sociol Methodol. 1994;24:199–228.
 12.
Tan CS, Støer NC, Chen Y, Andersson M, Ning Y, Wee HL, Khoo EYH, Tai ES, Kao SL, Reilly M. A stratification approach using logitbased models for confounder adjustment in the study of continuous outcomes. Stat Methods Med Res. 2017; Accepted.
 13.
Segi M. Cancer mortality for selected sites in 24 countries (1950–1957). Sendai: Tohoku University School of Medicine; 1960.
 14.
Jemal A, Center MM, DeSantis C, Ward EM. Global patterns of cancer incidence and mortality rates and trends. Cancer Epidemiol Biomark Prev. 2010;19:1893–907.
 15.
Chia KS, Reilly M, Tan CS, Lee J, Pawitan Y, Adami HO, Hall P, Mow B. Profound changes in breast cancer incidence may reflect changes into a westernized lifestyle: a comparative populationbased study in Singapore and Sweden. Int J Cancer. 2005;113:302–6.
 16.
R Core Team: R: A language and environment for statistical computing. 2013.
 17.
Boffetta P, ZeigOwens R, Wallenstein S, Li J, Brackbill R, Cone J, Farfel M, Holden W, Lucchini R, Webber MP, et al. Cancer in world trade center responders: findings from multiple cohorts and options for future study. Am J Ind Med. 2016;59:96–105.
 18.
Li J, Brackbill RM, Liao TS, Qiao B, Cone JE, Farfel MR, Hadler JL, Kahn AR, Konty KJ, Stayner LT, Stellman SD. Tenyear cancer incidence in rescue/recovery workers and civilians exposed to the September 11, 2001 terrorist attacks on the world trade center. Am J Ind Med. 2016;59:709–21.
 19.
Bertuccio P, Chatenoud L, Levi F, Praud D, Ferlay J, Negri E, Malvezzi M, La Vecchia C. Recent patterns in gastric cancer: a global overview. Int J Cancer. 2009;125:666–73.
 20.
LortetTieulent J, Soerjomataram I, Ferlay J, Rutherford M, Weiderpass E, Bray F. International trends in lung cancer incidence by histological subtype: adenocarcinoma stabilizing in men but still increasing in women. Lung Cancer. 2014;84:13–22.
 21.
Jemal A, Thun MJ, LAG R, Howe HL, Weir HK, Center MM, Ward E, Wu XC, Eheman C, Anderson R, et al. Annual Report to the Nation on the Status of Cancer, 19752005, Featuring trends in lung Cancer, tobacco use, and tobacco control. J Natl Cancer I. 2008;100:1672–94.
 22.
Parkin DM. The global health burden of infectionassociated cancers in the year 2002. Int J Cancer. 2006;118:3030–44.
 23.
Mathew A, George PS. Trends in incidence and mortality rates of squamous cell carcinoma and adenocarcinoma of cervixworldwide. Asian Pac J Cancer Prev. 2009;10:645–50.
 24.
Holmes D. Smoking in Russia: will old habits die hard? Lancet. 2011;378:973–4.
 25.
GBD 2016 Risk Factors Collaborators. Global, regional, and national comparative risk assessment of 84 behavioural, environmental and occupational, and metabolic risks or clusters of risks, 1990–2016: a systematic analysis for the Global Burden of Disease Study 2016. Lancet. 2017;390:1345–422.
Acknowledgements
Not applicable.
Funding
This work was supported by the Centre for Health Services and Policy Research SBRO14/NS01G from the National University Health Systems Pte Ltd., National University of Singapore Startup Grant (WBS: R608000059133), and the grant (contract 16 0497) from the Swedish Cancer Society (Cancerfonden).
Availability of data and materials
The dataset supporting the conclusions of this article is available at: http://ci5.iarc.fr/CI5plus/Default.aspx
Author information
Affiliations
Contributions
CST conceptualized the project, performed the data analysis, and drafted the manuscript. Author NS and MR contributed to development of the project, interpreted the findings and revised the manuscript. YN and YC managed and processed the data, participated in the data analysis and revised the manuscript. All authors read and approved the final manuscript.
Corresponding author
Correspondence to Chuen Seng Tan.
Ethics declarations
Ethics approval and consent to participate
The study was exempted from full Institutional Review Board review by the National University of Singapore Institutional Review Board because it involved use of existing data that is publicly available.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Additional files
Additional file 1:
Supplementary materials. (DOC 5245 kb)
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Received
Accepted
Published
DOI
Keywords
 Burden of disease
 Population surveillance
 Incidence
 Mortality
 Epidemiology
 Calendar time trends
 Rank order method