Multiple primary tumours: incidence estimation in the presence of competing risks
 Stefano Rosso^{1}Email author,
 Lea Terracini^{2},
 Fulvio Ricceri^{3, 4} and
 Roberto Zanetti^{1}
DOI: 10.1186/1478795475
© Rosso et al; licensee BioMed Central Ltd. 2009
Received: 29 February 2008
Accepted: 01 April 2009
Published: 01 April 2009
Abstract
Background
Estimating the risk of developing subsequent primary tumours in a population is difficult since the occurrence probability is conditioned to the survival probability.
Methods
We proposed to apply Markov models studying the transition intensities from first to second tumour with the AalenJohansen (AJ) estimators, as usually done in competing risk models. In a simulation study we applied the proposed method in different settings with constant or varying underlying intensities and applying age standardisation. In addition, we illustrated the method with data on breast cancer from the Piedmont Cancer Registry.
Results
The simulation study showed that the personyears approach led to a sensibly wider bias than the AJ estimators. The largest bias was observed assuming constantly increasing incidence rates. However, this situation is rather uncommon dealing with subsequent tumours incidence. In 9233 cases with breast cancer occurred in women resident in Turin, Italy, between 1985 and 1998 we observed a significant increased risk of 1.91 for subsequent cancer of corpus uteri, estimated with the agestandardised AalenJohansen incidence ratio (AJIR^{stand}), and a significant increased risk of 1.29 for cancer possibly related to the radiotherapy of breast cancer. The peak of occurrence of those cancers was observed after 8 years of followup.
Conclusion
The increased risk of a cancer of the corpus uteri, also observed in other studies, is usually interpreted as the common shared risk factors such as low parity, early menarche and late onset of menopause. We also grouped together those cancers possibly associated to a previous local radiotherapy: the cumulative risk at 14 years is still not significant, however the AJ estimators showed a significant risk peak between the eighth and the ninth year. Finally, the proposed approach has been shown to be reliable and informative under several aspects. It allowed for a correct estimation of the risk, and for investigating the time trend of the subsequent cancer occurrence.
Introduction
During the last decades, improvements in medical and surgical treatments have substantially increased the chances of surviving from a cancer. Cancer survivors now amount to more than 3.5% of population in the US [1], and about 3% in Western Europe [2]. Now more cancer survivors face the problem of subsequent cancers possibly related to the late effects of treatments or to a common etiology of first and subsequent cancers. As for other epidemics in the past, the challenge is towards a research effort to address, and possibly prevent, those elements that increase the chance of developing a second tumour. And, as in the past, the starting point is to correctly estimate the incidence of multiple primary tumours on a population basis.
First of all, there is a problem of differential diagnosis, when it comes to distinguish between local and distant metastases, recurrences and the onset of a truly new lesion. Classifications may also vary leading to substantial differences in rates. For example SEER rules [3] differ substantially from those adopted by IARC [4]. Timing of multiple primaries is also important, as they can occur at the same time (synchronous) or after a time lag (metachronous). Usually synchronous tumours are excluded from analyses, in the belief that they rather represent prevalent silent tumours come to evidence during diagnostic procedures.
Secondly, it should be taken into account that incidence rate of multiple primaries is conditional to the probability of surviving the first tumour, having accumulated sufficient time for developing another one. Usually, the studied statistics is the ratio between observed and expected multiple metachronous primary tumours. Expectation is taken calculating the personyears observed in the cohort of patients with first tumour, applying general population incidence rates. In this way, a group of 100 patients with a short survival of 1 year is equivalent to 10 patients surviving for 10 years. But we know that rarely these two groups with such a different survival experience can be compared for several aspects, even without difference in age distribution. On the contrary, when conditioning on survival probabilities, we get the same number of expected cases only when the overall survival is equal to that of the general population. This assumption holds true only for those pairs where the first tumour has a rather benign course with no substantial influence on the whole survival.
Some of these aspects are not new in the literature, and they were discussed for deriving expected number of deaths (or events) for SMR. Keiding offered a historical perspective of this [5], also showing how one of the oldest statistical techniques was connected to conditional survival probabilities and parametric models. The estimation of the expected number of subsequent cancers adds some complications to the traditional model and should be approached in the framework of competing risks, as many subjects are withdrawn from the population at risk, as time goes by, by death or censorship. Previous works had already shown how the traditional KaplanMeier estimator is inappropriate in the presence of competing risks [6], but until now a correct approach taking into consideration competing risks has not yet been applied to the estimation of multiple tumours expected number. In addition, it is also important to consider the time elapsing dimension, as subsequent tumours are often more frequent in the first years, and then decrease, with a later rise after five to eight years depending on the tumour type [7, 8].
These aspects of the problem of estimating probabilities of subsequent tumours occurrence in the presence of timevarying rates led us to consider a nonparametric approach based on multistate models, which can appropriately describe situations where there are several competing outcomes in a time process. Various types of multistate models have been proposed for analysing multiple endpoints in different situations; from transplants to clinical trials and from pregnancybirth model to infectious disease epidemics (for a review see: [9]). Indeed, we made use of a stochastic process for estimating the risk of developing a subsequent cancer, following the enlightening suggestions offered by Aalen and Gjessing in their work [10].
Methods
Statistical methods
We applied Markov theory to the process occurring to the first tumour cohort with two different irreversible and reciprocally exclusive outcomes: death and second tumour occurrence. Of course, to calculate transition intensities in this model it was necessary to also consider censored observations. We estimated transition intensities by NelsonAalen estimators; then we calculated occurrence probabilities conditioned to different events (occurrence of a second cancer, (0, t) death, (0, t)) in each time interval with the AalenJohansen [11] method (AJ) in the framework of a Markov process (for details see Appendix). The proposed model is a simple version of the competing risk model on which a vast literature already exists (see, for example, Satten and Datta for marginal estimation of multistate models with rightcensored data [12]).
where (0, 14) is the cancer occurrence probability in the general population.
It must be noted that imposing the same censorship mechanism in calculating expected cases resulted in a less biased estimator, since the same bias originated by the censorship mechanism was at work both in the numerator and in the AJIRdenominator.
We calculated 95 percent confidence limits using the AJ variance in formula 7 presented in the Appendix.
Age standardisation
Since rates of first and second tumours strongly depend on age, analysis must be done in age strata or a standardisation procedure must be defined. We pursued both strategies grouping age at breast cancer occurrence in five classes: 0–44, 45–54, 55–64, 65–74, 75+. A standardised AJ estimator for the whole population can be obtained as follows:

For each age class k we calculated the AJ estimator : let N _{ k }be the number of subjects in class k at time 0 and set a weight , where N equals the sum of the N _{ k }'s;

define ;

under the assumption that weights are deterministic variables,

so:
Simulation study
We first carried out a simulation study for validating the proposed model. Our aim was to compare the estimated number of second tumours using the AJ estimator with that obtained with the personyear approach in a simulation. This comparison can be better performed simulating the process of second tumour occurrence taking under control biasing factors such as censoring. We then simulated different dynamics of second tumour occurrence. We considered a simulated cohort of 10000 patients with a first primary and with same age and period of incidence, followed up for 10 years. We imposed a survival exponential law with a constant mortality rate of 0.2. Firstly, occurrence of a second primary was kept constant for the whole followup period. We compared the simulated number of second tumours to the number estimated both by the personyear, and by the AJ approach. We let the second tumour incidence rate vary from 0.00025 to 0.004, corresponding to a rate ratio of 0.25, 0.5, 1, 2 and 4. Then, the effect of standardisation by age was investigated repeating the simulation for the five age classes (each with the same number of subjects), varying the occurrence rates, but always keeping them constant for the whole period. Secondly, the simulation was extended to situations where also occurrence rates varied in time: at a constant decreasing or increasing trend, or in a bimodal way. Age standardisation was then applied on bimodal rates simulation.
The simulation engine was based on random chains of multinomial probabilities M (P _{ α }, P _{ β }, P _{ γ }) at each timeclick t, with the constraint that P _{ α }+ P _{ β }+ P _{ γ }= 1. P _{ α }and P _{ β }are normally distributed iperparameters in a three class model, respectively representing the probability of transition from steady state to second tumour and transition from steady state to death.
Subjects
After the simulation study, we applied the AalenJohansen model to the incidence data from the Piedmont Cancer Registry (RTP). RTP collects all incident tumours in the resident population (about one million inhabitants) of Turin (Italy) since 1985. We selected all first occurrences of breast cancer (following IARC rules, cases occurring in the paired breast gland were excluded). We included all cases diagnosed up to 1998 and then we prolonged the observation period for detecting subsequent tumours up to the end of year 2000, allowing for a reasonable amount of followup time also for the last incident cases. We excluded all cases diagnosed with Death Certificate Only (DCO), skin cancers other than melanoma and all synchronous tumours.
Results
Simulation study
Simulation Study, I
Averages over 1000 simulation runs  0.00025  0.0005  0.001  0.002  0.004  Agestand. 

Number of simulated cases  13.8  27.6  55.1  109.34  218.5  82.5 
Number of estimated cases(AalenJohansen)  13.0  26.0  53.0  104.0  208.0  78.6 
Bias %  5.80  5.80  3.81  4.88  4.81  4.75 
Number of estimated cases(personyear)  11.1  22.3  44.4  88.5  175.4  66.5 
Bias %  19.25  19.34  19.34  19.06  19.73  19.42 
Simulation Study, II (^{1}rates = 0.002; 0.0018; 0.00165; 0.0015; 0.00135; 0.0012; 0.001; 0.0008; 0.00065; 0.0005, ^{2}rates = 0.0005; 0.00065; 0.0008; 0.001; 0.0012; 0.00135; 0.0015; 0.00165; 0.0018; 0.002, ^{3}rates = 0.001; 0.002; 0.001; 0.0005; 0.0005; 0.0005; 0.001; 0.002; 0.001; 0.0005, ^{4}rates for age groups = annual rates by increasing Relative Risks {0.25; 0.5; 1; 2; 4})
Averages over 1000 simulation runs  Constantly decreasing^{1}  Constantly increasing^{2}  Bimodal^{3}  Agestand.^{4} 

Number of simulated cases  83.95  52.81  59.41  88.56 
Number of estimated cases(AalenJohansen)  85.00  45.00  57.00  84.42 
Bias %  1.25  14.79  4.06  4.67 
Number of estimated cases(personyear)  66.46  66.72  53.31  84.19 
Bias %  20.83  26.34  10.26  4.93 
Risk of a second tumour following breast cancer
Number of observed second tumours in a cohort of women with breast cancer in Turin (Italy), AJIR
Cancer Site  Observed cases  AJIR^{stand} (95% C.L.) 

Mouth Pharynx  7  0.80 (0.39–1.47) 
Oesophagus  5  2.38 (0.92–5.01) 
Stomach  29  1.38 (0.97–1.93) 
ColonRectum  66  0.87 (0.69–1.10) 
Liver  7  0.61 (0.23–1.30) 
Gallbladder  8  0.78 (0.38–1.40) 
Pancreas  20  1.39 (0.89–2.07) 
Lung  24  0.80 (0.54–1.15) 
Melanoma  14  1.15 (0.65–1.88) 
Cervix uteri  9  0.68 (0.34–1.22) 
Corpus uteri  54  1.91 (1.47–2.44) 
Ovary  24  1.12 (0.74–1.64) 
Bladder  14  0.74 (0.42–1.21) 
Brain & NS  4  0.56 (0.18–1.31) 
Thyroid  11  1.00 (0.46–1.89) 
NHL  21  1.32 (0.85–1.94) 
Leukaemias  9  0.81 (0.39–1.49) 
Other & unspecified  27  0.48 (0.33–0.68) 
Total (breast and skin excluded)  353  0.99 (0.91–1.10) 
A significant risk increase was observed only for corpus uteri cancer, while the only cancer site with a reduced statistically significant risk was "other and unspecified". In addition, AJIR^{stand} showed a suggestive risk increase also for cancers of oesophagus, stomach and pancreas, and for nonHodgkin lymphoma, although confidence limits still included unity. In particular the increased risk of a subsequent cancer located in the anatomical sites of oesophagus, stomach, lung or thyroid was suggestive of a late effect of local radiotherapy of the breast tumor. Grouping together these cancers, we observed a total of 69 patients, and the estimated AJIR^{stand} was 1.15 (95% C.L.: 1.04–1.28).
Mean time (years) occurrence of a subsequent primary cancer in a cohort of women with breast cancer
Subsequent Cancer Site  0–44  Age 45–54  Groups 55–64  65–74  75+  AgeStandardised 

All Sites  5.10  5.14  5.36  5.49  6.17  5.43 
Corpus uteri  2.03  4.99  4.85  5.52  7.69  5.27 
Cancers related to radiotherapy    7.33  5.21  5.81  6.96  6.07 
Discussion
The occurrence of subsequent primary tumours can be due to several factors. Subsequent malignancies can initially result from intense clinical surveillance after the first tumour; they can occur later on as therapies for the first primary can induce carcinogenesis. Finally, they can also be due to shared risk factors, including environment, life styles and inherited genes predisposing to higher susceptibility. However, the high fatality of several cancers hinders the possibility of observing subsequent events, even if their probability is sensibly increased. Following the suggestions of Hougaard [9], we applied a simple Markov model for competing risks and we studied the transition probabilities from first to second tumour varying in time. The observed time trend of second primary occurrence is often not constant with two or more waves of increased risks during the observed period. For this reason we resorted to a nonparametric approach, directly calculating AJ estimators. Some other possible parametric or nonparametric approaches could be based on a piecewise constant hazard function [13], or on stratification by age or other covariates when proportional hazard model cannot be used [14], or multivariate spacestate models [15]. However, lack of available clinical information and shortness of time series in populationbased series of disease occurrence, as in the case of cancer registry data, hinders the possibility of fully exploiting the power of more complex models.
Simulation showed how AJ estimators led to less biased estimates than the personyear method. This is essentially due to the fact that the AJ estimators are built up taking into consideration in numerator and in denominator the exact amount of transitions and persontime at risk at each time interval. On the contrary, the personyear method calculates denominators only at the end of the observation period. Moreover, not only can AJ estimators give a more precise result at the end of the period, but they also describe the full probability trend over time. When keeping the incidence rate constant over the period, the personyear approach led to a larger bias, underestimating the number of events. Also the AJ estimator had some limitations; however, the bias was within the 5% error probability as shown in tables 1 and 2 by the simulation.
The situation is even more complicated with varying rates. The largest bias was seen with constantly increasing rates. In this case, the personyear approach gave rise to a larger than simulated number of events, while the AJ estimator underestimated the overall number of events, although to a lesser extent. This limit is due to the unavoidable introduction of discrete time intervals in the analysis of an intrinsically continuous dimension. In the real situation a constantly increasing occurrence rate is quite uncommon, and usually an early increase risk followed by a decrease or a late peak of incidence is observed. We therefore concluded that we could apply the method of AJ estimators for analysing subsequent occurrence of cancers after a primary breast cancer.
Applying the method to the Turin data, we observed an increased risk of a cancer of the corpus uteri, as also observed in other studies [16–20], although with lower values, and it is usually interpreted as the common shared risk factors such as low parity, early menarche and late onset of menopause. The most remarkable finding of our study was an increased risk of oesophagus, stomach, lung and thyroid cancer.
Indeed, we observed a significant increased risk, grouping together those cancers possibly associated to a previous local radiotherapy: the AJ_{12}(s, t) estimators showed a significant risk peak between the eighth and the ninth year. This was suggestive of a late effect of local radiotherapy of the breast tumour. Other studies [18, 20], based on larger populations and longer followup, observed the association of breast cancer with oesophagus, stomach, lung and thyroid cancers. On the other hand, treatment for earlystage invasive breast cancer shifted in the 1990s from radical mastectomy substantially without regional radiotherapy to increasing use of breastconserving surgery followed by breast radiation (postlumpectomy radiation [21]). In conclusion the proposed approach has been shown to be valid and informative under several aspects. It allowed for a reliable estimate of the number of events, conditioning to observed survival. It also allowed description of the changing pattern of risk over time. Future developments of the method should be directed to the parametric modelling of transition probabilities, also in relation to clinical or epidemiological explanatory variables.
Appendix
Markov models and AalenJohansen estimators
Markov models deal with situations where individuals can belong to a finite set of states and move to one state to some others with a probability, possibly depending on time. The main hypothesis (the Markov assumption) is that the probability of moving from state i to state j at time t depends only on i, j and t and not on the previous states.
For a complete reference see the book on statistical models based on counting processes by Andersen et al [11].
Application to multiple primary tumours
We assumed that the starting time, 0, is the time of diagnosis of the first tumour for each individual.
 1
first tumour
 2
second tumour
 3
death after a first (but not a second) tumour
where 2 and 3 are absorbing states and the possible moves are
In order to fit our situation in a Markov model, we need to make sure that every individual goes through at most one move in every time unity. This is the mathematical reason for eliminating all sinchronous situations.
where is the matrix obtained by formula 2 with the estimated values.
The AJ estimators are consistent and valid also with right censoring and when the underlying process is nonMarkovian [22].
Declarations
Acknowledgements
This research was partly founded by Regione Piemonte – Ricerca Sanitaria Finalizzata (years 2004 and 2006).
We express thanks to the researchers and professors of the Me.Ri.Ma. group of the University of Turin (Department of Mathemathics) who shared their ideas with us and gave us their time and comments. We also thank Federica and Simona Gallo for their editorial assistance.
Authors’ Affiliations
References
 Ganz P: A teachable moment for oncologists: cancer survivors, 10 million strong and growing. J Clin Oncol. 2005,23(24):54685460. 10.1200/JCO.2005.04.916View ArticleGoogle Scholar
 Ferlay J, Bray F, Pisani P, Parkin D: GLOBOCAN 2002: Cancer Incidence, Mortality and Prevalence Worldwide. IARC CancerBase No. 5. version 2.0, Lyon: IARCPress, LippincottRaven; 2004.Google Scholar
 Jonhson C: The SEER coding and staging manual 2004. Fourth edition. NIH Pub No 045581, Bethesda, MD: National Cancer Institute; 2004.Google Scholar
 IARC/IACR: International rules for multiple primary cancer (ICDO Third edition)., Volume 2004/02 of Internal Report. Lyon: IARC; 2004.Google Scholar
 Keiding N: The method of expected number of deaths 178618861986. Int Stat Rev. 1987,55(1):120.View ArticlePubMedGoogle Scholar
 Gooley T, Leinsenring W, Crowley J, Storer B: Estimation of failure probabilities in the presence of competing risks: new representations of old estimators. Statistics in Medicine 1999, 18: 695706. 10.1002/(SICI)10970258(19990330)18:6<695::AIDSIM60>3.0.CO;2OView ArticlePubMedGoogle Scholar
 Crocetti E, Buiatti E, Falini P, the Italian Multiple Primary Cancer Working Group: Multiple primary cancer incidence in Italy. Eur J Cancer 2001, 37: 24492456. 10.1016/S09598049(01)003148View ArticlePubMedGoogle Scholar
 Filali K, Hédèlin G, Shaffer P, Estève J, Arveux P, Bouchardy C, Exbrayat C, Faivre J, Levi F, MacéLesech J, Pottier D, Torhorst J: Multiple primary cancers and estimation of the incidence rates and trends. Eur J Cancer 1996, 32A: 683690. 10.1016/09598049(95)006214View ArticlePubMedGoogle Scholar
 Hougaard P: Multistate models: a review. Lifetime Data Anal 1999, 5: 239264. 10.1023/A:1009672031531View ArticlePubMedGoogle Scholar
 Aalen O, Gjessing H: Understanding the shape of the hazard rate: a process point of view. Stat Sci 2001, 16: 114.Google Scholar
 Andersen P, Borgan O, Gill R, Keiding N: Statistical models based on counting processes. Springer Series in Statistics, SpringerVerlag; 1993.View ArticleGoogle Scholar
 Satten G, Datta S: Marginal estimation for multistage models: waiting time distributions and competing risks analyses. Statistics in Medicine 2002, 21: 319. 10.1002/sim.967View ArticlePubMedGoogle Scholar
 Gijbels I, Gurler U: Estimation of a change point in a hazard function based on censored data. Lifetime Data Anal 1996, 9: 395411. 10.1023/B:LIDA.0000012424.71723.9dView ArticleGoogle Scholar
 Prentice R, Williams B, Peterson A: On the regression analysis of multivariate failure time data. Biometrika 1981, 68: 373379. 10.1093/biomet/68.2.373View ArticleGoogle Scholar
 Durbin J, Koopman SJ: Time series analysis by state space methods, of. Volume 24. Oxford Statistical Science Series. New York: Oxford University Press; 2001.Google Scholar
 Teppo L, Pukkala E, Saxen E: Multiple cancers – an epidemiological exercise in Finland. J Natl Cancer Inst 1985, 75: 207217.PubMedGoogle Scholar
 Volk N, PompeKirn V: Secondary primary cancers in breast cancer patients in Slovenia. Cancer Causes Control 1997, 8: 764770. 10.1023/A:1018487506546View ArticlePubMedGoogle Scholar
 Evans H, Lewis C, Robinson D, Bell C, Møller H, Hodgson S: Incidence of multiple primary cancers in a cohort of women diagnosed with breast cancer in southeast England. Br J Cancer 2001, 84: 435440. 10.1054/bjoc.2000.1603View ArticlePubMedPubMed CentralGoogle Scholar
 Soerjomataram I, Louwman W, de Vries E, Lemmens V, Klokman W, Coebergh J: Primary malignancy after primary female breast cancer in the South of the Netherlands 1972 – 2001: a populationbased longitudinal study. Breast Cancer Treat 2005, 93: 9195. 10.1007/s1054900540162View ArticleGoogle Scholar
 Curtis R, Ron E, Hankey B, Hoover R: New malignancies following breast cancer. In New malignancies in cancer survivors: Seer registries, 1973 – 2000. Edited by: Curtis R, Freedman D, Ron E, ries L, Hacker D, Edwards B, Tucker M, Fraumeni J. National Cancer Institute, Bethesda, MD; 2006:181205.Google Scholar
 Veronesi U, Boyle P, Goldhirsch A, Orecchia R, Viale G: Breast cancer. Lancet 2005, 365: 17271741. 10.1016/S01406736(05)665464View ArticlePubMedGoogle Scholar
 Datta S, Satten G: Validity of the AalenJohansen estimators of stage occupation probabilities and NelsonAalen estimators of integrated transition hazards for nonMarkov models. Statist Probab Lett 2001, 55: 403411. 10.1016/S01677152(01)001559View ArticleGoogle Scholar
Copyright
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.