The originality of the methodology reported herein mainly resides in its ability to detect jumps automatically using the Polydect method, without a priori or visual investigation for jump positions. In addition, application of the method to a large dataset is less time-consuming and less human-dependent than any other known method.
Some methodological choices were made, such as the choice of a linear kernel smoother and the choice of the probability α of detecting fake jumps. Considering a constant kernel smoother or different values of α slightly affected the final set of detected jumps and only for time series in which the jump amplitudes were of an order comparable to that of the overall noise of the time series. Choosing a low value of α insured a better accuracy in the jump's amplitude estimation, which is more statistically stable when the jump is of much larger amplitude than the overall noise of the time series. According to the visual inspection of time series graphs and comparable bridge coding results, a jump's amplitude estimates were reliable enough to be used in subsequent analyses.
The codes used in this study to characterize the conditions were not chosen to be used in all contexts. Indeed, they were allocated with the constraint of being comparable between three versions of the ICD and based on three-digit codes. Taking each ICD individually would certainly have led us to select other codes.
The method is designed to detect sustained jumps. Therefore, it is not sensitive to the occurrence of one-year outliers in time series data and it does not necessitate considering them separately, unlike other methods .
However, the proposed method is not able to detect and correct for nonabrupt data production changes. For example, if a new death certificate form, impacting certification practice and final coding, slowly spread through the population (as was the case in France between 1997 and 1999), the impact on yearly death counts would occur over several years. But, to the authors' knowledge, no general method is able to correct time-spread data production changes.
When comparable, the multiplicative factors obtained from bridge coding studies and time series methods were similar [11, 15–17, 37, 38].
The purpose of this article is not to challenge bridge coding studies. However, bridge coding studies are not implemented in all countries, and it would be very difficult and costly to do so retrospectively for every data production change. The time series analysis methods proposed herein provide a reliable way of correcting data production changes affecting death count time trends.
Given the indirect manner in which data production changes are identified, the method necessitates feedback from data producers in order to confirm and explain the plausibility of the changes. Without that additional information, the automatic method would blindly correct any detected jumps, some of which may be related to real abrupt and sustained variations in the mortality risk. However, it is not always straightforward for a data producer to obtain a broad overview of past coding process methods in the producer's country. The reasons for the occurrence of some of the oldest jumps may have been lost. Therefore, the decision to take into account or not any detected jump that is not confirmed by the data producer will depend on the degree of confidence that the jump is not attributable to a production change.
Some jumps are of great amplitude (e.g., rheumatic heart disease). This may be observed when the cause considered is highly likely to be the result of other causes [10, 23]. In that case, the death count time trend is very sensitive to changes in coding rules (e.g., ICD-10 rule 3). However, the absence of high-amplitude jumps is not sufficient to ensure the interpretability of time trends. Time trends for some conditions like hypertension, heart failure, and renal failure have to be interpreted cautiously. Indeed, the approach chosen was to only consider the underlying cause of death, and these specific causes may be selected as underlying, due to lack of additional information about the real underlying cause on the death certificate. In these cases, mortality time trends could be influenced by other conditions or slowly diffused certification changes. A multiple cause approach considering each cause mentioned on the death certificate could bring very different results.
Large jumps may also be observed when a country uses very specific codes. In this study, for practical reasons, it was decided to use the same codes for all the countries. However, the same general method could have been applied to specific codes for each country.
In any event, time trends for causes with large amplitude jumps, even after correction, are to be interpreted with caution.
For some causes, jump amplitude was markedly heterogeneous by age. This result has already been observed in bridge coding studies [11, 16]. This result could be attributed to three factors: first, for some causes, subcause structure is different by age, and each subcause is differentially impacted by production change; second, older age mortality is more frequently associated with multiple pathologies, and the selection of one of these as the underlying cause may change with coding rules; third, in certain cases, the same death certificate may be interpreted differently depending on the age of the deceased, and this difference may also depend on the coding rules used.