Systematic review of general burden of disease studies using disability-adjusted life years

Objective To systematically review the methodology of general burden of disease studies. Three key questions were addressed: 1) what was the quality of the data, 2) which methodological choices were made to calculate disability adjusted life years (DALYs), and 3) were uncertainty and risk factor analyses performed? Furthermore, DALY outcomes of the included studies were compared. Methods Burden of disease studies (1990 to 2011) in international peer-reviewed journals and in grey literature were identified with main inclusion criteria being multiple-cause studies that quantified the burden of disease as the sum of the burden of all distinct diseases expressed in DALYs. Electronic database searches included Medline (PubMed), EMBASE, and Web of Science. Studies were collated by study population, design, methods used to measure mortality and morbidity, risk factor analyses, and evaluation of results. Results Thirty-one studies met the inclusion criteria of our review. Overall, studies followed the Global Burden of Disease (GBD) approach. However, considerable variation existed in disability weights, discounting, age-weighting, and adjustments for uncertainty. Few studies reported whether mortality data were corrected for missing data or underreporting. Comparison with the GBD DALY outcomes by country revealed that for some studies DALY estimates were of similar magnitude; others reported DALY estimates that were two times higher or lower. Conclusions Overcoming “error” variation due to the use of different methodologies and low-quality data is a critical priority for advancing burden of disease studies. This can enlarge the detection of true variation in DALY outcomes between populations or over time.


Introduction
The burden of disease concept provides a conceptual and methodological framework to quantify and compare the health of populations using a summary measure of both mortality and disabilitythe disability-adjusted life year (DALY) [1,2]. Since the launch of the Global Burden of Disease (GBD) study in 1993, the burden of disease concept has been widely adopted by countries and health development agencies alike to identify the relative magnitude of different health problems. This information serves as crucial input for debates about priorities in the health sector.
Criticism of the GBD study focused on the construction of DALYs [3,4], particularly the social choices around age weights and severity scores of disabilities. The GBD 2010 Study that is currently being conducted responded to the critiques and recent improvements in the field and includes significantly improved methods for burden assessment, particularly for ranking risk factors and disabilities [5,6]. It is expected that the imminent publication of the GBD 2010 Study will result in a new impulse to perform burden of disease studies.
A major strength of the burden of disease concept is that it allows comparison between different health problems, between different years, and between countries. In principle, the DALY approach should be used consistently to provide comparable DALY estimates. However, the technical approach of the GBD is complex, both in concept and in application, and there are many methodological alternatives, e.g., using alternative morbidity estimates, life expectancies, or severity weights, which have enormous influence on DALY outcomes [7]. Hence, the interpretation of results of burden of disease studies requires detailed methodological knowledge.
General burden of disease studies are multiple-cause studies that quantify the burden of disease as the sum of the burden of all distinct diseases expressed in DALYs. Until now, a systematic review of general burden of disease studies and the underlying methodological choices has not been conducted. This review was a first step in the development of a protocol specifically for burden of foodborne disease studies. This protocol complements the GBD manual, as it addresses problems that arise particularly when undertaking foodborne burden of disease studies. The protocol was developed for researchers that aim to undertake burden of foodborne disease studies in the framework of the Foodborne Disease Burden Epidemiology Reference Group (FERG). The FERG was established in 2007 by the World Health Organization (WHO). The purpose of the group is to advise WHO in their estimates of the global burden of diseases commonly transmitted through food.
This systematic review aims to provide an overview of the methodology of general burden of disease studies using the DALY approach. In the review, the following key questions were addressed: 1) what was the quality of the data and were there any data gaps, 2) which methodological choices were made in order to calculate years of life lost due to mortality (YLL) and years lost due to disability (YLD), and 3) which methods were used to handle uncertainty and risk factor analysis. Furthermore, DALY outcomes for specific disease and injury groups resulting from the general burden of disease studies were compared.

Selection criteria
In this review, burden of disease studies based on general multiple-cause studies (including all diseases and injuries) were included. Empirical studies in international peer-reviewed journals and grey literature published in English in the period 1990 to 2011 were included. Studies in established market economies and low-and middle-income countries were also included. The review is restricted to studies using the DALY as a burden of disease measure, both country-specific and worldwide.

Disability-adjusted life year
The DALY is calculated by adding YLL to morbidity and disability, expressed in YLD. The DALY methodology is represented in a conceptual framework in Figure 1.
YLL is calculated by summation of the number of fatal cases (d) due to health outcome (x) in a certain period multiplied by the residual expected life expectancy (e) at the age of death: For the calculation of YLD an incidence or prevalencebased approach can be used, which is highly dependent on the availability of data. The incidence-based approach quantifies both the burden of disease occurring during the reference period and the burden accrued into the future. A prevalence-based approach ascribes burden to the age at which disability is lived. YLD inc is calculated by multiplying the number of incident cases (I) at a certain age with health outcome (x) by the duration of the health outcome (t) and the disability weight (dw) assigned to health outcome x: YLD prev is calculated by multiplying the number of prevalent cases (P) in age group (x) at a point in the reference period with the disability weight (dw) assigned to health outcome x:YLD prev x = P x × dw x These basic formulas can be supplemented due to methodological choices (e.g., expanding with discount factor and ageweighting).

Data sources and search strategy
Searches of eligible studies were conducted in Medline (PubMed), EMBASE, and Web of Science. Searches for eligible grey literature were conducted in Google Scholar and SIGLE (System for Information on Grey Literature in Europe). All international peer-reviewed articles and grey literature published in English in the period January 1990 to 2011 were included in the searches. Search terms used for general burden of disease studies were: "burden of disease, "disability adjusted life year," "disabilityadjusted life year," "DALY." Keywords were matched to database-specific indexing terms. In addition to database searches, reference lists of review studies and articles included in the review were screened for titles that included key terms.

Data extraction
Relevant papers were selected by screening the titles (first step), abstracts (second step), and entire articles (third step) retrieved through the database searches. During each step respectively, the title, abstract, or entire article was screened to ensure that it met the selection criteria listed above. This screening was conducted independently by two researchers (Suzanne Polinder and Juanita Haagsma). Disagreement about eligibility between the reviewers was solved through discussion.
Full articles were critically appraised by two reviewers (Suzanne Polinder and Juanita Haagsma), using data extraction forms, which included information on the study population, details regarding the methods used to calculate YLL and YLD, risk factor analysis, main conclusions, and evaluation of results. Their reports were compared and disagreements were resolved by discussion. Figure 2 shows the flow diagram of the search of existing burden of disease studies and main reasons for exclusion. Eventually, 31 studies were included in the review. Table 1 shows the studies that have been included for the review. In Figure 3, the number of general burden of disease studies is shown per WHO region. Four studies were worldwide burden of disease studies [8][9][10][11].

Literature search
What was the quality of the data and were there any data gaps?
The availability and quality of mortality and morbidity data strongly differs by country.
Most countries register the number of fatal cases as well as the age and cause of death in national vital registrations (see Table 1). Vital registration often had full coverage, which means that the data are representative for the population of these countries.
Where vital registration had less than 100% coverage it is important to know whether the data had been extrapolated to 100%. However, the majority of the studies did not report whether the death statistics used had 100% coverage. Only 10 studies reported that they corrected for underreporting of death statistics [8,[10][11][12][13][14][15][16][17][18], for instance with demographic projection models [14], or by using the average of several years of death statistics to minimize stochastic variation [10,18].
Next to extrapolation in case of missing data, procedures such as reallocating of ill-defined deaths from so-called "garbage codes" may be necessary. Problems can arise from the routine use of specific codes in the International Classification of Diseases (ICD) list where information is incomplete. This can occur when medical records are not fully considered or when medical practitioners concerned with the specific cases are not consulted during the process of completing the forms. Certain codes then become overused and bias the relative importance attached to particular cases of death. These codes are called "garbage codes." The majority of the studies (n=24; Table 2) reported corrections for ill-defined deaths, partly by reallocation from garbage codes. For instance, the Victorian Burden of Disease study [19] redistributed the cardiovascular garbage codes to ischemic heart disease, inflammatory heart disease, and hypertensive heart disease in proportions that varied by age. Notably, many studies did not report whether mortality data were corrected for missing data, underreporting, or misclassification.
Which methodological choices were made in order to calculate YLL and YLD General burden of disease methods An incidence-or prevalence-based method can be used to quantify the burden of disease. In practice, it is often difficult to rigidly apply the incidence or prevalencebased approach and sometimes compromises must be made. Most studies have followed an incidence-based  Figure 1 Conceptual model. approach (n=25, Table 2). For some of these countries, not all incidence-based data could be gathered, and partly the prevalence-based method was used. The GBD developed a list of disease and injury causes based on the ICD. Including all causes avoids the problems of overinclusiveness of single-cause studies and incompatible mortality claims for different causes. Most studies used the GBD disease and injury causes, and sometimes some causes were removed due to little relevance and other causes were added to the list ( Table 2). Three studies developed their own disease and injury groups [20][21][22]. The original GBD study applied age-weighting and discounting [23]. With age-weighting, the altering levels of dependency with age are taken into account, meaning that years lived at youngest and oldest age are given less weight. Discounting means that future life years are assigned less value than those lived today. This is based on the economic concept that immediate profits are generally preferred over benefits later in time [1]. Both age-weighting and discounting have been disputed, which is further described in the discussion section. This debate is also translated in the distinction in the use of both methodologies in the included studies. Almost all studies (n=26) assigned less value to future life years by using a discount factor. However, only half of the studies (n=17) performed ageweighting in their study. It was not always stated whether age-weighting and discounting were used [18,24].  [25]. Other studies used life tables from their country (n=8) [18,21,[26][27][28][29][30][31].    Methods to calculate years lost due to disability (YLD)

Methods to calculate YLL
A crucial aspect to calculating YLD is the disability weight; a value ranging from 1, indicating worst imaginable health state equal to death, through 0, indicating full health. Its value is based on the preferences stated by a panel of judges towards a set of hypothetical health states, expressing the relative undesirability of the health state [32,33]. Several sets of disability weights exist, such as the GBD disability weights [1] and the Dutch Disability Weights (DDW) [34]. Most studies (n=27; Table 2) used the GBD disability weights. Sixteen of these studies combined the GBD weights with the DDW for disease and injury causes that were not in the GBD study. Four studies [14,29,35,36] derived YLD by applying the ratio of YLD to YLL from one study to derive the YLDs for their own study, which is common in burden of disease analyses for countries with limited data on disease occurrence [25]. For instance, the Western Australian Burden of Disease study [29] used this method to derive the YLDs for residual conditions not specifically analysed, but which were grouped to complete a broad disease grouping (e.g., other cardiovascular conditions).
Eight studies adjusted for comorbidity [12,13,16,19,20,28,30,31]. The Australian burden of disease studies [13,19,28,30] have developed methods to address the issue of comorbidity for the common coexisting nonfatal conditions (e.g., deafness, osteoarthritis, mental retardation, diabetes). With this method, the difference between a composite weight for two coexisting conditions and the weight for the more severe of the conditions is calculated and used, rather than the weight of the milder condition in its independent state. The disability weight for the more severe condition remains unchanged.
Which methods were used to handle uncertainty and risk factor analysis?

Uncertainty analysis
Each burden of disease study contains uncertainty as a result of possible imprecision in epidemiological data (e.g., deaths, incidence, prevalence, severity), in the parameter values used or due to methodological controversy. None of the studies quantify uncertainty in epidemiological data. Uncertainty in the parameter values is     [53], AUdw = Australian NBD disability weights [25], Zimb = disability weights based on preferences of Zimbabwe [41], KORdw= Korean disability weights [24], Estdw = Estonian disability weights [27], EQ5Ddw = EQ5D disability weights [20].
described by sensitivity analysis in 21 studies (Table 2). These studies test whether plausible changes in values of the main variables affect the results of the analysis [1]. Most studies showed how the results of their study varied when a discount rate changes, and some studies also examined the influence of the use of age-weighting, the effect of uncertainty in the disability weights, and/or uncertainty in the incidence data.

Risk factor analysis
A risk factor is an attribute or exposure that is causally associated with an increased probability of a disease or injury [1]. Regarding the causal attribution of the burden of disease, one can either attribute it to a single cause (categorical attribution) or to a group of causes (counterfactual attribution). The latter can be analyzed using counterfactual analysis. With counterfactual analysis, the current or future disease burden is compared with the burden of disease that would be expected under an alternative hypothetical scenario, the counterfactual scenario, to estimate the effects of disease(s) or risk factor(s) [37]. Twelve studies performed risk factor analyses [8][9][10][11][12][13]15,19,20,26,28,29]. Risk factors that were analyzed were related to the effect of lifestyle factors (such as tobacco smoking, physical inactivity, alcohol consumption, diet, unsafe sex, and intimate partner violence), physiological states (such as obesity, high blood pressure, and high cholesterol) and also societal conditions (such as occupational exposures and air pollution) on the burden of disease.

Comparison of disability-adjusted life year outcomes for specific disease and injury groups
The total burden due to diseases and injuries varies enormously between the included studies. The highest burden of disease was found in Pakistan (45,600 DALYs per 100,000) and Zimbabwe (41,900 DALYs per 100,000). The lowest burden of disease was estimated in Queensland (10,700 DALYs per 100,000) and Singapore (10,400 DALYs per 100,000) ( Table 3). The differences in total DALYs between countries can partly be explained by differences in exposure to risk factors. For example, almost half of the total burden of disease in Zimbabwe is due to HIV and diarrhea. These diseases are rare in developed countries. In most developed countries the highest burden is caused by cardiovascular diseases, followed by road traffic injuries and depression.
Comparison with the GBD DALY outcomes by country revealed that DALY estimates were of similar magnitude in some studies (e.g., for Syria [38], USA [35], Singapore [39], and Turkey [15]). Other studies reported DALY estimates that were two times higher [27,40] or two times lower [41] than the GBD study (Table 3).
Notably, four of the Australian burden of disease studies found comparable total DALY outcomes (between 13,220 and 13,700) [12,13,19,28], where the GBD reported 9,894 as total DALYs for Australia.

Discussion
We systematically reviewed 31 general burden of disease studies using the DALY approach and performed a quality assessment of the methodology used. We found that studies generally followed the GBD approach, but that large differences exist in methodology. Most studies used the incidence-based approach (80%), and almost all studies classified disease and injury groups as defined by the GBD. Half of the studies used age-weighting, whereas 80% of the studies used discounting.
As all systematic reviews, our study has some limitations. Reviewing the literature in the field of "burden of disease" studies was complicated by a wide variety of terminology for burden of disease. Consequently, some relevant publications may have been missed. To enhance the identification of relevant burden of disease studies we have used a variety of literature databases and keywords were matched to database-specific indexing terms. Furthermore, this review is limited to the English language. Therefore, relevant studies in other languages (e.g., Spanish [42,43]) are excluded. In the databases that were reviewed, we found a limited number of studies that included all diseases and injuries. An explanation for this finding may be that the use of the DALY is controversial and accompanied by theoretical concerns [44]. Practical concerns, such as lack of resources and available data sources and/or expertise, may also be a reason for researchers' apprehension to perform multiple-cause burden of disease studies.
Quality of the data and were there any data gaps The main issue in burden of disease studies is access to complete, consistent, and comparable epidemiological data. Summary measures of population health, such as the DALY, are only as good as the weakest link in the chain, which is the epidemiological evidence [45]. Most studies derived numbers of incident cases directly from disease registers, routine databases, or epidemiological studies. Furthermore, some studies used a combination of incidence and prevalence-based data because, for most conditions, only prevalence data were available. The calculation of mortality burden is straightforward, and the precision of the estimates of YLL depends almost entirely on the quality of data on underlying causes of death. Although great improvements in reporting, coding, and classification of mortality have been made, significant challenges remain. The infrastructure for mortality and health databases varies considerably around the world [1,46]. Developed regions have electronic databases that provide summary statistics through World Wide Web-based queries. Other countries maintain tabulated mortality statistics that are not integrated into a utilizable database, and many developing countries have paper-based systems with rates based on projections and estimates rather than actual counts [46]. The data challenges that result from disparities in the level of health infrastructure yield rates that can be difficult to compare. Furthermore, differences in death certification systems, methods of data collection, Table 3 DALY outcomes for specific disease and injury groups (per 100,000 persons) and definitions of variables severely challenge international comparisons.

Methodological choices
The calculation of the morbidity component of the burden of disease, expressed in YLD, requires extensive epidemiological modelling and is often based on a diverse range of data sources, literature research, and/or expert opinion. The resulting YLD estimates depend highly on the specific model being applied and the type of data underlying this model.
Most studies used the GBD 1996 disability weights, in many cases supplemented by DDW. The GBD disability weights cover a wider range of conditions than covered by the DDWs, but are generally less specific in terms of the disease and sequelae categories to which they refer. The set of DDW covers a more restricted range of conditions compared to the GBD 1996 disability weights, but it differentiates more finely between condition stages and severities, thus allowing more detailed disease models in estimating the YLD than is possible with the GBD weights [47]. For example, DDWs are often used for HIV/AIDS and organ disorders. For the GBD 2010 Study, new disability weights are being derived [6].
In the original GBD study, discounting and ageweighting are applied. Both age-weighting and discounting have been disputed. The use of age-weighting is discussed, since lost years of healthy life are assumed to be of equal value regardless of the age at loss, the absence of empirical foundation and validation, and because the age weights do not convey actual social values as this practice is controversial [3,48]. Discounting has been disputed because its application results in a lower efficiency of prevention programs, whereas not discounting, or the use of a low discount ratelower than the discount rate used for the costsfavors preventive measures due to benefit in the far future [32]. This discussion is reflected in different choices to use discounting and age-weighting between studies.
To measure the gap between actual population health and an ideal, most studies used the global standard life expectancy (West Level 26 and 25 life tables) as used by the GBD study. These life tables contain the relevant expectancies for each age and sex grouping. However, they have the disadvantage that they are abridged period life tables, chiefly using five-year age groupings and an upper-age category of 85+ [28]. The use of cohort life expectancies with more complete underlying population data and more complex methods, as done for two Australian studies [12,28], resulted in more accurate and slightly different life expectancy measures. The use of the global standard life expectancy figures is recommended to enlarge the comparability with GBD study outcomes, but a sensitivity analysis using more detailed country-specific life expectancies is recommended.

Comparison of DALY outcomes
The sensitivity of DALYs, defined by the relative contributions of "true" and "error" variation, is assumed to be low. Potential sources of true variation include differences in the size and structure of populations, real differences in disease epidemiology between populations or over time, and differences in disability weights. Error variation may originate from the use of different methodologies (e.g., for discounting and age-weighting, disability weights) and from low-quality mortality and morbidity data. The detection of true variation is the focus of interest when estimating the burden of disease in DALYs. However, error may limit the power to detect true differences between populations [49]. Therefore, at the moment burden of disease studies are not comparable, nor are disease rankings as these are affected by methodological variation as well.

Conclusions and recommendations
Burden of disease analyses provide a unique perspective on health, one that integrates fatal and nonfatal outcomes, yet also allows the two classes of outcomes to be examined separately. Furthermore, burden of disease studies may provide a valuable insight into the scope for further health gains on the global or country level. This information will assist in taking up the future challenges posed by an aging population, by changes in disease and risk factor patterns, and by the increasing costs of health services [28]. Linking burden of disease analyses to cost effectiveness studies of interventions for major health problems will allow these interventions to be judged both in terms of cost effectiveness, and their relative impacts in reducing the burden of disease and ill health at the population level. Furthermore, burden of disease studies may shed light on crucial data gaps and facilitate priority setting in research.
However, large differences in used methodology exist between general burden of disease studies. Because of the methodological variation between studies it is difficult to assess whether differences in DALY estimates between the studies are due to actual differences in population health or whether these are the result of methodological choices. Overcoming this methodological rigor between burden of disease studies using the DALY approach is a critical priority for advancing burden of disease studies. Harmonization of the methodology used and high-quality data can enlarge the detection of true variation in DALY outcomes between populations or over time.
Furthermore, overcoming this limitation in methodological rigor is particularly important in view of the imminent launch of the GBD 2010 Study, which is expected to result in a new impulse for the performance of burden of disease studies. It is a challenge for the GBD to develop more detailed harmonization procedures and clear guidelines to increase methodological improvements and enhanced comparability of general burden of disease studies.