How do Zimbabweans value health states?

Background Quality of life weights based on valuations of health states are often used in cost utility analysis and population health measures. This paper reports on an attempt to develop quality of life weights within the Zimbabwe context. Methods 2,384 residents in randomly selected small residential plots of land in a high-density suburb of Harare valued descriptors of 38 health states based on different combinations of the five domains of the EQ-5D (mobility, self-care, usual activities, pain or discomfort and anxiety or depression). The English version of the EQ-5D was used. The time trade-off method was used to determine the values, and 19,020 individual preferences for health states were analysed. A residual maximum likelihood linear mixed model was used to estimate a function for predicting the values of all possible combinations of levels on the five domains. The model was fit to a random subset of two-thirds of the observations, with the remaining observations reserved for analysis of predictive validity. The results were compared to a similar study undertaken in the United Kingdom. Results A credible model was developed to predict the values of states that were not valued directly. In the subset of observations reserved for validation, the mean absolute difference between predicted and observed values was 0.045. All domains of the EQ-5D were found to contribute significantly to the model, both at the moderate and severe levels. Severe pain was found to have the largest negative coefficient, followed by the inability to wash and dress oneself. Conclusion Despite a generally lower education level than their European counterparts, urban Zimbabweans appear to value health states in a consistent manner, and the determination of a global method of establishing quality of life weights may be feasible and valid. However, as the relative weightings of the different domains, although correlated, differed from the standard set of weights recommended by the EuroQol Group, the locally determined coefficients should be used within the Zimbabwean context.


Background
The resources available to health care are obviously finite, and prioritisation or rationing of public health provision is on government agendas across the world [1]. Cost-utility analysis is one method of investigating the relationship between the costs and benefits of health care that allows for comparison of different interventions across different health states. The quality-adjusted life year (QALY) forms the basic unit of measure in such evaluation and is the most widely used method for measuring health outcomes [2]. The QALY is the arithmetic product of data on quantity of life and quality of life. Whilst the former is typically measured in life years, the latter is measured in terms of utility weights. There is little consensus as to how these weights should be developed, but the measure should have at least interval properties and should represent the preferences of society [3].
There are a plethora of instruments for describing healthrelated quality of life, most of which demonstrate acceptable psychometric properties [4]. Some of these measures, such as the SF-36 [5], are primarily profile measures that provide descriptors of health states. Others, such as the Health Utilities Index (HUI) [6] and the EQ-5D [7], are linked directly to utility estimates, derived from population studies using some method of eliciting population preferences, such as the standard gamble.
The EQ-5D describes health-related quality of life in terms of five dimensions: mobility (MO), self-care (SC), usual activities (UA) (work, study, housework, family or leisure), pain/discomfort (PD) and anxiety/depression (AD). Each dimension is subdivided into three levels indicating no problem, a moderate problem or an extreme problem [7]. Different health states can be described by a five-digit code number relating to the relevant level of each dimension, with the dimensions always listed in the order given above. Thus a health state of 11223 means: Dimension AD: Extremely anxious or depressed (= 3) [8] The validity and reliability of the EQ-5D have been found acceptable in Europe among different populations and patient groups [9][10][11]. Despite the limited number of dimensions and levels, the instrument has been found to be sensitive to improvements in health-related quality of life [12]. A test-retest study was undertaken in Zimbabwe to determine the reliability of the English language version of the EQ-5D. Forty-four randomly selected subjects who had a minimum of seven years of education and whose health status had remained static over the previous seven days completed the instrument twice, one week apart. In all domains except SC, approximately half of the respondents reported some or severe problems. The kappa statistics were 0.695 (fair to good agreement) for SC, 0.878 for MO, 0.884 for UA, 0.892 for PD and 0.893 for AD (all excellent agreement beyond chance [13]). A similar reliability study on the version of the EQ-5D in Shona, the local Zimbabwean language, reported that the kappa statistics between the two sets of scores were high and ranged from 0.78 to 1.00 for different domains [14]. Although the Shona version was not used in the current exercise, multiple translators examined the cross-cultural equivalence of meaning of the EQ-5D during the process of forward and back translation. One of the conclusions of the translators of the instrument was that "although it is likely that the Shona respondents will identify it as a foreign instrument, Shona is able to capture the EQ-5D concepts. The respondents will be able to recognise the concepts and respond appropriately..." [15]. It was concluded that, despite the different cultural understanding of determinants of ill health, the English version of the EQ-5D could be used with confidence in an educated urban Zimbabwe population.
Several methods of valuation of health states have been developed, including rating scales or visual analogue scales, magnitude estimation, standard gamble, time trade-off and person trade-off methods [3]. The standard gamble has been extensively used to develop utility weights, and is regarded by some as being the most theoretically sound method of determining utility weights [16]. However, it is conceptually difficult and requires an ability to discriminate between probabilities close to one [3]. Nord [17] proposes that time trade-off techniques are likely to be the most valid technique for establishing preference weights for life years both in the clinical situation and in program evaluation.
The Measurement and Valuation of Health Group (MVH), headed by Williams, used time trade-off exercises to elicit preferences from 3,235 respondents in the United Kingdom for a range of different EQ-5D descriptor states [8]. Regression analysis was used to develop a set of values for each individual component of the five dimensions that can be used to calculate the value of health states not observed directly [8]. The test-retest reliability of health state valuations collected with the EQ-5D questionnaire is reported to be stable over time [18].
It is unlikely that preferences for health states are universal, although some health states might be given similar valuations across cultures [19]. The greater the divergences of the local culture from the Western worldview, the less likely that health state valuation will be the same [20]. Barker and Green [21] state that health state values should be developed locally based on the judgments and priorities of local communities, in the service of these communities.
Much work has been done in developed countries on the valuation of health states [7,8,[22][23][24], but there is a need to develop locally applicable measures of health that may be used to monitor the impact of interventions in developing countries. The WHOQOL is one of the few attempts to develop a genuinely international quality of life assessment [25], but so far it has no direct link with a utility index. The primary objective of this study was the generation of a set of weights for the different health states as described by the EQ-5D that would represent the values of urban high-density dwellers in Zimbabwe. Urban dwellers were chosen, as they were more likely to have the numeracy and literacy skills necessary to participate in the exercise. Where appropriate the results were compared with the results of the MVH study [8].

Subjects
In March 2000, 2,488 residents of randomly selected small residential plots of land in Glenview, a high-density suburb of Harare, were interviewed in their homes. The entrance criteria included completion of primary school education and a minimum age of 15 years. The oldest person in each household who met the criteria was interviewed.

Instruments
English descriptors of 38 different health states based on the different combinations of the five EQ-5D domains used in the original MVH study [8] were compiled on flash cards (See Appendices I and II). (Thirty-eight, rather than the original 42 health states were used, as unconscious and death were not valued, and two other states were excluded due to an administrative error). Each respondent was asked to complete a self-assessment using the EQ-5D and to value his or her own health condition on the EQ-5D visual analogue scale (VAS), which ranges from the "Worst possible health state" at 0 to the "Best possible health state" at 100. Respondents then each valued a different set of seven randomly selected health states (which included one or two very mild, mild, moderate and severe states). All respondents also valued an eighth state, the 33333 state. Valuation of states was undertaken using the time trade-off (TTO) approach. For states better than death subjects choose between a length of time in perfect health (11111), x, which was equivalent to spending ten years in the target state. In this case, a larger x indicates a better health state. For states worse than death, the choice was between dying immediately and spending a length of time (10 -x 1 ) in the target state followed by x 1 years in full health. A visual aid was used to clarify this choice. The greater the number of years in full health perceived to compensate for the time spent in the target state, the worse the health state. States worse than death were thus given negative values in analysis.

Procedure
The full procedure is described in Appendix II and III -see additional file 1. Nine interviewers, all of whom had higher degrees or diplomas of some kind, participated in a training workshop over three days, which included a pilot study. All residential plot numbers in Glenview were identified from a municipal map of the area, and a random sample of 2,500 were chosen. In the event of no one being present at the identified residential plot, the interviewers returned once more. The research assistants were instructed to conduct interviews in the evenings and weekends to the extent possible, but this was difficult because of the political unrest and weekend rallies at the time, and many interviews took place during work hours. Before each interview, the research assistant shuffled the 38 health states (excluding the 33333 health state) and randomly chose seven states for the respondent to value. Check visits were conducted by a supervisor to ensure that the randomly chosen residential plot had been visited.

Data analysis
Statistical analyses were undertaken using GenStat version 4.2 [26] and SPSS for Windows, Release 10 [13]. Descriptive statistics and χ 2 and 95% confidence intervals (CI's) were used to delineate the demographic characteristics of the subjects and to compare them with population demographics of high-density dwellers in Harare derived from census findings. The health characteristics of the respondents in terms of the five EQ-5D domains were described. The sample of respondents was randomly divided into three, and analysis was performed on two-thirds of the sample, the internal sample. The results were then used to estimate the values of the remaining one-third, the external sample. The dependent variable was the TTO score divided by ten. A residual maximum likelihood (REML) linear mixed model was fitted. Residual maximum likelihood estimation is a method of estimating variance components in the context of unbalanced incomplete block designs. It takes account of the loss of degrees of freedom in estimating the mean and produces unbiased estimating equations for the variance parameters [27]. Interviewer effect and subject nested within interviewer were fitted as the random effects. The three levels of the five domains were entered as the fixed effects and a weighted least squares model was fitted. The fixed effects were entered in a forward and backward sequence and their effects assessed using Wald statistics. An ANOVA full factorial model with Type III least squares (N = 15,671) was used to establish the source of variance. The dependent variable was the TTO score, and random factors entered included research assistant, health state, occupational category and gender. Interactions between the main effects were also investigated.

Subjects
Forty-eight respondents refused to answer the questionnaire. The data from 56 respondents were incomplete, and the replies from 201 respondents demonstrated inconsistency but were included in the analysis (see discussion below). Inconsistent data included responses in which all states were given the same value, fewer than three states were valued, or there were more than three logical inconsistencies (e.g. if a 11112 state were valued as being more severe than a 11113 state). Ultimately, the responses of 2,384 subjects were analysed. The demographic details of the respondents were compared with the results for Highfield in the 1992 census Harare Profile [28] or, if not available, with results for Harare Province (Table 1). Males were underrepresented at 38.3%, a proportion which fell outside the 95% CI for the population (52.4 -53.1%). There were more young adult With regard to the self-reported scores on the EQ-5D dimensions, nearly one-third of the respondents reported either some or severe problems in the dimensions of pain/ discomfort and anxiety/depression. The mean score on the VAS was 79.8 (CI = 79.1 -80.5).

Valuations
There were 19,020 values of 38 EQ-5D health states analysed, 12,663 in the internal sample and 6,357 in the external sample. Fifty two percent of the proportion of variance was due to the domain scores, 7% due to interviewer and health state interaction, 6% due to interviewer effects and 35% due to error. (Table 3).
The Wald statistic was highly significant for all main fixed effects, both when the effects were fitted in a forward and in a backward sequence (p < 0.001) ( Table 4). Previous models included a variable (N3) which reflected that at least one domain was valued at the severe level. However, this led to only a small increase in R 2 and resulted in a model in which moderate problems in several domains were counter-intuitively valued as being worse than extreme problems. This model was subsequently discarded.  The effects were largest for a level three on the dimensions of PD and SC, followed by level three MO and AD (Table  5).
There was some evidence of significant interaction effects. However, because not all combinations of health states are plausible and were not valued, numerical problems were experienced in the fitting of the model, and the estimates appeared unreliable.
Coeffecients from the model were used to generate predicted values for each of the states included in the study. For example, for the health state 22331 (i.e., some problems in walking about, some problems with self care, unable to perform usual activities, severe pain and no anxiety/ depression), the predicted value would equal 0.90 -0.056 -0.092 -0.135 -0.302 -0.0 = 0.315. The actual and predicted value and residuals for each health state for the internal and external samples and the observed results of the United Kingdom MVH study are reported in Table 6. The Pearson's correlation between mean values of health states was 0.914 (p < 0.001). The 33333 state was the only state that was valued as being worse than death in the current study (mean value = -0.24). There were three health states in the external sample for which the mean difference between the observed and predicted values was more than 0.1. In the internal sample, the mean absolute difference for all health states was 0.045. Figure 1 depicts the predicted UK scores compared to the Zimbabwe internal sample scores for the 39 health states (the UK sample did not include the 11111 health state).
Whereas there is initial convergence in scores between the Zimbabwe and MVH sample at high health levels, values diverge as the health states become more severe and domains at level three are included. Spearman's rank correlation between the values for the different states was 0.95 (p < 0.001).

Discussion
To the knowledge of the authors, this is the first paper to present the preferences for health states by urban Zimbabweans. The self-reported health-related quality of life of the Zimbabwe subjects was similar to that of UK counterparts. Kind et al. [9] found that 30% of a large UK sample reported some or severe pain/discomfort. However, the number reporting some or severe anxiety/depression was smaller in the UK sample (21%) than in the present study. The two samples were similar in finding very few people reporting problems in the area of self-care, or extreme problems with mobility or usual activities. The mean score on the Visual Analogue Scale (VAS) in the Zimbabwe sample was 79.8 (CI = 79.1 -80.5), which was similar to the British sample (mean 82.5).
As the questionnaire was administered in English and numeracy was required, the methodology precluded gathering valuations from a truly representative sample. The educational inclusion criteria and the limitations imposed on the times for data gathering by female interviewers resulted in a sample in which females, younger people and those with a higher level of literacy were overrepresented. In addition, the interviewer effect was considerable. As the interviewer and subject were entered in the computation as random effects, the REML linear mixed model allowed for the demographic deviations from census findings, the non-independence of the measures and the interviewer effect.
The interviewer effect appeared in spite of training sessions, piloting and standardisation of the format of the interview. It is possible that the approach and amount of interpretation given by each interviewer differed. The effect of the gender of the interviewers was evident, and female interviewers apparently did not conduct interviews during the evenings or weekends to the same extent as their male counterparts. This imbalance might have compounded the interviewer effect, which suggests that the gender of interviewers should receive careful attention in community surveys, particularly in socially unstable conditions.
However, a credible model was ultimately developed in this study. The mean absolute difference between the actual and estimated means for the external sample (0.045) is comparable to that of the UK study (0.039) and a similar study in Japan (0.01) [24], although in each case different models were used. The inclusion of inconsistent responses is controversial, with some researchers excluding these data from analysis [29]. These responses were included in this analysis on the assumptions that inconsistencies do not necessarily indicate a lack of understanding of the task, that all those who participate have the right to have their data included, and that human beings are not always rational in their judgments regarding health states.
Significant interaction effects were found but, as noted above, appeared unreliable due to an incomplete data set. The inclusion of interaction effects resulted in a model that was difficult to interpret, which would likely limit the use of the model in practice. Similarly, the inclusion of the N3 term, which indicates severe problems on at least one domain, resulted in a more complicated and less intuitive model. It was therefore decided to adopt the simple main effects model.
The UK and Zimbabwe samples produced similar descriptions of their own health states and similar rank orderings of the hypothetical health states. (As a different model was used, the coefficients of the valuation function could not be compared directly between the UK and Zimbabwe samples.) The mean self reported VAS was 3% lower in the Zimbabwe sample compared to the UK sample. The Pearson's correlation for the predicted health state values (0.95) was high. Although previous studies based on the EQ-5D have reported similarities in valuations, with low sensitivity for socio-demographic variables across European countries [22], the results of this study were unexpected. A previous study on the rank ordering of health states had found no correlation between the international and locally determined Zimbabwean ranking [20]. It would appear that a deconstructed approach to valuation in which impairments or activity limitations (e.g. pain or problems in moving around) are valued [30,31] rather than disease conditions (e.g. rheumatoid arthritis) is more likely to tap into commonly understood constructs and yield universal preferences.
However, there were important differences between the samples that should be noted. Respondents in the UK study valued 16 health states as being worse than death, whereas in the Zimbabwe sample only the 33333 state was awarded a negative value. The inclusion of 16 negative values in the UK model resulted in generally lower values being assigned to health states in which an "extreme" problem was included. Consequently the predictions from the UK model for about two-thirds of the health states are lower than those from the Zimbabwe model. The reluctance to value states as worse than death in the Zimbabwe sample might reflect a fundamentally different attitude towards the sacrifice of years of life. There is, for example, no national debate on either euthanasia or abortion in Zimbabwe, and both are illegal and likely to remain so for the near future. The general state of health of the population might also contribute. The life expectancy is now dropping drastically because of the HIV/AIDS pandemic. The expected number of equivalent healthy years (Disability Adjusted Life Expectancy, DALE) is now estimated to be 32.9 (cf. UK 71.7), and Zimbabwe ranks 184 out of 191 nations [32]. (To calculate DALE, the years of ill-health are weighted according to severity and subtracted from the expected overall life expectancy to give the equivalent years of healthy life [32]). There may be a greater reluctance to sacrifice life years in a society in which each individual is likely to have had direct contact with death or illness. This conclusion is supported by the results of a Spanish study of preferences of 103 patients who were severely ill. The patients tended to rate the worst health states higher than proxies and rated no states as worse than death. The authors of that study concluded that within the EQ-5D descriptive system, there are no health states worse than death for seriously ill patients [33].
For Zimbabweans, the inability to wash and dress oneself is a major contributor to poor quality of life, and SC level 3 was ranked second. In contrast, SC level 3 was ranked fourth in the UK study. This difference may possibly be due to the importance that Zimbabweans attach to selfpresentation. It is regarded as insulting to ask whether people are able to wash or dress themselves, if in any way it is implied that they have not done so [15,34]. In a poorer country, self-presentation may also be regarded as indicative of socio-economic standing and hence valued more highly. The important differences between the results of the two studies illustrate the dangers of applying measures developed in one culture without adequate testing of items for cultural meaning and appropriateness.
Severe AD was ranked similarly in the UK (Rank 3) and Zimbabwe (Rank 4) samples. Of all the EQ-5D concepts, the idea of depression and anxiety is most difficult to capture in the sensibility of the Shona-speaking Zimba- bwean. There is no specific word for depression; it is usually implied from symptoms rather than self-report. Anxiety and depression are not regarded simply as health states in Shona custom. They are understood as occasional psychological (social/alienation) or spiritual (religious) states. In addition, severe anxiety is seen to border on a psychiatric state known as "mhopu" [15]. It is therefore not surprising that extreme anxiety or depression should be regarded as being very serious.
The choice of the EQ-5D as the instrument to define the different domains of health-related quality of life needs justification. The measure is limited in that there are only five domains with three possible levels on each domain. The content validity may be questioned, as it may be that important areas that contribute to quality of life, such as cognitive function or energy, are excluded. However, even with this relatively crude measure, 243 hypothetical health states can be described. Researchers have to be cautious about transposing any measure across very different cultural contexts. The current study required a robust, relatively simple measure and, despite the shortcomings of the instrument, the EQ-5D appeared to be reliable and relatively insensitive to cultural context.

Conclusions
This study attempted to elicit cardinal values of health states from urban Zimbabweans. The limitation imposed by the educational criteria resulted in a sample that was more educated than the general population of high-density areas, and the results should be generalized with care to other urban populations in the country. Despite this limitation, the values derived from the study are more likely to represent the values of urban Zimbabwe than values derived from valuation exercises performed in other countries. The parameter estimates for each level of the five domains generated by the TTO exercise are credible and are comparable to those generated by other studies. The ranking of observed preferences for health states by Zimbabweans and UK residents are remarkably similar, and if consensus could be reached on the valuation of states worse than death, it is possible that QALY weights based on EQ-5D descriptors might be developed which are valid globally.
However, the observed cardinal values for health states are much lower overall in the UK sample. It is therefore recommended that the parameter estimates developed in this study be used both to describe health-related quality of life and as an outcome measure of health interventions in the Zimbabwe urban population.
Observed TTO scores (divided by 10) of the full Zimbabwe sample (N = 19,020 values from 2,183 respondents), com-pared to observed scores from the Measurement and Valua-tion of Health study in the United Kingdom [8] Figure 1 Observed TTO scores (divided by 10) of the full Zimbabwe sample (N = 19,020 values from 2,183 respondents), compared to observed scores from the Measurement and Valuation of Health study in the United Kingdom [8].