Propensity score weighting for addressing under-reporting in mortality surveillance: a proof-of-concept study using the nationally representative mortality data in China

Background National mortality data are obtained routinely by the Disease Surveillance Points system (DSPs) in China and under-reporting is a big challenge in mortality surveillance. Methods We carried out an under-reporting field survey in all 161 DSP sites to collect death cases during 2009–2011, using a multi-stage stratified sampling. To identify under-reporting, death data were matched between field survey system and the routine online surveillance system by an automatic computer checking followed by a thorough manual verification. We used a propensity score (PS) weighting method based on a logistic regression to calculate the under-reporting rate in different groups classified by age, gender, urban/rural residency, geographic locations and other mortality related variables. For comparison purposes, we also calculated the under-reporting rate by using capture-mark-recapture (CMR) method. Results There were no significant differences between the field survey system and routine online surveillance system in terms of age group, causes of death, highest level of diagnosis and diagnostic basis. The overall under-reporting rate in the DSPs was 12.9 % (95%CI 11.2 %, 14.6 %) based on PS. The under-reporting rate was higher in the west (18.8 %, 95%CI 16.5 %, 21.0 %) than the east (10.1 %, 95%CI 8.6 %, 11.3 %) and central regions (11.2 %, 95%CI 9.6 %, 12.7 %). Among all age groups, the under-reporting rate was highest in the 0–5 year group (23.7 %, 95%CI 16.1 %, 35.5 %) and lowest in the 65 years and above group (12.4 %, 95%CI 10.9 %, 13.6 %). The under-reporting rates in each group by PS were similar to the results calculated by the CMR methods. Conclusions The mortality data from the DSP system in China needs to be adjusted. Compared to the commonly used CMR method in the estimation of under-reporting rate, the results of propensity score weighting method are similar but more flexible when calculating the under-reporting rates in different groups. Propensity score weighting is suitable to adjust DSP data and can be used to address under-reporting in mortality surveillance in China.


Introduction
Cause of death data are fundamental to developing effective public health policies [1]. Achieving complete vital registration remains difficult for a middle-income country like China with 1.3 billion population and limited resources. As an interim approach, China developed the Disease Surveillance Points System (DSPs) to obtain national mortality data based on multi-stage stratified clustering sampling method [2].
The DSP method is not without limitations and one key challenge is under-reporting of mortality counts. To ensure data integrity, it is necessary to measure the degree of under-reporting. The capture-mark-recapture method (CMR) was used in previous under-reporting surveys in China to correct for under-reporting rate using household survey as a gold standard [3][4][5]. Using CMR to estimate the under-reporting rate is relatively straightforward and practical, but the assumptions applied for CMR in under-reporting surveys cannot always be met and results could produce biased estimates if covariance distribution between groups is uneven. Therefore, potential alternatives to CMR need to be identified and tested to derive more reliable under-reporting rates for correction of mortality rates.
The purpose of this paper was to introduce a propensity score (PS) weighting method with a logistic regression to offer an alternative correction for under-reporting. This paper used data from an under-reporting survey during the period 2009-2011 to assess the degree of underreporting of death causes surveillance in the DSP System. In this paper we compared and cross validated the CMR and propensity score weighting methods as options to correct for under-reporting.

The China Disease Surveillance Points System
The DSP was initiated in 1978 and adjusted three times in 1990, 2005 and 2010 on the basis of economic development, geographic location, Gross Domestic Product (GDP), proportion of non-agricultural population and the total population of the country to ensure representativeness. After adjustment in 2010, the DSP system included 64 urban and 97 rural surveillance sites in all 31 provinces (autonomous regions and municipalities) covering seven percent of the total population in China. The information provided by the system can be used to estimate causes of death among the national population and the detailed description of DSPs has been published elsewhere [2,6]. In brief, all deaths were reported in the monitoring stations in the hospitals, community health centers and village clinics in each DSP based on death certificates. Data on demographics, date of death, place of death, cause of death, and main symptoms and signs (for verbal autopsy), etc., were collected. The 161 DSP-level and 31 provincial-level Centers for Disease Control and Prevention (CDC) were responsible for data quality through regular checking, supervision, feedback and verification. Starting in 2008, all the deaths in DSPs were reported through an online death causes monitoring system.

Survey of the under-reporting death cases in China
To address the under-reporting, periodic evaluations for completeness of registration were conducted once every three years in DSPs. Two under-reporting field surveys have been carried out during the period 2006-2008 and 2009-2011 respectively. The survey in [2006][2007][2008] showed that the national total crude rate of underreporting was 16.7 % and the weighted rate was 17.4 %; the under-reporting rate for children aged 5 years and below (35.0 %) was much higher than that for people above age 5 (16.9 %) [7].

Field survey design
An under-reporting survey was conducted in all 161 DSPs from July to October in 2012. Within each DSP, three townships (in rural areas) or streets (in urban areas) whose crude death rate (CDR) was close to that DSP's average CDR were first selected as candidate fields for the under-reporting survey. One township/street was finally chosen as the field site if its economic level was similar to the DSP's average and the population size was in the middle level among all the townships/streets in the DSP. All the residents in the selected township/street were included as the survey population. Deaths occurring from January 1 st , 2009 to December 31 st , 2011 in the families were investigated using interviews with the surviving household residents. The information of death population collected in the field survey included demographics, death-related information such as causes of death, highest level of hospital where illness was diagnosed, and diagnostic basis.

Data collection
A list of decedents from the focal time period was created for each resident group (the smallest administrative unit) within all villages and communities in the selected townships or streets by recall of the resident group leaders. The initial list was checked and complemented by data from public security departments, civil affairs departments, family planning departments, and maternal and child health departments. Using the final list of deaths, the interviewers in each village or community surveyed each family which experienced a death to verify and revise relevant information on the death records.

Identification of missed deaths
Death records between the field survey system and the routine online death cause surveillance system in each DSP were first matched by an automatic computer checking algorithm. Persons included in both systems were identified as a match when national ID matched. If the national ID was missing, persons with the same name, gender and age (within three years) were used to identify a match. After an initial computer matching process, all mismatched cases were checked and verified by a further manual checking in the DSP level. The local staff checked each mismatched case with the records from the surveillance system. Missed death cases were identified after this thorough manual verification.

Statistical methods
To test the conformity between under-reporting field survey data and the dataset of DSP system, we used a test of goodness of fit to calculate and compare the frequency distribution of main variables (age, cause of death, highest level of hospital where illness was diagnosed, and diagnostic basis) of the two datasets. The highest level of hospital where disease was diagnosed and the diagnostic basis were important indicators for accuracy of the underlying cause of death. Hospitals at the township-level and above were generally regarded as qualified to make correct diagnosis and the diagnoses made at village hospitals were checked and verified by senior DSP staffs. The diagnosis was considered reliable if it was made based on symptoms/signs, physiobiochemistry, pathology, autopsy or surgery. Inferencebased diagnosis were verified with the original investigation documents.
We described the detailed steps of PS and CMR method as follows: Propensity score weighting method We used a propensity score weighting method based on a logistic regression of under-report, where the variables were selected stepwise. The inclusion criteria and the exclusion criteria were 0.1 and 0.12 respectively. The variables used for analysis included age, gender, rural/urban residency, geographic locations, educational attainment, occupation, marital status, cause of death, place of death and diagnostic unit. Geographic locations were classified as east, central and west according to criteria of National Bureau of Statistics. The cause of death was identified according to the International Statistical Classification of Diseases and Related Health Problems 10th Revision (ICD-10).
We used two groups (those aged 5 years and below and those above 5 years) to set up two separate models. The model included age, geographic location, urban and rural for children aged 5 years and below. Whereas for those over 5 years old, the model included age, gender, geographic location, occupation, rural/urban residency, marital status, place of death, diagnostic unit, cause of death and year of death. Propensity score weighting integrated the information of several major covariates into one propensity score variable. The estimated propensity score weighting may lead to a substantial reduction in bias, especially for small groups. The analytical procedure is as follows: Step 1: Model estimation The sampled under-reporting survey may not be perfectly representative of the whole DSP in terms of socioeconomic variables that are related to the probability a death is included in DSP. We applied logistic regression to the sociodemographic variables to predict the probability a respondent was included in the routine surveillance in the sampled underreporting survey site, using all individual records in the under-reporting field survey of 2009-2011 as the gold standard. We used age, sex, place of death and other predictor variables in the model. The coefficient and standard error for each variable of the models are shown in Table 1 (for under 5 years) and Table 2    where x1 refers to urbanity, x2 refers to age group, x3 refers to year and x4 refers to the highest level of hospital where disease was diagnosed listed in Table 1. þ 0:098x 9;3 1 þ 0:120x 9;4 1 -0:137x 9;5 1 -0:118x 9;6 1 þ 0:036x 9;7 1 -0:083x 9;8 1 -0:056x 9;9 1 -0:153x 9;10 1 þ 0:0057x 9; 11 1 where x1 refers to region, x2 refers to age group, x3 refers to year, x4 refers to the highest level of hospital where disease was diagnosed, x5 refers to marital status, x6 refers to education, x7 refers to occupation, x8 refers to place of death and x9 refers to cause of death listed in Table 2.
Step 2: Weighted estimates for death cases The probability of being reported for each observation (p i ) was based on the logistic regression model of the field survey data. Weights for each case were calculated as w i = 1/p i . The weighted number of deaths from 2009 to 2011 (Ts) was: Where N s is the total number of death cases from the DSP 2009-2011 surveillance.
Theoretically, the sum of w i of the cases represented the actual number of deaths, which was the total number of deaths that occurred during 2009-2011.
Step 3: The under-reporting rate of DSP from 2009-2011 (P) based on propensity score weighting was:

CMR method
To compare the results calculated from propensity score weighting method, we also used the CMR method to calculate the under-reporting rate. CMR has been widely used in wildlife science to estimate the size of free-living animal population and it has been advocated for use in estimating completeness of a registration [8]. In the two-sample capturemark-recapture approach, an estimate of the true population size is derived assuming independence of ascertainment by evaluating the degree of overlap from existing data sources.
To perform CMR analysis, the estimated overall death toll (N) was where M is defined as the total number of cases in the routine DSP surveillance, n is defined as the total number of cases in under-reporting field survey, and m is defined as the number of cases reported in both systems.
The under-reporting rate of DSP from 2009-2011 (p) based on CMR was:

Results
Baseline characteristics of database Table 3 shows the comparison of the sample dataset and the DSP dataset. Less than 10 % of the death cases were diagnosed below township-level hospitals and more than 90 % were diagnosed with solid basis, implying the accuracy and good quality of cause of death reported by the DSP system. The comparison showed that there were no significant differences between the two sources in terms of the major variables. As shown in  Under-reporting rate based on propensity score weighting and CMR As shown in   11.0 %) in urban and 14.1 % (13.8 %, 14.3 %) in rural areas respectively. Consistent with the propensity weighting method, the underreporting rate in the west was higher than the east and central regions (18.4 %, 9.9 % and 11.0 % respectively). The under-reporting rate for children aged 5 and below (19.6 %, 95%CI 17.3 %, 21.7 %) was the highest among all age groups. Table 6 summarizes the outputs of unadjusted and adjusted life tables for males and females in the DSP. The death probability for the 0-5 year age group was 0.0118 and 0.0082 for males and females respectively. Life expectancy at birth is a comprehensive reflection of mortality among all age groups and this study showed that life expectancy for Chinese males and females was 77.3 and 86.4 before adjustment. The under-reportingadjusted life expectancy was 75. 7 and 81.9 for males and females respectively.

Discussion
The adjusted under-reporting rate for mortality during 2009-2011 using both methods in our study decreased compared to the period 2006-2008. Consistent with previous studies, we found a significantly higher underreporting rate in rural areas than in urban areas [7]. This  could largely be explained by lack of experienced doctors in charge of completing the death report and inconvenience of information transfer. Additionally, the unwillingness of reporting in many bereaved families in rural areas worsened the under-reporting situation [9,10]. Similarly, the higher under-reporting rate in the west compared with east and central regions was mainly caused by lack of personnel and technical resources in less developed areas. Moreover, the special customs of some ethnic minorities in western regions made them less likely to report the death cases, especially for infants. Not surprisingly, the under-reporting rate for the population aged 5 years and below was the highest among all age groups. This may be associated with the poor quality of the death report card for infants and young children. Stigma and shame may lead some parents to shelter the facts of children's death, particularly in rural areas and western regions. Furthermore, in the floating population (a group of people who do not live in the area permanently and are not considered official residents) of urban migrants, health services for mothers and children under 5 years of age are more difficult to access [11].
In under-reporting surveys, populations often display dependence and heterogeneity. A model of a stable population can always be imposed if using CMR. It is difficult, however, to have independent samples and this would lead to inaccurate and sometimes misleading results [12]. It is not possible to evaluate the possible under-reporting rate when there are only two ascertainment sources, such as under-reporting survey and DSPs. Quality criteria about survey performance was defined for all populations in DSPs. The advantage of the CMR method to calculate the under-reporting rate is simplicity and ease of practical use. However, the CMR results could appear large deviation if covariance distribution between groups is uneven when using CMR method to calculate the under-reporting rate for subgroups. The propensity score weighting method is used to make observational data look like random distribution, and the results show that propensity score weighting estimates are more internally consistent than the cell based approach.
In a sampled field survey like the under-reporting survey, it is not easy to meet all the conditions. Dependencies between the individual cases make it easy for some deaths to be captured in some groups as opposed to others. It is more likely to be captured by another source. When calculating the under-reporting rates for different groups, selection bias would lead to biased results. When the distribution of covariates is consistent as in the current study, the results of the two methods are similar. However, the propensity score weighting method is more flexible and suitable to calculate the under-reporting rates for different population groups because it takes into account each individual death.
The propensity score weighting method represents the influence of multiple covariates for under-reporting. It reduces the dimension of covariates and calculated under-reporting rate of each group based on the scores. In a large sample of cases, individuals between the groups could be adjusted using propensity score, making the distribution of covariates between the groups equivalent to achieve a post-randomization [13]. Furthermore, propensity score weighting estimates are internally consistent, especially for the group with fewer death cases. For example among the population of >5 years age group, fewer deaths in the 6-14 year age group led to a big selection bias, and the under-reporting rate in this group was much higher than other groups based on CMR method. Propensity score weighting eliminated the bias, so the under-reporting rate in the 6-14 year age group based on PS was closer to the average rate in the >5 years age group. The results of propensity score weighting were therefore closer to the true level of under-reporting.
The reasons for under-reporting are multifaceted, such as the local government's emphasis on the work, competence and responsibility on the staff, affection of the local death registration system and collaboration of government departments [7,11,14,15]. Local population migrations and the traditional concept of folk culture are also possible reasons for under-reporting. The fundamental way to improve the quality of data is not through under-reporting rate adjustments, but by improving and strengthening the quality management system. All levels of government should increase investment in mortality surveillance, especially in rural areas and western regions. Communications and coordination with the local public security departments and other relevant departments and sectors need to be strengthened to allow multi-channel data complement. The enthusiasm of rural health centers or community health service doctors should be mobilized to report the death cards more carefully and accurately, and they should play a key role in the data collection process. The calculation of life expectancy in a population relies on the accurate estimate of the age-specific death rate. Using propensity score weighting based underreporting rates, the generated adjusted life tables for the DSP population will shed light on the implications of under-reporting for assessment of mortality patterns in China. The results for people aged above 5 years in the current study were similar to the estimation of Global Burden of Disease China Study [16]. The higher life expectancy at birth of our study is due to the relatively lower under-reporting rate for the under 5 year age group in DSP system. There was separate death surveillance for the under 5 year population in the Maternal and Child Health Surveillance Center of China (MCHSCN). The combination of the data from DSP and MCHSCN would produce the most accurate estimate of mortality in all age groups in China.
With the rapid economic development and urbanization in China, floating population has increased gradually and became an important part of the Chinese population. The current death cause surveillance system focuses on the residents who have lived in the DSP site for more than six months (considered as locally registered residents). Therefore we were not able to obtain death information for the population who had lived in the DSP site for less than six months. Mortality of this group is hard to track and they had potential impact on the overall mortality of Chinese population. The Chinese government has realized the importance of evaluating the health status of the floating population and initiated a national chronic disease risk factor survey based on the DSP system [17]. More investments are expected for this group and the floating population will be included in the death cause surveillance in future exercises.
There are some limitations of PS method. Firstly, the under-reporting was influenced by many sociodemographic variables of the death individuals. Since such information came from the death cards, incomplete and inaccurate records of the death individuals entered by the local staffs would affect the accuracy of logistic regression model. Secondly, although the PS method can eliminate some errors caused by sampling selection, it is not possible to get a perfectly random distribution. In addition, the PS method is more complicated and not easy for practical use compared to the CMR method. Furthermore, death information for the floating population is incomplete in the current death cause surveillance system and we were not able to estimate the true mortality of this group as an important component of the population in China.
The Chinese government has planned to expand the current death cause surveillance system to include more counties and districts with provincial representativeness. The propensity score weighting approach could be applied to estimate the under-reporting rates nationally and provincially to assess the quality of mortality data from the DSP system. The mortality data from DSP system need adjustment for under-reporting. Although both CMR and PS methods can do the adjustment, the latter utilizes much more information and should be more suitable to adjust DSP data. Overall, the results of propensity score weighting are more accurate and can be used to address under-reporting in mortality surveillance in China.