Effects of a health information system data quality intervention on concordance in Mozambique: time-series analyses from 2009–2012

Background We assessed the effects of a three-year national-level, ministry-led health information system (HIS) data quality intervention and identified associated health facility factors. Methods Monthly summary HIS data concordance between a gold standard data quality audit and routine HIS data was assessed in 26 health facilities in Sofala Province, Mozambique across four indicators (outpatient consults, institutional births, first antenatal care visits, and third dose of diphtheria, pertussis, and tetanus vaccination) and five levels of health system data aggregation (daily facility paper registers, monthly paper facility reports, monthly paper district reports, monthly electronic district reports, and monthly electronic provincial reports) through retrospective yearly audits conducted July-August 2010–2013. We used mixed-effects linear models to quantify changes in data quality over time and associated health system determinants. Results Median concordance increased from 56.3% during the baseline period (2009–2010) to 87.5% during 2012–2013. Concordance improved by 1.0% (confidence interval [CI]: 0.60, 1.5) per month during the intervention period of 2010–2011 and 1.6% (CI: 0.89, 2.2) per month from 2011–2012. No significant improvements were observed from 2009–2010 (during baseline period) or 2012–2013. Facilities with more technical staff (aβ: 0.71; CI: 0.14, 1.3), more first antenatal care visits (aβ: 3.3; CI: 0.43, 6.2), and fewer clinic beds (aβ: -0.94; CI: −1.7, −0.20) showed more improvements. Compared to facilities with no stock-outs, facilities with five essential drugs stocked out had 51.7% (CI: −64.8 -38.6) lower data concordance. Conclusions A data quality intervention was associated with significant improvements in health information system data concordance across public-sector health facilities in rural and urban Mozambique. Concordance was higher at those facilities with more human resources for health and was associated with fewer clinic-level stock-outs of essential medicines. Increased investments should be made in data audit and feedback activities alongside targeted efforts to improve HIS data in low- and middle-income countries.


Background
National-level, ministry-led health information systems (HIS) are widely touted as a "foundation of public health," [1] with available, reliable, timely, and valid data accepted as a prerequisite for decision-making and the provision of high-quality health services at all levels of the health care system. Published literature, however, is replete with studies detailing low quality of routine HIS data among many low-and middle-income countries (LMICs) [2][3][4][5][6]. In addition, failed attempts to use HIS data to monitor or evaluate the effects of health interventions or to conduct operational research are common [7][8][9][10].
Groups working in multiple LMICs have recently shown that rapid and effective methods for improving HIS data exist and have been tested [11]. In KwaZulu-Natal, South Africa, a seven-month data quality intervention consisting of three-day trainings, monthly data meetings, and data quality audits (DQAs) at health facilities increased data completeness from 26% to 64% and data accuracy from a correlation of 0.54 to 0.92 [12]. Interventions as simple as implementing quarterly data review workshops and fostering the use of HIS data for decision-making have resulted in improved data quality and coverage in diverse LMIC settings [13,14].
While case studies of short-term data quality interventions have been previously illustrated, no studies have quantitatively evaluated the relationship between health system factors and facility-level intervention effect heterogeneity over longer time periods. The objective of the present study is to measure the impact of a data quality intervention over three years and to identify factors associated with changes in HIS data concordance over time in Mozambique. Identifying these factors could improve the development and targeting of future interventions to improve HIS data in LMICs.

Study setting and data quality intervention
Funded through the Doris Duke Charitable Foundation's African Health Initiative, the Mozambique Population Health Implementation and Training Partnership (PHIT) is a comprehensive public health system intervention focused in Sofala Province [15]. One key element of this intervention is to improve routine HIS data through continual assessment of the availability, consistency, and accuracy of HIS data. Beginning in 2010, annual DQAs have been conducted from a sample of 26 health facilities from all districts in Sofala Province. The study setting and profile of the 26 health facilities have been previously described [16]. In terms of the intervention, health facilities are publicly ranked by summary data concordance measures, and facilities with poor data quality receive additional supportive supervision and data training. Additional intervention components include: (1) district-level meetings bringing together front-line health workers and district/provincial managers for data feedback, performance gap identification, solution planning, and action plan monitoring; (2) the development and use of simple data dashboards for easy visualization of secular trends in key health indicators; (3) the development of simple human resource allocation optimization models; and (4) equipment purchase and maintenance. A full description of intervention components and an introduction to the Mozambican HIS have been previously published [15,17].

Variable definitions and statistical analyses Outcome of interest
Our outcome summarizes the availability and reliability (concordance) between a gold standard data quality audit and routine HIS data across four key indicators (outpatient consults, institutional births, first antenatal care visits [ANC1], and third dose of diphtheria, pertussis, and tetanus vaccination [DPT3]) and five levels of health system data aggregation (daily facility paper registers, monthly paper facility reports, monthly paper district reports, monthly electronic district reports, and monthly electronic provincial reports). As has been used in similar studies [12,18], data were deemed concordant if they had less than a 10% error margin comparing the gold standard DQA and routine HIS numbers. Each month's value was compared for all five levels of data aggregation and across the four key indicators listed above and then averaged. That is, perfect facility concordance would be 16/16, representing four indicators multiplied by four comparisons across the five levels all achieving <10% error. If data were unavailable, concordance was zero for that indicator/level combination. DQA data teams consisted of trained data collectors external to the Ministry health system supervised by a data expert. Data were double-entered and managed in an Excel database. If there were discrepancies in abstracted DQA data, data collectors would validate their measurements by recounting registry entries with the help of the expert supervisor.

Predictors of interest
Predictors were selected based on previous research regarding facility-level predictors of stock-outs of essential health products [16] and the realities of data availability. These included: type of health facility; health facility burden measured in number of outpatient consults or ANC1 visits; number of inpatient beds; number of technical staff (doctors, nurses, assistants); number administrative staff; distance from central drug and equipment distribution center; rural/urban location; and number of health facility drug stock-outs where the drug was available at the district-level drug depository. The relationship between stock-outs and data quality was evaluated for 2011 and 2012 only due to limited stock-out data availability. Detailed methods regarding data collection for drug stock-outs and other key predictors have been previously published [16].

Analysis methods
Mixed-effects linear models were built in Stata 13 with 0-100% data concordance as our outcome of interest and α = 0.05 representing statistical significance using twotailed tests. Our analysis plan included: (1) local regression across time and clinics to determine functional forms for variable parameterization; (2) crude analyses of data trends; and (3) analyses of each explanatory variable and its effect on data quality after accounting for the confounding effect of time using linear splines with yearly knots and random intercepts and slopes for clinics; and (4) fully-adjusted analyses controlling for time and simultaneous adjustment for all predictors. For all models, significance of group variables (health facility type, number of drug stock-outs) was determined by a chunk test prior to interpreting within-group associations. Analyses of residual plots indicated no significant lack of model fit at all steps.

Ethics statement
This study was approved by the Mozambican National Institutional Review Board. The University of Washington deemed this study exempt as it focused on program evaluation purposes and was not considered human subjects research under United States federal regulations.

Results
Descriptive statistics, the basic profile of health facilities surveyed, and information about the study setting have been previously published [14,15]. The intraclass correlation coefficient was 0.26 (confidence interval [CI]: 0.14, 0.37). Baseline median concordance in 2009 was 56.3% and concordance increased to 87.5% by 2012 (Table 1) Each 100-unit increase in first antenatal visits was associated with 3.3% higher (CI: 0.43, 6.2) data concordance, while each additional inpatient bed was associated with 0.94% (CI: −1.7, −0.20) lower data concordance (Table 2). Further, each additional technical staff at the health facility was associated with 0.71% higher (CI: 0.14, 1.3) data concordance.
The factor most strongly associated with concordance was the number of essential drugs stocked out at health facilities while the drug was available at the district headquarters. Compared to those clinics with no drug stock-outs, those with five drugs stocked out had 51.7% (CI: −64.8, −38.6) lower data concordance.

Discussion
Similar to previous studies in sub-Saharan Africa [11][12][13][14], the present study found that an intervention consisting of data audits, equipment/supply purchase and maintenance, supportive supervision to low-performing clinics and feedback from district/provincial levels, data trainings, and district performance enhancement meetings focused on improving data use for decision-making can result in rapid improvements in data concordance in public-sector health facilities. Novel findings from our study in Mozambique are that: (1) improvements in data quality occur most significantly during the first two years and may hit a plateau of approximately 85-90% mean concordance; (2) improvements in data reliability can be sustained over multiple years given continued intervention activities; (3) higher numbers of human resources for health are associated with larger gains in data concordance; (4) facilities attending more antenatal care visits and those with fewer inpatient beds also show greater increases in concordance; and (5) stock-outs of essential medicines for primary health care provision are strongly associated with poor HIS data quality.
Our findings that data improvements were not related to determinants such as facility location (rural/urban, distance from district headquarters) and facility type are promising given that these more "static" infrastructure-related factors are difficult to modify in the short term. Given this, rapid and equitable data improvements appear possible even at rural peripheral health facilities that traditionally have the fewest health resources. These results support past evidence suggesting that management issues centered around motivation and value placed on the quality of routine data collection [14,19], as well as health worker numeracy and training [20], may be significant determinants of poor HIS data quality in LMICs. Our study builds on these previous findings by showing that, controlling for health facility location and type, interventions to improve data quality may be less effective at facilities with few human resources for health or large amounts of high-burden inpatient services. Further research should clarify how facility burden characteristics (number of ANC1 visits, outpatient visits)  are related to data improvements because of our counterintuitive findings of a positive relationship between ANC1 visits and data concordance, but no corresponding association with outpatient visits. Given that HIS data quality gains can be sustained over multiple years (allowing reliable data-driven decisionmaking), and that relatively simple data improvement interventions have been tested and shown effective in multiple LMIC settings, donors and governments should consider investments in DQAs and other interventions to improve routine data systems. These investments are especially important given recent analyses indicating potentially increasing subnational disparities in health statistics in LMICs [21] and the difficulty of traditional survey designs (Demographic and Health Surveys/Multiple Indicator Cluster Surveys) to provide health statistics below the provincial level [10,22]. Moreover, our findings further support the idea that quality HIS data are necessary for high-quality service provision, such as supply management of essential medicines and the forecasting of future supply needs to guard against stock-outs.
Our study has a number of limitations. First, without an adequate control group we cannot eliminate the possibility that all clinics in Mozambique are experiencing similar data improvements. Second, significant increases in data concordance do not necessarily mean that data validity has improveda more difficult metric to evaluate. Third, the key indicators evaluated may not be representative of all HIS indicators essential for program planning and service provision. Last, the present study was conducted in one province of Mozambique and in a subset of clinics and therefore may not be representative of all public health clinics nationally.

Conclusion
We found that an intervention consisting of facility-based data audits, targeted training and supervision, equipment purchase/maintenance, and data audit and feedback meetings was associated with significant increases in publicsector HIS data concordance. Improvements were greater at health facilities with more human resources for health, more antenatal care visits, and fewer inpatient beds. Given  the importance of available, reliable, timely, and valid data for decision-making and health care provisionsuch as effective management of essential medicinesdonors and Ministries of Health should consider increased investments in improving HIS data quality. Future studies should aim to identify which data quality intervention components are most effective and to determine the sustainability of data quality interventions over the longer term.