This article has Open Peer Review reports available.
Using funnel plots in public health surveillance
© Dover and Schopflocher; licensee BioMed Central Ltd. 2011
Received: 19 April 2011
Accepted: 10 November 2011
Published: 10 November 2011
Public health surveillance is often concerned with the analysis of health outcomes over small areas. Funnel plots have been proposed as a useful tool for assessing and visualizing surveillance data, but their full utility has not been appreciated (for example, in the incorporation and interpretation of risk factors).
We investigate a way to simultaneously focus funnel plot analyses on direct policy implications while visually incorporating model fit and the effects of risk factors. Health survey data representing modifiable and nonmodifiable risk factors are used in an analysis of 2007 small area motor vehicle mortality rates in Alberta, Canada.
Small area variations in motor vehicle mortality in Alberta were well explained by the suite of modifiable and nonmodifiable risk factors. Funnel plots of raw rates and of risk adjusted rates lead to different conclusions; the analysis process highlights opportunities for intervention as risk factors are incorporated into the model. Maps based on funnel plot methods identify areas worthy of further investigation.
Funnel plots provide a useful tool to explore small area data and to routinely incorporate covariate relationships in surveillance analyses. The exploratory process has at each step a direct and useful policy-related result. Dealing thoughtfully with statistical overdispersion is a cornerstone to fully understanding funnel plots.
According to a widely cited definition proposed by the CDC, "Public Health Surveillance is the on-going, systematic collection, analysis, and interpretation of health data essential to the planning, implementation, and evaluation of public health practice, closely integrated with the timely dissemination to those who need to know" . The results of analyses conducted on data collected within a surveillance system can be used to inform public health policy and planning, to monitor the health status of a population, and to stimulate research. A functional surveillance system will provide information about the number of health events of specified types that occur within specified populations on an ongoing basis and can therefore be used to derive disease and health event rates over time in different areas (or subpopulations of other types).
One routine surveillance activity may be to monitor rates of disease occurrence in small areas in order to identify anomalies that might have a geographic basis and to enable the reporting of such anomalies to authorities in these areas. Substantial variability in population sizes in small areas introduces some challenges in the comparisons of rates, however, because the precision of estimation of these rates depends on the size of the population over which they are measured.
Several graphical procedures have been proposed for displaying small area rates to support the location of anomalous patterns. League plots  and choropleth maps  are two common approaches. League plots display observed rates (with confidence intervals) ordered by those rates. These plots are difficult to interpret  because they encourage interpretation as a rank ordering, and rank orderings are known to have extremely poor statistical properties [see for example, [5, 6]]. Choropleth maps of rates apply differential color schemes to chosen categorizations (often quintiles) of observed rates and color each area on a map according to the category of its observed rate. These are also easy to misinterpret because the map reflects geographic area rather than population density and because the same data may result in maps with very different appearances, since the choice of category is arbitrary. Cartogram versions  attempt to redraw areas in proportion to populations but are often difficult to reconcile to geographies and still suffer from the arbitrary category problem.
Funnel plots are an alternative to both league plots and choropleth maps. Funnel plots are a form of scatter plot in which observed area rates are plotted against area populations. Control limits are then overlaid on the scatter plot. The control limits represent the expected variation in rates assuming that the only source of variation is stochastic. The control limits are computed in a fashion very similar to confidence limits and exhibit the distinctive funnel shape as a result of smaller expected variability in larger populations.
Funnel plots were first introduced in meta-analyses, where they are often used to determine whether a lack of a particular type of published findings demonstrates the presence of a publication bias [8, 9]. This would be indicated by the absence of points in a particular region of the funnel (especially an absence of studies with a small sample size and a negative result).
The funnel plot can also be considered a form of control chart . Control charts monitor whether a manufacturing or business process is under control. If analysis indicates that the process is currently stable, with only stochastic variation, then data from the process will vary within known limits and can be used to predict the future performance of the process. If the chart indicates that the data from the process being monitored are too variable, analysis of the chart can help determine the sources of variation, which might then be eliminated to bring the process back into control. In a funnel plot, if rate variation is only random and stochastic, then an appropriate proportion of the points representing area rates will tend to fall within the funnel, and importing control chart terminology, we might consider the (rate generation) process to be "under control." We can also revert to statistical terminology and note that the model fit is adequate (where, in this simple case, the model is of a single stable rate). When many rates fall outside the funnel, the plot can be described as "overdispersed," and it can be said that the process is not in control or the model does not fit the data well. Control chart terminology has been adapted to health system performance in various jurisdictions where it is assumed that managers within a health system can exercise control over a health event-related process . Many of the issues in institutional performance monitoring are shared by health surveillance in support of public health. Both activities deal with small domains, highly variable rates, large differences in population sizes, multiple testing issues, ongoing monitoring activities, and dissemination of results to interested parties invested with the authority or responsibility to effect change.
It should also be noted that funnel plots are not limited to representing the model of a single stable rate; more complex models can underlie the estimation of the rate or quantity of interest . For example, plotted rates can represent the residuals that remain after a rate, predicted from the values of relevant covariates using a regression model, has been subtracted from the observed rate. In health services research this process is typically called risk adjustment [2, 11, 12].
An ideal model for routine monitoring in health surveillance would begin with a model that fit the data, that is, where the rate generation process could be considered to be under control. Subsequent monitoring over time could focus on whether the rate generation process could be considered to be remaining under control. As well, funnel plots provide a natural, graphical method of assumption checking and model diagnostics during the model development process itself. At any stage, funnel plots may also locate areas with unusually high or low rates (outliers) and this might justify further field epidemiologic or research investigations.
In this paper we demonstrate the use of funnel plots for model development using motor vehicle mortality data in Alberta, Canada. We begin by constructing a funnel plot under the simple model of a single provincial rate and observe that it shows overdispersion. Then we demonstrate a risk adjustment process that largely eliminates this overdispersion. Finally, we discuss steps that emerge from the model that might be taken by public health decision-makers and discuss its use for routine monitoring.
We will speak in terms of small geographies, counts, and rates, and comparisons to an overall rate as these terms are commonly used in health surveillance. However, it should be noted that funnel plots are quite general and can be used for any domain where multiple estimates have been made using varying sample sizes.
Data are from the province of Alberta, Canada. Alberta is located in Western Canada and has a population of 3,600,000. The province maintains a publicly-funded universally-available health care system. All residents of the province (except the military, the Royal Canadian Mounted Police, and federal inmates) are registered with the Alberta Health Care Insurance Plan (AHCIP). This Stakeholder Registry contains demographic information including addresses and therefore provides a source of population estimates by temporal and spatial boundaries.
Maps are based on the Alberta Regional Health Authorities (RHA), reflecting boundary changes introduced in December 2003 and in force until 2009. The small areas analyzed are 70 subregional boundaries created specifically for the analysis of health data . RHA officials were engaged in the process to insure that the subregions would have operational relevance. A population of 20,000 was chosen as a minimum target within each subregion in order to ensure that rates would be relatively stable and this target was met in almost all cases.
The Alberta Vital Statistics Death Registry provides demographic information about each death in Alberta as well as the cause of death according to International Classification of Diseases, 10th revision (ICD-10) codes. The current analysis reports motor vehicle traffic death rates during 2007. Motor vehicle traffic deaths were identified as ICD-10 codes V30-V89 with .5, V39-V79.4, V86.00, and V86.08.
Covariates for risk adjustment (seat belt use; drinking and driving; road type and utilization) are derived from the 2007 cycle of the Canadian Community Health Survey (CCHS), a self-report survey administered annually to approximately 65,000 Canadians (5,000 Albertans) by Statistics Canada . Provincial health ministries are granted special access to location information for respondents in the CCHS sharefile, making it possible to estimate rates at the subregion level by linking CCHS postal codes to subregion boundary files and utilizing the CCHS survey weights.
Drinking and driving is the self-reported proportion of respondent drivers having driven a vehicle after two or more drinks; seat belt use is the self-reported proportion of drivers "always" wearing seat belts or passengers "always" wearing seat belts while in the front seat. A proxy for road type and utilization was based on the Statistics Canada Metropolitan Influence Zone (MIZ), a measure of the influence a major urban center has upon outlying areas substantially based upon the percent of the population that commutes daily to an urban center. Subregions were assigned the modal MIZ score.
The population, mortality, and survey data are all aggregated and analyzed at the 70 subregional boundaries.
The funnel plots use binomial control limits given by where Φ(•) is the cumulative inverse normal distribution evaluated for 1-α% control limits. Other methods for control limit generation could be used, see  for a comprehensive review. To emphasize, is fixed at the overall provincial rate as estimated from the data and n varies freely. The rate for each subregion is then overlaid on the plot at their actual population size and rate.
The funnel plot control limits are set at 95% and 99.8%. These correspond conceptually to the 95% confidence level often used in health services research and to the 3-sigma limits commonly used in process control.
Funnel plots for survey-based measures require a slight modification to account for the complex survey design. The population values are scaled by the particular survey question design effect to account for the additional variability due to the complex survey design .
Funnel plot principles for mapping
Funnel plots are adapted to mapping through the use of z-scores [3, 16]. The funnel plot based z-scores are computed as where is the provincial rate, p i is the i th subregion rate, and n i is the i th subregion population. Values greater than 2 are color-coded orange, values greater than 3 are red, values less than -2 are green, and values less than -3 are dark green. All other values are color-coded yellow. These z-score cut-offs correspond to the 95% and 99.8% control limits in the funnel plot.
Risk adjustment was carried out using a judgment-based modeling procedure. Covariates that may explain between-region variability in rates were selected a priori. Poisson regression on mortality counts with a log(population) offset, a standard method for regression on rates, was carried out sequentially including demographic factors (age, sex), behavioral risk factors (seat belt use, drinking and driving) and environmental factors (proxy for road type and utilization). The adjusted rate is the product of the provincial crude rate and the ratio of observed to expected values from the relevant regression model. Poisson regression methods are not discussed in any further detail as the focus of this paper is on the use of funnel plots; other sources offer complete discussions of risk adjustment and regression methods [11, 12]. Pearson goodness-of-fit statistics, in addition to the number of small areas outside the control limits, are reported at each stage in the modeling process.
All analyses were carried out in SAS 9.2. The macro code used to create the funnel plots is freely available from the authors.
In searching for a model with a better fit to the data, we begin by adjusting for demographic factors, age distribution, and sex. Then, we adjust for two well-known behavioral risk factors of motor vehicle mortality for which health surveillance data are regularly available: seat belt use and drinking and driving . Finally, the model is adjusted for the proxy for road type and utilization.
Modeling between subregion variation in motor vehicle traffic mortality rates
Pearson goodness of fit
Outside 95% limits (#)
Outside 99.8% limits (#)
Age, sex, drinking and driving
Age, sex, seat belt use
Age, sex, seat belt use, road type and utilization
Since the CCHS implements a complex survey design, the funnel plots have been adjusted for the design effect of 3.8 for seat belt use in 2007. All survey related points are randomly jiggled in the figures and the axis has been suppressed to protect confidentiality as required by Statistics Canada, the statistical agency that owns the data.
One interesting aspect of the funnel plot in Figure 1 is the substantial number of rates for small areas falling outside the funnel plot limits. This overdispersion is not an unusual phenomenon in health data . The ability of the funnel plot to clearly show overdispersion is, we feel, one of the most useful aspects of the funnel plot. We can immediately and visually see that we don't fully understand the disease process. This judgment should be considerably easier than judgments of the presence or absence of publication bias when considering funnel plots of effect sizes from a meta-analysis, which depends upon the distribution of points within the funnel limits and is therefore quite error prone .
Funnel plots are therefore extremely useful in focusing analysts' attention on model misspecification. When overdispersion is observed, the key question becomes what to do with the apparent overdispersion. Some have advocated the use of statistical correction [21, 22] to adjust for overdispersion, either through random effects models, via an overdispersion parameter, or both. We feel this should be an approach of last resort only. If there is large variability in a health variable being monitored, adjustment via the inclusion of missing covariates should be the first line of attack. We note that this adjustment need not be directly causally based. For example, if seat belt use data were not available, but a similar risk taking behavior variable was, that proxy variable could still have served to substantially explain the variability in motor vehicle mortality rates. With the plethora of survey and administrative data available today, there is no reason not to attempt to understand and model the factors affecting between-region variability before resorting to random effects-type models. Also, these blind approaches to overdispersion carry substantial risk in the surveillance arena. In the case of misspecification due to a missing covariate, random effects models make the strong assumption that the missing covariate value is essentially proportional to the observed rate . It is very easy for this not to be the case in practice, as illustrated in our example. In fact, had further attempts at adjustment not been made, the interpretation of the funnel plot would have pointed public health epidemiologists to the wrong area. The purported statistical approach to fixing overdispersion must be used with great caution. Echoing Berk et al , "one risks an arbitrary correction leading to arbitrary results." In the motor vehicle mortality example, one small area would still have been outside the 99.8% controls limits if an overdispersion factor had been included, even though our analysis shows that this was not any sort of outlier but simply has a poor combination of age, sex, seat belt use, and road type. In the analysis presented, the final model shows good fit. Had this not been the case, it would have been possible to include random effects or an overdispersion factor in the final model. Future surveillance and monitoring efforts could continue, keeping the random effects and/or overdispersion value fixed. Attempts to dynamically alter the overdispersion parameter or re-predict random effects might only mask any real changes over time.
The funnel plot methodology encourages the use of data from multiple sources. Funnel plots in the surveillance domain can rely on aggregate data, making the linking process between data sources much easier to facilitate. In our example, we were able to seamlessly integrate survey and administrative data sources because they are only required to be available at the aggregate subregion level. This also has implications for ongoing monitoring: with systems in place to create small area estimates from a variety of data sources, ongoing monitoring and creation of future funnel plots should be possible.
Underlying the outlined funnel plot methodology is the choice of method for creating the limits. The asymptotic normal approximation was used. This choice may appear unusual in light of the fact that, for binary confidence intervals, the use of the asymptotic normal approximation is generally not recommended as it can have very poor performance characteristics [see  for a recent review]. Ongoing research by the authors suggests that it is the opposite case for funnel plots and the asymptotic normal approximation outperforms other methods for creating limits.
The funnel plot methodology was also successfully adapted to a mapping framework. The ability to display surveillance data in a geographic context can aid in the understanding of that data. The maps, combined with expert knowledge of the areas, can generate suggestions as to what factors may explain any residual overdispersion.
Following the institutional performance literature, funnel plots of disease rates, risk factors, or changes in these could also be used as performance measurement tools . Using target rates as the funnel plot center line and placing the funnel around them gives an indication of how many small areas are likely achieving a public health target. The funnel plot of seat belt use rates around a target of 95% in Figure 10 gives a visual representation of both the range of seat belt use rates and the number of areas where seat belt use is below the target.
The recommended analysis process employing sequential funnel plots and multiple covariates lends itself to identifying opportunities for policy recommendations. For a covariate that does enter the model, there is evidence that the covariate varies across the province, naturally suggesting that further analysis of this covariate might identify local area level intervention and policy opportunities. If a known risk factor does not enter the model, a global policy level recommendation may be in order. In our example, drinking and driving did not enter the final model, suggesting that policy recommendations could be made at the provincial level; while seat belt use, which did enter the final model, lends itself to local level interventions. Further consideration of factors as modifiable or nonmodifiable facilitates the interpretation of individual small area rates. Adjusting for nonmodifiable risk factors allows a clear comparison to crude rates and highlights the potential for improvement through modifiable factors. Assessing the modifiable factors through their own funnel plots can help target local area level interventions and policy initiatives.
The use of funnel plots and modeling to assess the relationships between potential risk factors and outcomes must always be carried out with care. The process described employs an ecological model and carries with it the potential limitations and cautions of this type of design [see for example, ]. Particular care should be taken in interpreting the meaning of any coefficients in the model to avoid the ecological fallacy. We have framed the process as a surveillance activity where there is usually an evidential basis for inclusion of risk factors or proxies for risk factors. Clearly any single ecological correlation would be insufficient evidence to justify public health action, but when noted in the context of established risk factors, public health activities may be reasonable.
We envision three key areas for the evolution of the funnel plot in public health surveillance. The first area is the integration of the funnel plot into ongoing monitoring activities over time. We have touched on issues regarding the use of random effects and overdispersion parameters as they relate to repeated applications of a funnel plot over time. The questions related to incorporating modeling into a funnel plot-based surveillance process (Re-run the model each year with additional data? Hold coefficients constant over time? How best to display multiple years of data?) are an area of active inquiry. A related area for evolution of the funnel plot is how to appropriately incorporate funnel plots into a multilevel model framework. As multiple levels of data are becoming available for analysis in surveillance, multilevel models will become more common. Finally, funnel plots have a close link to spatial data as they are currently used in public health surveillance. The ties, theoretical and applied, to spatial methods provide a large area for future contributions.
Funnel plots and their cartographic equivalents provide visually attractive means of displaying small area data in health surveillance and other disciplines for the purposes of anomaly detection and ongoing monitoring, while accounting for variation in small samples. Overdispersion, readily apparent when present in funnel plots, needs to be dealt with thoughtfully in the analysis and modeling stages of surveillance to ensure that the interpretation of the surveillance data is appropriate. The use of funnel plots in health surveillance modeling activities naturally focuses attention to the level that policy recommendations should be made.
This work was made possible by a grant from Alberta Health and Wellness to DS. The findings and conclusions in this report are those of the authors and do not necessarily represent the views of Alberta Health and Wellness or the University of Alberta.
- CDC: Guidelines for evaluating surveillance systems. MMWR 1988.,37(S-5):Google Scholar
- Woodall DH: The Use of Control Charts in Health-Care and Public-Health Surveillance. J Qual Technol 2006,38(2):89-104.Google Scholar
- Rogerson P, Yamada I: Statistical detection and surveillance of geographic clusters. Hoboken, NJ, Taylor & Francis; 2008.View ArticleGoogle Scholar
- Marshall T, Mohamnmed MA, Rouse A: A randomized controlled trial of league tables and control charts as aids to health service decision-making. Int J Qual Health Care 2004,16(4):309-315. 10.1093/intqhc/mzh054View ArticlePubMedGoogle Scholar
- Marshall CE, Spiegelhalter DJ: Reliability of league tables of in vitro fertilisation clinics: retrospective analysis of live birth rates. BMJ 1998, 316: 1701-1705. 10.1136/bmj.316.7146.1701View ArticlePubMedPubMed CentralGoogle Scholar
- Shen W, Louis TA: Triple-goal estimates for disease mapping. Statistics in Medicine 2000, 19: 2295-2308. 10.1002/1097-0258(20000915/30)19:17/18<2295::AID-SIM570>3.0.CO;2-QView ArticlePubMedGoogle Scholar
- Sui DZ, Holt JB: Visualizing and Analysing Public-Health Data Using Value-by-Area Cartograms: Toward a New Synthetic Framework. Cartographica 2008,43(1):3-20. 10.3138/carto.43.1.3View ArticleGoogle Scholar
- Light RJ, Pillemer DB: Summing up: the science of reviewing research. Cambridge, Mass., Harvard University Press; 1984.Google Scholar
- Sterne JAC, Egger M, Smith GD: Investigating and dealing with publication and other biases in meta-analysis. BMJ 2001, 323: 101-105. 10.1136/bmj.323.7304.101View ArticlePubMedPubMed CentralGoogle Scholar
- Benneyan JC, Lloyd RC, Plsek PE: Statistical process control as a tool for research and healthcare improvement. Qual Saf Health Care 2003, 12: 458-464. 10.1136/qhc.12.6.458View ArticlePubMedPubMed CentralGoogle Scholar
- Spiegelhalter DJ: Funnel plots for comparing institutional performance. Statistics in Medicine 2005, 24: 1185-2102. 10.1002/sim.1970View ArticlePubMedGoogle Scholar
- Iezzoni LI, (ed): Risk Adjustment for Measuring Health Care Outcomes. 3rd edition. Chicago, IL, Health Administration Press; 2003.Google Scholar
- Alberta Health and Wellness:Calculating Small Area Analysis: Definition of Sub-regional Geographic Units in Alberta. Alberta. 2003. [http://www.health.alberta.ca/documents/Geo-Calculating-Small-Area-2003.pdf]Google Scholar
- Statistics Canada:Canadian Community Health Survey (CCHS). [http://www.statcan.gc.ca/cgi-bin/imdb/p2SV.pl?Function=getSurvey&SurvId=3226&SurvVer=0&InstaId=15282&InstaVer=4&SDDS=3226&lang=en&db=IMDB&dbg=f&adm=8&dis=2]
- Korn EL, Graubard BI: Analysis of Health Surveys. New York, NY, Wiley; 1999.View ArticleGoogle Scholar
- Rogerson P, Yamada I: Statistical Detection and Surveillance of Geographic Clusters. Boca Raton, FL, Chapman and Hall; 2008.View ArticleGoogle Scholar
- Barss P, Smith GS, Baker SP, Mohan D: Injury prevention: an international perspective epidemiology, surveillance, and policy. New York, Oxford University Press; 1998.Google Scholar
- Ministry of Transportation, Government of Canada:Vision 2010 - Making Canada's Roads the Safest in the World. 2002. [http://www.ccmta.ca/english/pdf/rsv_report_02_e.pdf]Google Scholar
- Birkmeyer JD: Primer on Geographic Variation in Health Care. Effective Clinical Practice 2001,4(5):232-233.Google Scholar
- Terrin N, Schmid CJ, Lau J: In an empirical evaluation of the funnel plot, researchers could not visually identify publication bias. Journal of Clinical Epidemiology 2005, 58: 894-901. 10.1016/j.jclinepi.2005.01.006View ArticlePubMedGoogle Scholar
- Spiegelhalter DJ: Handling over-dispersion of performance indicators. Qual Saf Health Care 2005, 14: 347-351. 10.1136/qshc.2005.013755View ArticlePubMedPubMed CentralGoogle Scholar
- Kim H, Kriebel D: Regression models for public health surveillance data: a simulation study. Occup Environ Med 2009, 66: 733-739. 10.1136/oem.2008.042887View ArticlePubMedGoogle Scholar
- Ohlssen DI, Sharples LD, Spiegelhalter DJ: A hierarchical modelling framework for identifying unusual performance in health care providers. J R Statist Soc A 2007, 170: 865-890. 10.1111/j.1467-985X.2007.00487.xView ArticleGoogle Scholar
- Berk R, MacDonald J: Overdispersion and Poisson Regression. Journal of Quantitative Criminology 2008, 24: 269-284. 10.1007/s10940-008-9048-4View ArticleGoogle Scholar
- Pires AN, Amado C: Interval estimators for a binomial proportion: comparison of twenty methods. Revstat 2008,6(2):165-197.Google Scholar
- Rothman KJ, Greenland S, Lash TL: Modern Epidemiology. 3rd edition. Philadelphia, PA Lippincott Williams & Wilkins; 2008.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.