Using multiyear national survey cohorts for period estimates: an application of weighted discrete Poisson regression for assessing annual national mortality in US adults with and without diabetes, 2000–2006
 Yiling J. Cheng^{1}Email author,
 Edward W. Gregg^{1},
 Deborah B. Rolka^{1} and
 Theodore J. Thompson^{1}
DOI: 10.1186/s129630160117x
© The Author(s). 2017
Received: 20 April 2016
Accepted: 9 December 2016
Published: 15 December 2016
Abstract
Background
Monitoring national mortality among persons with a disease is important to guide and evaluate progress in disease control and prevention. However, a method to estimate nationally representative annual mortality among persons with and without diabetes in the United States does not currently exist. The aim of this study is to demonstrate use of weighted discrete Poisson regression on national survey mortality followup data to estimate annual mortality rates among adults with diabetes.
Methods
To estimate mortality among US adults with diabetes, we applied a weighted discrete timetoevent Poisson regression approach with poststratification adjustment to national survey data. Adult participants aged 18 or older with and without diabetes in the National Health Interview Survey 1997–2004 were followed up through 2006 for mortality status. We estimated mortality among all US adults, and by selfreported diabetes status at baseline. The timevarying covariates used were age and calendar year. Mortality among all US adults was validated using direct estimates from the National Vital Statistics System (NVSS).
Results
Using our approach, annual allcause mortality among all US adults ranged from 8.8 deaths per 1,000 personyears (95% confidence interval [CI]: 8.0, 9.6) in year 2000 to 7.9 (95% CI: 7.6, 8.3) in year 2006. By comparison, the NVSS estimates ranged from 8.6 to 7.9 (correlation = 0.94). Allcause mortality among persons with diabetes decreased from 35.7 (95% CI: 28.4, 42.9) in 2000 to 31.8 (95% CI: 28.5, 35.1) in 2006. After adjusting for age, sex, and race/ethnicity, persons with diabetes had 2.1 (95% CI: 2.01, 2.26) times the risk of death of those without diabetes.
Conclusion
Periodspecific national mortality can be estimated for people with and without a chronic condition using national surveys with mortality followup and a discrete timetoevent Poisson regression approach with poststratification adjustment.
Keywords
Complex survey Surveillance Mortality Survival analysis Discrete survival time Poisson regressionBackground
National surveillance of incidence, prevalence, and mortality is key to guiding and evaluating progress in chronic disease control and prevention. The prevalence of a chronic disease like diabetes can be affected by increasing incidence among persons without the disease as well as decreasing mortality among persons with the disease. Good estimates of mortality among persons with a chronic disease improve understanding of secular changes in prevalence, incidence, and mortality and their relationships. Since diabetes is often not recorded on death certificates as a direct, underlying, or contributing cause of death, the impact of diabetes on deaths in the United States population could be underestimated [1, 2]. The linkage of nationally representative surveys that include baseline disease status with mortality followup provides the opportunity to examine allcause and causespecific mortality among persons with diabetes or other chronic conditions.
From the policymaking and resource allocation perspectives, a crosssectional estimate of mortality by calendar period (e.g., year) is highly desirable. Analyses of mortality followup data typically use survival approaches to examine the association between risk factors and death. In these analyses, the data are analyzed as a cohort covering the entire followup period, and the hazard of death is estimated for the cohort. However, this approach does not permit estimation of the hazard of death across time periods, nor does it provide valid annual or other calendar period estimates.
By following the conceptual framework of ageperiodcohort analysis (APC) as represented by the Lexis diagram, multiyear cohort data can be decomposed into discrete timetoevent data and aggregated by calendar period [3, 4]. Calendar period allcause mortality rates can be calculated by simply using the total number of deaths divided by the total personyears in each calendar period. Poisson regression, a generalized linear model, is appropriate for modeling unadjusted and adjusted mortality rates of multiple periods [4]. Discrete Poisson regression yields identical estimates to the piecewise exponential model, which is another alternative to the Cox proportional hazards model [5]. Nevertheless, most discrete timetoevent studies use aggregated group data and categorized independent variables [6]; we are not aware of previous publications using discrete Poisson regression applied to multiyear mortality followup data from national sample surveys.
In this study, to increase the awareness of estimating crosssectional period mortality using multiyear national survey mortality followup data, we describe the construction of discrete survival time data in detail and demonstrate our approach from data preparation to data analysis. With diabetes as an example, we use populationweighted Poisson regression to model discrete survival time and estimate annual allcause mortality by diabetes status. The US National Health Interview Survey (NHIS) 1997–2004 with mortality followup up to 2006 was used to illustrate this approach; US mortality estimates from the National Vital Statistics System (NVSS) were compared for validation purposes.
Methods
Continuous timetoevent data
Demonstration of continuous and discrete timetoevent survival formats
Part I: Continuous timetoevent format  
Person (i)  Date of birth  Date of interview (t0i)  Data of death or censored (t1i)  Discretized time period  Followup (years) (Ti)  Age at interview (years) (age at t0i)  Calendar year at interview (year at t0i)  Event status (1 = died) (0 = lived) (Di)  Sex (1 = Male) (0 = Female) 
A  10/01/1929  02/06/2000  08/08/2003    3.5  70.3  2000  1  1 
B  11/07/1932  07/02/2003  12/31/2006    3.5  70.7  2003  0  0 
Part II: Discrete timetoevent format  
Person (i) & time period (j)  Date of birth  Date of interview (t0ij)  Data of death or censored (t1ij)  Discretized time period (ij)  Year of followup (Tij)  Age at the beginning of year of followup  Calendar year during followup (year at ij)  Event status (1 = died) (Dij)  Sex (1 = Male) (0 = Female) 
A1  10/01/1929  02/06/2000  12/31/2000  1  0.9  70.2  2000  0  1 
A2  01/01/2001  12/31/2001  2  1.0  71.2  2001  0  1  
A3  01/01/2002  12/31/2002  3  1.0  72.2  2002  0  1  
A4  01/02/2003  08/08/2003  4  0.6  73.2  2003  1  1  
B1  11/07/1932  07/02/2003  12/31/2003  1  0.5  70.1  2003  0  0 
B2  01/01/2004  12/31/2004  2  1.0  71.1  2004  0  0  
B3  01/01/2005  12/31/2005  3  1.0  72.1  2005  0  0  
B4  01/01/2006  12/31/2006  4  1.0  73.1  2006  0  0 
Discrete timetoevent survival data
To calculate a periodspecific mortality rate, we divided the continuous survival times into discrete calendar years. Since interviews did not all take place on the first day of the survey year, to make sure the survival time was allocated correctly we added an individualspecific partial time period (t_ext_{i}), calculated as the difference between the interview date and the first day of the year, then we divided the extended survival time [(t_{1}−t_{0}) + t_ext_{i}] into years. That is, each person’s total continuous survival time was discretized into multiple records, one for each calendar year. An individual’s survival time for a given calendar year was between 0 and 1 year, and the survival time in the first year was [1–t_ext_{i}]. In the analysis, age and calendar year were treated as timevarying (i.e., timedependent) covariates. The age during each discrete period was assigned as the age on the first day of that calendar year.
With each participant contributing multiple discrete personyears during the followup, the sum of a person’s discretized annual personyears is equal to the total continuous survival time of that person (Fig. 1 – 1b). For example in Table 1 – Part II, person A was interviewed on 02/06/2000 at age 70.3 years and contributed 3.5 personyears of followup. That is, person A contributed 0.9 personyears in 2000 with age of 70.2 years at the beginning of year 2000, contributed 1 personyear in each of years 2001 and 2002 with age of 71.2, and 72.2 years at the beginning of those years, respectively, and in 2003, person A contributed 0.6 personyears before dying on 08/08/2003 at age 73.2 years at the beginning of the year. In this way, the continuous timetoevent survival records of participants have been decomposed into discrete survival time with timevarying age.
US national health interview survey and mortality followup
We used the NHIS mortality followup data to demonstrate our approach. The NHIS, conducted by the Centers for Disease Control and Prevention’s National Center for Health Statistics (NCHS), is an annual ongoing nationally representative crosssectional household interview survey of US noninstitutionalized civilians of all ages [10]. The sampling plan covers the 50 states and the District of Columbia, and follows a multistage area probability design that permits the representative sampling of households and noninstitutional group quarters. The annual response rate of NHIS is approximately 80% of the eligible households in the sample [10]. All information about sex, race/ethnicity (nonHispanic white, nonHispanic black, Hispanic, and others), and diabetes status was selfreported. Participants were classified as having diabetes if they answered “yes” to the question “Other than during pregnancy, have you EVER been told by a doctor or other health professional that you have diabetes or sugar diabetes?”
For our analysis, we selected the NHIS 1997 to 2004 surveys as the baseline, with mortality followup up through 2006; the NHIS 2005 to 2006 surveys were used to obtain the demographic distribution for the poststratification reweighting of those two years. We included 307,280 adults aged 18 to 84 years from the NHIS 1997 to 2004 (range: 30,141 to 35,437 per year) and followed them through 2006. We excluded 15,882 (range: 1,723 to 2,642 per year) respondents because of insufficient identifying data to create a death status record, which yielded a final mortality followup sample of 242,397 (range: 29,076 to 29,193 per year) adults. The mortality followup sampling weights provided by NCHS accounted for excluded respondents.
Diabetic death was defined as a death with an associated International Classification of Diseases, 10th Revision (ICD10) code of E10E14. Allcause with diabetes death was defined as a person with diabetes who died of any cause. The total weighted persontime was used as the denominator for mortality calculation. We also estimated mortality by selfreported diagnosed diabetes at baseline. To validate our findings empirically, we compared the allcause and diabetic mortality rates from NHIS with mortality rates from the NVSS, the fundamental source of US causeofdeath information. Mortality rates from the NVSS were directly calculated as number of death (allcause or diabetic death coded as E10 to E14) divided by total population using structured query language from CDC WONDER by following the stepbystep instruction on the WONDER website (http://wonder.cdc.gov/mortSQL.html).
To reduce potential selection bias due to respondents being healthier than nonrespondents, we excluded each individual’s first two years of followup. The final analytical discrete timetoevent data set included adults aged 20 years or older during the years 2000 to 2006.
Poisson regression
Here, the natural logarithm of the expected value of the event, log(d), with an offset of natural logarithm of followup time, log(pt), is a linear combination of independent covariates, X_{ i }, with regression parameters β_{ i },.
Poisson regression provided the estimate of mortality for each calendar year/period. We used the robust error variances estimation approach to minimize overdispersion [12] and the polynomial function of calendar time to smooth yeartoyear variation in mortality rates [6, 13]. To smooth the variation in mortality due to low mortality rates in some age subgroups, the age at the beginning of a calendar year was defined as a continuous variable with polynomial terms (quadratic polynomial). The mortality rates in our study were estimated by the predictive margins of the regression coefficients from the Poisson model.
Adjusted sampling weights for the discrete timetoevent data
The age of sampled participants in each survey cohort increased with the year of followup and those multiyear survey cohorts also overlapped time periods. Without accounting for the demographic discrepancy between the participants from different cohorts and the US population at each specific year, the demographic distribution of a discrete period after the baseline year would not represent the demographic distribution of the US population at that specific year or timeperiod, and the total crude mortality of the US population would be biased toward the older population. In order to correct for these issues, we adjusted the sample weights using a poststratification procedure in which sampled units were divided into subgroups based on age, sex, and race/ethnicity; we used the nationally representative weighted size of each subgroup of NHIS 2000 to 2006 at interview to estimate the US population size. The analysis weights for the discrete timetoevent data were reweighted proportionally. The adjusted analysis weights thus sum to the US population size within each subgroup. The sum of the analysis weights equaled the total noninstitutionalized US population for each calendar year.
Analysis
We used Stata 13.1 (StataCorp LP, College Station, Texas) to account for the complex multistage sampling design and to produce weighted estimates and 95% confidence intervals (CI).
For all comparisons we used a twosided statistical test with significance defined as p value (p) <0.05 or a 95% CI that did not include the null value. The ggplot2 package of R was used to produce graphics [14].
Results
Allcause mortality rates of US adults aged 20 to 84 years (per 1,000 personyears and 95% CI) by calendar year using discretized survival time, NVSS and NHIS followup
2000  2001  2002  2003  2004  2005  2006  

National Vital Statistics System  
Allcause deaths, n  1,690,834  1,697,147  1,708,100  1,706,555  1,672,235  1,691,092  1,671,006 
Population, n  196,709,054  199,749,920  202,082,985  204,215,941  206,505,061  208,818,040  211,189,565 
Allcause mortality (M0)^{a}  8.60  8.50  8.45  8.36  8.10  8.10  7.91 
NHIS mortality followup with the poststratification sampling weights  
Allcause deaths, n  614  874  1,124  1,417  1,638  1,869  2,046 
Sample adults, n  60,395  86,722  114,050  141,678  166,593  190,342  214,530 
Personyears  60,070  86,274  113,501  140,958  165,769  189,394  212,803 
Allcause mortality (M1)^{b}  8.81 (8.00, 9.62)  8.45 (7.75, 9.14)  8.35 (7.77, 8.93)  8.39 (7.89, 8.89)  7.98 (7.54, 8.42)  8.11 (7.65, 8.58)  7.94 (7.56, 8.33) 
NHIS mortality followup with original sampling weights  
Allcause mortality (M2)^{b}  8.99 (8.17, 9.82)  8.80 (8.08, 9.52)  8.67 (8.07, 9.28)  8.83 (8.31, 9.36)  8.41 (7.95, 8.87)  8.58 (8.10, 9.06)  8.36 (7.95, 8.77) 
(M2 – M1)  0.18 (−0.97, 1.33)  0.35 (−0.65, 1.35)  0.32 (−0.53, 1.17)  0.44 (−0.29, 1.17)  0.43 (−0.19, 1.05)  0.47 (−0.18, 1.12)  0.42 (−0.15, 0.99) 
To show the importance of poststratification reweighting, we compared the NHIS followup estimates with the results from NVSS and the NHIS estimates that used the original weights (Table 2). Mortality estimates using the original sampling weights without poststratification adjustment were higher than the mortality estimates using adjusted sampling weights, because of the aging of cohorts during the followup. Mortality in each year from the NVSS was within the 95% CIs of mortality rates from the NHIS using the adjusted sampling weights. The average annual decrease in crude mortality (per 1,000 personyears) was 0.12 for both the NHIS and the NVSS. The correlation of NHIS and the NVSS mortality was 0.94. Agesexrace/ethnicityadjusted mortality decreased 2.6% per year (p < 0.001).
Diabetic^{a} and allcause with diabetes^{b} mortality rates (per 1,000 personyears and 95% CI) of US adults aged 20 to 84 years by calendar year using discretized survival time data, NVSS and NHIS followup
2000  2001  2002  2003  2004  2005  2006  

National Vital Statistics System  
Diabetic death, n  55,661  57,105  58,431  59,164  58,123  59,108  57,260 
Population, n  196,709,054  199,749,920  202,082,985  204,215,941  206,505,061  208,818,040  211,189,565 
Diabetic mortality (M0)^{c}  0.28  0.29  0.29  0.29  0.28  0.28  0.27 
NHIS mortality followup: total adults with and without diabetes  
Diabetic deaths, n  24  32  55  61  55  79  69 
Diabetic mortality (M1)^{d}  0.26 (0.14, 0.37)  0.29 (0.18, 0.41)  0.42 (0.28, 0.57)  0.34 (0.23, 0.44)  0.26 (0.18, 0.34)  0.30 (0.21, 0.38)  0.25 (0.18, 0.33) 
Allcause with diabetes death, n  123  196  249  310  344  408  440 
Allcause with diabetes mortality (M2)^{e}  1.78 (1.43, 2.13)  1.94 (1.59, 2.30)  1.80 (1.51, 2.09)  1.80 (1.55, 2.05)  1.63 (1.42, 1.83)  1.70 (1.51, 1.88)  1.66 (1.48, 1.83) 
NHIS mortality followup: adults with diagnosed diabetes  
Diabetic deaths, n  15  22  45  51  44  56  47 
Diabetic mortality (M3)^{f}  3.50 (1.48, 5.51)  4.04 (2.08, 6.01)  7.03 (4.34, 9.72)  5.76 (3.85, 7.67)  4.04 (2.57, 5.51)  4.41 (2.94, 5.88)  3.06 (2.09, 4.03) 
Allcause deaths, n  123  196  249  310  344  408  440 
Allcause mortality (M4)^{g}  35.7 (28.4, 42.9)  39.5 (32.4, 46.5)  36.1 (30.4, 41.8)  35.6 (30.7, 40.5)  31.7 (27.8, 35.6)  33.0 (29.4, 36.5)  31.8 (28.5, 35.1) 
M3/M4, %  9.80 (9.05, 10.56)  10.13 (9.39, 10.86)  19.39 (18.76, 20.02)  16.29 (15.76, 16.82)  12.62 (12.20, 13.03)  13.37 (13.00, 13.75)  9.75 (9.40, 10.10) 
NHIS mortality followup: adults without diagnosed diabetes  
Diabetic death, n  9  10  10  10  11  23  22 
Diabetic mortality (M5)^{h}  0.09 (0.03, 0.15)  0.10 (0.03, 0.17)  0.08 (0.03, 0.13)  0.05 (0.02, 0.08)  0.05 (0.02, 0.09)  0.07 (0.04, 0.11)  0.10 (0.05, 0.16) 
Discussion
Period mortality among persons with chronic conditions such as diabetes is an important surveillance indicator of disease prevention and control. However, since chronic disease status is not reported in many vital statistics registries, it is often not possible to use vital statistics data to estimate mortality of persons with and without the condition. This presents a particular limitation for diabetesrelated death statistics because diabetic death is often not recorded on US death certificates as a direct underlying or contributing cause of death, and diabetic deaths in the US population could be underestimated by solely using death certificate information [1, 2]. Assembly of national cohorts by linking national survey data with vital statistics provides a potential remedy to the data gap, but requires specific methods to permit estimation of period effects. In this study, we described the use of weighted discrete Poisson regression to estimate national mortality rates by diabetes status using a complex sample survey, and we validated this approach using mortality registry estimates. Our study showed that allcause mortality of US adults estimated by the NHIS mortality followup decreased 2.6% per year from 2000 to 2006, which was similar to mortality rates estimated using the NVSS mortality registry data. Meanwhile allcause mortality of US adults with diabetes decreased 3.7% per year during the same period [1, 2].
The method that is most often used to analyze mortality cohort data is the Cox proportional hazards regression model, which is useful for analyzing the data from the association or causeeffect relationship perspective. However, it is cumbersome to use this method to calculate hazard rates for a large number of combinations of predictors. Alternatively, parametric survival models can be more convenient for predicting, but cannot deal easily with timevarying covariates [15].
Ageperiodcohort (APC) analysis provides a third option. If a vital statistics registry includes complete information on disease status, the APC method can be used to estimate the annual/period mortality among persons with and without diabetes [4, 6, 11, 16–18]. In the US, diabetes status is not recorded in the national vital statistics registry system. So we cannot apply this method directly. However, US nationally representative survey mortality followup data provide information on both diabetes status and death status. The APC model and life table framework can be applied to these data.
The APC analysis has been applied in demography, social science, and disease surveillance research using crosssectional registration or survey data for a long time [19]. The data are usually crosssectional and grouped for data analyses. One of the major purposes of these studies was to separate the age, period, or cohort effects using crosssectional data [20]. In our study, we applied the concept and analytic framework of this widely used APC model. Compared to traditional APC models, our study had several differences. First, our study used longitudinal national complex survey mortality followup data. Second, the purpose was to estimate period mortality, which is a sum of the age and cohort effects. Finally, to account for the aging of the cohort during followup, we poststratified the aggregated multiple segments from different survey cohorts using the US population structure at each period.
Both Poisson and logistic regression can be used for discretized timetoevent data analysis. Efron combined the logistic regression with discrete timetoevent survival time by 1month intervals and obtained direct estimates of the hazard rates [21]. A polynomial or spline model can be used to smooth out the random variation/noise. This partial logistic regression gives good estimates when the discrete time interval is small. Nevertheless, a Poisson regression that accounts for persontime of followup gives more accurate hazard rate estimates for longer discrete time intervals than a logistic regression. Poisson regression has been used frequently to compare mortality rates among different categories of cohorts in epidemiological studies and is a convenient alternative to Cox proportional hazards regression especially when the proportional hazards assumptions are not met [5]. Early studies on the analysis of cohort survival data showed that Poisson regression is a straightforward and intuitive approach for directly estimating the hazard rates while incorporating time scale as a covariate in the model [16, 22]. We were interested in annual (or longer) time periods rather than monthly or daily periods and thus discrete Poisson regression was chosen for our analysis.
To obtain valid national estimates from a complex sample survey, it is critical to use proper statistical methods to account for the sample design and sampling weights. Our study shows that in later years, the distribution of age in the followup cohort shifted to the right; thus without poststratification reweighting, the overall mortality rates combining all ages would have been overestimated. Using the US population as the standard population for poststratification reweighting yielded allcause and diabetic mortality estimates that were similar to the national registry estimates. Our study demonstrated that discrete Poisson regression with poststratification is a feasible approach for estimating annual mortality for the US population with and without diabetes.
The major limitation of our approach is the amount of time needed to discretize and analyze a large sample with long followup time. Poisson regression using complex sample data is computationally timeconsuming with large discretized persontime datasets because data cannot be collapsed over covariates to account for the designbased analysis of complex sample data. Estimation based on a small number of events can create problems with model convergence. Without careful programming and reweighting, the results can be biased. In addition, the NHIS mortality data represented deaths among the civilian noninstitutionalized population with personyear as the denominator, whereas mortality data from the NVSS represented deaths among the entire US population with the whole population at risk as the denominator. Thus, mortality rates from the two systems might have subtle differences.
To demonstrate our approach, we used selfreported diabetes. While any selfreported condition is subject to recall error, the selfreport of diabetes is considered a valid measure of diagnosed diabetes [23]. Although it is recognized as being nonsensitive, it has been shown to be highly specific [24]. Another source of bias may arise from the lack of information about diabetes status between the baseline interview and death or censoring. Even though the rate of remission from diabetes to nondiabetes is likely small [25], the lack of information on incident cases would likely lead to an overestimation of diabetes duration. Furthermore, if incident cases have a higher mortality rate than noncases and a lower mortality rate than prevalent cases, then lacking this information on incidence could lead to an overestimation of mortality rates for the populations both with and without diabetes. Future analyses with information with multiple followup visits could quantify the impact of this bias. We demonstrated that weighted discrete Poisson regression is an efficient applicable approach to estimate period mortality from the national mortality followup data. To our knowledge, there has been no similar report, though all the steps of this approach are well established. Several reasons could explain the scant usage of the discrete Poisson regression approach, including lack of data availability, lack of its inclusion as part of biostatistics educational curricula, the computing time required to analyze discrete timetoevent data, and the complex sampling design of national surveys, which further complicates using this approach. However, the increasing availability of more powerful statistical software and computing capabilities permits a revisitation of this method for the analysis of national survey mortality followup data.
Conclusions
We conclude that combining national followup cohorts from multiple survey years and analyzing them using population weighted discrete Poisson regression can yield annual national mortality rates by disease status.
Abbreviations
 CDC:

Centers for disease control & prevention
 CDCWONDER:

Wideranging online data for epidemiologic research
 NCHS:

National center for health statistics
 NHIS:

National health interview survey
 NVSS:

National vital statistics system
Declarations
Acknowledgments
A special thanks to the women and men who participated in the study. We would also like to thank all of the staff involved in the US National Health Interview Survey for the study design, data collection and data dissemination. We thank Dr. Giuseppina Imperatore and Dr. Elizabeth Luman for their contributions to this study.
The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention.
This paper was not presented anywhere outside of the CDC.
Funding
Not applicable.
Availability of data and materials
All the data are available online: 1) CDC WONDER (http://wonder.cdc.gov/mortSQL.html); 2) NHIS PublicUse file: http://www.cdc.gov/nchs/nhis/nhis_questionnaires.htm; and 3) NHIS PublicUse Linked Mortality Files: http://www.cdc.gov/nchs/datalinkage/mortalitypublic.htm.
Authors’ contributions
YJC conceived the study, acquired and analyze the data, and draft the manuscript. YJC and TJT designed the study. YJC, TJT, EWG, and DBR revised and interpreted the manuscript critically for important intellectual content. YJC, TJT, EWG, and DBR read and approved the final manuscript.
Competing interests
The authors declare that they have no competing interests.
Consent for publication
The NHIS mortality followup and CDC Wonder data used in this study were approved public data sets. We did not merging any of the data sets in such a way that individuals might be identified, and did not enhance the public data sets with any identifiable or potentially identifiable data.
Ethics approval and consent to participate
This is a secondary analysis of publicuse national survey data. The content and conduct of the US NHIS were subject to an institutional review board. All individuals participating in the original survey were required to sign an informed consent document.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Authors’ Affiliations
References
 US Centers for Disease Control and Prevention. 2014 US National Diabetes Statistics Report Atlanta (GA): Centers for Disease Control and Prevention; [updated May 15, 2015; cited 2015 February 19, 2016.]. Available from: http://www.cdc.gov/diabetes/data/statistics/2014StatisticsReport.html.
 Miniño AM, Murphy SL, Xu J, Kochanek KD. Deaths: final data for 2008. Natl Vital Stat Rep. 2011;59(10):1–126.Google Scholar
 Frost WH. The age selection of mortality from tuberculosis in successive decades. 1939. Am J Epidemiol. 1995;141(1):4–9. discussion 3.View ArticlePubMedGoogle Scholar
 Keiding N. Statistical Inference in the Lexis Diagram. Philos T Roy Soc A. 1990;332(1627):487–509.View ArticleGoogle Scholar
 Laird N, Olivier D. Covariance analysis of censored survival data using loglinear analysis techniques. J Am Stat Assoc. 1981;76(374):231–40. PMID: WOS:A1981LT69100002.View ArticleGoogle Scholar
 Hansen MB, Jensen ML, Carstensen B. Causes of death among diabetic patients in Denmark. Diabetologia. 2012;55(2):294–302.View ArticlePubMedGoogle Scholar
 Gail MH, Graubard B, Williamson DF, Flegal KM. Comments on ‘Choice of time scale and its effect on significance of predictors in longitudinal studies’ by Michael J. Pencina, Martin G. Larson and Ralph B. D'Agostino, Statistics in Medicine 2007; 26:1343–1359. Stat Med. 2009;28(8):1315–7.View ArticlePubMedPubMed CentralGoogle Scholar
 Korn EL, Graubard BI, Midthune D. Timetoevent analysis of longitudinal followup of a survey: choice of the timescale. Am J Epidemiol. 1997;145(1):72–80.View ArticlePubMedGoogle Scholar
 Pencina MJ, Larson MG, D’Agostino RB. Choice of time scale and its effect on significance of predictors in longitudinal studies. Stat Med. 2007;26(6):1343–59.View ArticlePubMedGoogle Scholar
 US Centers for Disease Control and Prevention. About the National Health Interview Survey Hyattsville (MD): US Centers for Disease Control and Prevention; 2016 [updated October 8, 2016; cited 2016 February 19, 2016.]. Available from: http://www.cdc.gov/nchs/nhis/about_nhis.htm.
 Frome EL. The analysis of rates using Poisson regression models. Biometrics. 1983;39(3):665–74.View ArticlePubMedGoogle Scholar
 Zou G. A modified poisson regression approach to prospective studies with binary data. Am J Epidemiol. 2004;159(7):702–6.View ArticlePubMedGoogle Scholar
 Carter D, Signorino C. Back to the future: Modeling time dependence in binary data. Polit Anal. 2010;18(3):271–92.View ArticleGoogle Scholar
 Wickham H. ggplot2: elegant graphics for data analysis. New York: Springer; 2009.View ArticleGoogle Scholar
 Reid N. A Conversation with Sir David Cox. Stat Sci. 1994;9(3):17.View ArticleGoogle Scholar
 Carstensen B, Kristensen JK, Ottosen P, BorchJohnsen K. Steering Group of the National Diabetes R. The Danish National Diabetes Register: trends in incidence, prevalence and mortality. Diabetologia. 2008;51(12):2187–96.View ArticlePubMedGoogle Scholar
 O’Brien RM. Age period cohort characteristic models. Soc Sci Res. 2000;29(1):123–39. PMID: WOS:000085452600006.View ArticleGoogle Scholar
 Bell FC, Miller ML. Life tables for the United States social security area 1900–1200 2005 [cited 2016 11/07]. Available from: https://www.ssa.gov/oact/NOTES/as120/LifeTables_Body.html.
 Vandeschrick C. The Lexis diagram, a misnomer. Demogr Res. 2001;4(3):97–124.View ArticleGoogle Scholar
 Keyes KM, Utz RL, Robinson W, Li G. What is a cohort effect? Comparison of three statistical methods for modeling cohort effects in obesity prevalence in the United States, 1971–2006. Soc Sci Med. 2010;70(7):1100–8.View ArticlePubMedPubMed CentralGoogle Scholar
 Efron B. Logistic Regression, Survival Analysis, and the KaplanMeier Curve. J Am Stat Assoc. 1988;83(402):414–25.View ArticleGoogle Scholar
 Breslow NE, Lubin JH, Marek P, Langholz B. Multiplicative models and cohort analysis. J Am Stat Assoc. 1983;78(381):1–12.View ArticleGoogle Scholar
 Jackson JM, DeFor TA, Crain AL, Kerby TJ, Strayer LS, Lewis CE, et al. Validity of diabetes selfreports in the Women’s Health Initiative. Menopause. 2014;21(8):861–8.View ArticlePubMedPubMed CentralGoogle Scholar
 Centers for Disease Control Prevention. National diabetes statistics report: estimates of diabetes and its burden in the United States, 2014: HHS CDC; 2014 [cited 2016 11/08]. Available from: http://www.cdc.gov/diabetes/pubs/statsreport14/nationaldiabetesreportweb.pdf.
 Gregg EW, Chen H, Wagenknecht LE, Clark JM, Delahanty LM, Bantle J, et al. Association of an intensive lifestyle intervention with remission of type 2 diabetes. JAMA. 2012;308(23):2489–96.View ArticlePubMedPubMed CentralGoogle Scholar