Skip to main content

Quality of vital event data for infant mortality estimation in prospective, population-based studies: an analysis of secondary data from Asia, Africa, and Latin America



Infant and neonatal mortality estimates are typically derived from retrospective birth histories collected through surveys in countries with unreliable civil registration and vital statistics systems. Yet such data are subject to biases, including under-reporting of deaths and age misreporting, which impact mortality estimates. Prospective population-based cohort studies are an underutilized data source for mortality estimation that may offer strengths that avoid biases.


We conducted a secondary analysis of data from the Child Health Epidemiology Reference Group, including 11 population-based pregnancy or birth cohort studies, to evaluate the appropriateness of vital event data for mortality estimation. Analyses were descriptive, summarizing study designs, populations, protocols, and internal checks to assess their impact on data quality. We calculated infant and neonatal morality rates and compared patterns with Demographic and Health Survey (DHS) data.


Studies yielded 71,760 pregnant women and 85,095 live births. Specific field protocols, especially pregnancy enrollment, limited exclusion criteria, and frequent follow-up visits after delivery, led to higher birth outcome ascertainment and fewer missing deaths. Most studies had low follow-up loss in pregnancy and the first month with little evidence of date heaping. Among studies in Asia and Latin America, neonatal mortality rates (NMR) were similar to DHS, while several studies in Sub-Saharan Africa had lower NMRs than DHS. Infant mortality varied by study and region between sources.


Prospective, population-based cohort studies following rigorous protocols can yield high-quality vital event data to improve characterization of detailed mortality patterns of infants in low- and middle-income countries, especially in the early neonatal period where mortality risk is highest and changes rapidly.

Peer Review reports


Infant and neonatal mortality rates are important indicators of trends in child health that serve to inform global health policy and programs [1]. Complete civil registration and vital statistics systems (CRVS systems) that collect continuous data on birth and death events are common sources of mortality data as they provide comprehensive, high-quality, and timely estimates of mortality. Yet there are gaps in our knowledge of mortality among infants in many low- and middle-income countries (LMICs) due to incomplete or inaccurate CRVS systems [2]. Only two-thirds (68%) of countries globally have CRVS systems that record data on at least 90% of all deaths; this threshold is met by only 25% of countries in South Asia and 8% in Sub-Saharan Africa, compared to 83% in Latin America and the Caribbean [3]. Further, detailed, high-quality data on patterns of mortality by age are not widely available beyond the traditional cut-offs of infant (birth to 1 year) and neonatal (birth to 28 days) periods, especially in the first week of life [4, 5].

In countries without strong CRVS systems, mortality estimates are typically derived from retrospective full birth histories (FBH) collected through sample surveys such as the United States Agency for International Development (USAID)-supported Demographic and Health Surveys (DHS) and UNICEF-supported Multiple Indicator Cluster Surveys (MICS) [6, 7]. However, although commonly used for many settings without reliable CRVS systems [8, 9], birth histories are subject to biases, including under-reporting of deaths and age misreporting that can impact mortality estimates, with the strongest effects occurring early in the neonatal period [10, 11]. New approaches are needed to estimate how infant and neonatal mortality are distributed by detailed age strata in settings where mortality is high and CRVS systems are inadequate.

Many maternal and child health population-based studies prospectively enroll and follow a pregnancy or birth cohort that could be used for mortality estimation. Although these studies, which are typically randomized controlled trials (RCTs) and observational prospective cohort studies, aim to evaluate the effect of a specific intervention or measure associations between suspected risk factors and child health outcomes, they have strengths that could avoid shortcomings and biases associated with DHS data [12]. Cohort studies often include prospective, systematic follow-up, protocols that might reduce biases common in the DHS’s retrospective FBHs, such as non-response bias, recall biases, and date heaping. Health and Demographic Surveillance Systems (HDSS), which collect longitudinal data though regular surveys in a defined geographic area and population, are another important source of health data in countries without strong CRVS systems. Compared to HDSSs, which utilize prospective annual or semi-annual visits, cohort studies often have frequent household visits to collect detailed information on vital events during critical periods, such as during pregnancy and the perinatal period.

Cohort studies also have potential weaknesses and data quality issues. Cohorts are conducted in limited geographic areas, similar to HDSS and different from the DHS, which may not necessarily represent the national population. Some cohort studies span a short period, in contrast to HDSS sites, which operate continuously, exposing cohorts to biases associated with seasonality and unusual external events (e.g., famine). Despite the shortcomings of HDSS and DHS, their continuity (although not always in the same season) offers the benefit of evaluating public health trends over time compared to cohorts, which are high cost and transient. Communities where cohort studies are based often selected because they have higher mortality, thereby reducing sample size requirements and allowing for an understanding of how interventions function in settings where they are most needed. Further, these studies are designed to look at specific research questions and; therefore, sometimes utilize inclusion/exclusion criteria that may not lead to the enrollment of a representative sample in the geographic study area. Study visit protocols can determine whether a very high proportion of deaths are identified, including whether and when pregnancies or births are enrolled, facility delivery rates in the study area, how quickly the study team makes home visits after the birth, and how live births and stillbirths are classified. Understanding these factors is required to determine the accuracy of mortality estimates obtained from a specific cohort.

Unlike DHS or HDSS, cohort studies are not a commonly used data source for mortality estimates in LMICs. The goal of this study was to evaluate the potential of this underutilized source of information for the purpose of mortality estimation and understanding detailed patterns of mortality by age. We assessed the effect of common data quality issues on infant and neonatal mortality measurement in several population-based pregnancy or birth cohort studies from Asia, Sub-Saharan Africa, and Latin America. We suggest approaches to prevent, measure, and control for these issues in understanding patterns of mortality in these populations.


We conducted a secondary analysis of data from the Child Health Epidemiology Reference Group (CHERG), including population-based cohort studies from Asia, Sub-Saharan Africa, and Latin America [13]. The goal of CHERG was to generate evidence to support estimation of child mortality burden and causes of death. We selected 11 CHERG studies that included individual-level participant vital event data for this descriptive analysis. We evaluated the appropriateness of vital event data for mortality estimation and understanding detailed mortality patterns in three dimensions. The first dimension comprises the study design, population, and field protocols, which is subdivided into study design and population (i.e., randomized controlled trial or cohort study; enrollment of pregnancies or live births), inclusion/exclusion criteria, surveillance and enrollment protocols, and visit frequency. The second and third dimensions included, internal data quality checks (loss to follow-up (LTF) (both in pregnancy and the infant period), date heaping, and shape of the mortality curve) and external checks (comparison with DHS), respectively.

Of 11 studies in our analysis, seven enrolled women in pregnancy, and four enrolled live births. For each of the seven studies that enrolled pregnancies, we summarized the number of pregnancies identified, the number of pregnancies enrolled, and the number of pregnancies followed to a birth outcome. We defined an unknown birth outcome as a confirmed pregnancy (i.e., typically a positive urine test, excluding false positive results) for which no information on the outcome of the birth was available to the study investigators. Reasons for LTF in pregnancy were classified as the following: 1) withdrawal of consent, 2) out-migration, censoring, or could not be contacted, 3) maternal death during pregnancy with unknown birth outcome, or 4) data entry or management error leading to loss of information on the birth outcome. Birth outcomes, including live births, stillbirths, and miscarriages/abortions, were defined as classified by the original study investigators.

For all 11 studies, we summarized the number of infant deaths, number of surviving infants, and number of infants LTF for both the neonatal (0–< 28 days) and infant periods (0–< 365 days). LTF in the neonatal and infant periods was defined as an infant for which the study investigators did not know the vital status of the infant at the end of the time period.

For each of the 11 studies, we identified DHS data for comparison by selecting the DHS survey with the closest time period relative to the study follow-up and the DHS region with the closest geographical location relative to the study site (see the footnote to Table 3 for the specific DHS surveys and regions that we selected). Within each DHS survey dataset, we restricted the analysis population to birth outcomes that occurred in the selected region within the years matching the data collection period for the respective cohort study. For example, if a cohort study began enrollment sometime in 2004 and followed the last participant until sometime in 2009, we included DHS participants with births occurring between January 1, 2004, and December 31, 2009. For each participant, we assigned an exit date as the date death or date of interview if this event occurred before December 31, 2009, or we administratively censored participant follow-up at December 31, 2009, if the event occurred after this date.

Cohort studies and DHS allowed us to use the same method for computing mortality estimates. In both cases, we used individual data to compute age-specific death rates by week for the neonatal period (0–6 days; 7–13 days; 14–20 days; 21–27 days) and by month for the post-neonatal period (months 2–12) using the event/exposure approach presented by Hill (2013) [14]. We calculated age-specific death rates by dividing the number of deaths by the person-years computed in each age group and for the corresponding period. We then cumulated these deaths rates—under the assumption of a constant force of mortality within each age interval—to obtained cumulative probabilities of dying from birth to age 28 days and one year, namely the so-called neonatal (NMR) and the infant (IMR) mortality rates. We presented these cumulative probabilities of dying as number of deaths per 1000 live births.

We graphed the distribution of LTF over time for the neonatal and infant periods against distribution of infants who died and log age-specific mortality rates, respectively, to visually assess the potential extent and timing of missed mortality outcomes due to LTF. We displayed the frequency of each day of the month for dates of birth and death in histograms to visually explore evidence of date heaping.

Datasets shared with our research team by the original study investigators contained no identifying information and; therefore, this analysis was considered exempt by the Institutional Review Board at the Johns Hopkins Bloomberg School of Public Health.


Study design, population, and field protocols

Study design and population

We utilized 11 studies in this analysis, including seven RCTs and four observational longitudinal cohort studies, from Asia (n = 4), Sub-Saharan Africa (n = 4), and Latin America (n = 3) conducted between 1983 and 2015. Seven studies enrolled and followed pregnant women and their infants, and four enrolled only live births, yielding a total of 71,760 pregnant women and 85,095 live births for analysis. Studies were population-based, either recruiting pregnancies or live births through census and systematic surveillance of households in a community or regular surveillance of health facilities in a geographical area with a rate of facility of delivery > 90% (Table 1).

Table 1 Study characteristics

Inclusion/exclusion criteria

Most studies had broad inclusion criteria and few exclusion criteria, which were related primarily to residency in the study area and posed minimal potential to bias vital event data (India 2000 [15], Nepal 1999 16, Nepal 2011 [17], Burkina Faso 2004 [18] and 2006 [19], Kenya 1992 [20, 21], and Brazil 1993 [22], 2004 [23], 2015 [24]). Zimbabwe 1997 [25], however, excluded mothers or infants with acutely life-threatening conditions, infants with birth weight < 1500 g, and multiple births. The Philippines 1983 [26] excluded multiple births.

Surveillance and enrollment protocols

Most community-based studies conducted a single census survey to either identify and follow women of reproductive age or immediately enroll currently pregnant women (Nepal 1999, Nepal 2011, Philippines 1983, Burkina Faso 2004 and 2006). India 2000 identified pregnancies from various sources, including community-based health workers, antenatal care clinics, and development workers in the study area. Kenya 1992 utilized monthly censuses by trained village monitors and/or traditional birth attendants to identify and enroll pregnancies. Brazil 2015, a facility-based study, identified pregnancies through weekly contact with 123 health facilities conducted by study staff. For studies enrolling pregnancies, outcomes were typically reported by locally-resident study staff (Nepal 1999, India 2000, Nepal 2011, Burkina Faso 2004 and 2006). The Philippines 1983 and Kenya 1992 utilized non-study traditional birth attendants to report birth outcomes and study staff to conduct enrollment, birth, and other follow-up visits. Brazil 1993 and 2004 and Zimbabwe 1997 enrolled only live births (i.e., not pregnant women) through visits by study staff to health facilities in the study area (notably, Zimbabwe 1997 only enrolled women during the day, not at night).

Visit frequency

Follow-up visits in pregnancy to identify birth outcomes ranged from very frequent (daily visits in Burkina Faso 2004 and 2006) to infrequent (one baseline visit in pregnancy before delivery in the Philippines 1983). First visits for ascertainment of the birth outcome ranged from < 96 h (Zimbabwe 1997) to the day of birth (Brazil 1992, 2004, and 2015), although most (n = 10) studies conducted this visit < 72 h and half of the studies (n = 6) at < 24 h after delivery. The frequency of follow-up visits in the early days and weeks of life ranged from daily visits in the first ten days of life (Nepal 1999) to a visit at three months after the initial birth visit (Brazil 2004 and 2015).

Internal data quality checks

Lost to follow-up during pregnancy

Studies enrolled a high proportion of the pregnancies identified through surveillance (97.6% to 100%) (Table 2). After pregnancy enrollment, LTF before the birth outcome was low for most studies (0% to 13.0%). Reasons for LTF in pregnancy included refusal/withdrawal of consent (0% to 2.5%); out-migration, participant unreachable, or participant missed by birth outcome surveillance (< 0.1% to 13.1%); maternal death (0% to 0.2%); and data error issues from (0% to 0.5%) (Additional file 1: Appendix 1).

Table 2 Identifying and recording pregnancies, loss to follow-up in pregnancy, and birth outcomes

Loss to follow-up in infant period

LTF for newborns after delivery between day 0 and 27 ranged from 0.1% to 4.8%, while LTF between day 28 to one year ranged from 0.7% to 43.9% (Table 3, Fig. 1). In most studies, the reason for LTF was unspecified and, potentially due to out-migration. For three studies, reasons for LTF in the infant period were specified, including Nepal 1999 (LTF: n = 88, 98.9%; refusal: n = 1, 1.1%), Nepal 2011 (LTF: n = 1140, 94.8%; refusal: n = 62, 5.2%; maternal death: n = 1, 0.1%), and the Philippines 1983 (LTF: n = 246, 71.3%; refusal n = 44, 12.8%; multiple births (not followed according to study protocol): n = 55, 15.9%).

Table 3 Following and recording infant vital status and loss to follow-up in the neonatal and infant periods
Fig. 1
figure 1figure 1

Distribution of age at death and loss to follow-up in neonatal or infant period by study. A Graphs include live births with complete vital registration data: India 2000: n = 14,147; Nepal 1999 n = 4130; Nepal 2011 n = 32,010; Philippines 1983: n = 3070 observations with complete data (n = 79 live births excluded for missing vital event data). All four of these studies were pregnancy cohorts. B Graphs include live births with complete vital registration data: Burkina 2004: n = 1321; Burkina 2006: n = 1102; Kenya 1992: n = 2332; Zimbabwe 1997: n = 14,108. Burkina Faso 2004 and 2006 and Kenya 1992 were pregnancy cohorts; Zimbabwe 1997 was a birth cohort. C Graphs include live births with complete vital registration data: Brazil 1993: n = 5248; Brazil 2004: n = 4219; Brazil 2015: n = 4270. Brazil 2015 was a pregnancy cohort; Brazil 1993 and 2004 were birth cohorts

Date heaping

Heaping for the date of death was observed in Burkina Faso 2004 and 2006 (due to reliance on maternal recall), Zimbabwe 1997 (15th), Kenya 1992 (15th), and potentially also in Brazil 2015 and Nepal 1999 (1st and 15th) (Additional file 1: Appendix 2). There was no evidence of heaping for dates of the birth outcome in the 11 studies.

Shape of mortality curve

Figure 1 presents histogram distributions of the number of infants who died, the number of infants lost to follow-up, and log mortality rates for the first four weeks of life and months 2 to 12 for each study and the best matching DHS survey and region.

External checks

Comparison with DHS

NMR among the cohort studies was relatively similar to the comparison group in Asia (DHS) and Brazil (national data from DHS for 1993 and United Nations Inter-agency Group for Child Mortality Estimation for 2004 and 2015). However, among the Africa studies, NMR was substantially lower in the study data compared to DHS, except for Kenya 1992, which was similar. In Asia, IMR was lower for Nepal 1999 and higher for the Philippines, relative to DHS. Among studies in Africa, IMR in Burkina Faso 2004 and 2006 was lower than DHS and much higher in Kenya 1992 and Zimbabwe 1997. In Brazil, IMR was lower than DHS with this difference decreasing from 1993, 2004, to 2015 (comparison was Brazil nationally).


Our analysis of 11 cohort studies identified field protocols that determine the appropriateness of vital event data for the purpose of mortality estimation. We found that missing birth and death outcomes—a source of bias if selection is associated with mortality risk—were influenced by several aspects of cohort study design and implementation. Several studies achieved low LTF in pregnancy and the neonatal period with no evidence of date heaping, likely due to frequent follow-up visits. Neonatal mortality rates between the external sources and the cohorts were similar in Asia and Latin America and substantially lower in most cohorts in Sub-Saharan Africa. Patterns of infant mortality varied by study and region between cohort studies and DHS comparison data. Potential reasons for these differences and their implications are discussed, while recognizing the absence of a single “gold standard” for mortality estimation.

Review of study design, population, and field protocols, as well as rates of LTF in pregnancy, suggest that studies enrolling pregnancies, rather than live births, are more likely to ascertain a high proportion of birth outcomes and less likely to miss very early neonatal deaths. The Nepal 1999 and the Burkina Faso studies achieved high follow-up of pregnancies with very few missing birth outcomes. Notably, Brazil 2015, a facility-based study, was able to attain a similar result. Nepal 2011 enrolled pregnancies or recorded birth outcomes for women not initially captured by pregnancy surveillance. This open cohort approach allowed for in-migration (and offset out-migration for the same reason) due to women returning to their maternal home for pregnancy and delivery, a common cultural practice in South Asia, especially among younger, nulliparous women. Nepal 2011’s high LTF in pregnancy and the neonatal period is also due in part to administrative censoring after study completion; a cause of missing data less likely to be associated with selection bias for mortality outcomes.

The specific protocols for pregnancy enrollment and follow-up influence whether a high proportion of birth outcomes are captured. Zimbabwe 1997, relied on a wide enrollment window (< 96 h of delivery) and enrolled women/infants only during daytime (potentially excluding women with obstetric complications); the substantially lower early neonatal mortality rate observed in this study is likely due in part to missed early deaths (mortality risk among HIV infected infants was lower in the early weeks of life suggesting missing deaths in this group) [27]. The Philippines 1983 conducted a follow-up survey after completing the primary study that found that many pregnancies, some that later resulted in an infant death, had been missed by pregnancy surveillance.

DHS FBHs are susceptible to missing and inaccurate vital event data for births and deaths, particularly at early ages, resulting from under-reporting of deaths and age misreporting [11]. Although thought to be less common due to use of prospective follow-up, there is also evidence that omissions of births and deaths are an important source of bias at early ages in HDSS sites, a result of time between surveys, recall bias, and the reliability of proxy respondents (i.e., person other than the mother) [28, 29]. Here the strengths of cohort studies—early and complete identification of pregnancies and frequent follow-up visits in pregnancy and the early neonatal period (e.g., as observed in Nepal 1999)—may offer a less biased source of data for estimation of fine strata mortality risks on the first days of life.

Generally, we found that cohort studies applied few inclusion/exclusion criteria; however, when utilized to address specific primary research questions, they can introduce selection bias into vital event data if associated with mortality risk. An example is Zimbabwe 1997, which excluded very low birth weight infants (< 1500 g), most likely leading to underestimation of early neonatal mortality.

Cohort studies had frequent field visits, especially those based in the community, often beginning with a census followed by prospective, house-to-house visits at varying intervals (e.g., Nepal 1999, 2011, and India 2000). In the Burkina Faso 2004 and 2006 studies, mothers and infants were seen every month at well-baby clinics at the health facility, leading to more missed follow-up visits, and longer maternal recall of date of death, than if visits had been conducted at the household. Or, in the case of facility-based studies, visits occurred daily to multiple health facilities or antenatal care centers (e.g., Brazil 2015). These approaches increase the likelihood that enrolled pregnancies will be followed to the birth outcome. Visits immediately and frequently after a birth outcome should be prioritized highly to avoid missed very early neonatal deaths, regardless of the study design; this is a major strength of cohort studies compared to DHS FBHs and HDSS.

Our data did not allow for the investigation of potential misclassification of stillbirths and neonatal deaths. This could be reliably done only in studies enrolling pregnancies. Evidence suggests misclassification of these outcomes can cause underestimation or overestimation of mortality rates, depending on the clinical and socio-cultural context. Further, if women can be followed early in pregnancy, then miscarriage rates will be more accurate. More accurate measures of gestational age, such as estimated by ultrasound examination, which is now more feasible and available in low-resource settings than in the past, rather than the less accurate dates of last menstrual period or postnatal assessment methods, will make estimation of stillbirth and miscarriage rates more accurate [30].

Cohort studies were largely unaffected by date heaping bias, except Burkina Faso 2004 and 2006, Kenya 1992, and Zimbabwe 1997. Date heaping in DHS FBHs is a cause of transferences of deaths from the early to late neonatal period due to heaping on day 7 of life [31]. Heaping has been associated with underestimation of neonatal and overestimation of postnatal mortality in HDSS [32], suggesting that cohort studies offer an advantage in their ability to reduce the impact of this bias on early mortality estimates. High-quality training and supervision of locally-resident data collectors utilized in these studies are reasons for this strength of cohort studies.

LTF varied across studies and the infant period but was generally low. Theoretically, LTF will only bias mortality rates if there is differential risk of death between those LTF and not LTF. Several cohort studies had < 1% LTF in the neonatal period (Nepal 1999, India 2000, Burkina Faso 2004 and 2006, Kenya 1992, and Brazil 1993), while others had around 3% or higher (e.g., Philippines 1983, Zimbabwe 1997, Nepal 2011), indicating increased potential for bias associated with more missed early deaths among LTF infants. Given small sample sizes of these studies, even a few missing deaths could significantly impact mortality rates.

Only rough comparisons between study and DHS mortality rates are possible given known biases with DHS and differences in geographical coverage areas. Cohort studies in Asia had similar NMR and IMR rates compared to DHS; this was observed even for the Philippines 1983, which experienced missing birth outcomes, LTF, and other biases. Data from the Philippines 1983, demonstrate a noticeable reduction in mortality risk at the first month of life followed by an increase at three months of life, peaking at six months. A follow-up survey identified 38 deaths among out-migrants, multiple births, and others that could not be included in the mortality analysis due to missing vital event data. Investigators reported finding misreported (later) dates of death (to avoid violation of government law related to late reporting of mortality outcomes), potentially leading to underestimated early, and overestimated late, mortality rates [33].

Studies in Africa, including Burkina Faso 2004 and 2006, had lower NMR than DHS, although there was better agreement in the postnatal period. One potential reason for this lower mortality rate, compared to DHS, could be the intense follow-up in the trial, including frequent visits study community health workers, high level of micronutrient intakes, and multiple antenatal care visits. Even in the absence of selection biases at enrollment, differences in the level of care delivered in the trial, compared to the general population, could affect representativeness of mortality estimates. The Kenya 1992 study had a relatively similar NMR to DHS over the study period. Zimbabwe 1997 study NMR mortality rates are much lower than the DHS rates, potentially a result of the study inclusion/exclusion criteria, and this relationship is inverted in the postnatal period. The increase in the mortality rate between two and five months in Kenya could result from waning passively transferred maternal antibodies against malaria, which contributed to a large burden of mortality and morbidity around this time [34, 35]. In Zimbabwe 1997, mortality rapidly increased from one to three months; reasons for this pattern could be missed early deaths; exclusion of LBW infants; or HIV infection, given 32% of enrolled women were HIV positive and this study took place before the availability of prevention of vertical transmission.

Our study had limitations. Studies included in this analysis were not identified through a systematic review, posing the possibility of selection bias associated with the design, protocols, and other methodological characteristics. Of note was the variation in study designs, field protocols, locations, and time periods across studies, which presented challenges for comparison between included studies and generalizability to other studies outside this analysis. These factors are likely not critical for internal validity in randomized trials or observational studies but can impact mortality patterns by age and sex. We did not evaluate the impact of trial interventions on mortality rates, nor could we evaluate the effects of seasonality and other external factors on mortality estimates or any effect of progressive intervention trials over many years in a single geographic site on mortality estimates.

We have described potential sources of bias in prospective cohort studies in Table 4. These include issues with pregnancy and birth outcomes and mortality estimation, such as missingness, loss to follow-up/out-migration, in-migration, misclassification, and date heaping and recall biases. The table also indicates the possible impact of these biases on mortality rates and proposes approaches to reduce these biases. The direction and magnitude of biases are often specific to the study design, site, and cultural context. Investigators should aim to understand local and cultural factors associated with potential biases and design customized strategies to reduce their impact. Investigators should be careful to note how the exclusion of certain participants could introduce selection bias (if associated with mortality risk) and how this differs from study of a special population, wherein mortality estimates may be unaffected by selection bias, but still non-representative of the underlying population. Quantitative validation studies comparing vital event data, FBHs, HDSS, and cohort studies, and the effects of various field protocols, should be the focus of future research to understand the potential for this underutilized resource for mortality estimation.

Table 4 Potential sources of bias in vital event data in population-based cohort studies


Prospective, population-based cohort studies that followed certain protocols can yield high-quality vital event data to contribute meaningfully to our understanding of mortality patterns of infants in LMIC settings (Table 5). These included enrolling pregnancies, limiting exclusion criteria potentially associated with mortality, capturing a high proportion of birth outcomes, immediate and frequent follow-up after delivery, and identifying and reducing other biases (e.g., related to the stigma of reporting a death) and data error issues (e.g., heaping). Cohort studies offer strengths not found in DHS FBHs or HDSSs, particularly immediate and frequent follow-up after the pregnancy outcome. Our results suggest that population-based cohort studies could provide high-quality vital event data for mortality estimation and understanding detailed patterns of mortality by age, particularly early in the neonatal period.

Table 5 Recommended protocols for collection of high-quality vital events data for mortality estimation in population-based birth cohort studies

Availability of data and materials

Not applicable.


  1. United Nations Inter-agency Group for Child Mortality Estimation (UN IGME). Levels & trends in child mortality: report 2020, estimates developed by the United Nations Inter-Agency Group for Child Mortality Estimation. United Nations Children’s Fund; 2020.

  2. Mikkelsen L, Phillips DE, AbouZahr C, et al. A global assessment of civil registration and vital statistics systems: monitoring data quality and progress. Lancet Lond Engl. 2015;386(10001):1395–406.

    Article  Google Scholar 

  3. United Nations Statistics Division. Quality of vital statistics obtained from civil registration. United Nations

  4. Guillot M, Gerland P, Pelletier F, Saabneh A. Child mortality estimation: a global overview of infant and child mortality age patterns in light of new empirical data. PLOS Med. 2012;9(8):e1001299.

    Article  PubMed  PubMed Central  Google Scholar 

  5. Guillot M, Prieto J, Verhulst A, Gerland P. Modeling age patterns of under-5 mortality: results from a log-quadratic model applied to high-quality vital registration data. Published online August 31, 2020.

  6. MEASURE DHS. Demographic and health surveys.

  7. Unicef. Statistics and monitoring: multiple indicator cluster survey.

  8. United Nations. Inter-agency group for child mortality estimation.

  9. Short Fabic M, Choi Y, Bird S. A systematic review of demographic and health surveys: data availability and utilization for research. Bull World Health Organ. 2012;90:604–12.

    Article  PubMed  PubMed Central  Google Scholar 

  10. Hill K. Approaches to the measurement of childhood mortality: a comparative review. Popul Index. 1991;57(3):368–82.

    Article  CAS  PubMed  Google Scholar 

  11. Pullum TW, Becker S. Evidence of omission and displacement in dhs birth histories. ICF International; 2014.

    Google Scholar 

  12. Sankoh O, Byass P. The INDEPTH network: filling vital gaps in global epidemiology. Int J Epidemiol. 2012;41(3):579–88.

    Article  PubMed  PubMed Central  Google Scholar 

  13. Katz J, Lee AC, Kozuki N, et al. Mortality risk in preterm and small-for-gestational-age infants in low-income and middle-income countries: a pooled country analysis. Lancet Lond Engl. 2013;382(9890):417–25.

    Article  Google Scholar 

  14. Hill K, et al. Direct estimation of child mortality from birth histories. In: Moultrie T, et al., editors. Tools for demographic estimation. International Union for the Scientific Study of Population; 2013.

    Google Scholar 

  15. Rahmathullah L, Tielsch JM, Thulasiraj RD, et al. Impact of supplementing newborn infants with vitamin A on early infant mortality: community based randomised trial in southern India. BMJ. 2003;327(7409):254.

    Article  PubMed  PubMed Central  Google Scholar 

  16. Christian P, West KP, Khatry SK, et al. Effects of maternal micronutrient supplementation on fetal loss and infant mortality: a cluster-randomized trial in Nepal. Am J Clin Nutr. 2003;78(6):1194–202.

    Article  CAS  PubMed  Google Scholar 

  17. Mullany LC. Impact of sunflower seed oil massage on neonatal mortality and morbidity in Nepal (NOMS).

  18. Roberfroid D, Huybregts L, Lanou H, et al. Effects of maternal multiple micronutrient supplementation on fetal growth: a double-blind randomized controlled trial in rural Burkina Faso. Am J Clin Nutr. 2008;88(5):1330–40.

    Article  CAS  PubMed  Google Scholar 

  19. Huybregts L, Roberfroid D, Lanou H, et al. Prenatal food supplementation fortified with multiple micronutrients increases birth length: a randomized controlled trial in rural Burkina Faso. Am J Clin Nutr. 2009;90(6):1593–600.

    Article  CAS  PubMed  Google Scholar 

  20. Marchant T, Willey B, Katz J, et al. Neonatal mortality risk associated with preterm birth in East Africa, adjusted by weight for gestational age: individual participant level meta-analysis. PLoS Med. 2012;9(8):e1001292.

    Article  PubMed  PubMed Central  Google Scholar 

  21. ter Kuile FO, Terlouw DJ, Kariuki SK, et al. Impact of permethrin-treated bed nets on malaria, anemia, and growth in infants in an area of intense perennial malaria transmission in western Kenya. Am J Trop Med Hyg. 2003;68(4 Suppl):68–77.

    Article  PubMed  Google Scholar 

  22. Victora CG, Hallal PC, Araújo CL, Menezes AM, Wells JC, Barros FC. Cohort profile: the 1993 Pelotas (Brazil) birth cohort study. Int J Epidemiol. 2008;37(4):704–9.

    Article  PubMed  Google Scholar 

  23. Santos IS, Barros AJD, Matijasevich A, Domingues MR, Barros FC, Victora CG. Cohort profile: the 2004 Pelotas (Brazil) birth cohort study. Int J Epidemiol. 2011;40(6):1461–8.

    Article  PubMed  Google Scholar 

  24. Hallal PC, Bertoldi AD, Domingues MR, et al. Cohort profile: The 2015 Pelotas (Brazil) birth cohort study. Int J Epidemiol. 2018;47(4):1048–1048h.

    Article  PubMed  Google Scholar 

  25. Humphrey JH, Iliff PJ, Marinda ET, et al. Effects of a single large dose of vitamin A, given during the postpartum period to HIV-positive women and their infants, on child HIV infection, HIV-free survival, and mortality. J Infect Dis. 2006;193(6):860–71.

    Article  CAS  PubMed  Google Scholar 

  26. Adair LS, Popkin BM, Akin JS, et al. Cohort profile: the Cebu longitudinal health and nutrition survey. Int J Epidemiol. 2011;40(3):619–25.

    Article  PubMed  Google Scholar 

  27. Marinda E, Humphrey JH, Iliff PJ, et al. Child mortality according to maternal and infant HIV status in Zimbabwe. Pediatr Infect Dis J. 2007;26(6):519–26.

    Article  PubMed  Google Scholar 

  28. Liu L, Kalter HD, Chu Y, et al. Understanding misclassification between neonatal deaths and stillbirths: empirical evidence from Malawi. PLoS ONE. 2016;11(12):e0168743.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  29. Helleringer S, Liu L, Chu Y, Rodrigues A, Fisker AB. Biases in survey estimates of neonatal mortality: results from a validation study in Urban Areas of Guinea-Bissau. Demography. 2020;57(5):1705–26.

    Article  PubMed  Google Scholar 

  30. Gernand AD, Paul RR, Ullah B, et al. A home calendar and recall method of last menstrual period for estimating gestational age in rural Bangladesh: a validation study. J Health Popul Nutr. 2016;35(1):34.

    Article  PubMed  PubMed Central  Google Scholar 

  31. Hill K, Choi Y. Neonatal mortality in the developing world. Demogr Res. 2006;14(18):429–52.

    Article  Google Scholar 

  32. Eilerts H, Romero Prieto J, Eaton JW, Reniers G. Age patterns of under-5 mortality in sub-Saharan Africa during 1990–2018: a comparison of estimates from demographic surveillance with full birth histories and the historic record. Demogr Res. 2021;44(18):415–42.

    Article  PubMed  PubMed Central  Google Scholar 

  33. Office of Population Studies. The Cebu longitudinal health and nutrition survey - survey procedures. University of San Carlos Carolina Population Center, The University of North Carolina at Chapel Hill Nutrition Center of the Philippines; 1989.

  34. Crawley J. Reducing the burden of anemia in infants and young children in malaria-endemic countries of Africa: from evidence to action. Am J Trop Med Hyg. 2004;71(2 Suppl):25–34.

    Article  PubMed  Google Scholar 

  35. Phillips-Howard PA, Nahlen BL, Kolczak MS, et al. Efficacy of permethrin-treated bed nets in the prevention of mortality in young children in an area of high perennial malaria transmission in western Kenya. Am J Trop Med Hyg. 2003;68(4 Suppl):23–9.

    Article  PubMed  Google Scholar 

  36. Lee ACC, Mullany LC, Tielsch JM, et al. Risk factors for neonatal mortality due to birth asphyxia in southern Nepal: a prospective, community-based cohort study. Pediatrics. 2008;121(5):e1381-1390.

    Article  PubMed  Google Scholar 

Download references


Thank you to all of the women, infants, and their families who participated in the studies included in this analysis.

Patients or public involvement

Patients or the public were not involved in the design, conduct, reporting, or dissemination plans of our research.


This work was supported by the National Institute for Child Health and Human Development (NICHD 1R01HD090082-01). The Nepal Oil Massage Trial (Nepal 2011) was supported by the National Institutes for Child Health and Development (HD060712) and the Bill & Melinda Gates Foundation (OPP1084399).

Author information

Authors and Affiliations



DE, SS, AV, MG, and JK conceptualized and designed the study. DE conducted the analysis and wrote the manuscript. DE and SS cleaned and prepared the datasets for analysis; AV provided analysis and programming support. All authors reviewed results, discussed interpretations, and contributed to development and revision of the manuscript.

Corresponding author

Correspondence to Daniel J. Erchick.

Ethics declarations

Ethics approval and consent to participate

Datasets shared with our research team by the original study investigators did not contain any identifying information and ;therefore, this analysis was considered exempt by the Institutional Review Board at the Johns Hopkins Bloomberg School of Public Health.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

Supplementary Tables and Figures.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Erchick, D.J., Subedi, S., Verhulst, A. et al. Quality of vital event data for infant mortality estimation in prospective, population-based studies: an analysis of secondary data from Asia, Africa, and Latin America. Popul Health Metrics 21, 10 (2023).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: