The effect of participant nonresponse on HIV prevalence estimates in a population-based survey in two informal settlements in Nairobi city
© Ziraba et al; licensee BioMed Central Ltd. 2010
Received: 29 July 2009
Accepted: 22 July 2010
Published: 22 July 2010
Participant nonresponse in an HIV serosurvey can affect estimates of HIV prevalence. Nonresponse can arise from a participant's refusal to provide a blood sample or the failure to trace a sampled individual. In a serosurvey conducted by the African Population and Health Research Center and Kenya Medical Research Centre in the slums of Nairobi, 43% of sampled individuals did not provide a blood sample. This paper describes selective participation in the serosurvey and estimates bias in HIV prevalence figures.
The paper uses data derived from an HIV serosurvey nested in an on-going demographic surveillance system. Nonresponse was assessed using logistic regression and multiple imputation methods to impute missing data for HIV status using a set of common variables available for all sampled participants.
Age, residence, high mobility, wealth, and ethnicity were independent predictors of a sampled individual not being contacted. Individuals aged 30-34 years, females, individuals from the Kikuyu and Kamba ethnicity, married participants, and residents of Viwandani were all less likely to accept HIV testing when contacted. Although men were less likely to be contacted, those found were more willing to be tested compared to females. The overall observed HIV prevalence was overestimated by 2%. The observed prevalence for male participants was underestimated by about 1% and that for females was overestimated by 3%. These differences were small and did not affect the overall estimate substantially as the observed estimates fell within the confidence limits of the corrected prevalence estimate.
Nonresponse in the HIV serosurvey in the two informal settlements was high, however, the effect on overall prevalence estimate was minimal.
Selective participation in a study can potentially skew estimates of the outcome of interest in a study population [1–5]. This is more likely to be the case if the circumstances that influence low participation are in some way related to the main outcome. Nonresponse in HIV serosurveys is mainly due to refusal to provide a blood sample for HIV testing or absenteeism of the sampled individual during the survey period. Several population-based HIV seroprevalence studies have reported varying nonresponse rates for HIV testing, ranging from as low as 5% among men in Rwanda to 56% in Lesotho [2, 3]. A moderate nonresponse rate (14.4% ) for HIV testing for Kenya was reported in an earlier survey. From the studies carried out to date on this topic, it has been shown that the effect of participant nonresponse on HIV prevalence estimates vary by certain characteristics, such as gender and residence, among others. Yet in general, the overall effect on national estimates is small, unless the level of nonresponse is very high, as was the case in Lesotho [1–3, 6].
HIV/AIDS remains a highly stigmatized disease, with many people preferring either not to know their status or to keep it a secret [7, 8]. The preference of an individual not to participate in a serosurvey may partly be influenced by the fear of knowing his or her own HIV serostatus. On the other hand, those who know their status as positive may participate in a serosurvey in the hope that they can be helped, or they may choose not to participate as they see no immediate benefit. Personal perceived risk may be correlated with actual risk of HIV infection . Perceptions about HIV risk are unlikely to be random among individuals in a population; they are likely to vary by defined individual characteristics, such as race, religion, ethnicity, and past behaviors, including experience with drug use or sex work . For that reason, if participants who perceive themselves to be at a higher risk of contracting HIV do not participate in a serosurvey, then prevalence estimates may be biased downward and might affect the overall estimate.
Interviewers may fail to make contact with a sampled person for a number of reasons, including temporary absence, work patterns, inability to locate the household/structure in which the sampled person lives, and out-migration. Highly mobile individuals, such as long-distance truck drivers, security personnel, and migrant workers, often have a different level of exposure to the risk of HIV [11–14]. In highly mobile populations, many sampled individuals may not be contacted, even if a good random sample is drawn. If a population has a substantial proportion of highly mobile individuals who miss out on a seroprevalence study and yet are likely to be at a higher risk, the estimates are likely to be biased downward as less mobile and low risk individuals are overrepresented in the effective sample interviewed [2, 3]. On the other hand, if a majority of a community's residents are migrant workers who live away from their families, they are likely to be exposed to higher risks of HIV infection. To the extent that such individuals are overrepresented in a seroprevalence survey, estimates are likely to be biased upward.
The slum context
Although informal settlements in Nairobi city are home to more than 60% of Nairobi's population , the informal nature of housing is likely to lead to underrepresentation of the slum population in national surveys, given the difficulty involved in listing temporary housing structures. Until the project on which this paper is based was conducted, HIV prevalence in the informal settlements was unknown. Kenya has had at least two large population-based HIV testing surveys [16, 17]. The Kenya Demographic and Health Survey of 2003 put the HIV prevalence estimate for Nairobi province at 10%. Nyanza province had the highest prevalence rate at 15%, and the national prevalence rate was 6.7% . There were differences in HIV prevalence rates by age, gender, ethnicity, rural-urban residence, educational attainment, and wealth status. These differences have been observed in several other surveys in sub-Saharan countries [2, 3, 16]. A more recent survey, the Kenya AIDS Indicator Survey 2007, estimated the national prevalence to be 7% and Nairobi province's prevalence rate to be 9% . However, the national surveys are unable to provide HIV prevalence estimates for slums. Earlier behavioral research indicates that high-risk sexual practices are prevalent in the informal settlements of Nairobi [18, 19]. Furthermore, recent work using verbal autopsies to establish causes of death, without HIV status, showed that HIV/AIDS and tuberculosis accounted for more than 50% of the adult mortality burden in the slums .
The African Population and Health Research Center (APHRC), in partnership with the Kenya Medical Research Institute (KEMRI), carried out a survey to estimate the prevalence and risk factors for HIV in two informal settlements in Nairobi city. The two communities where the project was carried out are informal settlements characterized by poor housing, lack of clean water, poor sanitation, unemployment, poverty, and overcrowding. Viwandani slum is located very close to the city's industrial area and is home to many low-income youths working in the industries close by. Korogocho is a more established slum settlement with a high proportion of men living with their spouses and children. Korogocho residents are predominantly either very low-income earners or unemployed. Additionally, residents of Viwandani are relatively more educated than those of Korogocho.
The survey, like many community-based surveys, faced a challenge of nonresponse, with a sizeable proportion of sampled individuals being nonresponders (43%). The desire to understand the effect of nonresponse on prevalence estimates was the basis for this paper. We hypothesised that the HIV prevalence estimate in the survey was underestimated due to low participation of highly mobile community members. Specifically, this paper aimed to describe selective participation in the serosurvey by sociodemographic characteristics and also to estimate the bias in the estimates of HIV prevalence.
Data used in this paper came from a cross-sectional serosurvey carried out from September 2006 to November 2007. The project was nested in the Nairobi Urban Health and Demographic Surveillance System (NUHDSS) covering about 60,000 individuals in two slums: Korogocho and Viwandani. The NUHDSS database provided the sampling frame from which a random sample of eligible participants was drawn. Eligible individuals had to be residents in the demographic surveillance area, registered with the NUHDSS, and aged between 15 to 54 years for men and 15 to 49 years for females. A total of 5,004 individuals were sampled. However, after the study and with the benefit of extra DSS updates of the residency status of individuals under surveillance, 237 individuals were found to have not been legitimate residents at the time the sample was drawn. These individuals have thus been excluded from the overall sample, leaving a total of 4,767.
A list of all sampled participants was generated with enough information to enable field workers to positively identify participants in their households. On the other hand, the questionnaires and blood sample filter papers didn't contain any identifiers except a new identification number (ID) to allow linkage to the NUHDSS data. A minimum of three visits were made for individuals who were not found at home on the first visit, and security arrangements were made to interview individuals who were identified as only available at odd hours (very early in the morning or late in the evening).
Participants were given information about the objectives of the study and information about their rights. Potential risks and benefits were read aloud by the interviewer to those who could not read, and those who could read were allowed enough time to read before making a decision. Those who accepted to participate affirmed it by signing the pre-written consent form. Minors (15 to 17 years old) who agreed to participate assented by signing the minor's consent form, and their guardians also had to confirm their support by appending their signatures or thumb prints. Individuals who consented to participate had the option of either responding to the interview only, providing a blood sample only, or providing both.
The survey used a questionnaire to collect data on knowledge of HIV prevention, HIV testing history, marriage and sexual activity, and circumcision. HIV status was determined using HIV serology on dried blood spots obtained from participants through a finger prick using Determine® HIV-1/HIV-2 (Abbott) and Uni-Gold™ Test kits, according to manufacturer's instructions. By design, participants were not allowed to know their HIV status results from the blood sample provided for the study. Those who wanted to know their status were provided standard pre-test counseling, testing, and post-test counselling at a Voluntary Counselling and Testing Centre. Core variables from the NUHDSS database were linked anonymously to the survey and serodata results using a linking ID.
Descriptive and multivariate logistic regression analyses were carried out to describe participation by sociodemographic characteristics and to assess determinants of sampled individuals being contacted and determinants for agreeing to provide a blood sample for HIV testing among those contacted.
Participant response categories considered during imputation
Participant survey outcome
Interviewed and tested or tested only
HIV status, socio-demographic, mobility index, HIV knowledge and attitudes, and sexual behaviour variables
Interviewed but not tested
Socio-demographic, mobility index, HIV knowledge and attitudes, and sexual behaviour variables
Refused both interview and testing
Socio-demographic & mobility index data only
Not contacted at all
Socio-demographic & mobility index data only
As pointed out by Marston et al , mobility is an important risk factor for HIV and, whenever possible, should be factored into the adjustments. Mobility data were available for all individuals as they were derived from the demographic surveillance database. The mobility index was derived from a count of movement episodes of participants within or out of the surveillance area per unit time. An individual was considered to be highly mobile if she or he had at least one or more episodes of change of residence per year or at least one out-migration and return episode to the surveillance area in two years.
The missing HIV status for those in category 2 was imputed against category 1, which had HIV status data, sociodemographic variables, and survey data. Missing HIV status data in categories 3 and 4 were imputed separately against category 1 using sociodemographic variables and mobility index. Multiple imputation was carried out using Stata version 10 statistical software using a user-written program called ice [22, 23]. The ice program does not assume multivariate joint distribution as do other multivariate approaches of handling missing data. This makes it flexible and more appealing to use. Imputations for HIV status data were carried out separately for each of the three participant categories that had no HIV status data and by gender. For each category, using the multiple imputation program, we created 10 multiple datasets ( 5-10 multiple copies are recommended) with missing data inserted as predicted by the variables in the model. The ice command automatically creates and combines the multiple imputed data files to get a single data file for a given category. From the combined file, prevalence estimates and corresponding confidence intervals for proportions were derived. The overall corrected HIV prevalence estimate (observed and imputed) was taken to be a weighted average of the imputed and observed prevalences, and an overall confidence interval for the resultant prevalence was also derived.
Percentage distribution of participants who were successfully contacted, those who accepted to test and HIV test result by socio-demographic characteristics
Accepted to test
HIV prevalence among tested
< 20 yrs
39.3 (< 0.001)
59.9 (< 0.001)
20.9 (< 0.001)
26.6 (< 0.001)
130.1 (< 0.001)
22.2 (< 0.001)
97.9 (< 0.001)
83.7 (< 0.001)
21.1 (< 0.001)
185.0 (< 0.001)
220.0 (< 0.001)
28.3 (< 0.001)
90.3 (< 0.001)
Not highly mobile
155.6 (< 0.001)
With regard to agreeing to be tested, higher proportions of younger individuals, residents of Korogocho, and members of the Luhya and Luo ethnic backgrounds accepted the test than their counterparts. There were no significant differences between those who accepted to test and those who refused by gender, educational attainment, and wealth status. The distribution of HIV prevalence by age, ethnicity, slum of residency, educational attainment, and marital status showed significant variation across several variables. Individuals below 20 years of age had the lowest prevalence but one of the highest participation rates, while men had lower participation rates and lower HIV prevalence. On the other hand, residents of Korogocho had higher participation rates and higher HIV prevalence. The Luo and Luhya ethnic groups and the widowed/divorced had higher participation rates and corresponding higher HIV prevalence than their counterparts.
Logistic regression model results showing odds ratios of being successfully contacted by gender controlling for socio-demographic characteristics
Odds Ratio [95% CI]
Odds Ratio [95% CI]
Less than 20 yrs
Not highly mobile
For both sexes, individuals from the wealthiest households were more than 2.5 times more likely to be contacted compared to their poorest counterparts. Women from Viwandani were less likely to be contacted compared to women from Korogocho, but there were no significant differences among men. Women and men classified as highly mobile were less likely to be contacted compared to those classified as less mobile. Women and men who had never been married were significantly less likely to be contacted compared to their married counterparts.
Logistic regression model results showing odds ratios of accepting to be tested by gender controlling for socio-demographic characteristics
Less than 20 yrs
Not highly mobile
Observed, imputed and overall adjusted prevalence of HIV
Interview and testing status
Prevalence [95% CI]
Prevalence [95% CI]
Prevalence [95% CI]
Interviewed & Tested-Observed prevalence (a)
Interviewed only-imputed prevalence (b)
Refused both-imputed prevalence (c)
No contact made-imputed prevalence (d)
Imputed prevalence all non-response -imputed prevalence-b,c & d (e)
Overall, corrected prevalence-adjusted (f)
This paper explored nonresponse to HIV testing in a survey and its impact on HIV prevalence estimates in informal settlements with a relatively mobile and young population. Nonresponse to HIV testing in this study (43%) was quite high compared to other community-based HIV testing surveys [1–3]. Absenteeism contributed 62%, while refusals accounted for 38% of nonresponse. At the time of designing the survey, an estimated nonresponse rate of 40% was factored into the sample size estimation, based on what has been reported elsewhere and the attrition rates in the NUHDSS.
Bivariate and multivariate assessments of responders and nonresponders showed that there were statistically significant differences between the two groups, justifying the need to assess the extent to which the observed differences could have affected the overall estimate of HIV prevalence in this population. Age, socio-economic status, residence, and mobility index were found to be good predictors of whether an individual was likely to be successfully contacted or not. Older people were not only less likely to be contacted, but they were also less likely to accept HIV testing. This finding was a bit surprising. One would have expected younger adults to be more mobile and less inclined to spare their time to participate in the survey. However, it should be noted that economic survival in the informal settlements relies on a cash economy dominated by informal employment. It might be the case that older people (up to 49 years for women and up to 54 years for men) have more demanding family responsibilities and as such are likely to be away from home fending for their families. Similarly, residents of Viwandani were less likely to be found at home, and if found, they were less inclined to participate. This finding is in line with our expectation. Viwandani slum is predominately inhabited by young adults, with smaller families and more educated residents who are more likely to be working in the nearby industrial estate, hence the higher likelihood of not being found at home.
Members of two of the ethnic communities with the highest HIV prevalence in Kenya, the Luo and Luhya,  were more likely to be contacted compared to their Kikuyu counterparts, and furthermore, they were also more likely to accept testing. It is hard to find an explanation for this observation. As expected, the mobility index predicted the likelihood of being found at home but not necessarily that of accepting to participate. If all these dynamics were examined in isolation, it would be hard to predict the likely overall impact the differential participation would have on HIV estimates. The odds of participation in the survey were not consistently higher among subgroups that are characteristically known to have higher or lower HIV prevalence such as age, gender, ethnicity, marital status, and socio-economic status. Thus from descriptive results, it is difficult to guess the overall direction the results would be biased, if at all, given that both participation rates and observed HIV prevalence varied in various directions by the key sociodemographic variables.
In the final model of multiple imputations, the overall effect on the estimates was small, showing that contrary to our expectation, HIV prevalence appears to have been overestimated by about 2%. Imputed estimates among females were consistently lower than the observed prevalence. It is important to note that all observed estimates lie within the confidence limits of the adjusted estimates, indicating that differences are small in spite of the significant differences in participation rates by sociodemographic characteristics as noted in Tables 2, 3, and 4. The high nonresponse rate observed in this study notwithstanding, results show that sound estimates can be obtained in a community-based HIV seroprevalence survey in a similar setting. The observed bias in this study is minimal but has a gender component, with a tendency to overestimate prevalence among women and underestimate it among males.
Future work in similar settings should take into consideration a number of issues. The informal nature of the housing makes listing of households extremely difficult. In the absence of a dedicated registration and monitoring system such as the demographic surveillance system, having an updated sampling frame is nearly impossible. Ways around this challenge should be carefully considered from the start. Although this study has found that the impact of nonresponse on overall estimates was minimal, it is prudent to adequately sample the population, factoring in nonresponse rates based on attrition rates where available. Extra efforts to reach hard-to-contact individuals must be considered while planning the study, especially in terms of duration of the study and adequacy of field staff.
Although the NUHDSS provided background characteristics for all sampled individuals, including nonresponders, the set of variables was rather limited for predicting the risk of HIV infection. It is possible that nonresponders were significantly and systematically different from responders on characteristics other than those used in the adjustments. The multiple imputations method used also assumes that data missing are missing at random (MAR). In reality, this might not be the case, and the predictions may not be as good. Mobility, as pointed out in other studies, is a key predictor of HIV infection, yet the way the mobility index was measured falls short of capturing short-term movements, such as absences of days or weeks, as happens with long-distance truck drivers. Movements involving short durations of absence might actually be more important in exposing individuals to the risk of HIV than movements involving longer periods of absence. Non-return migration can also result in underestimation of HIV prevalence, especially if the reason for out-migration is associated with poor health, as is the case with terminally ill HIV/AIDS patients. One study noted markedly high HIV/AIDS-related death rates among rural returnees in South Africa , indicating that a significant proportion of rural return migrants were HIV positive. HIV prevalence in the origin population could be affected (lowered) as a result of selective out-migration of infected individuals.
The estimate of HIV prevalence in slums is higher than that reported for Nairobi province, with women being disproportionately affected. Nonresponse resulted in minimal overestimates of HIV prevalence overall. We also infer that it is possible to obtain reliable results even in a relatively mobile population under surveillance as long as proper considerations are made at the survey design and implementation stages.
We wish to extend our sincere thanks to the Viwandani and Korogocho communities for their continued support and participation in our research projects. Implementation of this project was made possible through generous financial support from the Rockefeller Foundation under grant 2006AR013. We also acknowledge the financial support to NUHDSS from The Wellcome Trust grant-GR078530MA and Hewlett Foundation grant-2006-8376. We appreciate the contribution of our field and data management staff. Their dedication to work in a difficult environment at the height of political tensions was immensely appreciated.
- Mishra V, Vaessen M, Boerma JT, Arnold F, Way A, Barrere B, Cross A, Hong R, Sangha J: HIV testing in national population-based surveys: experience from the Demographic and Health Surveys. Bull World Health Organ 2006, 84: 537-545. 10.2471/BLT.05.029520View ArticlePubMedPubMed CentralGoogle Scholar
- Garcia-Calleja JM, Gouws E, Ghys PD: National population based HIV prevalence surveys in sub-Saharan Africa: results and implications for HIV and AIDS estimates. Sex Transm Infect 2006,82(Suppl 3):iii64-70. 10.1136/sti.2006.019901PubMedPubMed CentralGoogle Scholar
- Marston M, Harriss K, Slaymaker E: Non-response bias in estimates of HIV prevalence due to the mobility of absentees in national population-based surveys: a study of nine national surveys. Sex Transm Infect 2008,84(Suppl 1):i71-i77. 10.1136/sti.2008.030353View ArticlePubMedPubMed CentralGoogle Scholar
- Reniers G, Araya T, Berhane Y, Davey G, Sanders EJ: Implications of the HIV testing protocol for refusal bias in seroprevalence surveys. BMC Public Health 2009, 9: 163. 10.1186/1471-2458-9-163View ArticlePubMedPubMed CentralGoogle Scholar
- Reniers G, Eaton J: Refusal bias in HIV prevalence estimates from nationally representative seroprevalence surveys. Aids 2009, 23: 621-629. 10.1097/QAD.0b013e3283269e13View ArticlePubMedPubMed CentralGoogle Scholar
- National Statistical Office (NSO) [Malawi], ORC Macro: Malawi Demographic and Health Survey 2004. Calverton Maryland: NSO and ORC Macro; 2005.Google Scholar
- Zou J, Yamanaka Y, Muze J, Watt M, Ostermann J, Thielman N: Religion and HIV in Tanzania: influence of religious beliefs on HIV stigma disclosure, and treatment attitudes. BMC Public Health 2009., 9: 10.1186/1471-2458-9-75Google Scholar
- Simbayi LC, Kalichman S, Strebel A, Cloete A, Henda N, Mqeketo A: Internalized stigma discrimination, and depression among men and women living with HIV/AIDS in Cape Town South Africa. Soc Sci Med 2007, 64: 1823-1831. 10.1016/j.socscimed.2007.01.006View ArticlePubMedPubMed CentralGoogle Scholar
- Akwara P A, Madise N J, Hinde A: Perception of risk of HIV/AIDS and sexual behaviour in Kenya. J Biosoc Sci 2003, 35: 385-411. 10.1017/S0021932003003857View ArticlePubMedGoogle Scholar
- Moutsiakis DL, Chin PN: Why blacks do not take part in HIV vaccine trials. J Natl Med Assoc 2007, 99: 254-257.PubMedPubMed CentralGoogle Scholar
- Zuma K, Gouws E, Williams B, Lurie M: Risk factors for HIV infection among women in Carletonville South Africa migration demography and sexually transmitted diseases. Int J STD AIDS 2003, 14: 814-817. 10.1258/095646203322556147View ArticlePubMedGoogle Scholar
- Lydie N, Robinson NJ, Ferry B, Akam E, De Loenzien M, Abega S: Mobility, sexual behavior and HIV infection in an urban population in Cameroon. J Acquir Immune Defic Syndr 2004, 35: 67-74. 10.1097/00126334-200401010-00010View ArticlePubMedGoogle Scholar
- Kishamawe C, Vissers DC, Urassa M, Isingo R, Mwaluko G, Borsboom GJ, Voeten HA, Zaba B, Habbema JD, de Vlas SJ: Mobility and HIV in Tanzanian couples: both mobile persons and their partners show increased risk. Aids 2006, 20: 601-608. 10.1097/01.aids.0000210615.83330.b2View ArticlePubMedGoogle Scholar
- Nunn AJ, Wagner HU, Kamali A, Kengeya-Kayondo JF, Mulder DW: Migration and HIV-1 seroprevalence in a rural Ugandan population. Aids 1995, 9: 503-506.View ArticlePubMedGoogle Scholar
- Matrix Development Consultants: Nairobi's informal settlements: An inventory. A Report Prepared for USAID/REDSO/ESA. Nairobi: USAID; 1993.Google Scholar
- Central Bureau of Statistics (Kenya), Ministry of Health (MOH-Kenya), Macro O: Kenya Demographic and Health Surveys (2003). Calverton Maryland: Central Bureau of Statistics [Kenya], Ministry of Health (MOH) [Kenya], ORC Macro; 2004.Google Scholar
- National AIDS and STI Control Programme Ministry of Health Kenya: Kenya AIDS Indicator Survey 2007. Preliminary Report. Nairobi Kenya: NASCOP; 2008.Google Scholar
- Zulu EM, Dodoo FN, Chika-Ezee A: Sexual risk-taking in the slums of Nairobi Kenya, 1993-8. Popul Stud (Camb) 2002, 56: 311-323. 10.1080/00324720215933View ArticleGoogle Scholar
- APHRC: Population and health dynamics in Nairobi's informal settlements. Nairobi: African Population and Health Research Center; 2002.Google Scholar
- Kyobutungi C, Ziraba AK, Ezeh A, Ye Y: The burden of disease profile of residents of Nairobi's slums: Results from a Demographic Surveillance System. Popul Health Metr 2008, 6: 1. 10.1186/1478-7954-6-1View ArticlePubMedPubMed CentralGoogle Scholar
- van Buuren S, Boshuizen H C, Knook D L: Multiple imputation of missing blood pressure covariates in survival analysis. Statistics in Medicine 1999, 18: 681-694. 10.1002/(SICI)1097-0258(19990330)18:6<681::AID-SIM71>3.0.CO;2-RView ArticlePubMedGoogle Scholar
- Royston P: Multiple imputation of missing values: Update of ice. The Stata Journal 2005, 5: 527-536.Google Scholar
- Royston P: Multiple imputation of missing values. The Stata Journal 2004, 4: 227-227.Google Scholar
- Clark SJ, Collinson MA, Kahn K, Drullinger K, Tollman SM: Returning home to die: circular labour migration and mortality in South Africa. Scand J Public Health Suppl 2007, 69: 35-44. 10.1080/14034950701355619View ArticlePubMedPubMed CentralGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.