Symposium: Challenges in Targeting Nutrition Programs
Discussion: Targeting is Making Trade-offs
1 Jean-Pierre Habicht and Edward A. Frongillo 2 Division of Nutritional Sciences, Cornell University, Ithaca, NY 14853-6301 ABSTRACT The previous articles presented different aspects of targeting: the implicit political implications, the trade-offs in giving power to different stakeholders to decide and to implement targeting, perceptions of frontline workers in implementing a program, and a technical article about selecting a scale for targeting, which we review in greater detail. It is well recognized that targeting results in a trade-off between not serving those who should be served and including those who should not be served. Less well recognized are the trade-offs that are the consequences of deciding between using indicators of risk vs. using indicators that predict benet. J. Nutr. 135: 894897, 2005. KEY WORDS: targeting nutrition programs risk benet trade-off The articles issuing from the symposium on Challenges in Targeting Nutrition Programs were much broader in scope than we originally envisaged. They open new vistas of think- ing and research. One broad vision about targeting (1) reveals that this apparently technical enterprise is enmeshed in polit- ical considerations, which are often hidden, but are amenable to analysis. Another broad vision (2) considers the trade-offs in involving different stakeholders in targeting decisions, where the political process plays out in tension with technical considerations of effectiveness and efciency. Both these arti- cles discuss the different ways targeting can be accomplished but from different complementary points of view. Finally, another article (3) described the perspectives of the frontline providers of a targeted national nutrition program One article (4) addressed the more technical issues that we had expected. It is a report about the Institute of Medicines (IOM) 3 examination of scales to measure inadequate diets to target the Women, Infants, and Childrens (WIC) Program. This article is lucid and explicit about the technical issues facing the committee. The article concludes that the validity and the reliability of present measures of dietary risk are insufcient to use those measures as a basis for accepting or rejecting women and children from the WIC program. The 2002 report itself from the IOM (5) gives more background. Therefore, this IOM exercise provides a well-documented ex- ample that can be examined in light of some of the insights developed in the other articles. We then follow with a more detailed discussion about developing screening tools for tar- geting. Pelletier (1) discusses how the premises underlying techni- cal work are political. He uses a liberal and conservative dichotomy as an example. One can examine the motivations of the stakeholders involved in the WIC program, identify their interests, and then derive the approaches they would take to address the problem of dealing with the 2 major concern cells in Figure 4 of Pelletiers article (1). The genesis of the WIC program involved many stakeholders, including those who were more interested in selling food than in social benets. One may presume that those interested in selling food would favor more inclusive targeting, because more food would be sold to the government. Expanding WIC also corresponds to the interests of those responsible for the program if they follow a natural bureaucratic instinct, which is rewarded for expansion. These interests are compatible with those con- cerned about ensuring adequate health care and nutrition to all Americans, who put a higher premium on being sure that all who need WIC receive WIC, even if many who do not also receive WIC services. The interests of those who favor expansion and more in- clusiveness are in competition with those whose interest is less government expenditure, who thus favor a WIC program that costs less. One way of reducing costs is to diminish the number of beneciaries through restrictive targeting. Similarly, tech- nocrats who seek to improve efciency would also impose a more restrictive screen, because that will increase the propor- tion of those served, who actually need the program. The above review of players includes those who initiated WIC and those who fund and regulate it. The frontline work- ers who implement the program are usually omitted from these considerations. The article by Lee et al. (3) shows how im- portant these players actions are in how a program is actually implemented. It appears from their work that, most often, frontline workers are concerned with ensuring that services are 1 Presented as part of the symposium Challenges in Targeting Nutrition Programs given at the 2004 Experimental Biology meeting on April 20, 2004, Washington, DC. The symposium was sponsored by the American Society for Nutritional Sciences. The proceedings are published as a supplement to The Journal of Nutrition. This supplement is the responsibility of the Guest Editors to whom the Editor of The Journal of Nutrition has delegated supervision of both technical conformity to the published regulations of The Journal of Nutrition and general oversight of the scientic merit of each article. The opinions expressed in this publication are those of the authors and are not attributable to the sponsors or the publisher, editor, or editorial board of The Journal of Nutrition. The Guest Editors for the symposium publication are Edward A. Frongillo and Jean-Pierre Habicht, Division of Nutritional Sciences, Cornell University. 2 To whom correspondence should be addressed. E-mail: [email protected]. 3 Abbreviations used: IOM, Institute of Medicine; ROC, receiver operating characteristics; WIC, Women, Infants, and Children. 0022-3166/05 $8.00 2005 American Society for Nutritional Sciences. 894
b y
g u e s t
o n
A p r i l
3 ,
2 0 1 1 j n . n u t r i t i o n . o r g D o w n l o a d e d
f r o m
maximally accessible to cover those who are in need. The article by Lee et al. (3) about the Elderly Nutrition Program and the work by Dickin (6) on the Expanded Food and Nutrition Program, both administered by the U.S. Department of Agriculture, also illustrate well the wealth of understanding of the frontline workers and suggest that taking their knowl- edge about context, motivation, and practice into account in larger decision-making processes would improve the process and might well improve reaching the goals. This discussion illustrates that integrating the perspectives and the motiva- tions of stakeholders requires historical, political, anthropo- logical, and organizational psychology expertise. This is worth doing to bring these disparate views into a common framework for a discussion that includes not just the technical but also other perspectives, such as economic, legal, bureaucratic, and political. Our expertise, however, lies with the more technical aspects of developing a screen for targeting. The rst technical issue in targeting is the development of a valid screen to decide whom to include or exclude from an intervention or program. The screen can be developed to decide about where and when to intervene at population levels, such as in famine prevention (7), or it can be developed to intervene in targeting interventions at the individual level, such as preventing some deleterious behavior, such as poor eating pattern, or an outcome, such as death. There are various approaches to building a reliable scale once the objectives and context are dened. One investigates the reliability of achieving some cutoff on a scale where both the cutoffs and the scales were constructed for other purposes. For instance, one might use dietary recommendations in- tended for counseling. Such recommendations are hortatory rather than normative in that many in a population may not follow the recommendations, and are unlikely to do so, and yet many of them will not suffer because of that lapse. The IOM 2002 report (5) illustrates the difculty of adapting this coun- seling information for use in targeting a program. Another approach rst builds the scale and then applies cutoff points. One such approach tries to measure exactly the underlying dichotomous construct (e.g., death or a long-term adequate diet) and then builds a feasible scale relative to that construct. Yet another approach is to develop an index through princi- ple-component analyses, which uses a number of measures that seem to be related to an underlying but unmeasured construct and then examines whether they cluster into patterns that are consistent with the construct (8). This was the approach used in developing a scale for hunger (9,10) Whatever the scale, it must be tested by the receiver operating characteristics (ROC) method against some golden standard of reality. This is a recognized method for dichoto- mous scales (11,12) such as the presence or the absence of a symptom, or a diagnosis. It is also the only method to be used for continuous scales (13), such as the scales usually used in nutrition targeting. We give an example with anthropometric data from 1-y-old Bangladeshi children measured in 1974 relative to their subsequent 2-y survival (14). The sensitivities and the specicities for identifying children who would die were calculated across the whole range of screening cutoffs for height-for-age, for weight-for-height, and for arm circumfer- ence (15). Figure 1 presents the ROC of those measurements by plotting Z-scores of the percentiles of sensitivity against the Z-scores of the percentiles of specicity in predicting the 2-y mortality. The interval scales of the Z-scores are presented in the left and bottom axes. The corresponding percentiles are presented in the upper and right axes to show that they are not plotted on an interval scale. The further the ROC line is from the indifference line, the better is the scale at predicting the underlying reality, in this case risk of death. The indifference line is the 45-degree diagonal going through the 50th percen- tile of both sensitivity and specicity. Figure 1 shows that stunting (height-for-age) and arm circumference were better measures for predicting death than was wasting (weight-for- height). Examination of the ROC lines plotted as the interval Z-scores is essential before proceeding with further statistical analyses, which are based on the assumption that the lines are parallel to each other and to the indifference line. In this example, the height-for-age and the arm circumference ROC lines are indeed parallel, but one cannot count on this usually being the case (13). The ROC lines for height-for-age and for arm circumference are not parallel to the indifference reveal- ing that these 2 indicators are better screens at high sensitivity than at high specicity. The results from the ROC indicate that at higher sensitivity for height-for-age such as 80%, the specicity is 35%, and, at 80% specicity, the sensitivity is 48%, indicating a slightly better screen at lower than at higher specicity. This is a small deviation from parallelism and does not preclude using a single statistic (13) to describe the quality of the screen. In spite of being small, the deviation is still visible in Figure 1, which would not have been the case if the ROC had been plotted with an interval scale of percentiles. Such an inappropriate plot will miss ROC lines that are much less parallel when a single statistic makes no sense. Using the above example, one can compare the proportions of deaths in those selected at cutoffs that select with 80% sensitivity and with 80% specicity, respectively. The propor- tions of deaths among those selected are called the positive predictive value by epidemiologists (16) and yield (17) by others. The yield depends not only on sensitivity and speci- city but also on the incidence (or prevalence) of the outcome of interest in the population. In this case 112 of the 2019 children died for a population incidence of 55 deaths per 1000. At high specicity, the yield was 11.5 deaths per 1000 among those selected, whereas at high sensitivity, it was 55 per thousand, no different than for all the children. This example reveals how increasing specicity increases the yield. It also decreased the sensitivity to 40%, however, meaning that over two-thirds of the deaths were not predicted by this cutoff. FIGURE 1 Sensitivity and specicity for 3 anthropometric indica- tors. TARGETING IS MAKING TRADE-OFFS 895
b y
g u e s t
o n
A p r i l
3 ,
2 0 1 1 j n . n u t r i t i o n . o r g D o w n l o a d e d
f r o m
This example also reveals how poorly a screen may actually predict an outcome of interest even when an indicator scale (e.g., height for age) signicantly correlated with the outcome, if the population incidence or prevalence is low. The above example is also useful because it describes a high-specicity screen that identies a group of children who are at higher than usual risk. It is true that some screens are designed to identify those who deserve help, such as the working poor, even if they cannot benet from the program; however, in this case one would expect the screen to select those children who would benet from the program by surviv- ing. This requires a different reality than death; it requires that the reality be deaths prevented by the program. In other words, it requires identifying indicators that reveal a potential to benet from the intervention (18). Thus, potential-to-benet indicators are often different from indicators of risk, which reect or predict future harm. For instance, mothers height and head circumference are good predictors for risk of low birth weight but are poor predictors of benet from nutritional intervention, because targeting on mothers height will not improve the efciency of nutritional interventions to prevent low birth weight (19). Similarly, height in infancy is a good predictor of height in adolescence in an undernourished pop- ulation, but it predicted benet in growth in height from food supplementation less well than did weight in infancy (20). These studies to identify predictors of benet all required interventions. In the absence of interventions studies to estimate poten- tial-to-benet indicators, one can estimate a potential (P) to benet from the product of the risk (R) times the effectiveness of the intervention (E) to alleviate or to prevent the risk (i.e., R*E) (17). This requires that the indicator of risk be appropriate for the intervention planned. Often the risk scale is a measure of some biological or behavioral reality, which is related to the risk. It is instructive that occasionally such a risk scale may be a poor measure of the underlying biological or behavioral reality, yet be excellent for predicting the risk itself. For instance, in our work to prevent famines in Indonesia, we used expected harvest yield, because it was a good predictor of famines. The validity and the reliability of the scale as it related to harvest yield suffered when the responsibility for collecting the information on harvest yield passed to those who were rewarded for the yield that they reported. They overreported unless the harvest was failing, in which case, they underreported, because they were rewarded for high yield and were punished for low yield unless the low yield was cata- strophic. This invalid and unreliable indicator of harvest yield would be a better indicator of famine, because it included not only some knowledge of the real yield but added knowledge about other factors that were more pertinent to predicting famine than to harvest yield itself. Thus, a scale need only be valid and reliable at the cutoff point. This permits the devel- opment of scales that are neither continuous nor measure a same variable across the range of the scale. For instance, a cutoff on a Gutman scale (21) that uses the presence or the absence of a single attribute, such as a single household pos- session, or knowledge about a single fact, may be a good screen because that cutoff encapsulates the information necessary to target. This case of a biased estimate by frontline workers of a determinant of famines, which is nevertheless accurate and reliable in predicting famines, illustrates how frontline workers can sometimes target better than can quantitative indicators collected by other means. In summary, the quality of the screen for targeting depends on the degree to which the screen concentrates those who can benet from the program (1 minus specicity), with the least loss of those who could also benet but are excluded from the program (1 minus sensitivity). One recommendation is to rst identify the best scale for screening by ROC graphing and analyses, and to then decide on the cutoff point. This is difcult if alternate scales cross. The best scale will depend on whether one chooses the cutoff point above or below the crossover. Fortunately, the following considerations resolve the problem. The best cutoff point is one that results in including the number of people that the program can handle. A more inclusive and therefore sensitive screen will deliver more people than the program can handle at a particular time. A more sensitive screen will admit more people who are less likely to benet from the program and who will displace those who are more likely to benet if the number is exceeded. Thus, a screen that has too high a sensitivity for the current capacity of the program actually decreases the sensitivity of the program itself. This discussion reveals that using sensitivity and speci- city as the basis for setting cutoffs for program screening is, in fact, incorrect if the number of actual beneciaries is xed. In that case, the basis should be the number to be delivered by the screen. Once the cutoffs are chosen, plotting them on the ROC lines, such as in Figure 1, will automatically identify the best technical scales. At this juncture, other characteristics of the scale, such as ease of measurement, can be taken into account in a nal choice. The above technical discussion holds when a program can only accommodate a speciable xed number of beneciaries. In that case, there is actually no disagreement on where the cutoff point should be set between those concerned about including the most needy and those concerned with exclud- ing the non-needy. This technical conclusion is very differ- ent from what one would infer from the liberalconservative tension describe by Pelletier (1) once a program is established with a xed number of participants. The numbers a program can handle, however, is not a purely technical issue, nor is it necessarily xed. Sociopolitical processes determine the num- ber. In that context, there is a real tension, because a more or less sensitive screen can be used to advocate or to oppose a program In this article, we have identied a well-known trade-off that is, in fact, not technically true if social-political consid- erations are absent. Thus, sociopolitical considerations entail trade-offs that, in turn, have implications for technical trade- offs. Many other trade-offs remain. At the technical level, decisions need to be made about using highly quantiable measurements and using more holistic selections made by knowledgeable frontline staff. These decisions also need to take into account how much autonomy that staff should have in other aspects of the program. At a higher level of consid- eration is whether risk scales can substitute adequately for the potential to benet scales, which are the scales that should be used but about which we are mostly ignorant. Developing better screens depends on developing potential-to-benet scales, which is urgent, but presents major research challenges both relative to feasibility and, above all, to generalizability. ACKNOWLEDGMENTS We thank Gretel H. Pelto for discussions and insights, and Jen- nifer Schaub for creating the gure. LITERATURE CITED 1. Pelletier, D. L. (2005) The science and politics of targeting: who gets what, when, and how. J. Nutr. 135: 890893. 2. Marchione, T. J. (2005) Interactions with the recipient community in targeted food and nutrition programs. J. Nutr. 135: 886889. SYMPOSIUM 896
b y
g u e s t
o n
A p r i l
3 ,
2 0 1 1 j n . n u t r i t i o n . o r g D o w n l o a d e d
f r o m
3. Lee, J. S., Frongillo, E. A. & Olson, C. M. (2005) Meanings of targeting from program workers. J. Nutr. 135: 882885. 4. Cauleld, L. E. (2005) Methodological challenges in performing target- ing: assessing dietary risk for WIC participation and education. J Nutr. 135: 879881. 5. Institute of Medicine (2002) Dietary Risk Assessment in the WIC Program. National Academy Press, Washington, DC. 6. Dickin, K. L. (2003) The Work Context of Community Nutrition Edu- cators: Relevance to Work Attitudes and Program Outcomes. Ph.D. dissertation, Cornell University, Ithaca, NY. 7. Brooks, R. M., Abunain, D., Karyadi, D., Sumarno, I., Williamson, D., Latham, M. C. & Habicht, J.-P. (1985) A timely warning and intervention system for preventing food crises in Indonesia: Applying guidelines for nutrition surveillance. Food Nutr. 11: 3743. 8. DeVellis, R. F. (1991) Scale Development: Theory and Applications. Sage, Beverly Hills, CA. 9. Radimer, K. L., Olson, C. M., Greene, J. C., Campbell, C. C. & Habicht, J.-P. (1992) Understanding hunger and developing indicators to assess it in women and children. J. Nutr. Educ. 24: 36S45S. 10. Frongillo, E. A. (1999) Validation of measures of food insecurity and hunger. J. Nutr. 129: 506S509S. 11. Swets, J. A., Pickett, R. M., Whitehead, S. F., Getty, D. J., Schnur, J. A, Swets, J. B. & Freeman, B. A. (1979) Assessment of diagnostic technologies. Science 105: 753759. 12. Swets, J. A. & Pickett, R. M. (1982) Evaluation of Diagnostic Systems: Methods from Signal Detection Theory. Academic Press, New York, NY. 13. Brownie, C., Habicht, J.-P. & Cogill, B. (1986) Comparing indicators of health and nutritional status. Am. J. Epidemiol. 124: 10311044 (Note erratum. In Figure 1 false negative and false positive should be transposed.). 14. Chen, L. C., Chowdury, A.K.M.A. & Huffman, S. L. (1980) Anthropo- metric assessment of energy-protein malnutrition and subsequent risk for mor- tality among preschool children. Am. J. Clin. Nutr. 33: 18361845. 15. Cogill, B. (1982) Ranking anthropometric indicators using mortality in rural Bangladeshi children. Ph.D. dissertation, Cornell University, Ithaca, NY. 16. Habicht, J.-P. (1980) Some characteristics of indicators of nutritional status for use in screening and surveillance. Am. J. Clin. Nutr. 33: 531535. 17. Institute of Medicine (1996) WIC Nutrition Risk Criteria; A Scientic Assessment. National Academy Press, Washington, DC. 18. Habicht, J.-P. & Pelletier, D. L. (1990) The importance of context in choosing nutritional indicators. J. Nutr. 120: 15191524. 19. Habicht, J.-P. & Yarbrough, C. (1980) Efciency in selecting pregnant women for food supplementation during pregnancy. In: Maternal Nutrition During Pregnancy and Lactation (Aebi, H. & Whitehead, R., eds.), pp. 314336. Nestle Foundation Series, Huber, Bern, Switzerland. 20. Ruel, M. T., Habicht, J. P., Rasmussen, K. M. & Martorell, R. (1996) Screening for nutrition interventions: he risk or the differential-benet approach? Am. J. Clin. Nutr. 63: 671677. 21. Nunnally, J. C. (1978) Psychometric Theory, 2nd ed. McGraw-Hill, New York, NY. TARGETING IS MAKING TRADE-OFFS 897
b y
g u e s t
o n
A p r i l
3 ,
2 0 1 1 j n . n u t r i t i o n . o r g D o w n l o a d e d