A need to measure the strength of national family planning programs emerged around 1970. Controversies had arisen regarding the effectiveness of these programs, in particular whether they made a contribution to fertility decline beyond that produced by economic development and social changes. Advocates argued that fertility decline would be hastened if modern contraceptive methods were introduced to populations in combination with intensive public education campaigns; together, supporters contended, these initiatives would create a new behavioral environment and legitimize the new practices. However, critics doubted that the programs did more than merely provide contraceptives to people who would have used them anyway.1
One way to examine these issues was to devise a measure of the strength, or inputs, of national family planning programs, independent of their effects, that could be used in statistical analyses to relate both program effects and socioeconomic effects to rates of contraceptive use and fertility. The first separate, independent measures of program effort, published in 1972, were devised by Lapham and Mauldin, who compiled and analyzed data on a variety of program measures.2,3 They conducted a new analysis in 1982,4,5 this time using data from a 125-item questionnaire that was sent to informants in each country; additional cycles of the survey were conducted in 1989,6 1994,7 19998 and, with modifications, 2004. In these studies, ratings were produced for 30 program features and summed to provide a single score for overall effort. Data from this 32-year span stand as the only source for tracing changes in the nature and strength of family planning programs. This article presents the first analysis of results from the 2004 cycle.
The original idea for the effort measures came from the sense that a vigorous family planning program should possess certain features. It should offer a full range of contraceptive methods and deliver them to the whole population through a variety of channels. It should have a corps of full-time fieldworkers and an arm for informing and educating the public about contraception. Prominent leaders should issue frequent statements favoring the use of contraceptives and legitimizing the reversal of pronatalist values. The program should have a full-time director, placed well up in the government structure, and various ministries and private agencies should provide technical, logistical and financial assistance.
In contrast, a weak program can be conceptualized as one that reflects minimal effort: It offers only one method, does so in only a few sites, neglects the rural sector, has no outreach workers, provides no public education, has a low-level government official as its part-time director, is not supported by multiple ministries and fails to stimulate the private sector. Recognizing either a very weak or a very strong program is straightforward; for programs in the middle, however, measurement becomes more delicate.
Nonetheless, reviews of program effects1,9–16 have consistently shown that program strength and social setting have remarkable effects on contraceptive use and fertility. It has been suggested that effort scores that depend on respondent judgments might be colored by respondents' knowledge of fertility levels, thus making conclusions about the relationship between program effort and fertility partly circular; however, a study in which researchers obtained, rather laboriously, an objective measure for each score found reasonable comparability with the standard survey-based scores.17 Another area of research has focused on the nature of program action. For example, studies that examined scores on the 30 program features have shown differences between the profiles of low-scoring countries and those of high-scoring countries, as well as changes in profiles as a country's scores increase over time.18
Although the original vision of what a strong national family planning program should look like remains relevant today, the international environment surrounding these programs has evolved during the past several decades. The major international conferences of the period marked the transitions. The optimism surrounding the 1965 International Conference on Family Planning Programs in Geneva19 gave way to the sobering impact of the 1974 World Population Conference in Bucharest,20 where general development was given priority over family planning. Next came the chilling effect of the 1984 International Conference on Population in Mexico City,21 where the United States dramatically reversed its formerly favorable policies on family planning, and the sea change caused by the 1994 International Conference on Population and Development in Cairo,22 where a new, broader view of reproductive health was embraced.
Additional changes in the family planning environment have occurred as well. In many countries, health reforms have transferred budgetary control and the choice of action priorities from capitals to the provinces and local areas, potentially weakening contraceptive services and education. Moreover, in some areas, the HIV/AIDS crisis has overwhelmed health services (including family planning), changed funding priorities and personnel assignments, and decimated health staffs. Finally, an element of donor fatigue has developed, particularly for such essential services as supplying modern contraceptives, which many developing countries cannot afford to do because of insufficient local manufacturing capability or a lack of hard currency.
For these reasons, we expanded the questionnaire for the 2004 cycle of this research to include simple scales on four topics: current justifications for national family planning programs, negative influences on programs, population subgroups of special interest and overall quality of services.
DATA AND METHODS
The 125-item questionnaire was used in all rounds of the study from 1982 to 1999. The items were coded and combined to produce 30 scores, each of which represents a feature of family planning programs (e.g., involvement of government agencies; use of mass media).* These 30 features, in turn, were organized into four broad groups, or components: policy and stage-setting activities, service and service-related activities, evaluation and recordkeeping, and method accessibility and availability.
Each cycle of the study was quite laborious, requiring the identification of likely respondents from different backgrounds and institutions in nearly 100 countries, as well as extensive follow-up by mail and fax (much of this work was done before e-mail was available). Moreover, the questionnaire was lengthy and rather demanding, both to complete and to analyze. Therefore, it was decided to develop a shorter version of the questionnaire, one that would be easier to use and could be applied more frequently and at lower cost. To test the validity of this approach, such a "short form" was added to the end of the 125-item "long form" in 1999; the short form summarized the meaning of each of the 30 program feature indices in a brief statement and asked the respondent to provide a rating from 1 to 10. The results of the short and long forms corresponded reasonably closely, although ratings were somewhat lower overall with the short form.23
For the 2004 cycle, two changes were made in the interest of lowering cost and simplifying administration. First, only the short form was used. Adjusting its scores by the difference observed in 1999 between the short- and the long-form results yields equivalent long-form scores.
Second, the administration of the survey was decentralized. Rather than identifying and contacting potential respondents in every country from a central location, we recruited and trained a study manager for each country. It was the manager's task to select respondents (including program staff, local staff of nongovernmental organizations, resident staff of international organizations, and local academicians and researchers), explain the questionnaire to them, ensure that the forms were completed and return the replies to the authors for analysis. (The managers received prompt feedback on results.) This system had already been used in a study to measure program effort for national maternal and neonatal programs in 55 countries,24 as well as to assess national HIV/AIDS programs.25
As noted earlier, several new scales (which were kept relatively simple to maintain the brevity of the short form) were added to the 2004 questionnaire. First, to examine government motivations for supporting family planning programs, respondents rated the importance of seven possible justifications for their country's program, using a scale of 1 (negligible importance) to 10 (great importance). Next, respondents rated the extent of the emphasis programs give to five special populations (e.g., unmarried youth, postpartum women), again using a scale of 1 (negligible emphasis) to 10 (great emphasis).
Respondents were also asked to rate the impact that six changes in the family planning environment (e.g., decentralization) had had on their program. These items were rated on a scale from –5 to +5, to indicate whether the effects had been negative or positive.
Finally, respondents rated the overall quality of their country's family planning services on a scale of 1–10. In particular, they were asked the following: "Please rate the general quality of family planning services. (Good quality includes a focus on client needs, with counseling, full information, wide method choice and safe clinical procedures.)" The concept of quality is somewhat general, but it is commonly used and it can be assumed that close observers can at least gauge whether the quality of a national program is very poor, very good or somewhere in the middle.
A further change to the survey pertained to the questions regarding the availability of specific contraceptive methods. To provide additional context, respondents were asked to rate the reliability of their program's supply lines for each method. For the pill, for example, respondents were asked: "How well does the pill supply system operate (it avoids stockouts or interrupted supplies and guarantees a reliable flow at local levels)?" These ratings were compared with the ratings for the method's availability to the population.
To facilitate comparisons, both within and among cycles, all scores have been rescaled as percentages of the maximum score. Thus, most scores range from 0 to 100; the only exceptions are the scores assessing the impact of changes in the family planning environment, for which the values range from –100 to 100.
In the 2004 cycle of the study, a total of 1,037 respondents from 82 countries participated, yielding an average of 12 respondents per country (see appendix for country list). Response rates are not available, because study managers in each country were directed to identify appropriate respondents and did not necessarily record refusals. As specified in the study design, respondents included program staff (30%), resident staff of international organizations (26%), local staff of nongovernmental organizations (24%) and staff of local academic or research organizations (20%).
The mean short-form score across the 30 program features among all countries included in the 2004 survey was 48 out of 100. The estimated mean long-form score was 56. The region with the highest mean score† was Asia (66), followed by the Central Asian republics (59), anglophone Africa (56), the Middle East and North Africa (55), Latin America and the Caribbean (53), and francophone Africa (53). Every region showed improvement from 1999 to 2004, including Latin America and the Caribbean, whose score had changed little between 1989 and 1999.
Among the 72 countries that participated in both the 1999 and the 2004 studies, the mean total score rose from 53 to 56. This increase, though relatively small (6%), continues the upward slope observed in previous cycles (Figure 1). This finding also holds for the individual regions (not shown).
The picture changes considerably, however, when countries are weighted by population size. Although the scores of smaller countries generally increased between 1999 and 2004, those of certain large countries fell, resulting in declines in the mean ratings for Asia, Latin America and the Caribbean, and the sample as a whole. The decline for Asia was largely due to a marked decrease in scores for China and Indonesia; similarly, in Latin America, Brazil and Mexico reported lower scores in 2004 than in 1999. However, mean scores rose in the other four regions.
One dynamic that helps explain some of these trends is that greater improvement is possible in countries that started with low scores. Unlike the top performers, many of whose scores reached ceiling levels some years ago, the low-scoring countries have had ample room to move up in the ratings. If we divide all countries into quartiles according to their 1972 scores and follow each group over time, we find that the score for the top quartile was high at the start and rose little before plateauing (Figure 2). The score for the second highest quartile was initially very low, but it has risen rapidly and now essentially matches that of the highest quartile. The scores for the lowest two quartiles have risen to about 50% of maximum, but their improvement now shows signs of slowing.
Even for the highest quartile, scores have leveled off at only about 60% of maximum. Among individual countries with the highest ratings, scores have stabilized at 80–85% of maximum (not shown). If that is taken as a ceiling level, the average score of 56 can be viewed as about 68% of what is realistically possible, which still leaves substantial room for improvement.
Between 1999 and 2004, as in earlier intervals, scores rose more among countries with low initial scores than among those with higher initial scores. Among countries whose 1999 scores were in the lowest quartile, scores rose, on average, by 13 points, from 30 to 43 (not shown). In contrast, scores did not rise at all among countries in the highest quartile; in fact, they fell by about four points, from 60 to 56. The two middle groups showed intermediate results: Scores in the second highest quartile fell by three points, from 50 to 47, and those in the third highest quartile rose by two points, from 40 to 42.
Sub-Saharan Africa and Latin America and the Caribbean were overrepresented in the lowest two quartiles, whereas Asia was overrepresented in the two highest groups. However, as in previous cycles, all regions were represented in every quartile, testifying to a persistent diversity among countries within each large region.
Scores for Program Features
The correspondence between the scores for the 30 individual program features in 1999 and those in 2004 is remarkably close, given that they were measured in independent study cycles conducted five years apart with largely different respondents (Figure 3, page 25). The same is true within each region as well (not shown). These findings provide reassurance regarding the reliability of the study methodology and greater confidence in the results.
The ratings for most of the program features cluster at about 40–60% of the maximum score, and there are plausible explanations for the exceptions. For example, the use of incentives and disincentives, which fell out of favor many years ago, was rated lowest of all. Access to male sterilization was rated very low, whereas access to both the pill and the condom was rated much higher.
The similarity of patterns in the two most recent cycles of the study also point to a general stability of the programs. Average scores changed very little, presumably reflecting continuity in the fundamental character of programs overall. The few scores that did change noticeably concern program outreach: Scores for community-based distribution, social marketing and postpartum programs all increased.
The four broad components of the scores showed systematic differences in effort that held true in every region (Figure 4). Scores for policies were consistently higher than those for services, which reflects the relative ease of issuing favorable policies compared with the difficulties of implementing them. However, policy strength differed considerably among regions; it was high in Asia, and low in francophone Sub-Saharan Africa. It was also low in Latin America and the Caribbean, where the women's health rationale has generally eclipsed a narrower family planning rationale.
An especially important component of family planning programs is the ability to provide access to a variety of contraceptive methods, and this, too, varied considerably by region (Table 1). Although nearly every region provided a high level of access to the pill and condom, and a moderate level of access to the injectable, access to the IUD and female sterilization varied sharply. The ratings for IUD access were moderate, whereas those for female sterilization were lower. Access to male sterilization was uniformly low, except in Asia. These findings are generally consistent with regional preferences: The IUD is favored in the Middle East and North Africa and in the Central Asian republics, but its use is negligible in Sub-Saharan countries. In Asia and in Latin America and the Caribbean, some countries have high levels of IUD use, others have very low levels.26
Asia ranked first in average access scores across the six methods, with a mean score of 60; Latin America and the Caribbean, the Middle East and North Africa, and the Central Asian republics clustered at 52–54. The lowest ratings were for anglophone Sub-Saharan Africa (49) and francophone Africa (46). The overall mean was 51.
Supply ratings (not shown) were lower than access ratings for the condom (70 vs. 76), the pill (61 vs. 68) and the injectable (52 vs. 56); this was true not only for the sample as a whole, but for every region for the condom and nearly every region for the pill and injectable. In contrast, there was essentially no difference between access and supply for the IUD and for male and female sterilization, presumably because continuous monthly supply is unnecessary or less critical than for the pill, injectable and condom.
New Assessments of the Programs
As noted earlier, we added scales on four topics to the 2004 questionnaire to explore the changing environment within which the national programs operate.
•Justifications. In general, ratings for reduction of population growth were considerably lower than those for all other justifications, especially in the Central Asian republics, Latin America and the Caribbean, and francophone Sub-Saharan Africa (Table 2). Interestingly, enhancement of economic development rated rather well, yielding scores similar to those for the reduction of unmet need or the prevention of nonmarital childbearing by adolescents. Although reducing nonmarital adolescent childbearing was an important justification in some regions, it scored very low in the Middle East and North Africa. The highest ratings (more than 80) were given to improving women's health, improving child health and avoiding unwanted births, a finding consistent with the post-Cairo perspective.
Even so, those three justifications received consistently less emphasis in Asia, the Middle East and North Africa, and anglophone Sub-Saharan Africa than in the other three regions, where scores generally reached 85 or higher. Other justifications that scored particularly low in specific regions include reducing population growth (Central Asia and Latin America and the Caribbean) and enhancing economic development (Latin America and the Caribbean).
•Special populations. To what extent do national programs give particular emphasis to special populations? The overall averages for the five populations were similar, in each case yielding a score of about half of the maximum. However, there were sharp differences among the regions, even more so than for program justifications.
In anglophone and francophone Sub-Saharan Africa, most scores were below the global average. They were especially low for poor women, and not much higher for rural women, perhaps because nearly all people in these regions are in these categories and thus cannot be regarded as special. The low scores may reflect the inability of the rather weak national programs in Sub-Saharan Africa to differentiate services by subgroups. Unmarried youth received far less emphasis in the Middle East and North Africa than elsewhere, probably because the prevalence of premarital sex is low; hence, there is little perceived need for youth services. That region also placed the least emphasis on providing counseling and contraceptive services for women who have recently given birth or (especially) had an abortion. At the other extreme, the Central Asian republics had the highest ratings for four of the five populations, especially women who have recently given birth or had an abortion.
•Major influences. On average, changes in donor funding were judged to have had a negative impact on programs in four of the six regions, and the impact was only weakly positive in the other two regions. Changes in domestic government funding had had negative effects in anglophone Sub-Saharan Africa and weak positive effects elsewhere. Decentralization was judged to have been a positive influence in all regions, but the net effect was small.
The integration of family planning with other health services or into a broader reproductive health context yielded the highest average ratings (55 and 59, respectively), suggesting that these trends are seen, on balance, as helpful. The average rating for the effects of HIV/AIDS programs, though positive, was notably lower. The net effects of HIV/AIDS programs were seen as only slightly positive in anglophone Sub-Saharan Africa, but more positive in the francophone region; in fact, anglophone ratings were far below the francophone ratings for all six influences. The influence of HIV/AIDS programs was rated as especially positive in the Central Asian republics; this region also gave favorable ratings to the influence of efforts to incorporate and integrate family planning into a broader context.
The net ratings conceal the distribution of scores along the –5 to +5 scale for major influences. Overall, 53% of countries received negative ratings for changes in donor funding (not shown), and 26% recorded negative ratings for changes in domestic funding. (Negative ratings reflected lost funds, not damaging effects of the funds received.) In contrast, hardly any countries received negative ratings for the other influences, though scores ranged widely.
•Overall quality. The regional ratings for overall quality varied around the global average of 52. Average ratings were highest for countries in Central Asia (57) and Asia (56), and slightly lower for those in the Middle East and North Africa (53). Services in francophone Sub-Saharan Africa (48) were rated below those in anglophone Sub-Saharan Africa (53), a finding consistent with the higher effort score for the anglophone countries. Quality of services was also low in Latin America and the Caribbean (49); the reasons are unclear, as ratings for that region were not particularly low on most other criteria. Perhaps perceived standards are simply higher there.
Ratings for individual countries varied widely, even within regions (not shown). In Asia, for example, Thailand, Malaysia and Vietnam received high ratings (73–76), whereas Myanmar and India received very low scores (25 and 39, respectively). Within Latin America and the Caribbean, ratings were high for Chile (79) and for Costa Rica, Jamaica and Mexico (65–68), whereas Haiti, Puerto Rico, Uruguay and Venezuela received low scores (29–33).
The main results of this study are welcome but somewhat unexpected. The continued rise in total effort scores for national family planning programs in developing countries is surprising in view of the reports from the field stressing diminished attention to family planning by governments in much of Africa and elsewhere, as well as the declining emphasis placed on family planning by some donor organizations. Moreover, the HIV epidemic is believed to have diverted attention from contraceptive services, particularly in Africa, even though scaling back these services may actually worsen the epidemic. A more general but persistent theme is the changed ideological climate in the post-Cairo period, which has given new emphasis to reproductive health topics other than family planning. Although this shift has almost certainly modified donors' funding allocations, the extent of these changes within individual countries, in either policy intentions or program practices, is difficult to know.
The first five rounds of the study (1972–1999) showed program effort to be marching forward, in general but also in detail. The countries with scores in the highest two quartiles had above average scores on every measure, thus showing greater effort across the board, rather than on just a few items. The weaker countries raised their scores differentially over time, so that their scores increasingly resembled those of the stronger countries on measures such as adequacy of administration, use of mass media and the availability of female sterilization.18 Program effort was consistently related to higher levels of contraceptive use and fertility change, independent of the effects of better socio-economic settings. Differences among the score components made conceptual sense; for example, African countries improved more on policy items than they did on measures of method availability, reflecting a time lag from policy formulation to implementation. In fact, the study's most discouraging outcome was that the actual availability of multiple contraceptive methods was the weakest of the four components.
The data reported here from the 2004 cycle of the study show that the increase in family planning effort has not abated: The slope in Figure 1 for the 1999–2004 period is about the same as the slope for earlier periods. Several patterns within the data further support the conclusion that family planning effort has continued to rise. For example, the nearly identical score profiles observed in 1999 and 2004 underscore the reliability of the study's measurements. Also, the upward trend in the average total score from cycle to cycle remains consistent and smooth, not erratic; groups of countries that scored high in 1972 and 1982 continued to do so in 2004, and the scores of countries in the lowest three quartiles have increased sharply. These results have occurred across cycles that were conducted independently, with long intervals between. An unreliable methodology could not have yielded these patterns, and it seems reasonable to trust the 2004 results rather than to discount them simply because they do not match our expectations.
Moreover, many of the findings for individual regions or countries are consistent with policy changes and developments in those areas. For example, the decline in effort score in Indonesia can be attributed to changes introduced by the recent decentralization of health and family planning functions in that country. The rise in scores in the Central Asian republics may reflect that these nations, which inherited from the former USSR comprehensive health systems that cover maternal services for most women, have been trying to add provision of contraceptives to their services so as to reduce high abortion rates.27–29
Other results were more surprising. The average rating for the effects of HIV/AIDS programs on family planning programs was moderately positive, whereas we had expected negative scores that would have squared with observer reports, especially in Sub-Saharan Africa. The net effects of HIV/AIDS programs are seen as only slightly positive in anglophone Sub-Saharan Africa, perhaps because the HIV pandemic has hit the anglophone countries in eastern and southern Africa particularly hard,25 resulting in health services that are fragile and react sensitively to changed efforts. On the other hand, the influence of HIV/AIDS programs was rated as especially positive in the Central Asian republics. The ratings may depend upon the prominence, or even the existence, of formal HIV/AIDS programs.
A notable finding is the sharp difference between many ratings from anglophone countries and those from francophone countries. Anglophone ratings are well below francophone ones for five of seven program justifications, and are far below them for all six influences. As noted above, the heavy burden of the HIV pandemic has probably been more important in anglophone countries than in franco-phone countries.
The global rise in effort scores is consistent with the continuing increase in contraceptive use in the developing world;30,31 that, too, needs explaining, given the difficulties outlined above. Possible explanations for rising contraceptive use include not only increases in program effort and commercial activities, but also the currents of modernization (e.g., rising education levels, female employment and urbanization) that drive down the demand for children. Increases in the proportion of couples using contraceptives are even more remarkable considering the rise in population sizes, which has forced growth in the capacities of the public and private service sectors to handle larger numbers of clientele, most of whom are using resupply methods.
It remains unclear why increases in program effort and contraceptive practice have persisted despite fragile contraceptive supply lines and some losses in funding, as well as competition with HIV programs; the possibility of measurement error cannot be ruled out. Nonetheless, the data from the 2004 cycle offer useful guidance to researchers, donors and policymakers. Respondents clearly perceive funding losses to be a major problem, and decentralization measures are viewed negatively in many countries. Postpartum services and especially postabortion services receive much less emphasis than they should; unmarried youth receive too little attention as well. On a somewhat more positive note, the justifications for the current programs rest strongly and consistently on the combination of preventing unwanted births, enhancing maternal health and improving child health. The quality of services, however, is judged to be mediocre in every region, and overall program ratings, on average, are only about half of what they might be. Thus, the results of the study point to the need for greater funding, better implementation, improved overall quality and, especially, full access to multiple contraceptive methods.