MATHEMATICS AND COMPUTER SCIENCE

Should statistical sampling be used in the United States Census?

Viewpoint: Yes, statistical sampling offers a more accurate method—at much lower cost—for determining population than does physical enumeration.

Viewpoint: No, the Supreme Court ruled that statistical sampling to calculate the population for apportionment violates the Census Act of 1976.

Statistics—the mathematical science of analyzing numerical information—is vital to the practice of all the empirical sciences. No modern science tries to account for the complexity of nature without using statistical methods, which typically provide investigators with a numerical outcome along with an analysis of the margin of accuracy of that outcome. With the help of computers, statistical techniques for collecting and analyzing large, complicated data sets have become very sophisticated and have proved to be reliable and effective for scientific researchers, inventors, and engineers working on problems in such diverse fields as economics, physics, and pharmaceuticals.

The popular perception of statistics, however, starkly contrasts with its valued role in the sciences. Statistics has often been dismissed as an unreliable and sinister ("lies, damned lies, and statistics") strategy for manipulating data to support a pre-determined point of view. While statistical techniques are quietly and successfully being used in many areas of modern life, one that most people are familiar (and perhaps uncomfortable) with is polling. Because polls—which survey selected sample groups of people and then extrapolate the responses to a larger population—are often done on behalf of political causes or candidates, their interpretation can be controversial. Bitter arguments about the outcome of a poll can taint the understanding of the statistical methods that made the poll possible.

The census has become a particularly contentious area of debate over the use of statistics. The census project seems deceptively simple: The census aims to count the population of the United States. But the population is large, diverse, moving, partially hidden, and changing every moment. A physical "count" of the population could never be done, and even if it could it would only be accurate for a few seconds. Any effort to count the population will contain errors of identification and omission. The challenge that faces those who design and administer the census, then, is to proceed in a manner that will minimize those errors. But the census poses much more than a scientific problem. The census is a political, economic, and social project, and those who are most interested in its outcome often have little regard for technical issues surrounding errors and estimates.

As the population of the United States has grown and become more diverse, the census has become more difficult to administer. It is well known that the standard method for performing the census, which relies primarily upon citizens to report information about themselves and their households and secondarily upon visits by census-takers to the homes of those who fail to report, undercounts the population by a significant amount. The most obvious way to correct this problem seems to be to make use of statistical sampling methods, which could account for the variety within the population as the count is adjusted upward. The undercount seems to be distributed unevenly throughout the population, tending to come primarily from certain groups that are harder to contact and locate, such as renters, immigrants, the homeless, and children. These groups disproportionately tend to support Democrats rather than Republicans, thus leading to the primary political schism over the possibility of using statistical techniques to refine the census. Politicians view the undercounted groups either as potential supporters or potential opponents, and argue accordingly about how to count them.

Opponents of the use of statistical sampling to improve the census attack on several fronts and take advantage of public skepticism about the validity of statistical methods. They argue that the Constitution quite literally calls for a physical enumeration (a physical counting) of the population to be performed during each decade's census, and use this as a foundation to block any effort to incorporate statistical modifications. Various legal issues surrounding Constitutional interpretation have been argued all the way to the United States Supreme Court. Incorporated in these legal challenges are criticisms of the statistical methods that would be used to improve the accuracy of the census. While members of the National Academy of Sciences as well as other mathematical and scientific experts have generally endorsed the superior accuracy of statistical sampling over enumeration, laymen remain somewhat perplexed and skeptical. One reason for that concern may be a form of circularity in the argument of sampling's proponents; that is, in order to argue that sampling gives a more accurate count, they use evidence collected by sampling. What sampling's advocates call accuracy, its critics call bias. Courts are notoriously poor places to settle technical disputes, and the debate over census methods is no exception. Because the census affects the creation of political districts and the apportionment of financial resources, however, it is inevitable that any change in its methods will be assessed on political as well as scientific grounds.

—LOREN BUTLER FEFFER

Viewpoint: Yes, statistical sampling offers a more accurate method—at much lower cost—for determining population than does physical enumeration.

In general, there are essentially three key aspects in the debate over the extent of the use of statistical sampling in compiling census data. Although there are a number of interrelated concerns regarding the use of statistical sampling, all the issues can be reduced to either Constitutional arguments, political arguments, or scientific questions regarding the ability of statistical methods to render a more accurate count. Although the legal and political arguments can be tortuous and partisan, the scientific and mathematical considerations strongly favor the use of statistical sampling over physical enumeration.

Although there are exceptions, both proponents and opponents of the use of statistical sampling in the census generally agree that there are two independent measures that can be used to validate a particular statistical method known as integrated coverage measurement. Census officials, mathematicians, and statistical modeling experts contend that these two methodologies enhance the accuracy of census data by reducing differential undercount (a greater undercount in selected groups when compared to the count of the general population).

Reducing the Differential Undercount

The first method involves determining the extent of a census undercount, which can be reasonably estimated by the use of existing demographic data reflecting birth, death, immigration, and emigration records already maintained by governmental agencies. A second method, conducted following the census in the form of post-enumeration surveys, also enables mathematicians and statisticians to draw conclusions regarding the consistency of data collected during the census itself. Both validation methodologies currently support the argument that there exists a chronic undercount of almost all groups, but a more significant, and therefore differential undercount, of minorities, children, renters, and other identifiable groups.

Another powerful argument favoring the use of statistical sampling for all census measurements is that the modifications to the present system of enumeration (e.g., an emphasis on advertising designed to reach target undercount groups such as Spanish-speaking residents) failed, as measured by demographic and post-enumeration surveys to increase census accuracy. In fact, for the first time in census history, a decennial census (the 1990 one) proved to be less accurate than prior decennial census measurements. The accuracy of the 2000 census awaits full assessment.

Legislative Actions

In response to the disappointing performance of the 1990 census methodologies, Congress passed the Decennial Census Improvement Act in 1991, providing for a study of census methodologies by the National Academy of Sciences. The Academy was specifically asked to render an opinion on the scientific and mathematical appropriateness of using sampling methods to compile and analyze census data. After extensive hearings and investigations, two of the three expert panels convened by the Academy concluded that significant reductions on the undercount (i.e., improved and more accurate census data) could not take place without the use of statistical sampling. Moreover, all three Academy panels concluded that census data would be made more accurate by an integrated coverage measurement procedure that relied on statistical sampling. One panel of the National Academy of Sciences specifically concluded that it was "fruitless to continue trying to count every last person with traditional census methods of physical enumeration."

Data examined by the panels of the National Academy of Science and submitted to the United States Supreme Court also indicated that there was a probable undercount of 1.8% of the general population. The differential nature of the undercount is confirmed by the fact that 5.7% of self-described "Black" or "African-American" residents were undercounted.

The debate over the use of statistical sampling is ironic in that the arguments against the use of statistical sampling ignore the increasing trend of accuracy associated with statistical sampling. The current Census Act was enacted in 1954, and within three years, Congress amended the act to allow limited use of statistical sampling except for the "determination of population for apportionment." In 1964, Congress again revised the Census Act to allow data collection via questionnaire in place of a personal visit by a census-taker as long as the questionnaire was delivered and returned via the United States Postal service. The use of statistical sampling was further expanded by a 1976 revision to the Census Act that allowed the gathering of population and census data—but that did not specifically authorize its use for issues pertaining to apportionment. Upon this oversight to specifically allow statistical sampling to compile data used for apportionment, constitutional literalists and the majority of the Supreme Court rested their arguments against the broad use of statistical sampling.

For the year 2000 census, government officials estimated that 67% of households would voluntarily return census forms. Based upon the subjective data provided in the forms—correlated to past data, some itself derived from statistical sampling—census officials planned to divide the population into groups with homogenous (similar) characteristics involving economic status and residence. Instead of attempting a visit to all non-responding households by census-takers, however, census officials then planned to selectively visit households that would be statistically representative of the non-respondents, until a total of 90% of the households in the target group had been surveyed by questionnaire or interview. Integrated coverage measurement, encompassing the two validation methodologies previously outlined, would then be used to make a final adjustment to the undercount in target groups, or "strata," based upon such demographics as location, race, and ethnicity. As an additional check, randomized physical counts would be used to measure the accuracy of the statistical corrections.

The Political Divide

The political argument over this proposed procedure—the increasing reliance on statistical sampling—proved polarizing along traditional party lines. Democrats, representing a political party that historically benefits from greater minority representation and participation in government, argued that the undercount denies accurate and effective representation of undercounted groups including minority voters, children, and those who do not have a regular or stable residence. In addition to arguments based upon the potential of statistical sampling to at least partially correct these problems, the Democrats voiced a strong social justice claim that rested upon the essential requirement that in a democracy or a republic the representatives shall be fairly drawn from the population, so that the voice of the government is indeed the voice of the governed.

Republicans, a party that historically performs poorly with regard to garnering minority votes, have typically argued for a strict and literal Constitutional interpretation that would continue a reliance on a decennial census (the major United States census, Constitutionally mandated to take place every 10 years) via physical enumeration. Republican opponents of sampling particularly resist attempts to integrate enhanced sampling techniques specifically targeted at reducing the undercount among selected groups. Because mathematical arguments and performance records related to prior census collections favor statistical sampling, a Constitutional argument relying, as statistical sampling advocates will argue, on an outdated and narrow interpretation of the Constitution is clearly the strongest case physical enumeration advocates bring to the debate. On its face, and by historical precedent, the Constitution of the United States calls for an actual physical enumeration—a physical count of residents.

The Judicial Position

In 1998, on appeal of the landmark case the Department of Commerce et al. v. the United States House of Representatives et al., a majority of the United States Supreme Court—split along well-established political and ideological lines—affirmed a lower court's interpretation of both the Constitution and federal statues (e.g., the Census Act). According to the Supreme Court, the Constitution's census provisions authorize the appropriate Congressionally appointed agencies to "conduct an actual enumeration of the American public every 10 years (i.e. the decennial census) as a basis for apportioning Congressional representation among the States." The Court's ruling majority further cited narrowly existing provisions that specifically authorized the limited use of statistical sampling procedures "except for the determination of population for purposes of congressional apportionment." The ruling quashed the Department of Commerce's plans to use statistical sampling in the 2000 decennial census as part of an attempt to correct chronic and well-documented undercounts.

Significant, however, to the continuing argument regarding physical enumeration versus statistical sampling, the Court's ruling inherently recognized that there would be a difference in the outcome of the two counts—and, more importantly, that such a difference would be significant enough to alter the apportionment of Congressional representatives and state legislators. In its ruling the Court held that there was a recognized likelihood that voters in areas relying on physical enumeration would have their representation "diluted vis-á-vis residents of [areas] with larger undercount rates." Justice Sandra Day O'Conner, in writing the majority opinion, recognized that there was a traditional under-count of minorities, children, and other groups resulting from a census by enumeration. Although the Court's ruling majority recognized the power, utility, and validity of statistical analysis—they chose to rule on the law rather than on the acknowledged merits of statistical analysis.

In dissenting opinions (an opinion written by a Supreme Court justice who disagrees with the majority ruling), justices John Paul Stevens, David Souter, and Ruth Bader Ginsburg advanced the argument that the use of statistical sampling was authorized by a broader interpretation of prior authorizations allowing the use of statistical methods to compile and analyze census data. In particular, Justice Stevens argued that the planned safeguards and validation checks that are a part of the proposed integrated coverage measurement protocol were simply extensions of procedures previously authorized. Moreover, the use of integrated coverage measurement as a supplement to traditional enumeration-based procedures was "demanded" by the accuracy required by the intent and scope of the Constitution and existing census legislation.

Justice Stevens also pointed out that the census officials had long used various statistical estimation techniques—including the imputation of data—to adjust counts gathered by enumeration. Imputation allows census workers to guess at estimated data based upon prior data collected under similar circumstances (i.e., when

A census worker interviewing a woman in 1930, as her 10 children look on. (

U.S. Census Bureau, Public Information Office.

Reproduced by permission .)

the demographics are similar, census officials are able to make reasonable estimates of certain responses, such as approximating the size of a household based upon the number of individuals in neighboring households). In 1970, imputation methodologies added more than one million people to the national population count, and in 1980, imputation methods added more than three-quarters of a million people to the national population count.

The Advantages of Statistical Sampling

Although opponents decry that statistical sampling favors some groups over others (e.g., increases the number of minorities and children in the overall count), it is exactly this aspect of statistical sampling that reduces the differential error of a differential undercount. Because minorities and children are undercounted by enumeration methods, statistically based corrections to enumeration data—if well designed to enhance census accuracy—should differentially correct these selected population counts. Non-differential analysis that would correct only for a general undercount (i.e., a 10% correction to all data) would, at a minimum, simply preserve under-representation based upon undercounts.

With regard to error and error analysis, statistical sampling prevails. Statistical sampling data can be cross-checked with existing demographic data to allow a more accurate estimate of both gross omissions (failure to count) and erroneous errors wherein individuals are assigned to improper addresses. For example, identification of fractions of a target group in follow-up samplings are compared to initial samples in an effort to estimate the existence and extent of an under-count. Mathematically, the dual system estimate (DSE) can be explained as follows: C n represents the number in the census count; E n the number of erroneous enumeration errors; P n the number of selected group individuals as determined in a post-enumeration survey; and M n the group number in the post-enumeration survey accounted for in the enumeration count. The DSE is related by the following equation: DSEn = (C n-E n) x (P n / M n).

Although it can be fairly argued that statistical sampling is not perfect—and is also subject to error based upon bias contained in designing sampling collection protocols and formulae applications—the mathematical errors and bias in statistical analysis are more easily identifiable, quantifiable, and correctable, in clear contrast to errors in the enumeration process. Bias or prejudice in enumeration methodology, which is often based upon the difficulties of interviewing certain groups, or on the fears of census workers to carry out tasks in certain geographic areas, is not easily quantifiable because, in part, it is difficult to obtain data on such attitudes and fears. Moreover, any culturally pervasive bias or prejudice that exists among census workers must invariably increase the magnitude of errors in enumeration-based data.

Issues related to tabulation or potential computer programming errors affect both sides equally. It is just as likely that an error could exist in enumeration data-handling software as in programs designed to handle statistical sampling data. Moreover, there are reliable mathematical methods to estimate errors in statistical analysis. Recognizing the errors inherent in enumeration-based data, as a last resort opponents of statistical sampling often argue that neither methodology is perfect, and so, therefore, the increased accuracy of statistical sampling is not worth the effort to change. The assertion of this argument, however, ignores the fact that sampling is far less expensive than enumeration.

At a minimum, a strong argument for statistical sampling rests upon an enhanced ability to determine and control uncertainty and error. The use of sampling data does not eliminate error, but it does control the type of errors encountered. Statistical sampling offers enhanced sampling error and bias recognition. Although sampling errors usually average out—especially in large randomized census-sized samples—bias errors, which are more difficult to determine in enumeration-based methods, are usually directional errors that do not average out and that differentially degrade census accuracy.

—K. LEE LERNER

Viewpoint: No, the Supreme Court ruled that statistical sampling to calculate the population for apportionment violates the Census Act of 1976.

Background

The U.S. Constitution dictates that a census be taken every 10 years specifically for the purpose of apportioning representatives, and further specifies "actual Enumeration." Congress has dictated the particulars of exactly how the enumeration is to be carried out through the Census Act, which has been amended from time to time over the years (1976 being the latest version). The Secretary of Commerce oversees the "actual Enumeration" of the population. The Census Bureau was formed within the Commerce Department to conduct the decennial census. The law states that the Bureau "shall, in the year 1980 and every 10 years thereafter, take a decennial census of population as of the first day of April of such year." It further stipulates the timetable for completing the census—within nine months.

The subject of statistical sampling evolved from the problem of the undercount rate, which has plagued the census since 1940. A reasonable question is: How can an "undercount" be determined? If persons were not "counted" in the census, what evidence is there that they exist? Since 1940, the Census Bureau has been utilizing such methods as demographic analysis to produce an independent estimate of the population using birth, death, immigration and emigration records, and also a "Post-Enumeration Survey," which uses sampling to make an estimate of population. The numbers that result are compared to the actual census. Identifiable groups, which include some minorities, children, and renters, have had noticeably higher undercount rates in the census than the general population. Since the census is the basis for determining representation in Congress, some states may have been under-represented as the result of undercount.

Concerned about the discrepancy in representation, Congress passed the Decennial Census Improvement Act of 1991. The Secretary of Commerce was instructed to contract the National Academy of Sciences to study the problem and arrive at a solution. Hence comes the statistical sampling controversy. The Academy was instructed in the Improvement Act to consider "the appropriateness of using sampling methods, in combination with basic data-collection techniques or otherwise, in the acquisition or refinement of population data."

Measuring a Changing Nation, Modern Methods for the 2000 Census was published by the National Academy Press in 1999, detailing the work of the Panel on Alternative Census Methodologies, Committee on National Statistics, Commission on Behavioral and Social Sciences and Education, and the National Research Council. On the issue of sampling for nonresponse follow-up, the panel "concluded that a properly designed and well-executed sampling plan for field follow-up of census mail nonrespondents will save $100 million (assuming an overall sampling rate of 75 percent)."

Sampling for nonresponse follow-up was predicted to reduce the Census Bureau's total workload, which would permit improvements in the control and management of field operations, and which would allow more complete follow-up of difficult cases, leading to an increase in the quality of census data collected by enumerators. The combined committees added, "of course, sampling for nonresponse follow-up will add sampling variability to census counts."

In spite of the potential for added errors from sampling, the Census Bureau responded to the findings and decided to use two sampling procedures (explained below) to supplement information collected by traditional procedures in the 2000 Census. Critics of the statistical sampling procedures brought their case to the courts, and in January 1999, the Supreme Court ruled that such procedures were in violation of the Census Act of 1976. As the constitutionality of statistical sampling was not brought up in the lower courts, that issue was not decided.

The Statistical Sampling Plans: NRFU and ICM

The Census Bureau plan included a Non-response Followup (NRFU) program. NRFU is described in Measuring a Changing Nation as a field operation conducted by census enumerators in order to obtain interview data from people who failed to mail back their questionnaires. Data obtained from some, but not all, of these nonresponse follow-up cases are called "sampling of nonresponse follow-up" (SNRFU). It was the possible use of this data that was challenged in the courts.

Details of the Census Bureau's plans are included in the "Opinion of the Court" published by the Supreme Court. In one part of the program, the Census Bureau planned to divide the population into census tracts of approximately 4,000 people that have what they describe as "homogeneous population characteristics, economic status, and living conditions." From that list, the enumerators would visit a randomly selected sample of nonresponding households as statistically representative of all the group. This, in effect, would create what the Chicago Tribune described as "virtual people" (October 31, 2001, editorial titled "Census and Nonsense").

The second statistical sampling procedure that was challenged was one that would have been put in place after the first statistical sampling was completed. It was called Integrated Coverage Management (ICM). As described in the Supreme Court's Opinion, ICM uses the statistical technique called Dual System Estimation (DSE), a system that requires the Census Bureau to classify each of the country's seven million blocks (called strata) according to defined characteristics. It is a complex sampling process.

Using the 1990 census information, the characteristics used in the classification of strata include: state, racial and ethnic composition, and the proportion of homeowners to renters. The Census Bureau planned to select 25,000 blocks at random for an estimated 750,000 housing units. Enumerators would canvas each of the 750,000 units. Where discrepancies exist between information taken before the ICM sampling and the ICM information, repeat interviews were to be conducted to resolve the differences. The information from the ICM would be used to assign each person to a poststratum, which is described in the National Academy of Sciences report as a collection of individuals in the census context that share some characteristics as to race, age, sex, region, owner/renter. The information was to be treated separately in estimation as part of the statistical sampling.

The Supreme Court's Opinion describes what happened next. A bill was passed in 1997 allowing the Census Bureau to move forward with their plans for the 2000 Census, but requiring the Bureau to explain any statistical methodologies that might be used. In response to the directive, the Commerce Department issued the Census 2000 Report. Congress followed with an Appropriations Act providing for the plan and also making it possible for anyone aggrieved by the Bureau's plan to bring legal action. This provision was the basis for the two suits and the ultimate denial of the use of statistical sampling by the Supreme Court. Since the denial was based on this Act, the issue of constitutionality was never decided.

After the 2000 Census

The Census Bureau may have been prevented from using statistical sampling to determine population count for congressional apportionment, but it was able to use sampling to collect additional information, as it had done in previous censuses. About one in six households received a long-form questionnaire to obtain additional information. The form includes 53 questions about the person's life and lifestyle that were considered too intrusive by many users. In a May 2000 report sponsored by The Russell Sage Foundation and others, titled America's Experience with Census 2000, A Preliminary Report, the authors found that privacy concerns had a negative impact on cooperation among the households who received the long form. Questions concerning income and physical and mental disabilities ranked highest among those considered too personal for the census to ask.

In hearings in the House of Representatives prior to passing the Decennial Census Improvement Act in 1991, Congressmen expressed concerns that statistical sampling and subsequent adjustment would discourage voluntary public response through mail-back forms. The mail-back form is the most accurate, effective, and efficient source of census data, according to the General Accounting Office. The Congress also expressed concern that statistical sampling may discourage state and local participation in that it would remove their incentive for obtaining a full and complete count.

By law, within nine months of the census date, the apportionment population counts for each state are to be delivered to the President. That would be December 31, 2000, for the 2000 Census. The President then has the responsibility of delivering the information to the Clerk of the House, who, in turn, must inform each state governor of the number of representatives to which each state is entitled. Although the enumeration of the population for apportionment was fixed to the actual count data by the Supreme Court decision, what constitutes an actual count is still a fuzzy figure because it included a statistical procedure called imputation to "count" people about whom the Census Bureau had incomplete or no direct information.

It was the use of imputation-provided data that accounted for the reduction in net undercount in the groups of people missed in the 1990 census. 2.4 million people were reinstated using imputation data. To evaluate the success of the census, and to adjust the numbers for non-political purposes if needed, the Census Bureau conducts an Accuracy and Coverage Evaluation (A.C.E.). The A.C.E. is bound to a complex set of procedures and operations much like the census itself. As measured by A.C.E., the net under-count of the population was about 3.3 million or 1.2% of the population in 2000. This is an improvement over the 1990 census which had a 1.6% undercount. So, with imputation, but without using statistical sampling, the troublesome undercount of the 1990 census was reduced.

The Acting Census Bureau Director William Barron said in a news conference in October 2001 that the results of the A.C.E. survey did not measure a significant number of erroneous enumerations and so declared there would be no adjustment made in the 2000 Census figures for any purpose. He described the net national undercount as "virtually zero in statistical terms." Congressman Dan Miller (RFL), chairman of the Subcommittee on the Census, applauded the ruling that the census accurately shows what he describes as "real people, living in real neighborhoods and communities in a very real nation."

However, the final chapter on the counting controversies in the 2000 Census may not be written for some time. Although imputation accounted for less than half a percent of the total U.S. population, Utah has been fighting an ongoing court battle on the issue of statistical "imputation." The state of Utah claims that the use of imputation has unfairly given the state of North Carolina a seat in Congress that rightfully belongs to them. This imputation issue and the question of the constitutionality of statistical sampling are likely to need clarification before they are used in the 2010 Census. Until all such issues are resolved, statistical sampling should not be used in the U.S. census.

—M. C. NAGEL

KEY TERMS

DUAL SYSTEM ESTIMATION:

An estimation methodology that uses two independent attempts to collect information from a household in order to estimate the number of people missed by both attempts.

ENUMERATION:

In dictionaries contemporaneous with the signing of the Constitution, the term "enumeration" refers to an actual or physical counting, not an estimation of a quantity.

ERRONEOUS ENUMERATION:

The inclusion of a person in the census because of incorrect information or error, such as being counted twice.

IMPUTATION:

Amethod for filling in information about a person from a previously processed respondent with whom the person shares similar characteristics and who lives in close geographic proximity.

POSTSTRATUM:

A collection of persons in the census context. These persons share some characteristics such as race, age, sex, religion, or owner/renter. The collection is treated separately in estimation.

SAMPLING ERROR:

Errors in statistical analysis that result from an improper or unrepresentative sample draw.

SIMPLE RANDOM SAMPLING:

A complex mathematical method for generalizing from a small sample without regard for any ordering embedded in the choice.

SYSTEMATIC ERROR:

Errors in statistical analysis that result from bias, application of improper formulae, etc.

UNDERCOUNT:

The number of persons who should be included in the census count but were not for some reason. Persons living in group settings such as dormitories, or without permanent addresses such as migrant workers, are likely to be missed by the census.