Congress Passes Confidentiality and Data Sharing Legislation

In 1971, the President's Commission on Federal Statistics reported as follows:"Use of the term 'confidential' should always mean that disclosure of data in a manner that would allow public identification of the respondent or wouldin any way be harmful to him is prohibited" and that "data are immune from legal process." The Commission further recommended that "legislation should be enacted authorizing agencies collecting data for statistical purposes to promise confidentiality as defined above ..."

Since that time, during the Administrations of Presidents Carter, Reagan, Bush I, Clinton, and most recently Bush II, efforts have been undertaken by the Executive Branch to shore up legal protection for the confidentiality of statistical information, as well as to permit some limited sharing of data for statistical purposes.

I am delighted to report that on Friday, November 15, both the House (at 2:50 a.m.) and the Senate (sometime after 8:00 p.m.) passed by unanimous consent the Confidential Information Protection and Statistical Efficiency Act of 2002. "CIPSEA," included as Title V in the E-Government Act of 2002 [H.R. 2458], will provide a uniform set of confidentiality protections and extend these protections to all individually identifiable data collected for statistical purposes under a pledge of confidentiality and will permit the sharing of business data by the Bureau of Economic Analysis, the Bureau of Labor Statistics, and the Bureau of the Census.

Office of Management and Budget Issues Final "Data Quality" Rules

On January 3, 2002, the Office of Management and Budget (OMB) issued final "data quality" rules, effective immediately. The rules were published pursuant to a rider on the FY 2001 Treasury and General Government Appropriations Act (P.L.106-554), which requires OMB to publish guidelines that "provide policy and procedural guidance to Federal agencies for ensuring and maximizing the quality, objectivity, utility, and integrity of information (including statistical information) disseminated by Federal agencies." The rider requires agencies to issue their own implementing guidelines that include "administrative mechanisms allowing affected persons to seek and obtain correction of information maintained and disseminated by the agency."

OMB proposed the data quality rules on June 28, 2001. After receiving a number of comments, OMB published interim final rules on September 28, 2001, and sought additional comments on one provision. The rules published last week are final rules providing additions and refinements to the interim final rule published on September 28.

By April 1, 2002, each agency covered by the Paperwork Reduction Act must publish a notice for comment in the Federal Register providing guidelines for implementing the rider for its agency, along with how the agency will develop the administrative mechanism that allows people to seek and obtain appropriate correction of information maintained and disseminated by the agency. After consideration of public comments, each agency must submit its plan to OMB by July 1, 2002. The plan must be put into place by the start of the next fiscal year, October 1, 2002.

The administrative mechanism shall apply to all information that the agency disseminates, regardless of when the agency first disseminated the information. The agency must institute a pre-dissemination review for data quality for new information disseminated after October 1, 2002.

The specific agency responsibilities are:

  • Issue their own information quality guidelines ensuring and maximizing the quality, objectivity, utility, and integrity of information, including statistical information, disseminated by the agency no later than one year after the date of issuance of the OMB guidelines;

  • Establish administrative mechanisms allowing affected persons to seek and obtain correction of information maintained and disseminated by the agency that does not comply with these OMB guidelines; and

  • Report to the Director of OMB the number and nature of complaints received by the agency regarding agency compliance with these OMB guidelines concerning the quality, objectivity, utility, and integrity of information and flexible, appropriate to the nature and timeliness of the disseminated information, and incorporated into agency information resources management and administrative practices.

    The full OMB guideline report can be found in Federal Register Notices, Volume 2, Number 67 (Thursday January 3, 2002)

U.S. Census Bureau's Plans for the Census 2000 Public Use Microdata Sample (PUMS) Files

The U.S. Census Bureau will provide two sets of PUMS files: a 1 percent national characteristics file and 5 percent state files. These files will provide the greatest possible detail, while protecting the confidential nature of the data.

The National Characteristics 1 Percent PUMS File

The national characteristics file will provide the maximum amount of social, economic, and housing information available. The goal of this file is to provide a close as possible the amount of detail that was in the 1990 PUMS files (and, in some cases, more detail). No national minimum population threshold for the identification of variable categories is planned, with the exceptions of race and Hispanic origin.

To maintain the level of detail described above, however, the minimum geographic population threshold must be raised above 100,000 - the Public Use Microdata Area (PUMA) minimum. A new geographical entity is being created–the super-PUMA. Super-PUMAS have a minimum population of 400,000 and are composed of a PUMA or PUMAS delineated on the companion state-level PUMA file. Each state will be identified, and any state with a population of 800,000 or greater can be subdivided into two or more super-PUMAS.

The State-Level 5 Percent PUMS Files

State-level 5 percent PUMS files will provide information for PUMAS that will represent many metropolitan areas, cities, and more populous counties, as well as groups of less populous counties. In order to protect confidentiality, characteristic information for these smaller counties will be less detailed than in the national 1 percent file.

  1. Population Thresholds for PUMAS

    Each geographic unit in the 5 percent files–PUMAS–must meet a minimum population threshold of 100,000. The minimum PUMA threshold will be held at 100.000 people by increasing the degree of variable collapsing to an appropriate level to maintain confidentiality.. There are two main arguments favoring this approach.

    First, from a user's standpoint, raising the minimum population threshold for PUMAS above 100,000 would greatly restrict a wide variety of local-level geographic analyses, such as studies of nonmetropolitan, metropolitan, and intra-metropolitan areas, conducted by public agencies, academic researchers, and others in the private sector.

    Second, the 100,000 minimum population threshold–the threshold set for both the 1980 and 1990 PUMS files–permits historical comparability. Users interested in time-series analysis were clearly displeased at the possibility of an increase in the threshold for Census 2000. Those users noted the difficulty in comparing the results from different decades if the PUMA threshold was raised. Additionally, the Census Bureau's use of 250,000 as the minimum threshold for PUMAS in 1970 was criticized by users–an important reason for the decision to lower the minimum threshold to 100,000 people for the 1980 PUMS files and to maintain it in the 1990 PUMS files.

  2. Minimum Population Threshold for Categorical Variables

    To maintain confidentiality, while retaining as much characteristic detail as possible, a minimum threshold of 10,000 in the national population will be set for the identification of groups within categorical variables in the state-level PUMS files. At the PUMS Users Conference held in Alexandria, Virginia, on May 22, 2000, some users suggested a minimum population threshold of 25,000 in response to concerns about confidentiality. The Census Bureau subsequently determined that a minimum threshold of 10,000 would maintain the confidentiality of responses, while providing greater detail to the user.

    Post-processing will improve the PUMS products by offering a more precise means of ensuring confidentiality. However, this procedure will increase the processing and analytic work load and delay the release of the 5 percent PUMS products to the public by approximately six months.

  3. Post-Processing

    The state-level files will require significant pos-processing. Instead of identifying variable categories based upon pre-tabulation assumptions about the composition of the population, the approach develops variable collapsing requirements after the microdata samples have been drawn. Each variable will be analyzed, and only those values that do not meet the 10,000 minimum national population threshold will be collapsed into more general categories.

    Post-processing will improve the PUMS products by offering a more precise means of ensuring confidentiality. However, this procedure will increase the processing and analytic work load and delay the release of the 5 percent PUMS products to the public by approximately six months.

Timetables for PUMS Files

The 1 percent national characteristics file will be the first file released to the public. It is planned for release in 2002. The 5 percent state-level files, requiring more time for post-processing, will be released to the public in 2003.


First Decennial Census Files Released

The Census Bureau has started to release the Census 2000 Summary File 1(SF1) data on a flow basis. SF1 presents counts and basic cross-tabulations of information collected from all people and housing units. This information includes age, sex, race, Hispani or Latino origin, household releationship, and whether the residence is owned or rented. Data will be available down to the block level for many tabulations, but only to the census-tract level for others. Summaries will also be included for other geographic areas such as Zip Code Tabulation Areas and Congressional Districts.

The SF1 state-level data released today are for Delaware and Vermont. Other states will be released on a flow basis. You will gain access to the data through links from the URL address below:

This website will also provide the tentatively scheduled release dates for other states during the month of June.

Census Bureau Begins Releasing Redistricting Data

On March 6, 2001, Secretary of Commerce Evans confirmed that unadjusted 2000 Census data would be released for redistricting purposes. Starting today, the Census Bureau will begin releasing unadjusted block level race and ethnicity population data for New Jersey and Virginia. Data for the entire nation will become available by April 1.

At his press conference Mr. Evans was noncommittal about when and if adjusted data would be released. He stated that this was "...a decision for the future." He also stated, "I wish it was possible that they could resolve some of these issues in the few weeks ahead. They tell me that that's not possible; they won't have additional information to evaluate the data for months and in some cases years."

As the Census Bureau releases the data to the states, check their web site at to download the information.

Census Bureau Recommends Not to Adjust

By now you are most likely aware that the Census Bureau has advised the Secretary of Commerce that unadjusted census data be released as the Census Bureau's official redistricting data. The Census Bureau's report shows an estimated net (undercount minus overcount) undercount of 1.18%, or about 3.3 million people. And, as in 1990, there was a differential undercount. Proportionally more African Americans (2.17%) and those of Hispanic Origin (2.85%) were missed than Whites (.67%). However, these estimates were not in agreement with an independent demographic analysis that showed a lower overall count. The actual head count was about 281 million people. Based upon the sampling evaluation the count would have risen to about 284 million, and the demographic analysis mentioned above showed a count of about 279 million. The Census Bureau states that it could not resolve these differences in the time available, given that redistricting data for every block in the nation had to be delivered to the states by April 1, 2001.

On March 6th, Secretary of Commerce Evans will make the final decision as to whether or not to adjust. We believe it's unlikely that he will go against the professional opinion of the Census Bureau.

The next question is whether the adjusted data will become public. These data could have an effect on the $185 billion dollars in aid distributed by federal government each year. If 1990 is an indication of the future, this may be a very contentious issue.

To read all the relevant documents, go to the site below for the actual relevant reports in PDF format.

Census Bureau Release Undercount Figures

Last week was an eventful one for the those involved with the 2000 Census. On February 14th the Census Bureau released preliminary undercount data. For a view of all the data we suggest going to:

The data we now have gives us some indication of the net undercount in the 2000 Census. Overall, the Bureau estimates a net undercount in the range of .96 percent to 1.40 percent. This compares favorably with 1990 estimate of 1.61 percent. For non-Hispanic Blacks the range goes from 1.60 percent to 2.73 as compared to 4.57 (including those who are Black Hispanic) in 1990. For those of Hispanic Origin of any race, the range is 2.22 percent to 3.48 as compared to 4.99 in 1990. For Whites, the range is .44 percent to .90 percent, compared to .68 in 1990. The White figures, due to the addition of "some other race," are not directly comparable. But they're close enough. Note that these figures are for the net undercount. The Bureau has not as yet released the gross figures - total undercount vs. total over count. Yet even before we get to see these figures, it's clear that there is a differential undercount when comparing White to Hispanic and Black. The situation will undoubtably be similar for Asians. It's difficult at this stage to translate these figures into actual counts, but it looks as if the overall net will be somewhere around 4.0 million people. Who knows what the gross errors will be?

One way of evaluating the error is through the use of Demographic Analysis (DA), which uses administrative records on births, deaths, migration, and Medicare to develop an independent estimate of the population. A "first cut" was estimated by the Presidential members (those appointed by the Democrats) of U.S. Census Monitoring Board. Their estimate based on their use of DA for the population as of April 1, 2000 is 279.2 million, or about 2.2 million below the 2000 Census count of 281.4 million. If we add another 4.0 million undercount to the Bureau's findings, then the overall estimate of the population will end up being in the realm of 285.0 million people, or somewhere around 5.0 to 6.0 million more than DA would have led us to believe. Either there's a flaw this time around with DA or there is an appreciably unexplained over count - or both. Some of the difference will be explained by revised estimates of the number of undocumented aliens that have come into the country over the past decade. In any event, assuming the Bureau does adjust, it's full employment time for demographers.

There can be no doubt that the 2000 Census is a success in a number of ways. First, the overall response rate increased for the first time in many decades. Second, based upon the above figures, minority populations were better counted than in 1990 and the differential undercount reduced. Indeed at a Congressional Hearing on the subject, both sides of the aisle were very complimentary of the Bureau for its well deserved success. Of course, Washington being Washington, peace and harmony only lasted so long. On the Democratic side, we were reminded that millions of Americans were still missed. Therefore it was imperative that the Census Bureau adjust the 2000 Census. Further, the decision to adjust should be made by the professionals at the Bureau after a careful analysis of the Accuracy and Coverage Evaluation (ACE) Survey. The Republican message was that this was a very successful census, and that the introduction of weights yielded inaccurate data at the block level. Therefore no adjustment was called for. After months of criticism from Side A questioning how well things were going and Side B defending the processes, we saw a reversal of roles where Side B now kept reminding us of the fact that it was good but not perfect, and Side A telling us that things were so good nothing else was needed.

Secretary of Commerce Reverses Adjustment Decision

On February 28th, the internal evaluation committee of the Bureau will present its recommendation on whether or not to adjust. Acting Director, William Barron would then have had one week to decide what to do, that decision coming on March 5th. If you recall, last year the Secretary of Commerce established a rule that the Census Bureau Director shall make the decision. Thus, the reasoning went, the decision would be in the hands of the professionals and free from political intrusion. However, the President has stated that he favors an unadjusted census. Further, there is still a clear call on the part of the Republicans in the House not to see the data adjusted. On February 16th Secretary of Commerce Evans rescinded the rule. It is unclear if in February 28th the Census Bureau committee will release both adjusted and unadjusted figures to support their findings.

Recently a trial balloon was sent aloft where it was suggested that the adjusted data be used only for the allocation of funds, and not for redistricting. This didn't happen in 1990, and we're doubtful, once the unadjusted numbers become "official", that we would see it in the current decade. One thing's for sure, if I were the mayor of a city that could find itself losing millions of dollars of federal funding, I would be talking to my local member(s) of Congress today. Hope they're l

Provisional Guidance on the Implementation of the 1997 Standards for Federal Data on Race and Ethnicity

SUMMARY: OMB has announced the availability of ``Provisional Guidance for the Implementation of the 1997 Standards for Federal Data on Race and Ethnicity,'' and soliciting public comment on any aspects of the document for a period of 60 days. The document is close to 200 pages and thus is not being reproduced in the Federal Register. It is available electronically on the OMB web site at the following address: to Data on Race and Ethnicity, or in paper form from OMB at the address below. This updated material supercedes and replaces the draft provisional guidance that OMB made available on its web site in February 1999.

DATES: Written comments should be received by OMB by March 19, 2001.

ADDRESSES: Please send comments to: Katherine K. Wallman, Chief Statistician, Office of Management and Budget, Room 10201 New Executive Office Building, 725 17th Street, N.W., Washington, DC 20503; fax: (202) 395-7245. FOR FURTHER INFORMATION CONTACT: Suzann Evinger at telephone 202-395-7315; or E-mail:

SUPPLEMENTARY INFORMATION: In the Federal Register for October 30, 1997, OMB announced ``Standards for Maintaining, Collecting, and Presenting Federal Data on Race and Ethnicity,'' which revised the standards originally adopted in 1977. This classification provides a minimum set of five categories for data on race (American Indian or Alaska Native; Asian; Black or African American; Native Hawaiian or Other Pacific Islander; and White) and two categories for data on ethnicity (Hispanic or Latino and Not Hispanic or Latino). In addition to changes in the categories and terminology, the 1997 standards require that agencies provide the opportunity for individuals to choose more than one racial category if they wish to reflect multiple racial heritages. Since this change in policy that permits reporting of more than race was announced, considerable attention has been given to the question of how data on multiple race responses would be tabulated. Federal agencies and other users of data on race and ethnicity requested guidance on how to implement several aspects of the 1997 standards.

OMB issued some preliminary tabulation guidance as part of the October 30, 1997 Federal Register Notice. In addition, for the past several years the Tabulation Working Group of the Interagency Committee for the Review of Standards for Data on Race and Ethnicity has been considering tabulation and other implementation issues and working to develop additional guidance. This group of statistical and policy experts drawn from the Federal agencies that generate or use data on race and ethnicity produced draft provisional guidance that OMB issued on its web site in February 1999. The provisional guidance being announced today is a substantially updated version of the earlier guidance. It reflects public comments on the earlier draft as well as the Tabulation Working Group's further research and deliberations on the issues.

The guidance presented in this document is intended for any Federal agencies or organizational units that maintain, collect, or present data on race and ethnicity for Federal statistical purposes, program administrative reporting, or civil rights compliance reporting. To foster comparability across data collections carried out by various agencies, it is useful for those agencies to report responses of more than one race using some standardized tabulations or formats.The guidance briefly explains why the tabulation guidelines are needed, reviews the general guidance issued when the standards were adopted in October 1997, and provides information on the criteria used in developing the guidelines. The guidance also addresses a larger set of implementation questions that have emerged during the working group's deliberations. Thus, the guidance addresses: collecting data on race and ethnicity using the 1997 standards, including sample questions; tabulating Census 2000 data as well as data on race and ethnicity collected in surveys and from administrative records; using data on race and ethnicity in applications such as legislative redistricting, civil rights monitoring and enforcement, and population estimates; and comparing data under the 1997 and the 1977 standards when conducting trend analyses. This guidance is necessarily provisional pending the availability of data from Census 2000 and other data systems as the 1997 standards are implemented. The guidance provides a general framework and is not intended to cover every specific issue that agencies will encounter during their implementation of the 1997 standards.

Highlights From FY 2000, Our End of Year Review

Edward J. Spar, Executive Director Council of Professional Associations on Federal Statistics

The collection and eventual release of the 2000 Decennial Census has dominated much of the federal statistical scene in 2000. Although there were some fears that the mail back rate would be a disaster based on comments from members of Congress about confidentiality, this fortunately did not turn out to be the case. The mail back response rate, somewhere around 66% was better than what was achieved in 1990. This, coupled with an excellent field follow-up for non-response and the special sample for adjusting the undercount, has given the Census Bureau enough time to complete the delivery of the data needed for reapportionment (population counts for fifty states) and tabulate the special sample for adjusting the Decennial Census data by April 1, 2001. That's the date when all population counts by race and ethnicity will be delivered to the states for every block in the nation. These data will be used by each state for redistricting purposes.

Now comes the interesting part. Between now and April 1, 2001, the Census Bureau has to decide if the process used for developing adjusted data is statistically sound. A technical committee within the Bureau has been established to make this determination. Up until now, we have been assuming that the decision would be based solely on technical merits and that the Director of the Bureau would have the last word. As anyone breathing knows, we are about to change Administrations. President Elect Bush has indicated in the past that he believes the unadjusted data to be more accurate. Indeed a spokesman for Mr. Bush has stated that: "We need to look at the accuracy and reliability of the numbers in terms of making that decision." So all bets are off until after January 20, 2001 when we have a new President. However, if I were a betting statistician, I believe the odds are high that we will not see adjusted data for the Decennial Census. Meanwhile, on the Congressional side, we can expect the House oversight Subcommittee on the Census to be looking very closely at what the Bureau is doing. Our guess is that the issue will be the same as what's been argued in the past. Namely, adjusted data at the block level are very inaccurate due to weighting. Those in favor of adjustment will probably agree, but argue that these data are not reliable to begin with, and that the data that really matter are at the tract level where the accuracy is much better. The new Administration could kill the use of sampling by simply studying it to death so that the April1, 2001 deadline is missed. Who knows? Stay tuned.

The other issue that has had an impact on the federal statistical agencies is the budget impasse that is only recently resolved. Of the twelve major statistical agencies that we track, seven are only now receiving their FY 2001 budget. Up until now they have been spending at FY 2000 levels. The good news is that the final budget numbers are much better than expected. At the beginning of the process it looked as if Congress would keep the budgets flat with FY 2000. Thanks to a more liberal spending approach on the part of the Senate and strong coalition work from concerned users of the information, including COPAFS and members of COPAFS, the final budget figures are, for the most part, close to the President's request for FY 2001. Take a look at our web site at for the latest budget figures. Some of the numbers may be lowered based upon a 2.2% last minute agreed upon across the board budget decrease. This may affect the Census Bureau, the Bureau of Labor Statistics, the National Centers for Education and Health Statistics and the Bureau of Justice Statistics. But for the most part, what we are showing is very close to the final figures. We are especially pleased to see the final figure for the Bureau of Economic Analysis. This agency, responsible for the national accounts as well as other important economic indicators, has not seen an increase in over five years. They will now be in much better shape to modernize their equipment and improve their overall processes.

We're continually asked if the change in the Administration is bad news for statistics. There's no reason to suspect that this would be true. There's been strong and weak support in both Democratic and Republican Administrations. Much will still depend upon the Congress as well as Department Secretaries and economists in key places. And yes, there's near certainty that the resignation (that's already submitted) of Dr. Kenneth Prewitt, current Director of the Census Bureau, will be accepted. History has shown that the Census Bureau is more often run by an acting Director, than the actual Presidential appointee. Dr. Prewitt has done an outstanding job, not only to administer the Bureau during the Decennial Census, but to bring a strong esprit de corps to the agency. Assuming his resignation is accepted, he will certainly be missed.

Where are we on agency consolidation? If you recall, the House of Representatives passed the Statistical Efficiency Act of 1999. This legislation would have enabled, under strict confidentiality controls, eight statistical agencies to share data. This would have been the first step in improving the efficiency of agencies through the potential elimination of redundant collections, and the positive effect of working together to develop more timely and useful information. We can only hope that Congressman Steve Horn (R-CA) will soon reintroduce this important legislation.

For next year expect to see more emphasis placed on the American Community Survey. The Census Bureau will release data from a 700,000 household sample that was conducted simultaneously with the Decennial Census using the American Community Survey questionnaire. Long form data will be available for states and large areas beginning next year. Beginning in 2003, and continuing every year thereafter, the Bureau plans to collect annual data from 3 million households. Once the survey is in full operation, data will be available every year for areas of 65,000 or more beginning in 2004. By 2006, there will be data available for areas of 20,000 to 64,5000 based on 3 years of data. And by 2008 expect to see data available for small areas based on 5 years of data. Of course, much depends on Congress' willingness to fund this survey. It will be up to the Bureau to show that the cost savings based on the plan for the American Community Survey to replace the long form in 2010, and the added efficiencies derived from annual updated information, offset the annual cost. It will also be up to user communities and federal statistical agencies, who have the most to gain from the American Community Survey, to voice their support.

Finally, by the end of next week, the Census Bureau will deliver the population figure for 50 states and the District of Columbia. Although we're sure you'll find many places to get them, the best being the Census' site at, we too will make them available as soon as we get them.

Best wishes for a happy and healthy new year.

Crimes in the Nation's Schools Declined in the 1990's According To Departments of Justice & Education

WASHINGTON, D.C. Crime in the nation's schools decreased during the last seven years, according to a new report issued today by the Justice Department's Bureau of Justice Statistics and the Department of Education's National Center for Education Statistics. The report, Indicators of School Crime and Safety 2000, indicates that between 1992 and 1998 violent victimization rates at schools dropped from 48 crimes per 1,000 students to 43 per 1,000. The percentage of students who said they were victims of crimes (including either theft or violent crimes) at school decreased between 1995 and 1999 from 10 percent to 8 percent.

Between 1993 and 1997 students in grades 9 through 12 who reported carrying a gun, knife or other weapon on school property during the previous 30 days dropped from 12 percent to 9 percent, a 25 percent reduction. During 1998, students aged 12 through 18 were victims of more than 2.7 million crimes at school, including about 253,000 serious violent crimes (rape, sexual assault, robbery and aggravated assault). In comparison, there were 550,000 such serious crimes away from school. The new report indicates there were 60 violent deaths at school between July 1, 1997 and June 30, 1998, including 47 homicides, 12 suicides and 1 teenager killed by a police officer in the line of duty.

Between 1993 and 1997, the percentage of 9th through 12th grade students who were threatened or injured with a weapon of any sort on school property remained constant between 7 and 8 percent. Additionally, the percentage of those students who reported being in a physical fight on school property was unchanged during the same period.

During the 1994-1998 period, teachers were the victims of 1,755,000 crimes at school, including 1,087,000 thefts and 668,000 serious violent crimes. This amounts to 83 crimes per 1,000 teachers annually.

Revisions in January to August 2000 CPI Data

The Bureau of Labor Statistics (BLS) is reissuing Consumer Price Index (CPI) data for the January to August 2000 period to correct an error recently uncovered in the software used to calculate the Rent of Primary Residence and Owners' Equivalent Rent of Primary Residence components of the index. Correcting this error increases previously published values for those components and for index series that include those components, in selected local areas as well as at the U.S. City Average Level. The Affected series include the U.S. City Average All Items CPIs for All Urban Consumers (CPI-U) and for Urban Wage Earners and Clerical Workers. Between December 1999 and August 2000, the corrected CPI-U rose 2.7 percent, compared with an increase of 2.6 percent in the series as originally published.

The error occurred with the introduction of the new housing sample and calculation procedures beginning in January 1999. The error was in the calculation of quality adjustments when housing units in the CPI Rent and Owners' Equivalent Rent samples reported changes in air conditioning (AC) equipment. Although some index values in the January to December 1999 period were affected by the error, no revisions to data for this period will be published. Changes to the overall, or all items, index at the national average level during this period were not large enough to warrant re-publication under BLS policy, as in no month of 1999 did the overstatement in the overall index exceed 0.1 index point.

The revised CPI series will be posted on the Internet on the CPI home page at and also will be available upon request in hard copy form.

Just Seems Like Yesterday
Edward J. Spar
Executive Director, COPAFS

TerriAnn Lowenthal likes to kid me when I mention that the 2010 Decennial Census would be sixth census that I will have had any connection with. "Are you sure it's not the ninth?" she would ask. Thanks, pal. But time indeed does fly by quickly. Here we are in FY 2004 and we're hot and heavy into the processes for the next census. When Dave Kaplan used to run it at the Census Bureau, it kinda, sorta just nicely happened. And there on our doorstep were those Advanced and Preliminary Reports. Made you feel warm and cuddly all over. Come to think of it, the concept of a purely decennial census is about to be as outdated as my beloved Monroe calculator. Given the full funding that the House has given the Census Bureau to finally start up the American Community Survey (ACS) next year, looks as if we will have a piece of the decennial every year.

There's so much new happening that the next census will be, on both the collection and dissemination side, unrecognizable from anything we've been involved with in the past. Top of the list, of course is the ACS. We've written so much about this that I'm sure little addition is needed. One big issue will be how to make sense of three and five year averages, along with getting comfortable with the degree of statistical and non-statistical error. As to the former, I'm fairly confident the Bureau will give us enough options ranging from straight averages to smoothed averages putting more emphasis on the current years. As to the issue of error, statistical literacy here we come. Oh how we ignored the fact that the long form is a sample. Can't duck the issue now. In fact it's my understanding that much of the ACS data will have statistical tolerances built into the dissemination package. As far as non-sampling error, let us say based upon non-response or poor response to some questions, well, we will have to wait and see. Hopefully the Bureau will be able to give us some indication of its magnitude. Another ACS related issue will be centered around control totals. Do you weight the ACS to independently developed sub-national figures for totals and characteristics such as age, sex, race and ethnicity? Or, are the ACS data more accurate than the county and sub-county figures that have been historically estimated and should be the input to the sub-national estimating program? I have little doubt that this issue will be hotly debated over the next year.

On the collection side of the aisle, the Bureau is planning some high-tech approaches. I believe hand held computers for data capture are being looked into, geographic positioning to locate addresses, more accurate computer generated maps, and a whole host of advanced technology. One area that still, at least for me, remains in the dark is the Local Update of Census Addresses or LUCA. Won't it be necessary to keep the address file up to date in a more or less real time mode for ACS sample selection purposes? Does seem logical. What we would like to know is how the Bureau planning to do this? Sounds like an immense task, but a necessary one. We know that Joe Salvo in New York loves to find those questionable third units in houses zoned for two apartments.

One of the more enjoyable issues is trying to figure out what dissemination will be like in 2010. Instead of tables, I for one want to see a hologram of Jay Waite, the Associate Director for the Decennial, answering all data requests by reading them to you in living 3-D color. OK, someone better looking than Jay. (Sorry Jay). What will probably not change is the reality that 99% of the users will still need only 1% of what's available. Indeed, summary tables and a few printed reports will still be the mainstay of the delivery process. For the Patty Becker's of the world looking for nine-dimensional tables that send non-disclosure wonks (I will not disclose their names, tee hee) berserk, Jay will turn into a Picassoesque cubist work of art. More seriously, I hope we will finally see the decennial linked to not only the Economic Censuses, but data collections from other agencies both nationally and internationally. That too should keep the non-disclosure folks, lurking in the deep dark bowels of the Bureau, uncomfortable for a while.

Finally, the issue of adjustment. The Accuracy and Coverage Evaluation Revision II (ACE II) was a disappointment for a number of reasons. First, some of the conclusions made no sense. For example, the report estimated that there was an overcount for those under 17 years of age. We've been told for at least those six censuses I've been involved in that it's that cohort where substantial undercount is to be found. Further, ACE II found that the Hispanic population had a lower percentage of those undercounted than the Non-Hispanic Black population. Is this possible? It sure is counter-intuitive. Where did all the undocumented people go? Finally, there was an estimated overcount on American Indian Reservations. Hard to believe. What is clear is that there is NO reasonable way of measuring the undercount and overcount, even at the national level. And to the Bureau's credit, they have been up front about this. Indeed, the Director has clearly stated that given the current state of the statistical art, it's not realistic to assume that the 2010 Census will be adjusted and there are no plans to do so, although evaluations will take place. My only criticism of the Bureau is that given the lack of logic found in these figures, I would suggest that they shouldn't be used on the one hand to tell us why adjustment is not feasible and on the other hand use the data to tell us why the 2000 Census was the best census ever and that ACE II found an overall overcount of 1.3 million people. I can't prove it, but I'll bet it ain't so. As an aside, Jeff Passel pointed out to me recently that the concept of "undocumented" alien is a misnomer. They have plenty of documents, they're just not legitimate. "Unauthorized" would make more sense.

I think of a decennial census as a huge project put together with scotch tape, Elmers glue, and string. Based on enormous talent and dedication on the part of the Bureau staff it all stays together and out come very usable data. And that's what will happen again. There will be dozens of oversight committees, hearings, and hand wringing. And both in spite of and to some degree because of these, all three dimensions of Jay will emerge in 2010 on our desk tops with his findings.

OMB Released Latest Metropolitan-Micropolitan Areas Definitions (June 9, 2003)

On Friday, the Office of Management and Budget released the list of revised definitions of Metropolitan Areas, and new definitions of Micropolitan and Combined Statistical Areas. The list of areas can be obtained by going to: Go to "Bulletins" (on the left hand side of the page under "Information for Agencies") and then at the bottom of the announcement, Bulletin 03-04, there is the link to the PDF Attachment.

A full text of the short press release (2003-18) can be found on the OMB web site at:

COPAFS will host a one day seminar on November 4, 2003, to assess the impact of the new areas on the public and private sectors.

American Community Survey Will Start in FY 2004 Under President's Budget Proposal for 2004 (February 11, 2003)

The Administration's Fiscal Year 2004 (FY04) budget proposal requested $64.8 million for the American Community Survey (ACS), an amount that assumes the Census Bureau would not launch the survey nationwide until the fourth quarter. The federal fiscal year runs from October 1 through September 30; the fourth quarter covers July through September.

Under the revised ACS plan, the Census Bureau would begin mailing survey questionnaires in late June 2004 (for the first monthly sample in July). The ACS will sample 250,000 new households every month (3 million a year). For each monthly sample, the Bureau will first try to contact unresponsive homes by telephone, and then send survey takers to visit a portion of households that still have not responded. However, the Bureau does not plan to start household visits until after September 2004, pushing field work into Fiscal Year 2005 and reducing FY04 ACS costs considerably. In-person household interviews are the most costly operation in censuses and surveys. The Census Bureau has not finalized plans for release of the first annual ACS estimates for states and places with a population of 65,000 or greater. Under its original ACS plan, which assumed nationwide launch of the survey this year, the Bureau planned to release estimates based on a calendar year's worth of data collection. Before expanding the ACS nationwide in July 2004, the Census Bureau plans to continue sampling homes in the 31 test sites and the Supplementary Survey for the first nine months of FY04. Those demonstration projects would then be rolled into the nationwide survey. The Bureau has been evaluating ACS methodology and operations, as well as results, in 31 counties around the country since 1999; initial field testing began in a handful of sites in 1996. In 2000, it launched the Supplementary Survey, a national sample of 700,000 housing units annually, to assess the ACS plan on a national scale and provide a point of comparison with the Census 2000 long form. The national sample survey produces annual estimates for states and places with a population of 250,000 or greater. The Bureau has published data from both the test sites and the Supplementary Survey each year. Other appropriations news: The Census Bureau also is seeking $2.5 million in FY04 to conduct the 2004 Overseas Enumeration Test in France, Kuwait, and Mexico. The test, which will evaluate the feasibility of counting private American citizens living abroad in the 2010 census, was developed in response to congressional directives. In Census 2000, the Bureau counted members of the armed forces and civilian federal employees (and their dependents) stationed overseas on Census Day, using administrative records. The numbers were included only in state population totals used to reapportion the U.S. House of Representatives. The State of Utah sued the Census Bureau after post-census analyses showed that the overseas counts cost Utah an additional seat in Congress; the seat went to North Carolina instead. Utah unsuccessfully argued that Mormon missionaries and other private citizens living outside of the U.S. during the census should have been counted along with government personnel. The U.S. Supreme Court refused to hear the case after a federal appeals court sided with the Census Bureau. Even as the FY04 budget process begins, House and Senate negotiators continued to haggle among themselves and with the White House over appropriations for non-defense government agencies for Fiscal Year 2003 (FY03). Last week, Congress passed another temporary funding measure (Continuing Resolution), to keep the government running at last year's spending levels through February 20. Legislators are scheduled to recess after this week for the Presidents' Day holiday.

Adjusted Census Data File Released by Census Bureau (December 20, 2002)

The US Census Bureau has released adjusted data as discussed below. The University of California, Los Angeles has created an ftp site for the recently released adjusted 2000 census data for all states at:

These are huge files, even for small states. There is no software associated with the files. Each record looks like this (an example of a block from Rhode Island):


The Census Bureau does not support the use of the Adjusted 2000 Census Data.

This statement came along with the file:

"The numbers were released pursuant to the order of the United States Court of Appeals for the Ninth Circuit in Carter v. Department of Commerce, 307 F.3d 1084. These numbers are not official Census 2000 counts. These numbers are estimates of the population based on a statistical adjustment method, utilizing sampling and modeling, applied to the official Census 2000 figures. These estimates utilized the results of the Accuracy and Coverage Evaluation (A.C.E.), a sample survey intended to measure net over-and undercounts in the census results. The Census Bureau has determined that the A.C.E. estimates dramatically overstate the level of undercoverage in Census 2000, and that the adjusted Census 2000 data are, therefore, not better than the unadjusted data. Accordingly, the Department of Commerce deems that these estimates should not be used for any purpose that legally requires use of data from the decennial census and assumes no responsibility for the accuracy of the data for any purpose whatsoever. The Department, including the U.S. Census Bureau, will provide no assistance in the interpretation or use of these numbers."

An End Of Year Review for 2002: At Least One Big Win and Some Perils
Edward J. Spar
Executive Director, COPAFS

To start on a positive note, the big win: I am delighted to report that Congress has passed the Confidential Information Protection and Statistical Efficiency Act of 2002. "CIPSEA," included as Title V in the E-Government Act of 2002 [H.R. 2458], will provide a uniform set of confidentiality protections and extend these protections to all individually identifiable data collected for statistical purposes under a pledge of confidentiality and will permit the sharing of business data by the Bureau of Economic Analysis, the Bureau of Labor Statistics, and the Bureau of the Census. My compliments to Katherine Wallman, Chief Statistician of the United States, on this success. Her tireless effort that has made this a reality. The history to create such an act goes back decades. Having a broad based statement on confidentiality to cover all data and is not restricted to specific agencies is indeed a milestone for individual protection. At the same time, we have finally seen the beginning of data sharing. The current act may be limited to business data, but we can hope that this will be expanded to demographic and socio-economic data in the future. (The President signed this bill into law on December 17, 2002).

However, all is not well in the land of federal statistics, in my opinion. If you recall, last year the USA Patriots Act provided that the Attorney General could gain access to individual data collected by the National Center for Education Statistics (NCES). This raised serious question about privacy protection for respondents to surveys and to administrative data collected by NCES. A big question that has yet to be answered is whether CIPSEA, mentioned above, will over-ride the Patriots Act provision. Put in somewhat legalese, at least as has been argued around the "Hill," does the specific (the Patriots Act) take precedence over the general (CIPSEA)? As of this writing, the stack of legal opinions grows. I think the answer is that it will be up to the courts, if it gets that far. So far there have been no requests on the part of the Justice Department for NCES data.

The budget issue is bizarre. Congress won't even tackle the FY2003 budget (it was to start on October 1, 2002) until January of 2003. By then the FY2004 budgets will be almost ready to send to the Congress. As I assume you know, most of the federal government is operating at FY2002 expenditure levels. This means that such operations as conducting the Economic Censuses, an endeavor that requires special funding for FY2003, are underfunded. They can send out the questionnaires to business establishments, but who knows if they'll have the money to tabulate them when they come back. We all assume the money will be found, but in this town making assumptions is not a smart thing to do. One would also assume that one agency that would get all the funding it needed was the National Center for Health Statistics (NCHS). But unfortunately NCHS does not collect data related to bio-terrorism. Hence, as was explained to me by the Centers for Disease Control (the parent of NCHS), no extra money can be found. This is in an era where we need to know as much about the county's health as ever. By the way, this also translates into no funding for the introduction of the American Community Survey (ACS). Hopefully, enough money will be found to continue with the 31 ACS test sites and the supplemental survey. I could go on, but the "bottom line" is that we may not ever see FY2003 budgets. Congress, so the rumor mill goes, may go directly to FY2004 and just keep the federal government in its current austerity.

Although certainly not new, there is continued pressure to downsize government. For federal statistical agencies there are moves toward even greater privatization. Historically, one impetus has come from OMB Circular A-76 that states:

A. Competition enhances quality, economy, and productivity. Whenever commercial sector performance of a Government operated commercial activity is permissible, in accordance with this Circular and its Supplement, comparison of the cost of contracting and the cost of in-house performance shall be performed to determine who will do the work. When conducting cost comparisons, agencies must ensure that all costs are considered and that these costs are realistic and fair.

B. Certain functions are inherently Governmental in nature, being so intimately related to the public interest as to mandate performance only by Federal employees. These functions are not in competition with the commercial sector. Therefore, these functions shall be performed by Government employees.

C. The Federal Government shall rely on commercially available sources to provide commercial products and services. In accordance with the provisions of this Circular and its Supplement, the Government shall not start or carry on any activity to provide a commercial product or service if the product or service can be procured more economically from a commercial source.

So, Section C above basically says that if the private sector can do the job more economically than the federal government, then the private sector should do it. As a former private sector "vendor," I'm a fan of this logic. However, it can go too far. There have been rumblings that the statistical staffs of entire agencies could be eliminated by simply farming the research out to the private sector. Yes, using private sector firms is fine, but let's not forget the need for sufficient in-house staff to ensure data quality and usability. Indeed, I believe the pressure to downsize government flies in the face of the reality that federal statistical agencies are already, or will shortly, be faced will large staff shortages due to retirement and recruiting difficulties.

So, its been a mixed year. CIPSEA is certainly a major victory for the federal statistical system. I only wish I could look forward to more of them. Unfortunately, FY2003 is not shaping up as a great year. But as I've said many times, in this town you never know.

U.S. Census and Bureau of Economic Analysis Senate Appropriations Mark-Ups

The Senate Committee on Appropriations completed its mark-up of the Commerce-Justice-State Appropriations Bill. As was predicted, funding for homeland security had a high priority, along with some agencies including the National Oceanic and Atmospheric Administration. Two big losers were the Bureau of Economic Analysis (BEA) and the Census Bureau.

BEA was asking for an added $10.7 million for: 1) Generating more timely economic data; 2) Upgrading BEA's statistical processing system; 3) Meeting U.S. international obligations.

Among other requests, the Census Bureau was asking for: 1) $124 million dollars to implement the American Community Survey; 2) $92 million for the Economic Censuses.

Based on the proposed 4.1 percent salary increases, it's clear that the above efforts are in jeopardy if not impossible to fully implement. The bill is expected to go to the Seante floor in September. It is expected that this is when the House committees will address the Commerce-Justice-State Approriations Bill.

New BLS, EIA, BTS Agency Heads

The Senate confirmed the nominations of Kathleen Utgoff as Commissioner of Labor Statistics, Larry Greenfeld as Director of the Bureau of Justice Statistics, and Guy Caruso as Administrator of the Energy Information Administration.

Kathleen Utgoff previously served as vice president of the Center for Naval Analyses, where she was responsible for research on work force issues, the environment, health care, and infrastructure. Earlier in her career she was a senior economist for the Council of Economic Advisers, the executive Director of the Pension Benefit Guaranty Corporation, and the chief economist and a partner in the employee benefits law firm of Groom and Nordberg.

Guy Caruso has most recently served as executive director of the Strategic Energy Initiative at the Center for Strategic and International studies as as Director of the National Energy Strategy Project. During his earlier 32 years career with the government, he served as Director of three differenct offices in the Department of Energy (Oil and Natural Gas, Energy Emergency Policy, and Oil Market Analysis) ad was twice posted to the International Energy Agency.

Larry Greenfeld, who started out as a probation and parole office 33 years ago, has had along career at the Bureau of Justice Statistics. He is the 4th confirmed director of BJS.

Together with Louis Kincannon at the Census Bureau, these conformations complete action on four of the five presidential appointment vacancies in the statistical system. The fifth vacancy is Commissioner of Education Statistics, where no "intention to nominate" had been indicated. The sixth presidential appointment, at the Bureau of Transportation Statistics, currently is filled by Ashish Sen.