Summary of the September 10, 2004 COPAFS Meeting
COPAFS Chair Dun Muff started the meeting by introducing Ed Spar for his Executive Director’s report.
Ed described the current “flap” over the Census Bureau’s provision of 2000 census data on Arab ancestry to the Department of Homeland Security, and directed our attention to the Census Bureau’s announcement concerning procedures for the provision of assistance to those requesting special tabulations and extracts. The procedures go beyond existing disclosure review practice to require special approval where the request involves a sensitive population. Even where the request is only for an extract of already published data, the Census Bureau employee is to attempt to learn the name and affiliation of the requestor, and if the requestor is with a law enforcement or intelligence agency and/or the request involves a sensitive population, the request must be reviewed and approved by senior Census Bureau staff. Ed stressed that these are interim recommendations, subject to discussion by stakeholders. A special joint meeting of the Census Advisory Committees will be held on the subject November 9.
Turning to budgets, Ed noted that ACS funding is down $19 million on the House side, and that we have no clue what the Senate will do. Nevertheless, the Census Bureau is moving ahead with the ACS, and expects to be in the field starting January – so long as Senate funding is close to the House figure. One consequence of the cuts is the deferred inclusion of group quarters in the ACS – a move that has annoyed some users. With the end of the fiscal year approaching, hope for an omnibus funding bill is dwindling, so a continuing resolution is likely. Ed discussed some preliminary numbers, but commented that there are questions all around until we see what the Senate does.
Next Ed described the Census Bureau’s announcement of the new Quarterly Services Report—which will provide data specific to the services sector. Ed commented that this is not really a new economic “indicator” (as labeled in the media), but described it as a good source of new data. Disputing Democratic charges that the release was a political move on the part of the administration—Ed noted that the new source has been in the works for a long time, and that “these are not political data”
Ed closed with a pitch for the December 15-16 OMB seminar, and announced the dates for future COPAFS meetings: December 10 (2004), then March 11, June 10, September 16, and December 9.
Measuring Response Rates in Federal Surveys
Clyde Tucker. Bureau of Labor Statistics.
Clyde remarked that this was a difficult presentation to prepare because there is so much going on—with several federal interagency groups, AAPOR and others doing work in this area. However, he described three “most important points” for us to know: 1) that contrary to popular belief, the calculation of response and nonresponse is not straightforward; 2) that nonresponse rates are not measures of data quality; and 3) that for at least the past 40 years, nonresponse rates have been increasing.
Clyde then proved his first point by reviewing a large number of response/nonresponse measures developed for federal surveys, and describing the complications and variations associated with these measures. The measures involve a list of definitions (eligible units, interviewed units, eligible non-interviewed units, eligible units not interviewed due to refusal, etc.), and it is sobering how difficult it can be just to define “eligible units.” Response measures differ for units consisting of persons, households, consumer units, or housing units; and whether a longitudinal or panel survey is involved (where one must account for attrition and new household formation). Even data collection methods are a factor, as random digit dial (RDD) surveys must account for “telephone break offs” in addition to refusals. And business establishment surveys present a separate set of challenges—including the “births and deaths” of firms, the need to weight by size of firm, and large numbers of partial responses that increase the focus on item-nonresponse.
Next, Clyde reviewed some specific nonresponse measurement issues that are being worked on. For household surveys, these include the need for greater consistency in the definition of “outcome codes (e.g., interview vs. non-interview), the handling of cell phones in RDD surveys, the steady increase in “non-contacts” (which diminishes average time devoted to nonresponding units), and changes in technology (e.g., need to distinguish between reaching an answering machine and a straight non-contact). Issues in the calculation of establishment response rates include outcome codes, handling the births and deaths of establishments, the retention of initial sample records, and the creation of improved sampling frames.
Future work will focus on the refinement of rate calculations, improving the uniformity of outcome codes (and their application), rates specific to response mode, research on non-contacts, and rates for item-nonresponse.
When asked what constitutes a good response rate, Clyde referred us back to his second “important point,” that response rates are not a surrogate for data quality. To determine the impact of nonresponse on data quality, Clyde commented that we need measures of nonresponse bias. He closed by noting that this is something they are working on.
Policy Implications and Applications of Administrative Records Research Programs
Lisa Blumerman, Sally Obenski, and Ron Prevost. U.S. Census Bureau.
Lisa noted that the presenters are part of a new Administrative Records Opportunities team at the Census Bureau—with administrative records defined as data collected and maintained by other agencies as part of their (usually non-statistical) work. As Lisa described it, the Census Bureau’s administrative records work can be viewed in the context of its data stewardship program, and is consistent with its mission to provide quality data while protecting confidentiality, reducing respondent burden, and minimizing costs to the public.
The formal mandate is found in Title 13, which calls for the use of administrative data, where possible, instead of direct inquiries. Administrative records efforts also are guided by the Privacy Act of 1974, the E-Government Act of 2002, and the Paperwork Reduction Act.
Lisa explained that the use of administrative records requires a “bridge of trust” between the Census Bureau, the providing agencies (such as IRS), data users, and the providers of personal information. She then described a number of “data stewardship policies” designed to establish and maintain this trust. Critical among these is a policy statement on record linkage—ensuring that the critical, but sensitive, linking of individual records be conducted only when it is necessary to the mission, is the best alternative, provides benefit to the public, and is conducted with sensitivity, openness, and consistent review and tracking. Responding to a question, Lisa noted that disclosure avoidance is part of all of these objectives. Lisa concluded by describing the administrative steps to ensure the consistent application of these policies. These include centralized data acquisition and agreements, centralized project review, the removal of identifiable information, security and confidentiality training, and the Administrative Records Tracking System.
Sally Obenski described the evolution of the administrative records program. Noting that the Census Bureau has long used IRS data in its estimates, she explained that coverage issues with the 1990 census led to renewed interest in the use of administrative records, and that staff was formally dedicated to pursue administrative records work in 1997. The effort took the form of the Statistical Administrative Records System (StARS), an exercise in drawing from seven major federal files to simulate an administrative records census. StARS has been validated in five counties with coverage issues similar to those of the census, and found to perform well relative to census addresses. Improvements are still being made, but some limitations persist – such as the difficulty of getting race/ethnicity information from the Social Security transaction data. Also complicating StARS is the need to ensure that the use of Social Security numbers meets Social Security Administration requirements for validation.
Sally then described some emerging uses of administrative data. These include the identification of addresses not yet on the MAF, and the identification of characteristics of nonresponding households (relevant to Clyde Tucker’s call for measures of nonresponse bias). Administrative data also could improve the imputation of characteristics in cases of item-nonresponse. Sally concluded by stressing that the objective is not to replace the census, but to supplement and improve it.
Ron Prevost then gave us a look at the future of administrative records applications. Since administrative data provide a separate version of “truth,” the objective is to re-use this information to improve census operations – including the improvement of address frames, synthetic estimates and field and headquarters operations. To that end, he described the “new business line” proposed by the Census Bureau, in which administrative data would go beyond the mere provision of data to contribute information for program administration, policy development, and program performance measures. The objective is to show agencies that administrative data can help them improve their work.
As an example, Ron described a Maryland food stamp study, in which they demonstrated that administrative records had better coverage of recipients than the ACS 2001 Supplementary Survey. Administrative data also were less vulnerable to response error, and other respondent cognition issues. For example, the administrative data are immune to the tendency of some respondents to report that they do not use “food stamps,” because they are no longer provided in the form of stamps. Again, the potential goes beyond the provision of data to the improvement of questionnaire design, editing procedures, and survey weights.
Results from the BLS Time Use Survey
Jay Stewart and Dori Allard. Bureau of Labor Statistics.
Ed said he had hoped the first data from the American Time Use Study (ATUS) would be available for this presentation, but instead it was a preview, as the first data are now due September 14.
Jay explained that the goal of the ATUS is to proved annual estimates of time spent in various activities, crosstabulated by demographic and labor force characteristics, and by weekday/weekend and over time. Data are collected continuously from a sample of households (currently about 2,200 per month) that have just finished participation in the CPS. Once a household is selected, a reference person is designated at random, and assigned a reference day (e.g., Tuesday the 5th). Interviews are conducted by CAPI in either English or Spanish, with basic information carried over from the CPS.
Information is collected using a core time diary, and a set of summary questions (e.g., on labor force participation). Further information is collected on other household members, and how they are related to the reference person. Respondents are asked what they did and for how long, starting at 4:00 am on the reference day. They are also asked who was with them at the time, and where they were when engaged in the activity. A set of summary questions gathers information on work activities not covered in the diary, and “income-generating activities” other than the respondent’s job. Information also is collected on child care, volunteer activity, and “missed days” (activities on trips of two or more days that would be missed in the 24-hour protocol).
Anticipated uses of ATUS data include estimating the value of non-market work, verifying the accuracy of data from other sources (such as hours worked), measuring time spent with children, and comparing U.S. time use with that of other countries. Future ATUS “modules” will ask questions on specific topics, such as food security, tools and appliances (associated with household capital and production), and care of the elderly.
Dori then described ATUS data products. Again, the first ATUS estimates are due September 14, with public use microdata to follow. The first release will consist of time-use data files – including a person file (one record for each designated person), roster file (one record for each person in the household), activity file (one record for each activity), and a “who” file (one record for each “who” present with the respondent). There will also be an ATUS-CPS file identifying information (such as race, education, marital status) carried over from the CPS. A second data release will include information of interest to methodologists – such as interview time and number of interview attempts. For further information, Dori referred us to the ATUS website www.bls.gov/tus.
The Status of the Census Bureau’s MAF/TIGER Program
Bob LaMacchia. U.S. Census Bureau.
Bob noted that an important Census 2010 goal is to equip the field staff with hand held computers with GPS capability – thus reducing the costly and cumbersome use of paper maps. The MAF/TIGER improvement project is critical to this objective. Because TIGER was developed in the 1980s from the old GBF/DIME files, the geocoding of addresses depends on the (not always accurate) address ranges identified for street segments. The objective is to make TIGER usable with GPS technology by improving its spatial accuracy—enabling point-polygon geocoding, in which latitude/longitude coordinates locate each address within polygons defined by the boundaries of census blocks.
The project dates to a 2001 RFP, which resulted in a contract to Harris Corporation – described as an engineering firm with experience in telecommunications. Phase 1 of the project consisted of “fleshing out” the work that needed to be accomplished, and establishing accuracy requirements. For example, the project established an accuracy requirement of 7.6 meters (under something called CE95) for street centerlines. These centerlines are often the boundaries between blocks, and are therefore critical to the point-polygon geocoding of address coordinates.
Phase 2, begun in January 2003, is an effort to gather information with enhanced spatial accuracy. Bob noted that this phase “is not cheap,” as it involves the acquisition of state, local and tribal GIS files, and evaluations to determine if they meet or exceed accuracy requirements. Evaluations were conducted on a sample of intersection coordinates within each area. Where no local files are available, private sector files are considered. However, few of the private files meet the accuracy requirement, as many were targeted at an accuracy of 10-12 meters (a level sufficient for in-vehicle navigation applications). Some local files cannot be used because agreements preclude the kind of public-domain redistribution the Census Bureau requires.
Bob reported that so far, they have evaluated data for about 1,500 counties (with data believed to be good enough to evaluate), and that about one third have failed to meet the accuracy requirement. The rate of future progress depends on congressional funding, and Bob noted that work contributing to the 2010 census would have to be wrapped up by 2008.
Bob summarized the importance of TIGER improvements – noting they will feed the national spatial data infrastructure. The improved TIGER would better support the ACS, enhance geocoding accuracy, and contribute to the boundary and annexation survey, the definition of statistical areas, and the Census 2010 LUCA program.
New products include the 2003 TIGER/Line files (already released), 2004 TIGER/Line (due October 2004), and a 2004 TIGER/Line “second edition,” (six months later). A second edition of this type has not been done before, and is an effort to provide a more timely release of recent enhancements. Bob encouraged us to check the Census website for updates and further details.
Concerns of COPAFS Constituencies
No concerns were raised, and the meeting was adjourned.
- Don Muff, Muff Consulting Services
- Richard Forstall, Association Of American Geographers
- Patricia Becker, APDU/SEMCC
- Ken Hodges, PAA/Claritas
- Stephen Tordella, Decision Demographics
- Linda Jacobsen, APDU
- Dorothy Harshbarger, NAPHSIS
- Robert McGuckin, AEA
- William Barron, NORC
- Brian Harris-Kojetin, OMB
- Thomas E. Brown, IASSIST
- Douglas R. Thompson, Beyond 20/20 Corp.
- Seth Grimes, Alta Plana Corp.
- Stacey Fitzsimmons, Abt Associates
- Judie Mopsik, Abt Associates
- Dick Kulka, RTI International
- Maurine Haver, NABE
- Paul R. Zelus, AUBER
- Steve Wenck, Synectics
- Sarah Zapolsky (and Robert), AARP
- Susan Schechter, OMB
- Thomas Bolle, BTS
- Mary Moien, NCHS
- Jennifer D. Williams, CRS
- Phil Philbin, Booz Allen Hamilton
- Patricia Rozaklis, IBM
- Lee Herring, American Sociological Assn.
- Susan Liss, Federal Highway Admin.
- Ralph Rector, Heritage Foundation
- Nancy Bates, AAPOR
- Marilyn Seastrom, NCES
- Carolee Bush, AAPOR
- Margaret Martin, COPAFS
- Dan Levine, Westat
- Douglas Skuta, ESRI
- Rick Ayers, ESRI
- Ed Goldfield, CNSTAT/Census
- Felice Levine, AERA
- Susan Labin, ISR/Temple University
- Tim Tang, NEA
- Diane Herz, BLS
- Enrique Lamas, Census
- Larry Cox, NCHS
- Nanda Snoivasam, FHWA
- John Munyon, SeekData, Inc.
- Lu Jeppesen, SeekData, Inc.