Minutes of the September 12, 2008 COPAFS Meeting
Ed Spar: Executive Director’s Report.
Ed spar started the meeting with his Executive Director’s report, noting expectations that the 2009 budgets will not be finished on time, and rumors that it could be as late as March. However, there is awareness of the increase needed for 2010 census activities, and support for a Census Bureau anomaly in continuing resolutions. In contrast, the NCHS budget is a problem again, and there are plans for possible cuts in the Health Interview Survey, National Health and Nutrition Examination Survey, and vital statistics.
Commenting that it is once again “silly season” for the census, Spar said legislation will be proposed to require that the 2010 census count only legal U.S. residents, but that it will go nowhere. He then directed our attention to a document identifying the milestones and dates for the 2010 census process. The document was requested by the 2010 Census Advisory Committee, and has been posted on the COPAFS website. Spar also reported that the COPAFS Board has voted to upgrade the COPAFS website.
Spar noted the recent passing of Calvin Beale, and Katherine Smith of the Agriculture Department’s Economic Research Service made remarks recalling Cal and his career.
Looking ahead, Spar noted that a metropolitan area definitions conference is in the works. Issues include combined statistical areas, and the frequency with which areas should be defined – given the annual commuting data that the ACS will provide. Spar also recommended a presentation on the federal statistical system that former Census Director Louis Kincannon will make October 28 at the Department of Agriculture. He also said we can watch for legislation that Commerce, Treasury and Labor are working on that would promote data sharing (or “synchronization”) objectives.
The final 2008 COPAFS quarterly meeting is December 5, and dates for 2009 are March 6, June 5, September 11 and December 4.
An Update From the Economic Research Service.
Katherine Smith. Economic Research Service.
Administrator Kitty Smith explained that ERS is the economic research and policy analytical arm of the Department of Agriculture, providing economic information on topics related to food, agriculture, the environment, and rural America. Their mission is to anticipate issues related to these topics, and have information ready when policy decisions need to be made. ERS collects some of its own data, but also acquires data from other sources.
The Agricultural Resource Management Survey (ARMS) gathers information on farming practices and farm households. The survey provides a basis for calculating state and national farm income, provides weights for the Prices Paid by Farmers Index, and is used in developing the Report on the Status of Family Farms. ARMS also supports the USDA estimates and projections of farm income. Smith said ERS is proud of how it makes its data accessible – including customized data summaries on the ERS website, and limited access to microdata. She noted that data can now be accessed remotely using the NORC Data Enclave (see March 7, 2008 COPAFS meeting), but only for formal research on issues of interest to USDA and by researchers pledged to confidentiality.
Turning to data that ERS constructs, Smith described the Rural Definitions Data Product, in which ERS summarizes and provides guidance on the various definitions of “rural.” She also described their work with Food Security Measurements (the economic ability of people to consume adequate amounts of food), productivity measures for the agricultural sector, and major land uses (1945-2006, and including Population Interaction Zones for Agriculture, identifying areas most likely to be influenced by urban expansion). They also provide a Foodborne Illness Cost Calculator – an interactive tool on the ERS website that enables users to calculate cost estimates based on alternative assumptions.
ERS also purchases data from other sources, including supplements to existing surveys such as NHANES, the American Time Use Survey, and the Early Childhood Longitudinal Survey. They also acquire Nielsen HomeScan data – which provide large volumes of information on food purchases and prices.
Current data-intensive research projects are devoted to broadband Internet use (or absence) in rural America, food and agricultural security (threats to the food system), and estimating the relationship between food prices and health outcomes. Issues include the quality of the proprietary data they acquire, definitional consistency across datasets, data sharing, and opportunities for collaboration. Smith commented that ERS has been too isolated in the past, and needs to work more with other agencies.
Asked if ERS is doing research related to fuel production through agriculture, Smith confirmed that they are doing much work in this area of recent and intense interest – so much that she quipped that she has become tired of the topic. Asked if they have collaborated with FDA (in light of the recent tomato scare), Smith noted with regret that there has been insufficient coordination on foodborne illnesses.
Data Confidentiality Approaches with National Statistical Organizations.
Brian Garrett. Space-Time Research.
Garrett explained that Space-Time Research is an Australian software company, started in the mid 1980s, whose products perform “query-answer-query” data tabulations for analyses by a variety of users. Most of their clients are statistical agencies outside the U.S., but the U.S. Census Bureau is a major client, as STR software was used to build tables for the 2000 census, and will be used for 2010 as well.
Garrett’s presentation focused on confidentiality issues faced by some of these agencies. He defined “disclosure control” as methods to reduce the risk of disclosing information on persons, businesses or other organizations. The problem is common to agencies around the world, and he described two basic approaches to addressing it.
The first approach is to change cell contents – rounding or otherwise perturbing cell values. Random rounding is an example of this approach – for example, rounding cell values to values to the nearest multiple of 3. This approach raises issues about the impact on row and column totals. The second approach is to conceal or suppress some cell contents where numbers are very small. This approach requires “consequential suppression,” or the concealing of non-sensitive data to prevent disclosure by inference.
In designing disclosure control systems, Garrett said they strive to provide a maximum amount of data, while providing maximum protection, with a system that is easy to configure, and provides good performance.
Garrett then reviewed some of the approaches taken by some of STR’s major clients.
Statistics South Africa uses graduated rounding, in which the amount of adjustment is increased for smaller cell values. The Office of National Statistics, U.K. uses several methods, but does not disclose what they are – although they acknowledge using “controlled rounding” and suppression in some products (controlled rounding being an approach that allows one to preserve original row and column totals). The Australian Bureau of Statistics also uses graduated rounding, and has a Table Builder product that enables users to produce tables from microdata. This product has data perturbation algorithms that ensure data consistency while guarding against disclosure. Statistics New Zealand uses base 3 random rounding and a set of “business rules” that limit how ambitious tabulations can be in terms of things like mean cell size.
Disclosure is a sensitive topic for many COPAFS attendees, so there was much discussion of disclosure avoidance methods, and the tradeoffs between confidentiality and dissemination objectives. Garrett stressed that his company’s job is to implement the methods specified by its clients, not to evaluate how effective or appropriate the methods are.
Garrett reiterated that the U.S. Census Bureau uses STR products in DADS and DADS-II, and applies a variety of measures and business rules during pre-tabulation and post-tabulation. It was noted that the American Community Survey suppresses data based on tests for statistical accuracy, and when asked if he had seen such checks applied in other countries, Garrett said he had not.
August 26, 2008 Release of the Income, Earnings & Poverty Data from the Current Population Survey and the American Community Survey.
David Johnson. U.S. Census Bureau.
In a modified version of his presentation for the August 26 release of the poverty, income and health insurance coverage estimates, Johnson noted that poverty and income data are now provided by both the CPS and ACS. The CPS remains the official source for national level estimates, but its sample size is limited so the ACS is the source for state and sub-state areas – as well as for comparisons of state estimates to national level. Data from the two sources are not to be mixed, but the two sources can be used to get the most complete picture of recent patterns.
Johnson first reviewed the household income estimates, where the (CPS) national median now exceeds $50,000. He then turned to ACS in reviewing the state estimates, but noted that the national median is similar for both CPS and ACS. His review of metropolitan/micropolitan area estimates, and even those for principle city versus suburban areas, illustrated the ability of the ACS to provide estimates for smaller areas.
Johnson provided a similar review of the poverty estimates. As with income, the patterns were not surprising, but the provision of ACS estimates for metro areas and principle cities (vs. suburbs) highlighted long-standing issues with the use of uniform poverty thresholds across the nation. Johnson acknowledged the many possible alternative measures, and explained that the ACS will provide data enabling greater flexibility in measuring poverty. He also described an alternative poverty table creator – a tool on the Census Bureau website that allows users to calculate poverty rates based on alternative poverty definitions.
Turning to estimates of earnings and then health insurance coverage, Johnson explained that the ACS only started collecting health insurance data this year, so the CPS is still the only source for now. In the meantime, state estimates are provided from the CPS data, but require multiple years of sample – similar to the approach ACS will take with much smaller areas.
Asked what would happen if the CPS and ACS data told fundamentally different stories, Johnson cautioned that CPS and ACS estimates should differ – due to differences in question wording, income measures, reference periods, and family definitions. Even so, he noted that the national estimates and trends have been quite similar.
Throughout his presentation, Johnson cited specific numbers and trends from the recent release, but encouraged attendees to get the full report online.
Update on 2007 American Community Survey Data Releases and Education Plans for Understanding Multiyear Estimates.
Susan Schechter, Douglas Hillmer and Deborah Griffin. U.S. Census Bureau.
Susan Schechter started with a brief ACS update, saying they will be back for the December 5 COPAFS meeting to present on multi-year estimates, which then will be just days away from the December 9 release. The current focus is on the 2009 budget. In 2008, ACS funding was adequate for data collection, but the methods panel was lost, so it is encouraging that the 2009 request includes funding for the methods panel.
The Bureau is also managing changes to the ACS questionnaire. Interviewer focus groups confirm that the 2008 questionnaire (with its new questions) is working well, and two new questions are planned for 2009 – field of (bachelor’s) degree and duration of vacancy. Both questions have been approved by OMB, but with collection starting in 2009, we will not see ACS estimates until 2010.
Doug Hillmer provided an overview of the 2008 data releases, which are providing a wide range of 1-year 2007 estimates for areas of 65,000 or more population, and the first release of 3-year estimates for areas of 20,000 or more population. Because group quarters data were not collected in 2005, weighting adjustments have been made for the 3-year group quarters data.
Geographic areas for multi-year estimates will reflect the final year of the relevant period. For example, geography for the 2005-2007 3-year estimates, will be as of January 1, 2007. Areas provided will include states, counties, congressional districts, PUMAs, metro/micropolitan areas, Native American areas, cities, and towns. Data also will be provided for non-metropolitan portions of states.
Looking at “what’s new and different,” Hillmer pointed to expanded Population Profiles (now with country and region of birth), the return of Comparison Profiles (that compare 2006 and 2007 estimates), and some new detailed tables on existing topics (for example, on residence 1 year ago). Hillmer also noted that they have decided to archive ACS data from the test phase – with these data no longer being available on American FactFinder.
Debbie Griffin described the “ACS Compass Products” which will be available soon, and are designed to educate users about the ACS. First will be a set of 12 handbooks tailored to specific user groups. Ten of these were contracted out and the other two produced in-house at Census. The handbooks will include material on ACS basics, but presented differently for the target groups. Second, a set of “train the trainer” materials will consist of fully scripted PowerPoint presentations for use by the Census Bureau, and also by “data intermediaries” for use in their training efforts. Third, Griffin described an “E-Learning ACS Tutorial” that will walk users through both basic and advanced ACS topics, and include a broad set of case studies drawn from the user handbooks. The Bureau also plans a web page providing guidance on comparing ACS data as well as guidelines for using multi-year estimates. The Bureau hopes to have all of these materials available on the website before the first multi-year estimates are released in December.
Next steps for ACS educational materials include soliciting user feedback on ACS Compass products, and assessing the need for additional materials. It is expected that the user handbooks will be updated based on user feedback and will provide new case studies based on actual ACS data. The extended Q and A session that followed underscored the importance of the ACS educational outreach effort.
Concerns from COPAFS Constituencies.
No issues were raised, and the meeting was adjourned.