Challenging Research Issues in Statistics and Survey Methodology at the BLS
Problem Statement: Mixed-Mode Survey Design. What Are the Effects on Data Quality?
Key Words: Mixed-mode, multiple mode, multi-mode, hybrid surveys, equivalence of instruments, dual frame surveys, data quality, survey error, nonresponse
Contact for further discussion:
Office of Survey Methods Research, PSB 1950
Bureau of Labor Statistics
2 Massachusetts Avenue NE
Washington, DC 20212
Telephone: (202) 691-7414
Fax: (202) 691-7426
Background and Definitions
As noted by de Leeuw (2005), mixed-mode surveys have been in use for a long time and have become the norm in some countries as survey managers seek to use collection procedures that produce the best possible data within existing constraints of time and budget.
"Modes" refer to approaches used either to contact or to obtain data from survey respondents. Some possible technologies that could be used in the data collection process are mail, phone, touchtone data entry, FAX, and the Internet. Modes may use the same basic technology but differ in how it is used. For example, questionnaires may be administered by an interviewer using the phone or completed by a respondent (self-administered) using the phone but with interactive voice response. Similarly, questionnaires on the Web or on a computer (laptop, tablet, pad, PC) could be administered by an interviewer or completed by a respondent without the aid of an interviewer. The decision depends on a complex interplay of factors that the survey manager must contemplate, including the complexity of the information being collected, the time burden of the interview, and the sensitivity of the data.
The mixing of modes and approaches seems to be limited at times only by the creativity of survey managers. For example, certain survey activities, such as prenotification and reminder messages, can be sent using one mode, but data collection might rely on a different, but single, mode. Or, the survey data can be collected using more than one mode (mixed-mode). For example, prenotification, reminder messages, and the initial questionnaire might be sent via mail and respondents given the option of reporting using mail, phone, the Internet, or some other approach.
Of the two basic approaches to data collection (single or multiple modes), the use of multiple modes is of most interest because as de Leeuw states it, mixed-mode systems with unimode data collection appear to be a "win-win" situation. There does not appear to be a downside in terms of survey error. On the other hand, systems with more than one mode of data collection may increase the likelihood of measurement error because the survey question may appear somewhat differently under different modes.
In addition to the varied use of different technologies, widely different organizational approaches may be implemented for collecting the data. For example, computer assisted telephone interviewing is typically conducted in centralized locations, but it could also be conducted in a decentralized manner (e.g., from the interviewers' homes). In fact, this occurs in surveys like the Current Population Survey (CPS), where 1st and 5th month interviews are typically done face-to-face, but subsequent interviews are commonly done from the interviewer's home using a laptop computer, which is, in essence, decentralized CATI 1.
Although survey managers have been very creative with their use of modes, there may be a hidden price in terms of data quality. According to Dillman and Christian (2003) evidence exists that survey mode can affect respondent answers to questions, even when questions are worded the same. As a result, they caution that differences observed between Time 1 and Time 2 may be due to mode changes, rather than to any actual differences in behavior or opinion. A good example of the impact of mode is the reporting of sensitive behaviors. Reporting rates and data quality differ substantially when self-administered and interviewer-administered modes of data collection are compared (Turner, Lessler, and Gfoerer, 1992). However, as with many methodological differences that affect attitude and opinion items or the reporting of potentially sensitive behaviors, it is not clear if the effects of using multiple data-collection modes will generalize to government establishment surveys, where the data are mostly factual in nature. Similarly, government household surveys that deal with topics such as work, education, and expenditures may be relatively immune to changes in data collection mode.
Mixed-mode survey approaches are widely used by the Bureau of Labor Statistics (BLS). For example, in addition to the use of mail the Current Employment Statistics (CES) program, which collects data monthly, uses touchtone data entry (TDE), electronic data interchange (EDI), computer assisted telephone interviewing (CATI), FAX, and the Internet (Web), with the most recent addition being the Internet. However, the CES does not offer respondents a true choice of response mode. Instead, there is a hierarchy of reporting options that differs in cost. For example, if an establishment does not respond via mail or FAX, a follow-up call will be made using CATI. If respondents report satisfactorily via CATI for a pre-determined amount of time, then an attempt will be made to move them to lower-cost reporting options, such as TDE or the Internet (although respondents are not forced to make the move).
In many surveys, a basic assumption appears to be that offering multiple modes of reporting makes the reporting task easier for respondents, which will lead to higher response and better quality data. In addition, if respondents can be encouraged to use the more cost-effective modes, the costs of data collection can be significantly reduced, or at least better controlled 2. However, the question of whether offering concurrent, multiple modes of responding actually leads to higher response does not seem to have a clear answer, but the evidence is much clearer that the use of sequential mixed-modes (for example, conducting a telephone followup after an initial questionnaire mailing) does lead to improved response (de Leeuw, 2005).
From a research perspective, the impact of multiple modes on measurement error is also difficult to determine because choice of mode is often determined by the respondent either explicitly or implicitly (for example, by failing to respond via the desired mode). Therefore, self-selection can lead to differences that are confounded with the data-collection mode.
New data collection modes are usually welcomed by survey managers, especially when they offer the opportunity to reduce costs or to improve the timeliness of data. One of the most recent, significant innovations is the Internet, and federal statistical agencies are increasingly attempting to move surveys to the Internet, or to offer it as a reporting option. Since a significant number of business respondents report for more than one BLS survey, BLS offers a common portal or gateway into its Internet reporting website, called the "Internet Data Collection Facility" or IDCF. In addition to providing a secure common gateway, the IDCF requires that all survey applications meet internal standards for graphical user interfaces so that on-line questionnaires have the same look and feel.
As previously noted, the Current Employment Statistics (CES) program uses a variety of reporting modes, including mail, phone, FAX, and the Internet. Because of the low cost and improved timeliness of Internet reporting, there has been a great deal of interest in encouraging increased use of this mode within the CES. Recent research conducted by Rosen and Gomes (2004) explored the following questions:
- Will Web-eligible TDE (touchtone data entry) respondents be willing to switch to Web reporting?
- If respondents switch to the Web, how will the conversion affect response rates?
- What's the most cost-effective method (telephone, fax, or mail) for contacting and converting respondents from TDE to Web?
- Which security option do respondents prefer (account number/password or digital certificate)?
In a test conducted in April 2004, a sample of 3,000 TDE respondents (1,000 for each contact method) was contacted by the three contact/conversion methods (phone, FAX, mail). Seventy four (74) percent of the TDE units responded. All those who agreed to report using the Web received their initial Web account information by mail.
Since accessibility to the Web has long been of interest, it is worth noting that at the time of this study 71 percent of the TDE respondents met the criteria imposed for reporting via the Web (i.e., have access to the Internet and e-mail at their desk, and currently using Internet Explorer 6.0 or higher) 3. Of those meeting the eligibility criteria for using the Web, 89 percent reported that they wanted to switch to Web reporting, and 90 percent chose the account/password option. But, only 77 percent activated their accounts, and only 59 percent actually reported data via the Web.
Another important finding was that offering the Web to TDE respondents hurt response rates. The potential for a reduction in response rate during the first few months of transition was about 8 percentage points in this study. Moreover, extensive follow-up procedures were needed to ensure respondents activated their Web accounts, and response rates for the group that chose to use the digital certificate security option was about 12 percentage points lower than the group that chose the account/password approach. As it turned out, FAX was the most cost-effective contact method when converting respondents from TDE to Web reporting.
In summary, the use of multiple modes of data collection is a trend that appears to be increasing, rather than decreasing. For example, in the Current Expenditures Quarterly Interview Survey (CEIS), which was designed to be done by personal visit and is currently done using CAPI, the survey procedures were initially established so that a telephone interview was supposed to be a rarity, to be done only in unusual situations, for example, when a respondent demanded it. A face-to-face interview was chosen because the length and complexity of the questionnaire seemed to demand the skills of a interviewer to be present to encourage response and the collection of high quality data, for example, by having the interviewer encourage the respondent to refer to records (an average interview lasts about 60 minutes, and sample units are interviewed every three months for a total of five times). However, a recent paper by McGrath (2005) revealed that about 42 percent of the interviews are currently being done by phone (decentralized CAPI), with unknown effects on data quality.
As noted previously, with establishment surveys the effects of using multiple modes are assumed to be benign because the surveys tend to be much shorter than household surveys and the questions asked are factual in nature. Still, the validity of this assumption remains untested.
Issue: What impacts do alternative data-collection modes have on data quality?
- Is there any evidence that different data collection modes result in data of differing quality for BLS surveys? Are there identifiable biases associated with different modes?
- As survey managers increasingly encourage respondents to use the Internet or other modes, what approaches to sample design and evaluation can be taken so that survey managers are able to measure the bias associated with different modes?
- Are there methodological conditions or procedures that are correlated with biased data? For example, do differences in questionnaire design among modes (phone vs. Web) lead to biased data?
- What steps can BLS managers take to measure and reduce bias?
de Leeuw, E. D. (2005). To Mix or Not to Mix Data Collection Modes in Surveys. The Journal of Official Statistics, 21(2), 233-255.
de Leeuw, E. D. (1992). Data Quality in Mail, Telephone, and Face-to-face Surveys.
Dillman, D. A. and Christian, L. M. (2003). Survey Mode as a Source of Instability in Responses across Surveys. Revised version of a paper presented at the Workshop on Stability of Methods for Collecting, Analyzing and Managing Panel Data, American Academy of Arts and Sciences, Cambridge, MA March 27, 2003. Forthcoming in the journal, Field Methods.
McGrath, D. (2005). Comparison of Data Obtained by Telephone Versus Face to Face Response in the U.S. Consumer Expenditures Survey. Paper presented at the Joint Statistical Meetings, Minneapolis, Minnesota.
Rosen, R. and Gomes, T. (2004). Converting CES Reporters from TDE to Web Data Collection. Paper presented at the Joint Statistical Meetings, Toronto, Canada.
Turner, C., Lessler, J., and Gfoerer, J. (1992). Survey Measurement of Drug Use:
Methodological Studies. Washington, DC: National Institute on Drug Abuse.
1 About 15 percent of cases are also done using CATI (Computer Assisted Telephone Interviewing) from centralized facilities.
2 Although achieving lower costs by encouraging respondents to use the Web is a goal, a certain number of respondents must voluntarily use the Web before cost efficiencies can be realized. However, lower response on the Web has frustrated meeting this goal as of mid 2005. Personal communication with the CES program manager, Richard Rosen.
3 Access is expected to continue to increase over time.
Last Modified Date: January 06, 2006
Last Modified Date: July 19, 2008