Once a Satisficer, Not Always a Satisficer - Evidence from an Trap Question Experiment Sebastian Lundmark, Department of Political Science, University of Gothenburg. Sebastian.Lundmark@gu.se Johan Martinsson, Department of Political Science, University of Gothenburg. Johan.Martinsson@pol.gu.se Abstract Satisficing is a well-known data quality problem in all types of survey research. However, using web surveys enables the researcher to monitor and more promptly intervene in the respondents’ response process than before. In this research note, we aim to detect strong satisficing behavior early in surveys to explore whether this behavior continues throughout the survey. In addition we also explore the possibility to remedy such behavior by trying to catch the respondents’ attention and trying to invoke altruistic motivation for the satisfying respondents. We detect satisficers by introducing an “awareness control”or “trap” question that has the same outline as other attitude questions, but instead of asking for a response on an attitude, it requests the respondent to return a specific answer (e.g. “2”) on the specified scale. We also introduce two questions asking the respondent to motivate two responses they have given previously in the survey, with the aim to increase the respondents’ attentiveness to the questions and perhaps decrease satisficing behavior. Using a 3x2 full factorial survey embedded experiment on 1,954 respondents, this study show that introducing a trap questions which try to invoke altruistic motivation does not increase data quality and neither does asking the respondent to be more attentive by motivating their answers. Most importantly, in Sebastian Lundmark is a Ph.D. candidate at the Department of Political Science at the University of Gothenburg, Sweden. His dissertation focuses on the measurement of generalized trust and how to increase data quality in surveys. Johan Martinsson is an Associate Professor at the Department of Political Science at the University of Gothenburg, Sweden. His research interests are Economic Voting, Public Opinion, and Survey Methodology. Johan is also the director of the Laboratory of Opinion Research (LORE) at the University of Gothenburg.
2 Lundmark and Martinsson contrast to previous belief, poor attentiveness and poor data quality amongst respondents at one part of the survey does not predict poor attentiveness or data quality later in the survey. Keywords: Satisficing, Awareness Control, Instructional Manipulation Check, Web Survey Introduction Satisficing is a well-known data quality problem in all types of survey research (Tourangeau 1984; Krosnick 1991). In a new digital era the introduction of web surveys has generated great opportunities to control and collect information on how respondents answer questions, how much time they devote to each question, and even tracking eye movement, measuring what words the respondent actually read. Unfortunately however, contemporary online behavior might increase the sources of interruptions (e.g. social media, online games, chatting) and thus intervene more in the response process compared to other survey modes such as telephone or face-to-face interviews. However, the technological advancements of recent years have enabled researchers to a cost-effective way monitor and experiment with how to increase data quality or identify respondents who are satisficing. One such proposed way is to introduce “instructional manipulation checks” (IMC) or “awareness controls” that aim to sort out respondents who are satisficing and giving poor data quality, or to ensure that the respondent have read the full instructions in for example scenario experiments (Kapelner and Chandler 2010; Oppenheimer et al. 2009). When these awareness controls are used as “gatekeepers”, and sort out respondents who are not answering in an as attentive way as the researcher wants, the survey practitioner makes the assumption that this satisficing behavior is perpetual throughout the entire survey within each respondent. Oppenheimer and colleagues (2009) argue that, when introducing an IMC, respondents failing the manipulation show no significant difference compared to those who succeed when evaluating the data quality after the IMC. They also claim that introducing such a check will
3 Once a Satisficer, Not always a Satisficer make the failing respondents (when forced to retake the question until they succeed) as good as those respondents who succeeded on their first try. However, since they evaluate this only after the introduction of the IMC, Oppenheimer and colleagues (2009) cannot conclude whether these respondents actually became better by the manipulation check, since they do not compare how the failing respondents behaved prior to the experiment. Perhaps the failure on the manipulation check by the respondent only where due to a momentary/random lack of attention or motivation by the respondent? In this research note, we introduce an awareness control that mimics the outline of other attitude questions but instead ask the respondent to give a specific answer (i.e. 2) on the rating scale. By doing so, in contrast to other manipulation checks (e.g. Oppenheimer et al 2009; Kapelner and Chandler 2010), we can identify the satisfying respondents as well as what satisficing strategy those respondents chose to use (e.g. acquiescence or opting out). In addition, we also include objects that can compare data quality both before and after the awareness control, and can thus safely conclude whether the control does increase data quality or help predict future satisficing behavior amongst the respondents. Furthermore we introduce two questions that ask respondents to motivate some of their previous answers in order to try to increase the likelihood of increasing the data quality even further. Our results show that introducing awareness controls or asking respondents to motivate a previous answer, contrary to previous beliefs (Kapelner and Chandler 2010; Oppenheimer et al. 2009), does not increase data quality. Failing an awareness control is not consistent with poor data quality or satisficing behavior neither prior to the control and does not help predict satisficing behavior later in the survey either. Contrary, poor attentiveness or data quality amongst respondents at one part of the survey does not necessarily mean poor attentiveness or data quality later in the survey.
4 Lundmark and Martinsson Data and Design The experiment was embedded in a panel survey conducted during the spring of 2014, by the Laboratory of Opinion Research (LORE) at the University of Gothenburg, Sweden (see Martinsson et al. Forthcoming). The panel consists of a mixed probability/non-probability sample of respondents participating in a citizen-panel focused on experiments. The topic of this panel wave was to measure political attitudes and covered a wide range of questions. The embedded experiment is a 3x2 full factorial design and all groups are presented in Figure 1. There were no statistically significant differences in dropout rate between the six different groups (see Online Appendix A. Sample Description), hence the introduction of our awareness control and asking the respondents to motivate some of their answers did not create enough malcontent amongst the respondents to make them not finish our survey. Awareness Control: The awareness control that we use mimics the outlook of the other questions and aim to sort out the respondents who satisfice and do not read the entire question. In one of the experimental groups, the respondents who fail the awareness control are notified that they failed to read the entire question and are asked to answer the question again, this time hopefully more attentively. The awareness control is asked like this: “In the public debate a lot of opinions and proposals exist. In our surveys we often ask about just these sorts of opinions and proposals. These kinds of questions are often followed by a scale by which we ask you to rate to what extent you agree to a certain proposition. The following question serves the purpose to guarantee the quality of our surveys. We therefore ask you to read this whole question thoroughly and, if you read the whole question, answer number two on the following scale.”
5 Once a Satisficer, Not always a Satisficer Motivation: To further poke the respondent’s attention we ask them two “motivational” questions asking them to motivate their answer to a question they previously answered. The two questions were worded like this: First Motivational Question:“On the previous page you answered a question whether you like or dislike political party leaders in the Swedish Parliament. On that question you answered that you generally [like/neither like or dislike/dislike] [party leader name]. We would like to know more about your attitude towards [party leader name]. Please describe, using your own words, the main reasons why you think this way.” Second: “On the previous page you answered a question on how the Swedish Economy have changed the last 12 months and whether that first and foremost is a product of the government’s politics or mainly other factors. You answered [number] on a scale ranging from 1 to 7, where 1 means that the change is fully from other factors and 7 means that it’s a fully a product of the government’s politics. Please describe, using your own words, the main reasons why you think this way. For these questions most of the respondents, 88.6 for the first and 85.0 for the second question, gave a written answer. The six groups analyzed in this research note are presented in Figure 1. Results In the upcoming analysis we first (1) present the amount of respondents failing the awareness control, what satisficing strategy the failing respondents chose, as well as, whether notifying respondents that they failed make them succeed. Next (2) investigate whether the awareness control and asking the respondent to motivate their answers actually can help decrease satisficing behavior by comparing the experimental groups to the control group in terms of
6 Lundmark and Martinsson data quality. Lastly (3), we test whether those respondents failing the awareness control actually where satisficing also early in the survey. As measures of data quality we first compare probability to straight-line, i.e. non- differentiation between responses in grid questions (Cole et al. 2012). We use two definitions of straight-lining: a weaker definition where the respondent only have to straight line on one grid question, and a more conservative definition where the respondent must have straight- lined on all three grid questions included in the survey. Second, we evaluate data quality in terms of concurrent validity, i.e. “the degree to which a given measure can predict other variables to which it should be related” (Krosnick and Fabrigar 1997, 143).1 In other words, if one experimental group show stronger relationship with factors theoretically or logically related each other, it would suggest that that experimental group is less prone to satisficing behavior, since satisficing behavior would produce more random or noisy data.If data quality is increased and satisficing behavior decreased by our manipulations, we should see lower probability to straight-line and higher concurrent validity amongst the groups receiving the awareness controls as well as the respondents asked to motivate their answers. 1. Descriptive Statistics of the Awareness Control But before venturing into the data quality discussion we start off by presenting the descriptive statistics of the awareness control. The results are presented in Table 1. Although the question was asked quite early in the survey, only 92 % of the respondents succeeded on answering the question correctly while 8 % failed. Among those who failed, two clear satisficing strategies are evident. Firstly, the largest proportion put themselves at the right hand side of the scale, answering “strongly agree”, even though there was nothing to agree to. This is clear evidence of acquiescence bias, which is a common satisficing strategy 1 We included two related concepts in the survey to measure concurrent validity, the first two factors measured willingness to act on environmental issues and the second two factors measured support for taxes. For detailed description of these items see Online Appendix B. Operationalizations.
7 Once a Satisficer, Not always a Satisficer (McClendon 1991). The second most common failing answer amongst the respondents was to answer at the middle-point of the scale. Choosing a mid-point alternative is also a common satisficing strategy where the respondent chose to opt-out to give an answer that is more likely to be questioned (Krosnick 1991; 1999; Alwin and Krosnick 1991). The rest of the response alternatives were uncommon amongst the failing respondents, with almost none choosing them. When investigating the follow-up question (see Table 2), that the respondents in the “failure notification”-condition received after being notified that they had failed our awareness control, only 47 % now succeed on the question while 53 % are still not attentive enough (or reluctant) to give the correct answer. This means that over half of those who fail remain inattentive or oblivious even after receiving a notification asking them to focus and read the whole questions. Evident from this question as well is that both acquiescence as well as opting out are present amongst the failing respondents. We therefore conclude that our awareness control seem to actually catch and clearly identify respondents using traditional satisficing behavior. It would therefore be plausible to hypothesize that, if previous findings of awareness controls generalize to our question as well, those receiving the awareness control should be less prone to satisficing behavior after receiving it. 2. Does Awareness Control Decrease Satisficing Behavior? Moving over to analyze whether our awareness control actually help decrease satisficing behavior as well as help to predict satisficing behavior both early and late in the survey, the next section analyze what impact the awareness control and asking respondents to motivate previous answers had on data quality. We start off by investigating whether receiving our manipulations decrease satisficing behavior later in the survey. Table 3 presents the probability of each group to satisfice through both weak and strong straight-lining (non- differentiation).
8 Lundmark and Martinsson In Table 3, no significant decrease in the probability to straight-line could be found when comparing the groups receiving our manipulations to the control group. Hence, neither receiving the awareness control nor being asked to motivate previous answers decrease satisficing behavior later in the survey. Moving over to analyze concurrent validity, Figure 2 presents regression coefficients comparing the difference in validity between the control group and the experimental groups. In the same manner, neither for concurrent validity could any experimental effect be found: introducing an awareness control and asking respondents to motivate previous responses did not decrease satisficing behavior or increased data quality. In addition to this analysis we also investigated whether removing those who fail the awareness control from the data increased data quality as well as whether those respondents who failed the control the first time but succeeded the second started to produce better data quality than those who fail it twice. However, removing the failing respondents did not increase data quality and neither did making failing respondents successful on the awareness control. This stands in contrast to what Oppenheimer et al. (2009) found in their study. For the results from these analyses, see Online Appendix C. Additional Data Quality Analyses 3. Does Failing the Awareness Control Mean Satisficing Before the Awareness Control? However, this non-difference could stem from that respondents, who received the awareness control or were asked to motivate some previous answers, actually were satisficers before the manipulation and upon receiving our manipulations became as good as those who always were attentive. Table 4 presents a comparison of concurrent validity between those respondents who failed and those who were successful on the awareness control. However, failing the awareness control does not predict satisficing behavior or poor data quality even earlier in the survey and those who failed does not show significantly lower concurrent validity compared to those who succeeded (see Table 4). Hence, showing strong
9 Once a Satisficer, Not always a Satisficer satisficing behavior at one point in the survey is actually not at all related to showing satisficing behavior either early or later in the survey. Conclusion In this research note we have presented a 3x2 full factorial experimental design, testing the effect awareness controls and motivational questions have on data quality from survey respondents. We find that, contrary to previous research, introducing an awareness control does not increase data quality by lowering satisficing behavior and neither does asking respondents to motivate some of their previous answers. All in all, controlling respondents’ awareness and attentiveness at one part of the survey does not, in fact, predict overall bad response or satisficing behavior. Rather, whether a respondent fail or succeed on an awareness control or instructional manipulation check at the beginning of a survey, does not help us to predict who will generate poor responses or data quality later in the survey. However, to conclude in a somewhat optimistic fashion, our results show that once a satisficer may not mean always a satisficer. References Cole, James S., Alexander C. McCormick, and Robert M. Gonyea (2012) “Respondent use of straight-lining as a response strategy in education survey research: Prevalence and implications.” American Educational Research Association Annual Meeting 2012, Vancouver, BC, Canada. Alwin, Duane F., and Jon A. Krosnick. 1991. “The Reliability of Survey Attitude Measurement: The Influence of Question and Respondent Attributes.” Sociological Methods and Research 20: 139-181. Kapelner, Adam, and Dana Chandler. 2010. “Preventing Satisficing in Online Surveys: A “Kapcha” to Ensure Higher Quality Data.”CrowdConf 2010. San Francisco, CA, US.
10 Lundmark and Martinsson Krosnick, Jon A. 1991. “Response Strategies for Coping with the Cognitive Demands of Attitude Measures in Surveys.” Applied Cognitive Psychology 5: 213-236. Krosnick, Jon A. 1999. “Survey Research.” Annual Review of Psychology 50: 537-567. Krosnick, Jon A., and Leandre R. Fabrigar. 1997. “Designing Rating Scales for Effective Measurement in Surveys.” In Survey Measurement and Process Quality, eds. Lars Lyberg, Paul Biemer, Martin Collins, Edith de Leeuw, Cathryn Dippo, Norbert Schwarz, and Dennis Trewin. New York: John Wiley & Sons, Inc. Martinsson, Johan, Maria Andreasson, Elias Markstedt, and Karolina Riedel. Forthcoming, “Technical Report Citizen Panel 9 – 2014”, Gothenburg: University of Gothenburg, LORE. McClendon, McKee J. 1991. “Acquiescence and Recency Response-Order Effects in Interview Surveys.” Sociological Methods & Research 20: 60-102 Oppenheimer, Daniel M., Tom Meyvis, and Nicolas Davidenko. 2009. “Instructional manipulation checks: Detecting satisficing to increase statistical power.” Journal of Experimental Social Psychology 45: 867-72. Tourangeau, Roger. 1984. “Cognitive Sciences and Survey Methods.” In Cognitive Aspects of Survey Methodology: Building a Bridge between Disciplines, eds. Thomas B. Jabine, Miron L. Straf, Judith M. Tanur, and Roger Tourangeau. Washington, DC: National Academy Press.
11 Once a Satisficer, Not always a Satisficer Figure 2. Concurrent Validity after Awareness Control
12 Lundmark and Martinsson Table 1. Failed/Successful on Awareness Control Strongly Disagree 1 Strongly Agree 7 (Correct) 2 3 4 5 6 N Awareness Control 0.00 93.23 0.00 0.97 0.97 1.61 3.23 310 Awareness Control + Failure Notification 0.00 93.01 0.91 1.22 0.91 0.91 3.04 329 Awareness Control + Motivation 0.62 91.36 0.31 1.54 0.62 2.47 3.09 324 Awareness Control + Failure Notification + Motivation 0.32 91.77 0.00 2.53 0.63 2.53 2.22 316 Total 0.23 92.34 0.31 1.56 0.78 1.88 2.89 1279
13 Once a Satisficer, Not always a Satisficer Table 2. Failure Notification Follow-Up Question Strongly Disagree 1 Strongly Agree 7 (Correct) 2 3 4 5 6 N Awareness Control + Failure Notification Awareness Control + Failure Notification + Motivation Total 0.00 40.00 8.00 4.00 4.00 12.00 32.00 25 3.33 53.33 0.00 13.33 3.33 10.00 16.67 30 1.82 47.27 3.64 9.09 3.64 10.91 23.64 55
14 Lundmark and Martinsson Table 3. Straight Lining – Logistic Regression - Within Each Experimental Group (Standard errors in parentheses) Control Group Awareness Control Awareness Control + Failure Notification No Awareness Control + Motivation Awareness Control + Motivation Awareness Control + Failure Notification + Motivation Constant N Comment: *p < 0.05, **p < 0.01, ***p < 0.001 (1) Weak Straight-Lining . . -0.24 (0.23) -0.02 (0.22) -0.06 (0.23) -0.36 (0.24) 0.02 (0.22) -1.75*** (0.16) 1889 (2) Strong Straight-Lining . . -0.35 (0.31) -0.01 (0.28) -0.09 (0.30) -0.62 (0.33) 0.15 (0.28) -2.40*** (0.20) 1883
15 Once a Satisficer, Not always a Satisficer Table 4. Concurrent Validity before Awareness Control Policy Support Failing Respondents Successful Respondents Failing Respondents * Policy Support Successful Respondents * Policy Support Constant N Standard errors in parentheses *p < 0.05, **p < 0.01, ***p < 0.001 (1) Validity if Failed -0.71*** (0.12) (2) Validity if Successful -0.66*** (0.03) (3) Validity Interaction -0.71*** (0.11) 0.00 (.) -0.43 (0.47) 0.00 (.) 0.06 5.98*** (0.48) 97 5.55*** (0.13) 1176 (0.11) 5.98*** (0.45) 1273
16 Lundmark and Martinsson Online Appendix C. Additional Data Quality Analyses Table 7. Removing Failers - Logistic Regression: Straight-lining With Failers Weak Straight- Lining With Failers Strong Straight- Lining Without Failers Weak Straight- Lining Without Failers Strong Straight- Lining Control Group . . . . . . . . Awareness Control -0.24 -0.35 -0.26 -0.40 (0.23) (0.31) (0.24) (0.32) Awareness Control + Failure Notification -0.02 -0.01 -0.04 -0.07 (0.22) (0.28) (0.23) (0.29) No Awareness Control + Motivation -0.06 -0.09 -0.06 -0.09 (0.23) (0.30) (0.23) (0.30) Awareness Control + Motivation -0.36 -0.62 -0.40 -0.68 (0.24) (0.33) (0.25) (0.35) Awareness Control + Failure Notification + Motivation 0.02 0.15 -0.08 -0.00 (0.22) (0.28) (0.23) (0.29) *** *** *** *** Constant -1.75 -2.40 -1.75 -2.40 (0.16) (0.20) (0.16) (0.20) N 1889 1883 1795 1789 When continuing the analysis by investigating what happens to the data quality when those who fail the awareness control are removed from the data (a common practice in experimental studies), we yet again find no significant impact on the probability to straight-line. With Failers Without Failers
17 Once a Satisficer, Not always a Satisficer Figure 2. Concurrent Validity when Removing Failers Removing those who fail does not increase data quality in terms of concurrent validity, as can be seen in Figure 2. 4. Does Notifying that the Respondent Have Failed Decrease Satisficing Behavior? Lastly we analyze whether respondents’ who are notified that they failed the awareness control and become successful when receiving it again start producing better data quality. Table 8. Forcing Failers to become Successful - Logistic Regression: Straight-lining (1) (2) Weak Straight-Lining Strong Straight-Lining Failers Failure Notification -1.01 (0.79) -0.56 (0.81) *** *** Constant -1.34 (0.28) -1.79 (0.33) N 100 100 Again, no effect of the experiment could be found on probability to straight-line and thus, notifying the respondent that they are inattentive when reading our questions does not increase data quality.
18 Lundmark and Martinsson Figure 3. Forcing Failers to be Successful - Concurrent Validity And in line with all our previous findings, the same is true for concurrent validity, notifying respondents that they are inattentive does not increase concurrent validity.