The best currently available evidence suggests that most public communication campaigns could substantially improve their impact. For example, large meta-studies of climate change communication, flu vaccination encouragement, and political advertising all find considerable variation in the impact of different messages, where the most persuasive messages have 2x greater impact than the average message. If campaigns can reliably identify such messages, they should expect to double their impact. If they additionally tailor their messaging to different audiences, they can potentially 3x their impact.
The challenge is that the “space” of messages for campaigns to decide between is enormous — there are very many things a campaign could say and many different ways to say them. Unfortunately, research shows that relying on theory and expert guidance about “what works” when designing campaign messages is unlikely to be effective by itself, because “what works” is difficult to predict and can change dramatically across contexts (e.g., see , , , ).
To overcome this challenge, our central premise is that campaigns require new methods to decrease their reliance on expert-designed messaging, and be more responsive to public opinion. In particular, we focus on two research directions we believe are at the forefront of modern message development:
- Efficient message search. We design research pipelines that allow campaigns to explore the large space of potential messages more efficiently, and to quickly zero-in on the most impactful messaging strategies. Our methodology is based on a combination of large-scale adaptive online survey RCTs, Bayesian machine learning and surrogate metrics.
- Community involvement. For campaigns interested in communicating with a particular group or community, we design scalable methods that involve community-members themselves directly in the message development process. This provides intrinsic value—in the form of representation—as well as instrumental value: recent research suggests that regular people can often be far more effective than experts at predicting which messages will best resonate with others in their community.¹
Core to our research process are ethical checks and high standards for accuracy. We do not develop or test messages that are false, misleading, or that incite exclusionary attitudes or violence; we work only with campaigns whose goals are clearly aligned with public good; and we never share any data that could personally identify our research participants.
¹ This may be particularly important for reaching groups who are highly under-represented amongst those conducting the research. For example, Milkman et al. (2022) asked people to predict the effects of 22 different interventions designed to encourage flu shots in unvaccinated people. They found that the average predictions of behavioral scientists (96% of whom were themselves already vaccinated) were far less predictive of real-world impacts than those of lay-people.
Our research typically contains an RCT experiment at its core, and we believe that the trend towards online, in-survey RCT testing for public communication campaigns has been highly positive. However, this method is not a panacea for identifying impactful messages, and suffers from two key limitations which our research aims to address:
- Feasibility. For campaigns focused on hard-to-move attitudes like vaccine hesitancy, in-survey RCTs typically require showing each message to around a thousand people to compare them with statistical significance.² This limits the number of messages that can feasibly be tested within an RCT. However, the “space” of potential messages — things campaigns could say — is often enormous. It is infeasible for a campaign to even consider all the different realizations of a message, let alone create and test them in an RCT. For this reason we combine RCT testing with other tools such as crowdsourcing and machine learning, to more efficiently narrow the field of potential messages to those with the largest probable impact.
- External validity. In-survey RCTs differ from real-world campaigns in several ways; for example, they use a captive audience (whereas real campaigns must engage people’s attention), and they ask participants’ opinions immediately after message exposure (whereas real campaigns must produce lasting attitude change). These differences could substantially and systematically bias the results of in-survey message tests, such as by over-estimating the impact of priming/framing interventions. Thus, wherever possible we combine in-survey RCTs with field experiments and multi-wave panel experiments, thereby identifying and adjusting for the biases of in-survey RCTs to increase their real-world predictive value.
² e.g. Student t-test with Cohen’s d = 0.1, power = 0.8, alpha = 0.05