By Rasto Ivanic, GroupSolver
By the time all the votes are counted, the ultimate margin of error may not be as huge as it appeared in the morning after the 2020 elections. However, it is already fairly certain that pollsters continue to struggle mightily with predicting election results despite conducting countless polls. This struggle is particularly apparent when we look beyond national averages and start dissecting the errors on a state and district level. I am sure that there will be countless opinion pieces, articles and research papers written about the reasons why this election seems to have again defied predictions, and numerous statistical models will be built to explain the polling errors.
The purpose of this piece is not to dive into details, but to offer a more philosophical perspective on polling in general. One has to ponder why predictions were off despite the massive volume of data we feed our models. Personally, I have a ton of respect for anyone who goes into the business of predicting political outcomes. I am also a polling outsider, a market researcher and a CEO of an AI-focused market research tech firm that is applying technologies including artificial intelligence and crowdsourcing that have transformed how commercial online surveys are conducted and significantly improved the depth and richness of insights they deliver. Through the lens of these learnings, I offer four takeaways for political pollsters who are re-thinking their approaches with an eye to delivering more reliable results.
Takeaway #1: Reconsider where data comes from (looking at you, phone interviews)
They are considered to be the golden standard for collecting political opinions and, as such, phone interviews can carry greater weight in predictive modeling. I would challenge their golden status for two main reasons: questionable quality of phone interview data and the inability of this data collection method to get deeper, beyond superficial yes or no answers.
Perhaps you have been asked for a political opinion over the phone before and you may have had the same experience as I have: I was called in the middle of my grocery shopping (why did I even pick up?), hands full and in a hurry to check items off my grocery list. I was answering questions read to me by an operator who didn’t quite care too much about what she was doing. The connection was poor, and I could hardly understand her questions. Focusing more on my groceries than on the survey, I kept forgetting the options I was supposed to consider for my answers, and toward the end of the interview, all I wanted was to be done with it.
Calling people may have once been the best window into voters’ minds, but there is a better window now. People use phones differently today. They are much more likely to speak to each other by tapping their fingers, and polling should probably follow voters for their opinions there. Now that online surveys can be used to easily and efficiently ask and process the answers to open-ended questions, this presents groundbreaking opportunities for political pollsters.
Takeaway #2: Get a better understanding of why people vote
Doing more polling online has another advantage – we can get deeper and obtain more data on the behaviors and motivations of a voter. With the right tools, such data is much easier to capture online than through a telephone call. Understanding the voter beyond simple demographics is particularly important for those in the prediction-making business, because voting behavior is a mix of rational and emotional impulses that is impossible to understand from superficial demographic data (age, gender, income, political leaning, etc.) or simple multiple-choice questions. Communicating with a respondent online allows us to ask more interesting questions during a voter survey, so they can reveal more relevant information about themselves. We should be asking questions such as why they vote, what issues are important to them, and what words would describe their feelings about a candidate, etc.
Rich information from such probing questions can add more explanatory power to predictive modeling, but simply can’t be reliably captured with a traditional poll. Undoubtedly, capturing this additional voter signal can then be used to build more accurate models of how people are likely to vote, and perhaps more importantly, how likely they are to vote – regardless of how they answer the quintessential polling question: “How likely are you to vote in the upcoming elections.” This can now be done with online surveys using AI and crowdsourcing techniques.
Takeaway #3: Get creative with modeling
In my experience distilling insights from customer data, when we bring into the equation new variables (like emotions, behaviors or top-of-mind thoughts on issues), it opens up a whole new universe in how we can look at the consumer behavior. I don’t see a reason why we could not apply some of the same ideas from market research to political polling. Considering a consumer looking to make a purchase in the same way we consider a voter casting a ballot, we can apply some of the same modeling techniques to build more accurate models about the probability to vote and understand how different segments of the population are likely to behave.
Let me share one example of what I mean. When GroupSolver was approached by an Amazon Alexa music team about a customer segmentation survey, it was because their initial effort to understand their customer segments failed. They realized that understanding their customer from a simple set of close-ended questions and demographic data produced non-actionable, shallow understanding that was not useful in making business decisions. In contrast, we approached this problem through the “why do you listen to music” lens and built a segmentation model from open-ended responses that ranged from deeply spiritual, such as I connect with God through music, to practical and behavioral, such as music allows me to focus while I am working.
Now, think about substituting the music listener for a voter. Instead of relying on predicting their voting behavior based on their age, gender and income, we could go deeper and capture their personal ambitions, convictions and moral compass into tangible information that would give us a more precise indication of their future voting behavior. AI and crowdsourcing techniques were the key.
Takeaway #4: Future of polling depends on adapting to change
To be clear, I do not believe that it is enough to simply do more polling online. The volume of data is not the problem. There are plenty of traditional surveys out there. However, typical surveys suffer from the same issue as do phone interviews – they are shallow. They rarely ask respondents to express their own feelings but instead rely on asking multiple-choice questions. Historically, there was a good reason why polls have relied on shallow multiple-choice questions: results were easy to tabulate, while reading text answers to open-ended questions were not. But understanding open-ended questions at scale is no longer a valid limitation.
Today’s commercial online survey platforms can use AI and Natural Language Processing (NLP) to reliably find deeper insights, faster than previous techniques and at scale. Researchers solicit unaided answers to open-ended questions without having to convert all of that text into something that can be categorized and analyzed. There is an automatic, real-time first-pass clean-up of free-text answers and then respondents are presented with statements based on these answers with which they can agree or disagree. The process exposes researchers to a treasure trove of unfiltered and organic sentiments in each respondent’s own voice that are validated in real time through this crowdsourced process. Respondents enjoy doing it, and it eliminates what previously had been a prohibitively labor-intensive data preparation process.
Despite an initially healthy dose of skepticism from the commercial research and insights industry, these new online survey methodologies are making serious inroads into traditional market research methods because they work – they help reduce bias and increase precision of our predictions. I see a similar opportunity in political polling.
If the 2016 election was a “fool me once, shame on you” moment for political polling, the 2020 election should be a “fool me twice, shame on me” realization. As much as sites like fivethirtyeight.com are doing an extraordinary job with modeling and simulating potential outcomes based on the polls they have at their disposal, their predictions can only be as good as the polls that are fed to them. It may be time to feed them better food.
Rasto Ivanic is a co-founder and CEO of GroupSolver, a market research tech company.