Issue salience & support
In August 2024, we surveyed thousands of voters across the country to understand how they would manage trade-offs between different policy positions when deciding which candidate to vote for. We focused on abortion access, crime, immigration, labor economics, and “wokeness” in schools. We also tested the degree to which a candidate’s party might matter more or less than their stances on a particular issue.
After conducting this survey, we used the responses to build a set of predictive models to estimate each voter’s take on the aforementioned issues – with separate models for voters’ ideological positioning on a given issue (i.e., “support”) and the importance they place on that issue relative to others (i.e., “salience”).
In this documentation, we explain how we built these models, describe how they might be used in practice, and review detailed evaluation statistics for each issue’s support and salience models.
While vote choices are motivated by a range of factors, research indicates that candidates' policy stances matter more than the conventional wisdom might suggest. However, the mechanisms behind this influence are complex. Candidates and voters each hold a range of (often conflicting) views on a given set of issues, with some holding greater importance than others.
For example, imagine a voter who supports abortion access and is highly motivated by that issue. They are facing a choice between a Democratic state legislative candidate who opposes abortion rights and a Republican who supports them. However, on all other issues, the voter aligns much more closely with the Democrat. And to complicate matters further, the voter knows the Democratic Party generally supports abortion rights while the Republican Party generally opposes them, which means it might be more strategic to help the Democrats gain a majority even if this one candidate is imperfect on the issue. Who is this voter most likely to support? Which arguments are most likely to sway them?
To better understand these dynamics, we've assembled a suite of six "support" and "salience" scores – indicating the likelihood that a voter (1) holds progressive stances in a given issue area and (2) considers a given issue area more important than others when determining which candidate to vote for.
Issue support scores have been a staple in the progressive data ecosystem for decades. They are often used to segment audiences so that voters get exposed to messages they're more likely to support. But audiences are rarely segmented based on the issues they're likely to care the most about – because we haven't had effective tools to drive that segmentation.
Let's say a given candidate is seen as too soft on crime. These scores would allow a campaign to target only the voters most likely to (1) prefer a tough-on-crime stance and (2) choose a candidate based primarily on their approach to crime. In this scenario, the campaign would avoid increasing the salience of an issue that isn't working for them by only reaching those for whom the issue is already salient.
These scores could also help campaigns expand their persuasion audiences in smarter ways. For example, our scores indicate that millions of Republicans support abortion access and care deeply about the issue. Those voters might appear to be deeply entrenched in their views according to a normal candidate support model – but the right message could potentially win them over.
To gather training data, we deployed a national conjoint analysis survey to understand how voters would make trade-offs between different positions held by hypothetical Democratic and Republican candidates. The issue areas included abortion access, crime, immigration, labor economics, and “wokeness” in schools. In each area, we drafted several positions a real candidate might hold and assigned those positions to a left-to-right ideological scale.
Survey respondents saw five randomly constructed match-ups between two candidates, with each holding a unique position on the five aforementioned issue areas. We were then able to analyze which issues drove voter decisions. (Or alternately, whether voters seemed motivated only by party affiliation.)
The chart below shows the average support for a candidate with each issue stance we tested.
After collecting the survey data, we put it through a cleaning and preprocessing phase. We summarized the complex conjoint responses into a set of numeric values for each voter indicating their ideological position on each issue area and the relative importance they appeared to attach to a candidate’s views in each issue category.
We then used deep learning models to train dense neural networks that would predict each voter's issue stances and relative prioritizations. For each unique model, we scanned our training data (consisting of the survey responses and personal traits from the Stacks Voter File) for optimal combinations of predictors using a method called Variable Selection Using Random Forests.
The deep learning hyperparameters used to configure each model are detailed below.
Model | Loss | ||
---|---|---|---|
Abortion | hard silu | nadam | mean abs. error |
Crime | hard silu | nadam | huber |
Immigration | softplus | nadam | mean sq. log. error |
Labor | selu | sgd | mean sq. error |
Party | mish | nadam | mean abs. error |
Wokeness | hard silu | rmsprop | huber |
Model | Loss | ||
---|---|---|---|
Abortion | selu | nadam | mean sq. error |
Crime | hard silu | sgd | mean abs. error |
Immigration | sigmoid | adam | mean sq. error |
Labor | sigmoid | nadam | mean abs. error |
Party | hard silu | sgd | mean abs. error |
Wokeness | sigmoid | adamax | cosine similarity |
To validate these models, we suppressed 20% of our survey respondents as a hold-out group for testing. (An additional 10% of responses were suppressed and used as validation samples within the model design process.) We then ran the models on the testing group and compared our predictions to the respondents' actual choices.
While we focused on a range of evaluation metrics when deciding whether to keep a model, the metrics that mattered most to us were:
- Area under the ROC curve (AUC): The probability that a model would rank a positive value higher than a negative value.
- Gain captured by the model: The percentage of theoretical lift over random performance that a model achieves.
- Mean absolute error (MAE): The average distance between actual values and those predicted by the model.
Below, we share these values for each model. Models with higher AUCs and gains, and lower MAEs, are higher performing.
You'll notice that some models are better than others. For example, the abortion salience and support models are quite good – meaning that it's relatively easy to distinguish between both (1) supporters and opponents of abortion rights and (2) those who care deeply about abortion rights and those who do not.
In other cases, salience is easier to predict than support (e.g., the salience of wokeness in schools is more differentiated that support for policies related to that topic) and vice versa (e.g., the salience of issues related to labor economics and immigration are less differentiated than the views people hold in each issue area). In these cases, weaker evaluation metrics indicate a less polarized electorate, though the models still provide some sorting utility.
Model | AUC | Gain captured | MAE |
---|---|---|---|
Abortion | 0.69 | 0.38 | 0.26 |
Crime | 0.68 | 0.35 | 0.26 |
Immigration | 0.59 | 0.16 | 0.29 |
Labor | 0.55 | 0.10 | 0.32 |
Party | 0.57 | 0.15 | 0.20 |
Wokeness | 0.61 | 0.21 | 0.29 |
Model | AUC | Gain captured | MAE |
---|---|---|---|
Abortion | 0.72 | 0.44 | 0.24 |
Crime | 0.66 | 0.33 | 0.19 |
Immigration | 0.75 | 0.51 | 0.24 |
Labor | 0.77 | 0.54 | 0.28 |
Party | 0.76 | 0.52 | 0.20 |
Wokeness | 0.54 | 0.09 | 0.22 |