voters
We work with partners to collect and process data from each state's voter file. We then join that data with consumer data (including high quality email addresses and phone numbers) and our own predictive models to provide users with a dataset that can power a variety of targeting, messaging, outreach, and modeling use cases.
Our core national voter file comes in two varieties:
A record of all registered voters – taking each state's data at face value – with PII, contact information, geographic and district info, consumer data, vote history, and more joined in.
Using entity resolution, we've identified voters who have multiple registration records across different states and have removed outdated, inactive records. PII from multiple records are combined into the most complete version of the current record.
Column | Description |
---|---|
voter_id | Unique voter ID |
state_voter_id | State supplied voter ID, format differs from state to state |
county_voter_id | County supplied voter ID, format differs by county and state |
first_name | First name in uppercase with spaces, numbers, and special characters removed. Accented characters have been replaced with non-accented versions |
middle_name | Middle name in uppercase with spaces, numbers, and special characters removed. Accented characters have been replaced with non-accented versions |
middle_init | Middle initial in uppercase |
last_name | Last name in uppercase with spaces, numbers, and special characters removed. Accented characters have been replaced with non-accented versions |
name_suffix | Name suffix in uppercase, i.e. JR, III |
dob | Date of birth formatted as date (YYYY-MM-DD). DOB comes from voter registration data and commercial data |
myob | Month and year of birth formatted as 6 digit integer (YYYYMM). MYOB comes from voter registration data and commercial data. Some states truncate DOB to the first of the month making MYOB better for matching in those cases |
yob | Month and year of birth formatted as 4 digit integer (YYYY). YOB comes from voter registration data and commercial data. Some states truncate DOB to the first of the year making YOB better for matching in those cases. |
reg_date | Voter registration date formatted as date (YYYY-MM-DD). In situations where state or county updates reg date when a voter record is updated, reg_date is calculated as 30 days prior to earliest recorded vote date |
gender | Voter gender from voter registration data. M, F, or NULL |
ethnicity | Ethnicity, self reported on the voterfile where available, otherwise modeled. In cases where the model isn't confident, Null. AAPI, Black, Latino, Native American, White |
ethnicity_source | Ethnicity source: voterfile, modeled, previous registration |
modeled_race_aapi | Indigo race model, score from 0-1 with the probability that a voter is AAPI |
modeled_race_black | Indigo race model, score from 0-1 with the probability that a voter is Black |
modeled_race_latino | Indigo race model, score from 0-1 with the probability that a voter is Hispanic or Latino/a |
modeled_race_native_american | Indigo race model, score from 0-1 with the probability that a voter is Native American |
modeled_race_white | Indigo race model, score from 0-1 with the probability that a voter is White |
religion | Modeled religion from L2 based on name and census data: Buddhist, Catholic, Christian, Eastern Orthodox, Greek Orthodox, Hindu, Islamic, Jewish, Lutheran, Mormon, Protestant, Shinto, Sikh |
state_fips | Registration state fips code, two digit fips as determined by the census formatted as a string |
state | State abbreviation |
county_fips | Registration address county fips code, three digit fips as determined by the census formatted as a string |
county | Registration address county name in uppercase |
precinct | Registration precinct name |
reg_address | Registration address |
reg_city | Registration city name |
reg_state | Registration state abbreviation |
reg_zip | Registration address zip5 as string |
reg_zip4 | Registration address zip4 as string |
reg_lat | Registration address latitude |
reg_long | Registration address longitude |
reg_latlong_accuracy | Accuracy of reg_lat and reg_long columns, ordered from most to least accurate: GeoMatch9Digit, GeoMatchRooftop, GeoMatchBuilding, RangeInterpolation, ExactMatch, AverageOfApartments, ParcelCenter, GeoMatch5Digit, KnownAlternateName, DirectionPrefixRemoved, DirectionSuffixRemoved, StreetCenter, Intersection |
mailing_address | Mailing address |
mailing_city | Mailing city name |
mailing_zip | Mailing address zip5 as string |
mailing_zip4 | Mailing address zip4 as string |
mailing_state | Mailing address state abbreviation |
phone | Best phone number for voter, prioritizing cell phones over landlines, 9 digits formatted as a string |
phone_type | Type of phone: CELL, LANDLINE |
phone_confidence_code | Confidence in quality of phone number with 1 being highest confidence and 5 being lowest confidence |
phone_cell | Cell phone number, 9 digits formatted as a string |
phone_cell_confidence_code | Confidence in quality of phone_cell number with 1 being highest confidence and 5 being lowest confidence |
phone_landline | Landline phone number, 9 digits formatted as a string |
phone_landline_confidence_code | Confidence in quality of phone_landline number with 1 being highest confidence and 5 being lowest confidence |
Email address | |
party | Party identification, based on voterfile and modeled data |
party_3way | Party identification grouped into DEM, REP, and IND based on voterfile and modeled data |
party_source | Source of party and party_3way data: voterfile, modeled |
district_congressional_2020 | Congressional district, three digits zero padded, i.e. 002, 011, 024 |
district_congressional_2010 | 2010 congressional district, three digits zero padded, i.e. 002, 011, 024 |
district_congressional_proposed_2024 | Proposed 2024 congressional district where available, three digits zero padded, i.e. 002, 011, 024 |
district_stleg_upper_2020 | Upper state legislative district, state senate. For numeric districts, district names are three digits and zero padded, i.e. 003, 021, 041B. For non-numeric district names, strings are uppercase. |
district_stleg_upper_2010 | 2010 upper state legislative district, three digits zero padded |
district_stleg_upper_proposed_2024 | Proposed 2024 upper state legislative district where available, three digits zero padded |
district_stleg_lower_2020 | Lower state legislative district, including state house or state assembly depending on the state. For numeric districts, district names are three digits and zero padded, i.e. 003, 021, 041B. For non-numeric district names, strings are uppercase. |
district_stleg_lower_2010 | 2010 lower state legislative district, three digits zero padded |
district_stleg_lower_proposed_2024 | Proposed 2024 lower state legislative district where available, three digits zero padded |
district_stleg_floterial_2020 | Floterial districts district, only used in New Hampshire |
district_stleg_floterial_2010 | 2010 floterial districts district, only used in New Hampshire |
commercial_hh_donatestocharity | Binary if someone in the household donates to charity based on commercial data |
commercial_dwellingtype_duplex | Binary if home is a duplex based on commercial data |
commercial_dwellingtype_apartment | Binary if home is an apartment based on commercial data |
commercial_dwellingtype_singlefamilyhome | Binary if home is a single family home based on commercial data |
commercial_edu_hsonly | Binary if education level is high school or less based on commercial data |
commercial_edu_somecollege | Binary if educaiton level is some college based on commercial data |
commercial_edu_bachdegree | Binary if education level is bachelors degree based on commercial data |
commercial_edu_graddegree | Binary if education level is graduate degree based on commercial data |
commercial_hh_income | Estimated household income based on commercial data |
commercial_homepurchasedate | Estimated home purchase date based on commercial data |
commercial_homepurchaseprice | Estimated home purhcase price based on commercial data |
commercial_ispsa | Index of social position for small areas, mix of education and income data that estimates where a voter lies on a sacle from 0 to 9 on the socio-economic ladder |
commercial_gun_owner | Binary if voter is a gun owner based on gun registrations and subscriptions to hunging / gun magazines |
commercial_veteran | Binary if voter is a veteran based on commercial data |
census_block_2020 | Census block ID, 15 digits formatted as a string |
census_area_medianincome | Median household income in census block based on census data |
census_area_medianhousingvalue | Median home value in census block based on census data |
census_area_pctspanishspeaking | Pct of census block that is Spanish speaking based on census data |
fec_avg_donation_amount | Average donation dollar amount in federal races over the last four election cycles |
fec_total_donation_amount | Total dollar amount donated in federal races over the last four election cycles |
fec_last_donation_date | Date of most recent donation in a federal race over the last four eleciton cycles |
fec_primary_party | Partisanship of candidate or organization who voter donated the largest amount to over the last four election cycles: D, R |
modeled_turnout_midterm_primary | Indigo midterm primary turnout model, score from 0-1 with the probability that a voter will turn out to vote in a midterm primary election |
modeled_turnout_midterm_general | Indigo midterm general turnout model, score from 0-1 with the probability that a voter will turn out to vote in a midterm general election |
modeled_turnout_presidential_primary | Indigo presidential primary turnout model, score from 0-1 with the probability that a voter will turn out to vote in a presidential primary election |
modeled_turnout_presidential_general | Indigo presidential general turnout model, score from 0-1 with the probability that a voter will turn out to vote in a presidential general election |
modeled_dem_partisanship | Indigo partisanship model, score from 0-1 with the probability that a voter indentifies as a Democrat |
g08_voted* | 1 if voted, 0 if registered to vote but didn't vote, null if wasn't registered to vote |
g08_election_date* | Date of election formatted as date YYYY-MM-DD |
g08_ballot_type* | Indicated method of voting if voted. Note that not all states report type of ballot for all historic elections, a null value indicates lack of reporting |
*For general, primary, and presidential primary elections from 2008-present day, we have _voted, _election_date, and _ballot_type
columns for each election. We only include statewide and federal elections, and in instances where more than on election of the same type occurred in one election year, we chose the election with the highest turnout level. In situations where presidential and normal primaries are combined into a single election, they are represented as normal primaries. The naming convention for these columns in [election stage - g/p/pp][election year - 09/14/22]_[column type], i.e. pp09_voted, p14_election_date, g22_ballot_type.