People

voters

We work with partners to collect and process data from each state's voter file. We then join that data with consumer data (including high quality email addresses and phone numbers), census data, and our own predictive models to provide users with a dataset that can power a variety of targeting, messaging, outreach, and modeling use cases.

Schema

Column

Description

voterbase_id

Unique voter ID

state_voter_id

State supplied voter ID, format differs from state to state

county_voter_id

County supplied voter ID, format differs by county and state

first_name

First name in uppercase with spaces, numbers, and special characters removed. Accented characters have been replaced with non-accented versions

middle_name

Middle name in uppercase with spaces, numbers, and special characters removed. Accented characters have been replaced with non-accented versions

middle_init

Middle initial in uppercase

last_name

Last name in uppercase with spaces, numbers, and special characters removed. Accented characters have been replaced with non-accented versions

name_suffix

Name suffix in uppercase, i.e. JR, III

dob

Date of birth formatted as date (YYYY-MM-DD). DOB comes from voter registration data and commercial data

myob

Month and year of birth formatted as 6 digit integer (YYYYMM). MYOB comes from voter registration data and commercial data. Some states truncate DOB to the first of the month making MYOB better for matching in those cases

yob

Month and year of birth formatted as 4 digit integer (YYYY). YOB comes from voter registration data and commercial data. Some states truncate DOB to the first of the year making YOB better for matching in those cases.

age

Voter's age by end of current year, calculated by current_year - YOB

modeled_age

Voter's age by end of current year, calculated by current_year - YOB, modeled if DOB is missing

reg_date

Voter registration date formatted as date (YYYY-MM-DD). In situations where state or county updates reg date when a voter record is updated, reg_date is calculated as 30 days prior to earliest recorded vote date

gender

Voter gender from voter registration data. M, F, or NULL

ethnicity

Ethnicity, self reported on the voterfile where available, otherwise modeled. In cases where the model isn't confident, Null. AAPI, Black, Latino, Native American, White

ethnicity_source

Ethnicity source: Voterfile, Commercial, Uncoded

modeled_race_aapi

Indigo race model, score from 0-1 with the probability that a voter is AAPI

modeled_race_black

Indigo race model, score from 0-1 with the probability that a voter is Black

modeled_race_latino

Indigo race model, score from 0-1 with the probability that a voter is Hispanic or Latino/a

modeled_race_native_american

Indigo race model, score from 0-1 with the probability that a voter is Native American

modeled_race_white

Indigo race model, score from 0-1 with the probability that a voter is White

state_fips

Registration state fips code, two digit fips as determined by the census formatted as a string

state

State abbreviation

county_fips

Registration address county fips code, three digit fips as determined by the census formatted as a string

county

Registration address county name in uppercase

precinct

Registration precinct name

reg_address

Registration address

reg_city

Registration city name

reg_state

Registration state abbreviation

reg_zip

Registration address zip5 as string

reg_zip4

Registration address zip4 as string

reg_lat

Registration address latitude

reg_long

Registration address longitude

reg_latlong_accuracy

Accuracy of reg_lat and reg_long columns, ordered from most to least accurate: GeoMatch9Digit, GeoMatchRooftop, GeoMatchBuilding, RangeInterpolation, ExactMatch, AverageOfApartments, ParcelCenter, GeoMatch5Digit, KnownAlternateName, DirectionPrefixRemoved, DirectionSuffixRemoved, StreetCenter, Intersection

mailing_address

Mailing address

mailing_city

Mailing city name

mailing_state

Mailing address state abbreviation

mailing_zip

Mailing address zip5 as string

mailing_zip4

Mailing address zip4 as string

phone

Best phone number for voter, prioritizing cell phones over landlines, 9 digits formatted as a string

phone_type

Type of phone: Cell, Landline, VOIP

phone_confidence_code

Confidence in quality of phone number High - high confidence phone number matched at the individual level Household - high confidence phone number matched at the household level Low - low confidence phone number

phone_cell

Cell phone number, 9 digits formatted as a string

phone_cell_confidence_code

Confidence in quality of phone_cell High - high confidence phone number matched at the individual level Household - high confidence phone number matched at the household level Low - low confidence phone number

phone_landline

Landline phone number, 9 digits formatted as a string

phone_landline_confidence_code

Confidence in quality of phone_landline High - high confidence phone number matched at the individual level Household - high confidence phone number matched at the household level Low - low confidence phone number

party

Party identification based on voterfile

party_3way

Party identification grouped into DEM, REP, and IND based on voterfile and modeled data

party_source

Source of party and party_3way data: Voterfile, Modeled

district_congressional_2020

Congressional district, three digits zero padded, i.e. 002, 011, 024

district_congressional_2010

2010 congressional district, three digits zero padded, i.e. 002, 011, 024

district_stleg_upper_2020

Upper state legislative district, state senate. For numeric districts, district names are three digits and zero padded, i.e. 003, 021, 041B. For non-numeric district names, strings are uppercase.

district_stleg_lower_2020

Lower state legislative district, including state house or state assembly depending on the state. For numeric districts, district names are three digits and zero padded, i.e. 003, 021, 041B. For non-numeric district names, strings are uppercase.

census_block_2020

Census block ID, 15 digits formatted as a string

ts_model_education_college_graduate

TargetSmart education model. Predicts the likelihood that an individual has attained a college-level or higher education. Scores are expressed from 0-1. A higher score represents a higher probability that a person's education level is college graduate or higher.

ts_model_education_high_school_only

TargetSmart education model. Predicts the likelihood that an individual has not attained formal education beyond high school. Scores are expressed from 0-1. A higher score represents a higher probability that a person's education level is high school or lower.

ts_model_religion_catholic

TargetSmart religion score. Predicts the likelihood that a voter identifies as Catholic. Scores are expressed from 0-1. A higher score indicates a higher likelihood to identify as Catholic.

ts_model_religion_evangelical

TargetSmart religion score. Predicts the likelihood that a voter identifies as Evangelical. Scores are expressed from 0-1. A higher score indicates a higher likelihood to identify as Evangelical.

ts_model_religion_jewish

TargetSmart religion score. Predicts the likelihood that a voter identifies as Jewish. Scores are expressed from 0-1. A higher score indicates a higher likelihood to identify as Jewish.

ts_model_religion_mormon

TargetSmart religion score. Predicts the likelihood that a voter identifies as Mormon. Scores are expressed from 0-1. A higher score indicates a higher likelihood to identify as Mormon.

ts_model_gunowner

TargetSmart gun owner score. Predicts the likelihood that an individual supports stricter gun control laws. Scores are expressed from 0-1. A higher score indicates a higher likelihood that a person supports stricter gun control laws.

ts_model_veteran

TargetSmart veteran score. Predicts the likelihood that an individual is a military veteran or an active service member. Scores are expressed from 0-1. A higher score predicts a higher likelihood that the individual is a military veteran or an active service member.

ts_model_ideology_score

TargetSmart ideology score. Predicts the likelihood that an individual supports liberal ideology. Scores are expressed from 0-1. A value of 1 represents those most likely (very liberal) and 0 represents those least likely (very conservative).

ts_model_children_present_score

TargetSmart children present score. Predicts the likelihood that a voter lives in a household with children. Scores are expressed from 0-1. A higher score represents a higher probability that a person lives in a household with children.

ts_model_marriage

TargetSmart marriage score. Predicts the likelihood that an individual is married. Scores are expressed from 0-1. A higher score represents a higher probability that a person is married.

ts_model_income_rank

TargetSmart high income score. Predicts the likelihood that an individual has an income over $100,000. Scores are expressed from 0-1. A higher score represents a higher probability that a person would self-report income greater than $100,000.

ts_model_homeowner

TargetSmart homeowner score. Predicts the likelihood that an individual owns a home. Scores are expressed from 0-1. A higher score represents a higher probability that person owns a home.

fec_hh_contribution_count_rep

Total number of FEC contributions to Republicans in the household

fec_hh_contribution_count_dem

Total number of FEC contributions to Democrats in the household

fec_hh_contribution_count_total

Total number of FEC contributions in the household

fec_hh_contribution_pct_dem

Percent of FEC contributions in the household made to Democrats

modeled_turnout_midterm_primary

Indigo midterm primary turnout model, score from 0-1 with the probability that a voter will turn out to vote in a midterm primary election

modeled_turnout_midterm_general

Indigo midterm general turnout model, score from 0-1 with the probability that a voter will turn out to vote in a midterm general election

modeled_turnout_presidential_primary

Indigo presidential primary turnout model, score from 0-1 with the probability that a voter will turn out to vote in a presidential primary election

modeled_turnout_presidential_general

Indigo presidential general turnout model, score from 0-1 with the probability that a voter will turn out to vote in a presidential general election

modeled_race

Categorical race value based on Indigo race model. AAPI, Black, Latino, Native American, White, or NULL if model is low confidence

modeled_dem_partisanship

Indigo partisanship model, score from 0-1 with the probability that a voter indentifies as a Democrat

g12_reg*

1 if voter was registered for given election, 0 if voter wasn't yet registered

g12_voted*

1 if voted, 0 if didn't vote

g12_ballot_type*

Indicated method of voting if voted. Note that not all states report type of ballot for all historic elections, a null value indicates lack of reporting. Early, Absentee, Poll Vote, Unknown.

p12_party*

Partisan primary voted in in given election

D - Democrat, G - Green, I - Independent, L - Libertarian, N - No Party Preference, O - Other Party, R - Republican, U - Unknown

* For general, primary, and presidential primary elections from 2012-present day, we have _voted, _election_date, and _ballot_type columns for each election. The naming convention for these columns in [election stage - g/p/pp][2 digit election year - 14/22/23]_[column type], i.e. pp09_voted, p14_party, g22_ballot_type.