Ballot Measure Choices: For/Against/Yes/No
When a row in an election results file has “For” as the candidate name, it could mean two things: a person whose legal name is “For” (implausible), or a choice on a ballot measure (almost certain). The distinction cannot be made from the candidate name alone — it requires examining the contest name alongside it.
The Problem
Ballot measures appear in election data using the same schema as candidate races. The “candidate” column holds “For”, “Against”, “Yes”, or “No”. The “contest” column holds something like “BOND REFERENDUM - SCHOOL CONSTRUCTION” or “CONSTITUTIONAL AMENDMENT 3”. Nothing in the file format distinguishes a ballot measure from a candidate race.
Real examples from MEDSL 2022:
| Contest Name | Candidate Name | Votes | What It Actually Is |
|---|---|---|---|
| CONSTITUTIONAL AMENDMENT 1 | For | 1,847,312 | Ballot measure choice |
| BOND REFERENDUM COLUMBUS COUNTY SCHOOLS | Against | 4,219 | Ballot measure choice |
| COUNTY SALES TAX REFERENDUM | Yes | 31,408 | Ballot measure choice |
| CHARTER AMENDMENT - TERM LIMITS | No | 12,773 | Ballot measure choice |
If these rows enter the candidate pipeline, “For” becomes a person entity. “For” then appears in entity resolution, gets a candidate_entity_id, and shows up in the L4 canonical export as the most prolific politician in America — winning thousands of races across every state and every office level.
The L4 Audit Discovery
In our prototype, the L4 LLM entity audit examined 50 entities for plausibility. Among the 4 errors it identified:
“‘For’ is not a plausible person name. This entity appears across 347 contests in 12 states, always in contest names containing ‘amendment’, ‘bond’, ‘referendum’, or ‘proposition’. These are ballot measure choices, not candidates.”
The audit correctly identified the contamination. But detecting it at L4 is too late — the bad entity has already propagated through L2 embeddings and L3 matching. The fix is detection at L1.
Detection Logic
A candidate name of “For”, “Against”, “Yes”, or “No” is ambiguous in isolation. These are common English words, and while no real candidate in our dataset is named “For”, names like “Yes” are theoretically possible. The detection requires both signals:
Signal 1: Candidate name pattern. The candidate name is one of a small set of ballot measure choice words:
| Candidate Name | Ballot Measure Choice |
|---|---|
| For | Yes |
| Against | Yes |
| Yes | Yes |
| No | Yes |
| Bonds Yes | Yes |
| Bonds No | Yes |
| For the Tax Levy | Yes |
| Against the Tax Levy | Yes |
Signal 2: Contest name pattern. The contest name contains one or more ballot measure keywords:
| Keyword | Example Contest Name |
|---|---|
| amendment | CONSTITUTIONAL AMENDMENT 1 |
| bond | BOND REFERENDUM COLUMBUS COUNTY SCHOOLS |
| referendum | COUNTY SALES TAX REFERENDUM |
| proposition | PROPOSITION 30 - TAX ON INCOME |
| measure | MEASURE A - PARCEL TAX |
| initiative | INITIATIVE 82 - TIPPED WAGES |
| question | BALLOT QUESTION 4 |
| charter | CHARTER AMENDMENT - TERM LIMITS |
| levy | RENEWAL 2.0 MILL LEVY - FIRE |
| issue | ISSUE 1 - REPRODUCTIVE RIGHTS |
Both signals must be present. A candidate named “For” in a contest called “COUNTY COMMISSIONER” would not trigger ballot measure detection — it would be flagged as a data quality anomaly for manual review. A candidate named “John Smith” in a contest called “BOND REFERENDUM” is not a ballot measure choice — the candidate name does not match the pattern.
Routing
When both signals match, L1 routes the record to BallotMeasure contest kind with a MeasureChoice result type instead of CandidateResult:
{
"contest": {
"kind": "ballot_measure",
"raw_name": "BOND REFERENDUM COLUMBUS COUNTY SCHOOLS",
"office_level": "school_district",
"measure_type": "bond"
},
"results": [
{
"measure_choice": "against",
"votes_total": 4219,
"vote_counts_by_type": {
"election_day": 2107,
"early": 1891,
"absentee_mail": 198,
"provisional": 23
}
}
]
}
The measure_choice field replaces candidate_name. No name decomposition is performed (there is no first, middle, last, or suffix for “Against”). No entity resolution is needed — “For” in one contest is not the same entity as “For” in another contest. No embedding is generated.
Edge Cases
“For the Tax Levy” vs “For.” Some sources use complete phrases like “For the Tax Levy” rather than bare “For”. The pattern match checks for the prefix, not exact equality.
Mixed contests. A small number of records have both candidate names and ballot measure choices in the same contest. This occurs when a source reports write-in votes alongside measure choices. The L1 parser handles each row independently — “For” is routed to BallotMeasure, while “Write-in” in the same contest is routed to TurnoutMetadata.
Retention elections. Judicial retention elections ask “Shall Judge X be retained?” with choices “Yes” and “No.” These are structurally ballot measures but semantically candidate races — the “candidate” is the judge. L1 classifies these as BallotMeasure with an additional retention_candidate field preserving the judge’s name from the contest string. This is an area where the boundary between candidate races and ballot measures is genuinely blurred.
Scale
Ballot measure records account for approximately 3–5% of total rows in MEDSL 2022, varying by state. States with frequent ballot initiatives (California, Oregon, Colorado) have higher proportions. Failing to detect them does not just create bad entities — it inflates the count of “candidates” and distorts competitiveness metrics. A bond referendum with 51% “For” and 49% “Against” is not an uncontested race with one candidate named “For.”