Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Ballot Measure Choices: For/Against/Yes/No

When a row in an election results file has “For” as the candidate name, it could mean two things: a person whose legal name is “For” (implausible), or a choice on a ballot measure (almost certain). The distinction cannot be made from the candidate name alone — it requires examining the contest name alongside it.

The Problem

Ballot measures appear in election data using the same schema as candidate races. The “candidate” column holds “For”, “Against”, “Yes”, or “No”. The “contest” column holds something like “BOND REFERENDUM - SCHOOL CONSTRUCTION” or “CONSTITUTIONAL AMENDMENT 3”. Nothing in the file format distinguishes a ballot measure from a candidate race.

Real examples from MEDSL 2022:

Contest NameCandidate NameVotesWhat It Actually Is
CONSTITUTIONAL AMENDMENT 1For1,847,312Ballot measure choice
BOND REFERENDUM COLUMBUS COUNTY SCHOOLSAgainst4,219Ballot measure choice
COUNTY SALES TAX REFERENDUMYes31,408Ballot measure choice
CHARTER AMENDMENT - TERM LIMITSNo12,773Ballot measure choice

If these rows enter the candidate pipeline, “For” becomes a person entity. “For” then appears in entity resolution, gets a candidate_entity_id, and shows up in the L4 canonical export as the most prolific politician in America — winning thousands of races across every state and every office level.

The L4 Audit Discovery

In our prototype, the L4 LLM entity audit examined 50 entities for plausibility. Among the 4 errors it identified:

“‘For’ is not a plausible person name. This entity appears across 347 contests in 12 states, always in contest names containing ‘amendment’, ‘bond’, ‘referendum’, or ‘proposition’. These are ballot measure choices, not candidates.”

The audit correctly identified the contamination. But detecting it at L4 is too late — the bad entity has already propagated through L2 embeddings and L3 matching. The fix is detection at L1.

Detection Logic

A candidate name of “For”, “Against”, “Yes”, or “No” is ambiguous in isolation. These are common English words, and while no real candidate in our dataset is named “For”, names like “Yes” are theoretically possible. The detection requires both signals:

Signal 1: Candidate name pattern. The candidate name is one of a small set of ballot measure choice words:

Candidate NameBallot Measure Choice
ForYes
AgainstYes
YesYes
NoYes
Bonds YesYes
Bonds NoYes
For the Tax LevyYes
Against the Tax LevyYes

Signal 2: Contest name pattern. The contest name contains one or more ballot measure keywords:

KeywordExample Contest Name
amendmentCONSTITUTIONAL AMENDMENT 1
bondBOND REFERENDUM COLUMBUS COUNTY SCHOOLS
referendumCOUNTY SALES TAX REFERENDUM
propositionPROPOSITION 30 - TAX ON INCOME
measureMEASURE A - PARCEL TAX
initiativeINITIATIVE 82 - TIPPED WAGES
questionBALLOT QUESTION 4
charterCHARTER AMENDMENT - TERM LIMITS
levyRENEWAL 2.0 MILL LEVY - FIRE
issueISSUE 1 - REPRODUCTIVE RIGHTS

Both signals must be present. A candidate named “For” in a contest called “COUNTY COMMISSIONER” would not trigger ballot measure detection — it would be flagged as a data quality anomaly for manual review. A candidate named “John Smith” in a contest called “BOND REFERENDUM” is not a ballot measure choice — the candidate name does not match the pattern.

Routing

When both signals match, L1 routes the record to BallotMeasure contest kind with a MeasureChoice result type instead of CandidateResult:

{
  "contest": {
    "kind": "ballot_measure",
    "raw_name": "BOND REFERENDUM COLUMBUS COUNTY SCHOOLS",
    "office_level": "school_district",
    "measure_type": "bond"
  },
  "results": [
    {
      "measure_choice": "against",
      "votes_total": 4219,
      "vote_counts_by_type": {
        "election_day": 2107,
        "early": 1891,
        "absentee_mail": 198,
        "provisional": 23
      }
    }
  ]
}

The measure_choice field replaces candidate_name. No name decomposition is performed (there is no first, middle, last, or suffix for “Against”). No entity resolution is needed — “For” in one contest is not the same entity as “For” in another contest. No embedding is generated.

Edge Cases

“For the Tax Levy” vs “For.” Some sources use complete phrases like “For the Tax Levy” rather than bare “For”. The pattern match checks for the prefix, not exact equality.

Mixed contests. A small number of records have both candidate names and ballot measure choices in the same contest. This occurs when a source reports write-in votes alongside measure choices. The L1 parser handles each row independently — “For” is routed to BallotMeasure, while “Write-in” in the same contest is routed to TurnoutMetadata.

Retention elections. Judicial retention elections ask “Shall Judge X be retained?” with choices “Yes” and “No.” These are structurally ballot measures but semantically candidate races — the “candidate” is the judge. L1 classifies these as BallotMeasure with an additional retention_candidate field preserving the judge’s name from the contest string. This is an area where the boundary between candidate races and ballot measures is genuinely blurred.

Scale

Ballot measure records account for approximately 3–5% of total rows in MEDSL 2022, varying by state. States with frequent ballot initiatives (California, Oregon, Colorado) have higher proportions. Failing to detect them does not just create bad entities — it inflates the count of “candidates” and distorts competitiveness metrics. A bond referendum with 51% “For” and 49% “Against” is not an uncontested race with one candidate named “For.”