Methodology · PolicyLogic

How We Score

A plain-language guide to the PolicyLogic methodology — what we measure, how we score it, and what our grades mean.

What We Measure

PolicyLogic tracks whether elected officials — governors, mayors, members of Congress, and others — follow through on commitments made during their campaigns and in office. We focus on three questions for each promise:

Did they start? Did the official take concrete action to advance this commitment, or did it remain a talking point?

Did it happen? Was the policy enacted, the program launched, the goal reached?

Did it work? To the extent outcomes can be measured, did reality match the promise?

The Scoring Pipeline

Each promise moves through three independent scoring stages using standardized rubrics.

Stage 1 — Specificity of Promise

Score	Specificity Level	Example
0	Values statement only	"I believe in stronger communities."
1	Directional goal, no specifics	"We will improve public safety."
2	Specific policy named	"I will pass a tenant protection bill."
3	Specific + measurable target	"Reduce violent crime 20% by 2026."

Stage 2 — Actions Taken

Score	Description
0	No action taken
1	Public statements only, no formal action
2	Proposal introduced, bill filed, or program announced
3	Passed, signed, or formally launched
4	Fully operational and implemented

Stage 3 — Outcome & Results

Score	Description
0	Failed, reversed, or abandoned
1	Under 10% of stated goal achieved
2	10–40% of goal achieved
3	40–70% of goal achieved
4	70–90% of goal achieved
5	90%+ of goal achieved

Modifiers

Their Role (0.0 – 1.0)

Outcomes are rarely caused by one person alone. The "Their Role" modifier scales the outcome score based on how directly the official caused the result — from 1.0 (sole cause) to 0.0 (no causal connection). This prevents officials from claiming credit for conditions they didn't create, and protects them from blame for outcomes beyond their control.

Difficulty Multiplier (1.0 – 2.5)

Not all promises are equally hard to keep. A promise that requires only executive action is scored at 1.0x. One requiring multi-government coordination scores up to 2.5x. This rewards officials who attempt harder reforms.

Promise Score = (Specificity + Actions Taken + (Outcome × Their Role)) × Difficulty

Each promise receives an individual policy score. The overall grade reflects aggregate delivery across all tracked promises, weighted by ambition and adjusted for promise volume.

Behavioral Flags

↩ Reversed

Official actively undid a commitment they previously made.

≠ Redefined

Goalposts moved — the promise was recharacterized after the fact.

■ Externally Blocked

Genuine external obstacle prevented delivery.

◷ Deadline Shifted

Timeline extended without explanation after original deadline passed.

▽ Scope Reduced

Promise delivered in significantly diminished form.

★ Credit Overclaimed

Official claimed credit for outcomes not attributable to their actions.

Letter Grades

A	Exceptional. Strong delivery across most promises, including difficult ones.
B	Strong. Solid execution on core commitments with some gaps.
C	Mixed. Meaningful action on some promises, significant shortfalls on others.
D	Weak. Most promises not delivered or substantially diminished.
F	Poor. Systematic non-delivery, reversals, or pattern of broken commitments.

International Commitments — Why No Grade

The International Commitments tracker operates on a different logic than the official scorecards. International commitments — treaty obligations, multilateral funding pledges, emissions targets, assessed contributions to international organizations — are institutional. They are made by administrations, inherited by successors, honored or abandoned across decades.

Assigning a letter grade to a commitment that has passed through multiple administrations would collapse decades of contested political history into a single letter. Instead, each commitment displays a delivery ratio — the percentage of the pledge delivered as of the research date — alongside a full timeline of events. Administration color bands make clear which events happened under whose watch.

Presidential Foreign Policy — Why No Grade

The Presidential Foreign Policy tracker scores individual promises, not the president as a whole. Each commitment carries a status — Kept, Broken, Partial, In Progress, Reversed, Contested — based on the available record. There is no aggregate presidential grade.

Selection effects. Trackable foreign policy promises are not a representative sample of presidential performance. An aggregate grade would reflect political communication as much as governance.

Causal complexity. Foreign policy outcomes are rarely attributable to a single decision. The Their Role modifier partially addresses this, but causal chains in foreign policy are longer and more contested.

Cross-administration comparability. The tracker intentionally shows commitments across administrations to surface reversals and patterns. A grade attached to one president would invite comparisons the underlying data may not support.

Data Sources

Tier 1 — Primary: Official government websites, signed legislation, executive orders, budget documents
Tier 2 — Verified reporting: Major newspapers, Ballotpedia, PolitiFact, government data portals
Tier 3 — Secondary: Think tank analyses and advocacy reports (used with disclosure)
Tier 4 — Not used: Social media, opinion pieces, anonymous sources, partisan campaign materials

Known Limitations AI-assisted research may miss recent developments, misattribute outcomes, or lack access to local government records. Outcome data for state and local policy is often incomplete or lagged. Scorecards for officials with limited media coverage carry lower confidence ratings. We flag these in each scorecard's Data Gaps section.

Research Process

Each scorecard is generated in two stages. First, an AI research assistant searches the web for campaign materials, news coverage, legislative records, and outcome data. Second, a scoring model applies the PolicyLogic rubric to produce a structured draft scorecard. All scorecards are marked pending human review until verified by a researcher.

PolicyLogic is a nonpartisan project. Scores are determined by the methodology above, not by political affiliation. Officials of both parties are evaluated on the same rubric.

Found an Error?

If you find a factual error, a missing promise, or a score you believe is wrong, please tell us. Every submission is reviewed.

Open a Scorecard to Report an Error →