Est. 2026 · Nonpartisan Public Accountability
PolicyLogic
How we score promises
Draft Scorecards · Under Review
How We Score
A plain-language guide to the PolicyLogic methodology — what we measure, how we score it, and what our grades mean.
All scorecards are draft documents. Scores are generated using AI-assisted research from public web sources and reflect conditions as of the research date. They are pending human review and verification. Grades should not be cited as definitive assessments.

What We Measure

PolicyLogic tracks whether elected officials — governors, mayors, members of Congress, and others — follow through on commitments made during their campaigns and in office. We focus on three questions for each promise:

Did they start? Did the official take concrete action to advance this commitment, or did it remain a talking point?

Did it happen? Was the policy enacted, the program launched, the goal reached?

Did it work? To the extent outcomes can be measured, did reality match the promise?

The Scoring Pipeline

Each promise moves through three independent scoring stages using standardized rubrics.

Stage 1 — Specificity of Promise

ScoreSpecificity LevelExample
0Values statement only"I believe in stronger communities."
1Directional goal, no specifics"We will improve public safety."
2Specific policy named"I will pass a tenant protection bill."
3Specific + measurable target"Reduce violent crime 20% by 2026."

Stage 2 — Actions Taken

ScoreDescription
0No action taken
1Public statements only, no formal action
2Proposal introduced, bill filed, or program announced
3Passed, signed, or formally launched
4Fully operational and implemented

Stage 3 — Outcome & Results

ScoreDescription
0Failed, reversed, or abandoned
1Under 10% of stated goal achieved
210–40% of goal achieved
340–70% of goal achieved
470–90% of goal achieved
590%+ of goal achieved

Modifiers

Their Role (0.0 – 1.0)

Outcomes are rarely caused by one person alone. The "Their Role" modifier scales the outcome score based on how directly the official caused the result — from 1.0 (sole cause) to 0.0 (no causal connection). This prevents officials from claiming credit for conditions they didn't create, and protects them from blame for outcomes beyond their control.

Difficulty Multiplier (1.0 – 2.5)

Not all promises are equally hard to keep. A promise that requires only executive action is scored at 1.0x. One requiring multi-government coordination scores up to 2.5x. This rewards officials who attempt harder reforms.

Promise Score = (Specificity + Actions Taken + (Outcome × Their Role)) × Difficulty
Each promise receives an individual policy score. The overall grade reflects aggregate delivery across all tracked promises, weighted by ambition and adjusted for promise volume.

Behavioral Flags

↩ Reversed
Official actively undid a commitment they previously made.
≠ Redefined
Goalposts moved — the promise was recharacterized after the fact.
■ Externally Blocked
Genuine external obstacle prevented delivery.
◷ Deadline Shifted
Timeline extended without explanation after original deadline passed.
▽ Scope Reduced
Promise delivered in significantly diminished form.
★ Credit Overclaimed
Official claimed credit for outcomes not attributable to their actions.

Letter Grades

AExceptional. Strong delivery across most promises, including difficult ones.
BStrong. Solid execution on core commitments with some gaps.
CMixed. Meaningful action on some promises, significant shortfalls on others.
DWeak. Most promises not delivered or substantially diminished.
FPoor. Systematic non-delivery, reversals, or pattern of broken commitments.

International Commitments — Why No Grade

The International Commitments tracker operates on a different logic than the official scorecards. International commitments — treaty obligations, multilateral funding pledges, emissions targets, assessed contributions to international organizations — are institutional. They are made by administrations, inherited by successors, honored or abandoned across decades.

Assigning a letter grade to a commitment that has passed through multiple administrations would collapse decades of contested political history into a single letter. Instead, each commitment displays a delivery ratio — the percentage of the pledge delivered as of the research date — alongside a full timeline of events. Administration color bands make clear which events happened under whose watch.

Presidential Foreign Policy — Why No Grade

The Presidential Foreign Policy tracker scores individual promises, not the president as a whole. Each commitment carries a status — Kept, Broken, Partial, In Progress, Reversed, Contested — based on the available record. There is no aggregate presidential grade.

Selection effects. Trackable foreign policy promises are not a representative sample of presidential performance. An aggregate grade would reflect political communication as much as governance.

Causal complexity. Foreign policy outcomes are rarely attributable to a single decision. The Their Role modifier partially addresses this, but causal chains in foreign policy are longer and more contested.

Cross-administration comparability. The tracker intentionally shows commitments across administrations to surface reversals and patterns. A grade attached to one president would invite comparisons the underlying data may not support.

Data Sources

Known Limitations AI-assisted research may miss recent developments, misattribute outcomes, or lack access to local government records. Outcome data for state and local policy is often incomplete or lagged. Scorecards for officials with limited media coverage carry lower confidence ratings. We flag these in each scorecard's Data Gaps section.

Research Process

Each scorecard is generated in two stages. First, an AI research assistant searches the web for campaign materials, news coverage, legislative records, and outcome data. Second, a scoring model applies the PolicyLogic rubric to produce a structured draft scorecard. All scorecards are marked pending human review until verified by a researcher.

PolicyLogic is a nonpartisan project. Scores are determined by the methodology above, not by political affiliation. Officials of both parties are evaluated on the same rubric.

Found an Error?

If you find a factual error, a missing promise, or a score you believe is wrong, please tell us. Every submission is reviewed.

Open a Scorecard to Report an Error →