What is the M.A.R.C. Accuracy Score?

The M.A.R.C. Accuracy Score rates how accurately AI platforms represent a specific business, on a 0-100 scale. It works by extracting factual claims from AI responses, assembling verified ground truth about the business from multiple sources, and comparing each claim against that ground truth. Claims are classified as accurate, inaccurate, hallucinated, or unverifiable.

How we make content — and why we show our working

M.A.R.C. — Methodology for Augmented Research Content — governs two systems at Findcraft: how we produce every article on this site, and how our Accuracy Engine verifies what AI says about businesses. This page explains both.

The problem M.A.R.C. exists to solve

The internet has a content quality problem. AI tools make it easy to produce articles at scale, and much of what gets published now is generated without original research, without fact-checking, and without any editorial standard beyond "does it rank?"

That creates a specific problem for readers: you can't tell which articles were carefully researched and which were generated in thirty seconds. The format looks identical. The claims sound equally confident.

M.A.R.C. is our attempt to solve this — not by avoiding AI tools (we use them), but by subjecting everything we publish to a set of rules that are documented, repeatable, and inspectable. AI is part of the workflow. The methodology is what makes the output trustworthy. We explain the thinking behind this decision in why we published our content methodology.

Two systems, same principles

M.A.R.C. operates as two systems. The first — M.A.R.C. for content — governs everything we publish on this site. The seven principles below ensure our research is sourced, counter-argued, and transparent. The second — M.A.R.C. for accuracy scoring — is the engine behind marcscore.com, which verifies what AI platforms say about businesses and produces a 0–100 accuracy score.

Both systems share the same intellectual commitment: source everything, counter your own findings, calibrate confidence to evidence, and make the working inspectable. The content methodology produces trustworthy articles. The accuracy methodology produces trustworthy business reports. Different outputs, same standard.

Part 1: Content methodology

Seven principles

Every article published under the M.A.R.C. methodology must satisfy these seven principles. They are non-negotiable — if an article fails any one of them, it doesn't publish.

1. Answer the question first

The direct answer to the reader's question appears within the first 50 words. No preamble, no "in today's world" filler. If someone came here with a question, we answer it before doing anything else.

2. Source everything

Every factual claim links to a verifiable source. Statistics include context — who measured it, when, and how. We use calibrated language: "research shows" for peer-reviewed findings, "industry data suggests" for commercial studies, "practitioners report" for anecdotal evidence. If we're speculating, we say so.

3. Counter our own argument

Every article that makes a case includes the strongest available counter-argument — not a weakened version, not a caricature, but the argument an intelligent sceptic would actually make. The goal is that someone who disagrees with our conclusion would still feel their position was represented fairly.

4. Disclose our incentives

Findcraft is an AEO consultancy. We sell AI visibility services. Every article that touches on topics related to our services includes an explicit disclosure of this fact, so you can factor our commercial interest into how you read our recommendations.

5. Calibrate confidence to evidence

When the evidence is strong, we say so clearly. When it's mixed, uncertain, or early-stage, we say that too. We don't hedge useful conclusions into uselessness, and we don't present tentative findings as settled fact. The strength of our language matches the strength of the evidence.

6. Link to independent sources

Every article includes links to further reading that we don't control — independent research, industry reports, primary sources. If you want to verify our claims or go deeper, we make that easy rather than trying to keep you on our site.

7. Human review is non-negotiable

No article publishes without a human reviewing it against a structured checklist. AI assists in research and drafting. A human verifies accuracy, checks sources, assesses whether the counter-argument is genuinely strong, and confirms the article serves the reader rather than just the business. This is documented per-article, not just asserted.

What M.A.R.C. does not claim

M.A.R.C. does not claim to be the only way to produce good content. Plenty of excellent writing happens without a formal methodology. What M.A.R.C. provides is a verifiable commitment: a documented, repeatable process you can inspect, so you don't have to take our word for it.

We also don't claim the methodology is finished. M.A.R.C. is versioned and amended when evidence shows it should be. Every change is logged with the reason it was made. The current version has been tested against a small number of published articles — not hundreds. We'll say more when we have more data. You can see the methodology applied in our analysis of the AI traffic measurement paradox and how AI agents respond to sponsored content.

Part 2: Accuracy scoring methodology

The M.A.R.C. Accuracy Score answers a specific question: when AI platforms talk about your business, is what they say true? The engine extracts individual factual claims from AI responses, verifies each one against assembled ground truth, and produces a 0–100 score reflecting the proportion of verified claims that are accurate.

Ground truth assembly

Before any accuracy checking happens, the engine assembles verified facts about the business from multiple independent sources:

Website crawling — the business's own site is crawled and parsed for pricing pages, service descriptions, contact details, and opening hours
Structured data extraction — JSON-LD schema markup is parsed from every crawled page, extracting machine-readable business facts
Google Business Profile — the Places API provides independently verified address, phone, hours, rating, and category data
Cross-referencing — where sources agree, confidence is high. Where they conflict, the conflict is flagged and the lower-confidence value is noted

Each business's ground truth receives a quality grade: RICH (6+ data points from 2+ sources, no conflicts), MODERATE (3+ data points from 2+ sources), THIN (3+ data points from 1 source), or MINIMAL (fewer than 3 data points). This grade is displayed prominently on every report — a score based on thin ground truth means something different from one based on rich data, and we make that distinction visible.

Three-tier verification

Once ground truth is assembled, AI responses are collected and every factual claim is verified through three tiers:

Tier 1 — Deterministic comparison. Hard facts — addresses, phone numbers, opening hours, prices — are compared using exact matching with normalisation. An address is standardised for abbreviations and postcode format before comparison. A phone number is normalised to E.164 format. Hours are parsed into structured day/time pairs with a 30-minute tolerance. No AI is involved in Tier 1 — these are machine comparisons against verified data.

Tier 2 — Semantic comparison. Softer claims — service descriptions, specialisations, reputation statements — require judgment. Tier 2 uses an adversarial pipeline: one model (Haiku) extracts individual claims from AI responses and provides exact quotes. A deterministic check verifies each quote actually appears in the original text (filtering phantom extractions). A second model (Sonnet) then compares each verified claim against the ground truth and provides reasoning for its classification. This two-model pipeline means the extractor and the verifier are independent — one challenges the other.

Tier 3 — Escalation. Claims that neither deterministic nor semantic comparison can confidently resolve are routed for human review. This includes: Tier 2 results where extraction and comparison models disagree, claims involving subjective quality assessments, and cases where ground truth conflicts make automated classification unreliable.

Claim classification

Every extracted claim receives one of four classifications:

Accurate — the claim matches ground truth data
Inaccurate — the claim contradicts ground truth data (e.g., wrong opening hours, incorrect address)
Hallucinated — the claim is fabricated with no basis in any source (e.g., a location that doesn't exist, a service never offered)
Unverifiable — ground truth data doesn't cover this topic, so the claim can't be checked either way

The distinction between "inaccurate" and "hallucinated" matters. An inaccurate claim gets a real fact wrong — the business opened in 2012, not 2015. A hallucinated claim invents something entirely — a branch location that has never existed. Both are errors, but they have different causes and different fixes.

Transparency in every report

Every accuracy report shows its working:

Verification coverage — how many claims could be checked (e.g., "24 of 57 claims verified"). A score based on 42% coverage is flagged differently from one based on 90%
Ground truth quality grade — RICH, MODERATE, THIN, or MINIMAL, so the reader knows how strong the comparison data is
Confidence levels — each finding carries a confidence indicator based on the tier that produced it and the quality of the ground truth it was compared against
Methodology version — reports are versioned so findings can be compared across time as the methodology improves

The score is never presented without this context. A 45/100 with rich ground truth means something very different from a 45/100 with minimal data — and we make both the score and its limitations visible.

Same principles, different application

Every principle from the content methodology maps directly to the accuracy scoring system:

"Source everything" → ground truth assembled from multiple verified sources before any comparison happens
"Counter our own argument" → the adversarial Tier 2 pipeline uses independent models to challenge its own findings
"Calibrate confidence" → quality grades (RICH/MODERATE/THIN/MINIMAL) and verification coverage percentages on every report
"Human review non-negotiable" → Tier 3 escalation routes uncertain claims to human judgment
"Disclose incentives" → every report includes a methodology footer and coverage caveats

How to check our work

Every article on this site carries a M.A.R.C. methodology footer confirming it was produced through this process. If you want to verify a specific claim, every factual assertion links to its source. If you find an error or a claim without adequate sourcing, let us know — corrections are part of the process.

See the methodology in action

Content: Every article on the Findcraft blog is produced using M.A.R.C. Start with our research into the AI traffic measurement paradox or learn how AI decides which businesses to recommend.

Accuracy scoring: Try a free scan at marcscore.com to see what AI gets right and wrong about your business — the M.A.R.C. methodology applied to your specific AI presence.

Get Your Free AI Visibility Audit