Falsifiability as a Design Principle

"A theory which is not refutable by any conceivable event is non-scientific. Irrefutability is not a virtue of a theory (as people often think) but a vice." — Karl Popper

A judge in a courtroom looks down at an exhibit. The exhibit is a two-paragraph summary, generated by an AI, of a year of text messages between a parent and a case manager. The summary is signed and time- stamped. It carries an impressive seal. The judge reads it, pauses, and says: how do I know this is true?

The honest answer is that truth is not the question the system can answer. The question the system can answer is something more modest and, properly used, more useful: here is what would have to be true for this summary to be wrong, and here is what we tried, and here is what we did not try.

A summary that takes that posture is falsifiable. A summary that does not is, in Popper's sense, irrefutable — and irrefutability, in the courtroom, is a vice.

This chapter is about the discipline that makes falsifiability possible, in software, in evidence, and in the systems whose outputs appear on judges' desks. The Canon four-stage chain — Witness, Findings, Refutation, Seal — is one engineering answer. The chapter starts with the philosophy, ends with the engineering, and argues that the engineering only works because the philosophy does.

At a glance

A finding is meaningful only if it could, in principle, be shown wrong by someone who does not trust the finder.
Most software systems that emit findings — retrieval results, summaries, expert reports — fail this test, even when their engineering is sound.
The Canon four-stage chain bakes the capacity to be falsified into the artifact itself: typed claims, declared gaps, applied-and- declined challenges, all signed. Without the discipline, the cryptography signs nothing useful.

Learning objectives

By the end of this chapter, you should be able to:

Distinguish the trust-me posture from the catch-me posture and explain why evidence work requires the latter.
Map Popper's falsificationism — testable prediction, survival of attempted refutation, flagged untested aspects — onto the four mechanisms of a Canon Attestation.
Identify the three failure modes the trust-me posture invites (sycophancy, confirmation bias, selective ingestion) and describe which of them Canon mitigates cryptographically versus which it only names.
Execute the seven-step verification protocol against a sample attestation and distinguish a tampered artifact from a cryptographically valid but substantively rejected one.
Explain why falsifiability is a property of an artifact, not of a system, and articulate the implication for how Canon attestations must be treated individually.

The trust-me posture and the catch-me posture

Every claim a system makes about the world has a default trust posture. There are only two interesting ones.

The trust-me posture says: I have done my best work; my methodology is sound; rely on the result. This is the posture of conventional research summaries, expert reports, audit findings, ranked search results, generative AI outputs. It is not, in itself, dishonest. It is just a posture that places the cost of disbelief on the recipient. To doubt the claim, the recipient must produce their own work product of comparable scope.

The catch-me posture says: Here is the result; here is exactly what would have to be true to make it wrong; here is what I tested; here is what I did not test and why; verify all of it. The cost of disbelief is borne by the issuer in the form of disclosure. The recipient does not have to produce parallel work product; the recipient has only to run the falsification protocol the artifact ships with.

Most software defaults to the trust-me posture because it is cheaper for the issuer. Evidence work cannot afford to. The catch-me posture is the one Canon embodies.

▼ Why It Matters. A judge who admits a piece of evidence on the trust-me posture has, in practice, shifted the cost of disagreement onto opposing counsel. Daubert v. Merrell Dow (1993) and the reliability-gatekeeping case law that grew from it are largely about reversing that shift: making the proponent of expert evidence prove reliability up front, not after a years-long battle of competing reports. The catch-me posture is what reliability gatekeeping looks like, mechanized into the artifact.

Popper, briefly, for engineers

Karl Popper's The Logic of Scientific Discovery (1934) framed the problem this book is built around. A scientific theory, Popper argued, is not validated by piling up confirmations — every dawn confirms the sun-rises-in-the-east theory but adds nothing to its testability. A theory is validated only by surviving genuine attempts to refute it. Theories that cannot, in principle, be refuted are not scientific; they are something else (he called it metaphysics; for our purposes the something-else is opinion).

The same argument applies to a piece of computational evidence. A retrieval result that cannot in principle be shown wrong by anyone other than its issuer is not evidence in any useful sense. It is the issuer's opinion, dressed up.

The Canon design imports this discipline directly:

Popper's insight	Canon's mechanism
A theory must specify what would falsify it.	Each Claim declares its `inference_type` and explicit `gaps`.

enforcement in the reference implementation.

☉ In the Wild — Why XML-DSig is the cautionary tale.

XML Digital Signatures, standardized in 2002, were the technology the legal and financial industries adopted in the early 2000s to sign electronic documents. They were sound cryptographically and catastrophic epistemically. The seal proved that a specific bit- stream had been signed by a specific holder. It did not commit the signer to what the bit-stream meant, because XML-DSig allowed the signer to declare their canonicalization method outside the signed envelope — an attacker could swap the canonicalizer without invalidating the signature, and the verifier would accept a document the signer never approved. Brad Hill catalogued the resulting two decades of vulnerabilities in his BlackHat 2007 presentation.

Canon's design — RFC 8785 canonicalization declared inside the signed seal, plus the four-block discipline of Witness/Findings/ Refutation/Seal — is the response. The cryptography is necessary but not sufficient. The discipline is what closes the loop.

What this chapter has established

Verifiability — the hardest of the three properties from Chapter 1 — requires not just a signature but a declared set of things the issuer asserts, a declared set of things the issuer tested, and a declared set of things the issuer did not test. All of it signed. All of it replayable.

The remaining parts of this book are engineering. Each chapter in Part II installs one primitive in the chain. Each chapter in Part III describes how the primitives compose. Parts IV and V are the system in operation and in court.

At every boundary between parts, two questions apply:

Could a recipient who does not trust the issuer verify this output?
Could a recipient who does not trust the issuer catch the issuer lying?

Most systems you will encounter fail the first question. Some fail the second as well — not because they are dishonest, but because they were not designed to be caught. A system that invites scrutiny and survives it is the only kind that belongs in a record of proceedings.

Lab 2 — reading a real attestation

The labs begin in Chapter 5. For Chapter 2, there is one preparatory exercise: verify that you can run the Canon reference verifier against a test artifact.

You do not need to understand how Ed25519 signature verification works to interpret the result — a passing step means the key matched; a failing step means it did not. Chapter 6 explains the mechanism in full.

cd Meridian-Cannon
# Generate a test attestation and run the seven-step walker
python -m meridian.canon.cli verify tests/fixtures/sample_attestation.json

The output should print VALID for steps 1–6 and display the declined inventory from step 7. If it does not, consult meridian/canon/walk.py and compare the error message against the seven steps above. This is the verifier you will use throughout the labs in Part II. > ✻ Try This. Before running the command above, manually locate > the chain_hash field in tests/fixtures/sample_attestation.json. > Note its value. Then locate public_key_url. Predict what steps 1 > and 3 will do with those two fields before you run the verifier. > Run it. Were you right?

💡Key Takeaways

- Popper's falsifiability criterion applied to evidence means every claim must specify what would make it wrong — a Canon Claim satisfies this by declaring its inference_type, its gaps, and the challenges it survived. - An unfalsifiable claim — one that cannot in principle be shown wrong by anyone other than the issuer — has no evidentiary value regardless of how authoritative it sounds; the Canon catch-me posture is the structural answer to the trust-me posture. - Authentication (a signed PDF) differs from verifiability (a recipient can independently check the content, the inference chain, and the challenge record without the issuer's cooperation) — Canon is designed for the latter. - A Canon Attestation embeds seven verifiable steps covering public-key fetch, DSSE signature, chain hash, content re-hash, supports resolution, challenge-target resolution, and declined-challenge review — any failure at steps 1–6 yields a binary verdict the recipient can reach alone. - Falsifiability is a property of an individual artifact, not of a system: each attestation must carry its own falsification harness because a future recipient cannot interrogate the system that produced it.

## Exercises ### Warm-up 1. Take any AI-generated artifact you have produced or received in the last week — a summary, a code review, a document analysis. List three claims it makes. For each claim, write one sentence describing what it would mean for that claim to be wrong. 2. Of the three, how many could be checked by a third party who did not have access to your prompt history or to you? ### Core 3. Read Popper's Conjectures and Refutations (Chapter 1, available widely; the relevant excerpt is ~10 pages). Identify the analogue, in his framework, of (a) inference type, (b) declared gaps, (c) the declined-challenge inventory. 4. The Canon spec requires that every non-observational claim declare at least one gap. Why is this requirement load-bearing? What incentives does it create on the issuer of the claim, and why is making those incentives explicit important? 5. Run meridian-canon verify on the sample attestation in docs/textbook/labs/ch25_verifier/fixtures/01_valid.json. Record the exit code. Then open the file, change one character in the chain_hash field, and run verify again. Record which step fails and write one sentence explaining why that step catches the tampering. ### Stretch 5. Construct a thought experiment in which the seven-step verification protocol is necessary but not sufficient. That is: an attestation that passes all seven steps but is nonetheless misleading. What does your example tell you about where Canon's guarantees end and the recipient's substantive judgement begins? 6. The Refutation block requires a declined list with machine-readable reasons. Why "machine-readable"? What kinds of misuse does that requirement preempt that "human-readable explanation" alone would permit? ## Build-your-own prompt For the corpus you named at the end of Chapter 1: list two challenge types you expect would be difficult or impossible to apply to that corpus. For each, write the machine-readable reason you would record in the declined list. Save these notes; they will become part of your capstone design. ## Further reading - Karl Popper, Conjectures and Refutations (1963), Chapter 1. - The Canon spec, §3 (the Reading Guide). The four-stage chain is described there in roughly the form this chapter has presented it. - Sharma et al., "Towards Understanding Sycophancy in Language Models," arXiv 2310.13548 (2023). Foundational on the LLM-specific failure mode the falsifiability discipline is meant to counter. - Quine, W. V. O., "Two Dogmas of Empiricism" (1951), if the holism sidebar interested you. - Daubert v. Merrell Dow Pharmaceuticals, Inc., 509 U.S. 579 (1993). The Supreme Court's reliability-gatekeeping standard for expert testimony. The catch-me posture this chapter argues for is, in procedural terms, a mechanism for pre-empting Daubert challenges before trial. - Brad Hill, "Tricks and Treats: More Fun with XML Security" (Black Hat 2007). The XML-DSig vulnerability catalogue referenced in the ☉ sidebar. A concrete illustration of what sound cryptography without epistemic discipline produces. - The dossier research/04_adversarial_llm_eval.md for the rest of the literature on adversarial validation.

Next: Chapter 3 — An Engineer's Tour of the Federal Rules of Evidence.