Part IV — Engineering Practice · Chapter 28
Schema Design & Migrations
Schema Design & Migrations
A migration that drops a column while a legal hold is active on that column destroys evidence. A schema is not just data engineering — it is chain-of-custody infrastructure.
ℹPrerequisites▼
Before reading this chapter, you should be comfortable with: Chapter 14 (Postgres Substrate). Migrations extend the base schema; you need to know the base before modifying it.
The Meridian-Cannon schema spans nineteen SQL files. They apply in filename order, and that order is not accidental. Every file is a contract: the tables and indexes it creates are infrastructure the files that follow will depend on. Change them carelessly and you do not just break a query — you break a chain.
This chapter explains how the schema is laid out, how to extend it without breaking anything, and what discipline makes a migration safe in an evidence context. The running case is the 2026 TPR proceeding: a schema that cannot survive a mid-proceeding update is a schema that cannot handle a case that moves the way real cases move.
At a glance
- Migrations are numbered and reversible: numeric files (00–99) establish the base schema once; alphabetic phase files (A0, B0, C0…) extend it in paired forward/down scripts.
- The alphabetic namespace sorts after 99_ and maps to Canon version increments, so phase additions never collide with the sealed base schema.
- Every migration must be idempotent —
IF NOT EXISTS,CREATE OR REPLACE, andON CONFLICT DO NOTHINGare the tools; a migration that cannot be applied twice safely is not ready for production. ## Learning objectives By the end of this chapter you should be able to: 1. Explain the two-tier naming convention (numeric base versus alphabetic phase) and state why numeric files are sealed after initial deployment. 2. Write a reversible migration for a new domain table, including the paired.down.sqlscript, and verify the round-trip in a development database. 3. ApplyIF NOT EXISTS,CREATE OR REPLACE, andON CONFLICT DO NOTHINGcorrectly to make a migration idempotent, and explain which idiom applies to each DDL statement type. 4. Add a nullable column to an existing table in a new migration file — without modifying the original file — in a way that leaves all existing workers compatible. ## 21.1 The numbered-file discipline Open theschema/directory. The files are:
00_setup.sql 10_core.sql 20_provenance.sql
30_documents.sql 40_communications.sql 50_recordings.sql
60_court.sql 70_financial_telephony.sql 80_telemetry.sql
85_legal.sql 90_workers_correlations.sql 95_views_indexes.sql
97_supabase.sql 99_rls.sql
A0_attestations.sql B0_chunks_fts.sql B1_paradedb_fts.sql
B2_pgvectorscale.sql C0_entities_resolution.sql E0_rekor.sql
The numeric prefix (00–99) is applied once: at system bootstrap. Every table
in the 00–99 block is part of the initial schema. Do not touch them after they ship. Do not add a column to 10_core.sql six months later.
Add a migration.
The alphabetic prefix (A0, B0, C0...) marks phases. Phase A is the Canon foundation. Phase B is chunk full-text search. Phase C is entity resolution. Each phase corresponds to a Canon version increment. A phase migration file lives alongside its down script:
A0_attestations.sql
A0_attestations.down.sql
The pairing is not optional. A migration without a down script is a one-way door. Evidence databases need to go back through that door — to recover from a bad deployment, to roll back after an upstream spec change. The down script is the only way back.
The critical rule: never restructure an existing file. The numeric files
are sealed. If 10_core.sql defines matters, you do not add a column to 10_core.sql's CREATE TABLE statement. You write a new migration. This rule is not stylistic preference — it is the difference between a schema whose history is legible and one that is not. > ▼ Why It Matters — The mid-proceeding schema change. > > The TPR proceeding opens in January 2026. The initial schema ships with > matters, parties, sources, documents. By March, the attorney > identifies a new category of evidence: voicemail transcripts stored by > the opposing agency's phone system. The schema needs a new table. The > migration adds it; the existing rows in documents are untouched. The > chain of custody for every document ingested before March is intact > because nothing about the documents table changed. An evidence database > that cannot be extended mid-proceeding is useless for any case that lasts > more than six months. ## 21.2 Idempotency: applying a migration twice A migration is idempotent if running it twice leaves the schema identical to running it once. In a deployment pipeline that can fail and retry, idempotency prevents the retry from failing because the table already exists. The idiom is IF NOT EXISTS:
CREATE TABLE IF NOT EXISTS attestations ( ... );
CREATE INDEX IF NOT EXISTS attestations_matter_idx ON attestations(matter_id, issued_at);
CREATE OR REPLACE FUNCTION attestations_audit_trigger() RETURNS trigger AS $$ ... $$;
CREATE TABLE IF NOT EXISTS and CREATE INDEX IF NOT EXISTS are idempotent at the DDL level. CREATE OR REPLACE FUNCTION replaces the function body if it exists and creates it if it does not. The trigger requires more care: you must DROP TRIGGER IF EXISTS before CREATE TRIGGER because PostgreSQL 15 and earlier do not support CREATE OR REPLACE TRIGGER for all trigger types. For data operations inside migrations — filling in a default value for a new NOT NULL column, for instance — the idiom is ON CONFLICT DO NOTHING or an UPDATE WHERE NOT EXISTS guard. The goal is the same: running the file a second time must not corrupt the data it already produced. > ◆ Going Deeper — Why PostgreSQL DDL is transactional. > > PostgreSQL runs DDL inside transactions. CREATE TABLE, ALTER TABLE, > CREATE INDEX — all participate in MVCC and can be rolled back. This is > unusual: MySQL (before 8.0) does not support transactional DDL, which is > part of why GitHub's migration from MySQL to Vitess was so difficult to > instrument. In PostgreSQL, a migration file can be wrapped in a single > BEGIN / COMMIT block, and if any statement fails, the entire file > rolls back automatically. The migration either fully applies or leaves > nothing behind. This is the correct model for an evidence schema. > > There is one exception: CREATE INDEX CONCURRENTLY cannot run inside a > transaction. If you need it (for a large table where a blocking index build > would cause downtime), it must be in its own file, outside a transaction > wrapper. Document this explicitly in the file header. ## 21.3 Down migrations: when trivial, when hard The down script for A0_attestations.sql is ten lines:
-- A0_attestations.down.sql
ALTER TABLE acquisitions DROP COLUMN IF EXISTS obs_attestation_id;
DROP TRIGGER IF EXISTS attestations_audit ON attestations;
DROP FUNCTION IF EXISTS attestations_audit_trigger();
DROP INDEX IF EXISTS attestations_fingerprint_idx;
DROP INDEX IF EXISTS attestations_kind_idx;
DROP INDEX IF EXISTS attestations_matter_idx;
DROP TABLE IF EXISTS attestations;
This down script is trivial because A0_attestations.sql added a new table and a new column on an existing table (acquisitions). The inverse is to drop the column and drop the table. No data migration is required. Down migrations become hard when the forward migration transforms data, not just structure. If A0_attestations.sql had also backfilled an obs_attestation_id for every existing acquisition row, the down script would need to remove those backfilled values — and would have no way to know which rows it backfilled versus which rows had the column populated by later business logic. The down script becomes ambiguous. Never mix schema migrations and data migrations in one file. Schema first, in one transaction. Data migration as a separate step, in a second file, after the schema is confirmed. The down scripts stay unambiguous because they only ever undo one kind of change. The ordering rule for multi-migration down scripts is strict: down scripts run in reverse order. If migration A0 adds attestations and migration A1 adds a foreign key from documents to attestations, then: - To roll back: run A1.down.sql first (drops the FK), then A0.down.sql (drops the table). - Running them in forward order would attempt to drop attestations while the FK from documents still references it. PostgreSQL's referential integrity enforcement would refuse. > § For the Record — Zubulake v. UBS Warburg III, 220 F.R.D. 212 (S.D.N.Y. 2003). > > "The obligation to preserve evidence arises when the party has notice that > the evidence is relevant to litigation or when a party should have known > that the evidence may be relevant to future litigation." > > A schema migration that drops a column — even a column that seems > redundant — is a deletion event on any data stored in that column. If those > values are relevant to pending or anticipated litigation, the deletion is > spoliation. The additive discipline (never remove, only deprecate) is not > pedantry. It is the minimum engineering response to Zubulake's preservation > obligation. ## 21.4 The A0_attestations.sql walkthrough
The attestations table is the most architecturally interesting migration in the repository. Walk through it with the design rationale visible.
CREATE TABLE attestations (
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
attestation_id text UNIQUE NOT NULL,
CHECK (attestation_id ~ '^[A-Z0-9]+$'),
kind text NOT NULL CHECK (kind IN (
'observation', 'enrichment', 'search', 'brief', 'audit'
)),
canon_version text NOT NULL,
matter_id uuid REFERENCES matters(id),
issued_at timestamptz NOT NULL,
issuer text NOT NULL,
subject text,
chain_hash text NOT NULL,
CHECK (chain_hash ~ '^sha256:[0-9a-f]{64}$'),
signature text NOT NULL,
public_key_fingerprint text NOT NULL,
CHECK (public_key_fingerprint ~ '^sha256:[0-9a-f]{64}$'),
public_key_url text NOT NULL,
payload jsonb NOT NULL,
kind_specific jsonb,
created_at timestamptz NOT NULL DEFAULT now()
);
Three design decisions warrant explanation.
Two IDs. Every row has a uuid primary key (id) and a separate attestation_id text field. The UUID is the database row's identity — fast joins, FK references from other tables, internal Postgres efficiency. The attestation_id is the artifact's external identity: a ULID, printed in the Canon-conformant JSON payload. These two identities must never be conflated. The CHECK constraint '^[A-Z0-9]+$' validates that the external ID conforms to ULID encoding before insertion. If you coupled the external ID to the row's insertion order (using a sequence), you would leak information about how many attestations have been issued — which can be sensitive. ULIDs preserve sort order without revealing count. Denormalized crypto fields. chain_hash, signature, public_key_fingerprint, and public_key_url are extracted from payload->'seal' and stored as indexed top-level columns. This is deliberate redundancy. The canonical source is always the payload JSONB column — a verifier should run the seven-step protocol against payload, not against these columns. The denormalized fields exist for indexed lookups: "find all attestations signed by key with fingerprint X" or "check whether a chain_hash has already been recorded" without deserializing every payload in the table. The CHECK constraints enforce that these columns remain consistent with their expected formats. Why A0_attestations.sql lives after 99_rls.sql. The initial numeric sequence ends at 99_rls.sql. Row-level security is the last thing applied to the base schema because it wraps every table that precedes it. The attestations table, added in phase A, needed its own RLS policies but those are not yet written — they will arrive in A0_rls.sql or as additions to a
future migration. Placing the attestations table after RLS means it initially
has no row-level policies until the appropriate phase adds them. An
unprotected table in an otherwise RLS-protected database is a risk, but it is
a known and auditable risk, not a silent one. The file header documents the
gap.
After the main table, the migration adds three indexes and an audit trigger:
CREATE INDEX attestations_matter_idx ON attestations(matter_id, issued_at);
CREATE INDEX attestations_kind_idx ON attestations(kind, issued_at);
CREATE INDEX attestations_fingerprint_idx ON attestations(public_key_fingerprint);
CREATE TRIGGER attestations_audit
AFTER INSERT ON attestations
FOR EACH ROW EXECUTE FUNCTION attestations_audit_trigger();
The composite indexes carry issued_at as the second column because the primary access pattern is timeline queries: "all attestations for this matter ordered by issuance" and "all observation attestations ordered by issuance." A matter_id index alone would produce an unordered heap scan for timeline queries; the composite index produces a sorted result directly. The fingerprint index is narrow and dense — a single text column with no composite. It serves key-rotation queries: when a signing key is rotated, the system needs to locate every attestation signed by the old key. Without this index, that query would be a full table scan. The audit trigger fires on every INSERT. It calls the audit() function (defined in 10_core.sql) which appends a hash-chained row to the audit_log table. Every Canon emission is therefore permanently logged, with the audit row's hash incorporating the preceding row's hash. An auditor can confirm that the log is complete and unmodified by recomputing the hash chain. Finally, the migration adds a back-link to acquisitions:
ALTER TABLE acquisitions
ADD COLUMN obs_attestation_id text REFERENCES acquisitions(attestation_id);
This is a nullable FK, not a NOT NULL constraint. Most acquisitions will not
have an associated attestation at the time of ingestion. The Phase B witness
wrapper (meridian/witness/wrapper.py) populates this column as attestations are emitted. Making it nullable preserves compatibility with all existing acquisition rows. > ☉ In the Wild — GitHub's migration from MySQL to Vitess (2022). > > GitHub's database team spent two years migrating 150 million users' data > from a monolithic MySQL cluster to Vitess (a MySQL-compatible sharded > topology). Their published discipline had three rules for every migration: > (1) never lock — any DDL that takes a table lock is prohibited because a > one-minute table lock at GitHub scale causes visible user-facing downtime; > (2) never drop — columns can be deprecated and ignored by the application > layer, but the column stays in the schema until a separate verified cleanup > operation removes it; (3) always backward-compatible — the new schema must > be readable by both the old and new application code simultaneously. > > The same three rules apply here, with one addition: never delete rows that > are subject to a legal hold. GitHub's "never drop" rule is an operational > concern — the evidence database's "never drop" rule is a legal obligation. > A migration that drops a column containing communication metadata that is > relevant to a pending proceeding is not just a deployment risk. It is > evidence spoliation under Zubulake. ## 21.5 Schema and Canon co-evolution The Canon specification is versioned with semver. When the Canon spec changes, the migration strategy depends on what changed. Minor version bumps (0.1.0 → 0.1.1) typically add optional fields to the attestation payload or clarify ambiguous verification behavior. No schema change is required. The canon_version column in attestations already stores the version string, so the system can query for attestations by version without a migration. Major phase changes (Phase A → Phase B) typically add new attestation kinds, new mandatory fields, or new structural requirements that the payload schema enforces at the Pydantic layer. These may or may not require a database migration. Phase B added ObservationAttestation (a new kind) and the Phase B chunk metadata tables — those required B0_chunks_fts.sql. Phase A's attestations table was not modified; the new kind slots into the existing kind CHECK constraint. Canon v0.2.0 introduced three new optional schema files. They extend the base schema without modifying any earlier file — consistent with the additive discipline. schema/B1_paradedb_fts.sql adds a BM25 full-text index powered by the pg_search extension (ParadeDB):
DO $paradedb_check$
BEGIN
IF EXISTS (SELECT 1 FROM pg_available_extensions WHERE name = 'pg_search') THEN
CREATE INDEX IF NOT EXISTS chunks_paradedb_bm25
ON chunks USING bm25 (content, ...);
END IF;
END $paradedb_check$;
The DO block guards extension availability: if pg_search is not installed, the migration silently no-ops. The index is activated at query time by setting MERIDIAN_USE_PARADEDB=1. Running B1 on a stock PostgreSQL instance without ParadeDB is safe — it leaves no artifact and does not fail. schema/B2_pgvectorscale.sql adds a StreamingDiskANN index for approximate
nearest-neighbor vector search:
DO $pgvs_check$
BEGIN
IF EXISTS (SELECT 1 FROM pg_available_extensions WHERE name = 'pgvectorscale') THEN
CREATE INDEX IF NOT EXISTS embeddings_diskann
ON embeddings USING diskann (vector vector_cosine_ops);
END IF;
END $pgvs_check$;
When pgvectorscale is available, B2 replaces the ivfflat index as the preferred ANN index. Same guard pattern: silent no-op if the extension is absent. Apply B1 and B2 after the base schema (the A series) and the B0_chunks_fts.sql migration that establishes the chunks and embeddings tables they extend. schema/E0_rekor.sql creates the table that persists Rekor transparency log entries when MERIDIAN_REKOR_ENABLED=1:
CREATE TABLE IF NOT EXISTS rekor_entries (
id bigserial PRIMARY KEY,
seal_id uuid REFERENCES seals(id),
entry_uuid text NOT NULL,
log_index bigint,
rekor_url text NOT NULL,
integrated_time bigint,
created_at timestamptz DEFAULT now()
);
Apply E0 before enabling Rekor publishing. The E0 migration must run after the attestation and seal tables it references. All three new files use IF NOT EXISTS and extension guards — they are safe to re-run. When a new Canon phase does require a structural change to the attestations table — say, a new mandatory column that the Phase A table does not have — write a new migration, not a modification to A0: 1. Write A1_attestations_phase_b_fields.sql. 2. Use ALTER TABLE attestations ADD COLUMN ... DEFAULT <safe_default> for the new column. Make it nullable or give it a default so existing rows survive. 3. Write the corresponding A1_attestations_phase_b_fields.down.sql. 4. If the Canon version validator at the application layer needs updating, update meridian/canon/schema.py in the same commit. Schema migration and Pydantic model live in the same PR; deploy them atomically. The inverse is equally important: when you bump a Canon version, check whether the existing attestations table can store a conformant attestation of that version without schema changes. If the answer is yes, no migration is needed. If the answer is no — a new mandatory field, a new kind — write a migration. The Canon version and the migration phase are not the same versioning system, but they must remain coherent. > ◆ Going Deeper — Schema migrations vs data migrations. > > A schema migration is a DDL change: CREATE TABLE, ALTER TABLE ADD COLUMN, > CREATE INDEX. It changes structure. A data migration is a DML change: > UPDATE documents SET processed = true WHERE processed IS NULL. It changes > content. > > Mixing them in one file creates a failure class: the DDL succeeds, the DML > fails at row 50,000 due to a constraint violation on an edge case, and the > transaction rolls back. The schema is now in its pre-migration state but the > application layer expected the post-migration state. Or worse: the database > does not wrap them in a transaction, the DDL commits, the DML fails, and the > schema is in a half-migrated state. The documents table has the new column > but 30% of its rows have the old null value that the new NOT NULL constraint > should reject. > > The discipline: schema migration file, applied and verified. Data migration > file, applied and verified in a separate step. CI tests both independently. > Down scripts are written for the schema migration (structural undo) and for > the data migration (value restoration) separately. ## 21.6 Reading the schema before writing Every migration file in the repository opens with a comment block that states what it adds and which earlier files it depends on. Before writing any migration, read the dependency chain. A0_attestations.sql depends on matters (from 10_core.sql), on acquisitions (from 20_provenance.sql), and on the audit() function (from 10_core.sql). A migration that re-defines audit() or renames the acquisitions table would silently break A0_attestations.sql. The four-step read-before-write discipline: 1. Read the existing schema in dependency order (numeric prefix order for the base schema; then phase migrations in alphabetic order). 2. Identify which tables and functions your new migration will reference. 3. Confirm those tables and functions exist in an already-applied file. 4. Write the migration, then write the down script, then test both. "Test both" means: apply the migration, verify the schema, run the down script, verify the schema is back to pre-migration state, apply the migration again. A migration that does not survive this cycle is not ready. ## 21.7 The running case: adding expert_reports Six months into the TPR proceeding, the attorney retains a forensic psychologist to evaluate the opposing agency's risk-assessment methodology. Her report will be a central exhibit. The existing schema has documents, sources, matters, and actors — but no first-class representation for
expert reports as a distinct artifact type.
The migration adds a table:
-- B1_expert_reports.sql
CREATE TABLE IF NOT EXISTS expert_reports (
id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
matter_id uuid NOT NULL REFERENCES matters(id),
expert_actor_id uuid NOT NULL REFERENCES actors(id),
filed_at timestamptz NOT NULL,
source_id uuid REFERENCES sources(id),
privilege_level integer NOT NULL DEFAULT 0
CHECK (privilege_level BETWEEN 0 AND 4),
created_at timestamptz NOT NULL DEFAULT now()
);
CREATE INDEX IF NOT EXISTS expert_reports_matter_idx
ON expert_reports(matter_id, filed_at);
The corresponding down script:
-- B1_expert_reports.down.sql
DROP INDEX IF EXISTS expert_reports_matter_idx;
DROP TABLE IF EXISTS expert_reports;
The down script is nine words and a semicolon. It is trivial because the
forward migration only adds structure; it does not transform existing data.
The IF EXISTS guards make both scripts idempotent. Notice the constraint on privilege_level: a CHECK (privilege_level BETWEEN 0 AND 4) enforces the PII tier enumeration at the database layer, not just at the application layer. If a bug in the ingestion pipeline attempts to insert a privilege level of 99, the database rejects it. Defense in depth: the Pydantic model enforces the range, the database enforces the range, and the audit trigger logs every insertion so any anomaly is visible after the fact. > ✻ Try This — Verify the round trip. > > Write the DDL for expert_reports exactly as shown above. Apply it to a > development copy of the schema. Then run the down script. Run \d > expert_reports in psql before and after the down script. Confirm the > table disappears. Apply the forward migration again. Confirm it reappears > with the same structure. > > Now add a single row: INSERT INTO expert_reports (matter_id, > expert_actor_id, filed_at, privilege_level) VALUES (<uuid>, <uuid>, now(), > 0);. Run the down script. What error do you get? The down script drops the > table but does not first delete the row — this is intentional. DROP TABLE > removes both structure and data. Confirm that the row is gone after the > down. This is the expected behavior and the reason down scripts for tables > with live data require explicit confirmation in production deployments. ## 21.8 The privilege_level column The privilege_level integer in expert_reports maps to the PII tier enumeration established in 85_legal.sql: - 0: public — no restriction - 1: low — internal case staff - 2: internal — attorneys and client only - 3: privileged — attorney-client privilege asserted - 4: work_product — protected from disclosure An expert report not yet filed publicly would be 2 or 3. A filed report would be 0 or 1. The column enforces that the value is always meaningful. The column's presence in this table, rather than only in the sources table, is deliberate: an expert report's privilege level may differ from the source document's privilege level. The source might be a raw scan (level 2). The filed exhibit might be a redacted version (level 0). Two rows in expert_reports, two different privilege levels, same underlying source — this model supports that distinction without requiring a privilege assertion join on every query. The tradeoff is redundancy: privilege is tracked in both the source and the expert report. The discipline is that both must be set on insertion, and both must agree unless there is an explicit reason they differ (redaction). The ingestion worker that creates expert_reports rows is responsible for this consistency. The database cannot enforce it across tables without a trigger. Whether to add that trigger is a design decision — document it in the migration header if you decide either way.
IF NOT EXISTS for CREATE TABLE/CREATE INDEX, CREATE OR REPLACE for functions, and DROP TRIGGER IF EXISTS before CREATE TRIGGER — making each migration file safe to apply twice without corrupting the schema or the data. - The correct application order is: numeric base files (00–99) once at bootstrap, then A series (Canon foundation), then B1/B2 (optional BM25 and vector index upgrades), then E0 (Rekor entries table) — each alphabetic phase file depends only on files that sort before it. - B1_paradedb_fts.sql is a silent no-op if pg_search is not installed; B2_pgvectorscale.sql is a silent no-op if pgvectorscale is not installed — both use DO block extension guards so applying them to a stock PostgreSQL instance is safe. - Adding a column to the attestations table after attestations have been sealed breaks chain_hash reproducibility: the chain hash is computed over JCS(attestation with seal excluded), so any schema change that alters the canonical JSON shape invalidates all prior chain hashes. - Migration idempotency is a Canon compliance requirement because the pipeline must be re-runnable after any failure — a migration that cannot be applied twice safely cannot be used in a system where retries are the primary recovery mechanism. schema/ in the order they should be applied. For the alphabetic phase files (A0, B0, B1, B2, C0, E0), explain how you would determine their ordering relative to each other when there is no numeric prefix. State which files silently no-op if their required extension is not installed. 2. The attestations table has two CHECK constraints that enforce hash format. Write the constraint for a new column acquisition_hash that must also be a SHA-256 hex string with the sha256: prefix. ### Core 3. Write the DDL for expert_reports as specified above. Write the down script. Apply both in order; verify the round trip. 4. A0_attestations.sql adds a column to acquisitions. What does the down script need to do to undo this, and why must it do that step before dropping the attestations table? 5. The kind CHECK constraint in attestations lists five allowed values. Suppose Phase C adds a sixth kind: 'comparative'. Write the migration that adds it. (Hint: ALTER TABLE attestations DROP CONSTRAINT ... ADD CONSTRAINT ... is the pattern, but this is a disruptive operation. Consider whether there is a less disruptive alternative.) ### Stretch 6. Design a migration that adds full-text search to expert_reports.subject (assume a subject text column was added in a prior migration). Use tsvector and a GIN index. Write both the forward migration and the down script. 7. The additive discipline says never remove columns. Design a deprecation protocol for a column that has been superseded: what columns would you add to signal deprecation, what constraints would you add to prevent new writes, and at what point (measured in migration files) would a future team be safe to drop the column? ## Build-your-own prompt For your capstone matter, identify one evidence category that your current schema does not represent as a first-class table. Write a migration to add it. The migration must: (1) use IF NOT EXISTS; (2) add at least one composite index that matches the primary access pattern; (3) include a paired down script; (4) not modify any existing file. If your migration references a column that does not yet exist on a dependency table, write the dependency migration first, then write the main migration. ## Further reading - Ambler & Sadalage, Refactoring Databases: Evolutionary Database Design (Addison-Wesley, 2006). The canonical text on additive migration discipline. - Kleppmann, Designing Data-Intensive Applications, Ch. 4 (Encoding and Evolution). The schema evolution problem in storage and message format terms. - PostgreSQL documentation, §13.3 (Explicit Locking). Required reading before writing any migration against a production database. - GitHub Engineering, "How GitHub's Partitioning Migration Enables GitLab's Monolith" (2022). The Vitess migration post-mortem with the three-rule discipline explained in the engineers' own words. - Zubulake v. UBS Warburg III, 220 F.R.D. 212 (S.D.N.Y. 2003). The controlling case on document preservation obligations. - schema/A0_attestations.sql and schema/A0_attestations.down.sql in this
repository.
Next: Chapter 22 — Local-First Chunking & Privilege. Where data physically lives before any cloud service touches it.