Voyage · EventFarm Parity Audit Trail
Methodology only earns its keep if its results are auditable. Every factory cycle leaves behind per-axis verdicts, evidence artifacts, and a row in the ledger below. When a bug escapes the methodology and a human catches it, the case is logged with a healing plan that names the methodology change preventing the same class of escape next time. The trail is persistent — past entries stay; charts visualize whether the methodology is getting better.
Last updated: 2026-04-30 · Methodology axes: 4 (Functional, Visual, Semantic, Honesty) · Cycles audited: 7 (incl. self-audit) · Open findings: 0 · Addressed findings: 2 · Retrospective-resolved findings: 3 · Open escapes: 1
Where the methodology stands as of the last factory cycle.
Every cycle, every axis verdict, every artifact pointer. Newest first.
| Cycle | Date | Surfaces / scope | Functional | Visual | Semantic | Honesty | Verdict |
|---|---|---|---|---|---|---|---|
| Visual-quality prompt tightening1a0b78d · 4983338 · deploy/self-audit | 2026-04-30 | EF-074, EF-073 (×2 viewports), EF-077 Escape #2 calibration + pilot re-eval |
N/A | 2 demoted | N/A | PASS (self) | Methodology hardening |
| Honesty findings clarity + open-issue close018a808 | 2026-04-30 | audit-trail rendering + close real open data-server-total tautology | N/A | N/A | ADDRESSED | PASS | Remediation |
| Honesty axis pilot1e60042 · aa71c38 · f0e3076 · 4bbc4c9 | 2026-04-30 | 5 cycles audited retrospectively + self-audit | N/A | N/A | N/A | PASS (self) | Harness landed |
| Semantic-invariants pilotffe6c81 · ed9942d · 1a33202 · 79b2256 | 2026-04-30 | EF-074, EF-073, EF-077 3 surfaces × 8 invariants |
PASS | PASS | PASS | ADDRESSED · 2 serious | EF-074 → Shippable |
| Visualizer polish56fc597 · 01f5b30 · bf035f8 · 5b57b45 | 2026-04-30 | EF-074, EF-073 (×2 viewports), EF-077 polish iteration ×2 |
PASS | PASS | N/A | PASS (retro) | Visual closed; semantic pending |
| Visualizer chrome-fix27507ae · d1b85f8 · 57d600e · 46e588e | 2026-04-30 | SurfaceApp suppress admin chrome for surface=visualizer | PASS | demoted | N/A | PASS (retro) | honest demote — polish < 4 |
| Visual-quality pilot2ec2c43 · cf3a226 · 4ac6ece | 2026-04-30 | 3 EFx surfaces — first vision-evaluator run | PASS | all 4 fail | N/A | RETRO · 1 serious resolved | Harness pilot |
| EF-074 cycle 3 (client fixes)b3364f9 · 7d054ad · ee5123f · df65cab | 2026-04-29 | EF-074 client out-of-order + foreign-event filter | PASS | no axis yet | no axis yet | RETRO · 2 serious resolved | demoted by user; founding escape #1 healed |
| EF-077 access control637c053 · ec94b0c · f3ca200 · 976b89f · 1bd7bb7 | 2026-04-29 | /access-station, door scan, audit row, capacity | PASS | no axis yet | no axis yet | not audited | Partial — NFC + organizer admin deferred |
| EF-073 EFx Polld79a1b2 · 75f0559 · adff28b · 3a89f34 | 2026-04-29 | /poll-attendee + /poll-station net-new | PASS | no axis yet | no axis yet | not audited | Partial — organizer admin deferred |
| W1 mailing edgefb6f884 · 976279c · 5323e4b | 2026-04-29 | 99/100 → 100/100 stuck-sending fix | PASS | no UI | no axis yet | not audited | Deliverable 3 closed |
| EF-074 tightenea76be3 · abcc49c · 81a98ce · 8fb3542 · 593905b | 2026-04-29 | delta-injection + load wrapper, surfaced 3 client bugs | PASS | no axis yet | no axis yet | not audited | Partial — 3 honest demotions |
Methodology coverage and outcomes over time. Hand-rendered for now; auto-generation comes when the dataset warrants it.
Axes coverage over time
How many of the 4 axes were available + applied per cycle.
Honesty-audit finding states (across 6 audited cycles)
Findings by current state. Open = action required. Addressed = fix landed in a later commit. Retro-resolved = the finding describes a state of the world already corrected by subsequent work.
Bugs caught by humans that the methodology should have caught. Each escape includes a healing plan: what was missed, root cause, and the methodology change that prevents the same class of escape next time.
Escape #2 — OPEN · logged 2026-04-30
The visual-quality prompt read aesthetic surface quality without checking whether the station rendered a coherent runtime state for a real door operator. The prompt missed ten visible failures: kitchen-sink ALLOW / DENY / CAPACITY REACHED / LATE POLICY pills; Checkpoint ef077-door-station debug-string leakage; ALLOW shown while capacity is reached; active equal-weight NFC affordance while NFC proof is deferred; headline dominating the actual scan action; audit rows without column headers; 4 rows developer-database terminology; yellow/cream capacity-reached color semantics; massive audit-panel dead space; and an empty middle column beneath the checkpoint label.
The prompt's sub-axis definitions were too loose. In particular, would_a_designer_ship was being returned true based on "looks intentional and fairly production-ready" framing without testing whether the page would actually function for its named persona.
The visual-quality predicate now requires functional coherence in addition to aesthetic surface quality: a single resolved state instead of kitchen-sink rendering, surface-language hygiene instead of debug identifiers, affordance/capability alignment instead of active no-op controls, and label completeness for table-like data. The methodology tightened in response to the human-caught escape, per the audit-trail commitment.
Escape #1 — Founding case · RESOLVED 2026-04-30
The cycle's strict-predicate harness covered tagged elements via selector-visible, selector-absent, color-contrast on a single marker, and similar. The page passed every probe. But the operator-visible state was broken in two distinct ways:
The methodology at the time had one axis: functional. The functional axis evaluated tagged elements end-to-end correctness, but it did not evaluate aggregate page quality (visual axis territory) or arithmetic invariants on rendered values (semantic axis territory). The cycle's "Shippable" verdict was broader than the underlying probe coverage warranted — a classic instance of over-claim that the messaging audit (honesty axis) is now built to catch. The honesty audit, applied retroactively, flagged this cycle FAIL with 2 serious findings, validating the calibration.
"Shippable" is no longer a claim earned by passing the probes the cycle authored. It is a claim that requires evidence on all four axes. The honesty axis specifically watches for the pattern that produced this escape: cycles that conveniently choose probe sets that exclude the dimensions where there's residual risk. The retroactive honesty audit on this very cycle validates that the harness has the teeth required.
What every future factory cycle obligates the methodology to do.
Per-cycle audit-trail update protocol:
/findings/ at stable paths so the ledger's links don't rot.Escape protocol:
queued / in flight / complete until the methodology change actually lands. No steps are silently retired.Persistence: the trail does not get rewritten. Past entries stay. New evidence layers (e.g., honesty audits applied retroactively) are added as new entries that reference the original cycle, never by editing the original.