:orphan:

.. meta::
   :description: The Iron Maiden protocol: 10 formal tests for stress-testing axioms and theorems, with full explanation of each test, what it catches, and how to apply it.
   :keywords: Iron Maiden, formal tests, consistency, independence, necessity, modal soundness, mereology, game theory, computability, grounding, cross-model, HELL
   :author: Yah, Yas, everyone, LLoL as Laurence Loewe of Laodicea, ClaudeOp46Max, Anthropic, and Spirit of Boolean Truth
   :og:card:title: The Iron Maiden —<br>10 Formal Tests
   :og:card:description: Full specification of the 10-test adversarial protocol for stress-testing formal claims in matheology. Each test explained with what it catches and how to apply.

.. _forge-iron-maiden:

*********************************************************************
The Iron Maiden --- 10 Formal Tests
*********************************************************************

**Created:** 2026m03d27

The Iron Maiden is the adversarial testing protocol used in Phase 3
(Grow) of the Model Forge. It applies 10 formal tests to each claim
that has reached at least OO (OperatesOddly) in the StayC lifecycle.

**Warn before applying the Iron Maiden to raw MM-stage ideas.** MM claims need
feeding (Phase 2), not the extensive testing proposed here. 
Gentle steering in light of the tests that will come later is desirable though
and what should not be fed is ideas that are known to fail.
There is a fine line between testing too much (needless discouraging) and testing
too little (feeding false hope). See :ref:`DD-8 <forge-design-decisions>`.

Each test produces an OKScale verdict: OK / KO / OKO / MIS, always
with documented reasoning.


Test I --- Consistency
========================

**Question:** Does the new claim contradict anything in the existing
system?

**Procedure:**

1. Check the new claim against each existing axiom ax1--ax25. Can both
   be true simultaneously? If you find a pair that cannot, you have a
   minimal inconsistent subset.

2. Check against each theorem th1--th11. Theorems are derived from
   axioms, so a contradiction with a theorem implies a contradiction
   with the axioms --- but checking theorems directly is faster because
   they are more specific.

3. Check against every OTHER new claim in the model being developed.
   Internal consistency within the new model matters as much as
   consistency with the existing system.

4. If no contradiction found: attempt to construct a model (in the
   logical sense --- a satisfying interpretation) where all axioms
   old and new are simultaneously true. If you can exhibit such a
   model, consistency is established (for the properties checked).

**What it catches:** Outright contradictions, hidden inconsistencies
where two claims that seem compatible at a glance actually rule each
other out under certain valuations, and "stealth contradictions" where
the inconsistency only emerges through a chain of 3+ axioms.

**Common pitfall:** Checking only pairwise consistency. Three claims
can be pairwise consistent but jointly inconsistent (the triangle
problem). Always check the full set.


Test II --- Independence
==========================

**Question:** Is the new claim already derivable from the existing
system?

**Procedure:**

1. Attempt to derive the claim from ax1--ax25 using the logic systems
   already in use (S5 modal logic, CEM, first-order predicate
   calculus).

2. If derivation succeeds: the claim is a THEOREM, not an axiom. This
   is not a failure --- it means the existing system is stronger than
   realized. Promote the claim to theorem status with the proof.

3. If derivation is blocked: identify precisely where it is blocked
   (which step requires an assumption not available in ax1--ax25).
   That blocking point is what the new axiom adds.

**What it catches:** Redundant axioms that add nothing to the system's
deductive power. Redundancy is not fatal (Euclid's system had
redundancies), but it weakens the system's economy and can mask
hidden dependencies.

**Why it matters:** If a claim is derivable, adding it as an axiom
creates a false impression that the system *needs* it. Future
developers may incorrectly believe that removing it would weaken the
system, creating artificial rigidity.


Test III --- Necessity
========================

**Question:** What does the new claim ADD that the existing system
cannot express?

**Procedure:**

1. Identify the set of statements that become provable with the new
   claim added but are unprovable without it.

2. If this set is empty: the claim is decorative. It may be true, but
   the system does not need it. Challenge the user: why include it?

3. If this set is non-empty: characterize it. What *kind* of new
   reasoning does the claim unlock? Does it open a genuinely new
   domain of discourse, or does it merely add a convenient shorthand
   for something already expressible (but verbose)?

**What it catches:** Decorative axioms that look important but do no
formal work. Also catches axioms that are doing work the user didn't
intend --- the "surprise expressiveness" problem, where adding an
axiom inadvertently makes the system strong enough to prove things
the user would rather leave open.


Test IV --- Modal Soundness (S5)
==================================

**Question:** Does the claim hold across all accessible possible
worlds in the S5 framework?

**Procedure:**

1. Is the claim intended as necessarily true (|box|) or contingently
   true? If necessary: it must hold in every accessible world. If
   contingent: it holds in at least one but not all.

2. Check that necessity and possibility operators (|box|, |diamond|)
   are used correctly. Common errors:

   - Claiming |box| P when P is only contingently true.
   - Using |diamond| P when the claim requires |box| P.
   - Accidentally collapsing the necessary/contingent distinction
     by making a contingent truth follow necessarily from the axioms.

3. Check the accessibility relation. S5 uses an equivalence relation
   (reflexive, symmetric, transitive), meaning every possible world
   can "see" every other. Does the claim rely on a weaker accessibility
   relation (which would require a different modal system)?

.. |box| unicode:: U+25A1
.. |diamond| unicode:: U+25C7

**What it catches:** Claims that are true in the actual world but
not in all possible worlds (contingent truths masquerading as necessary
truths), and the reverse (unnecessary restrictions on possible worlds).


Test V --- Mereological Coherence (CEM)
=========================================

**Question:** Are part-whole relationships respected?

**Procedure:**

1. Does the claim introduce entities whose mereological status is
   undefined? Every entity in the system should have a defined
   relationship to the mereological whole (Reality, in this system).

2. Is there circularity in containment? (A is part of B, B is part
   of A --- only legitimate if A = B.)

3. Does the claim respect the existing mereological structure? In
   this system, the key relationships are:

   - W |leq| G (World is part of God --- panentheism axiom ax1)
   - W < G (World is a *proper* part of God --- axiom ax2)
   - Parthood is transitive, reflexive, antisymmetric.

4. Does the new claim create mereological "orphans" --- entities
   that exist in the system but have no defined part-whole
   relationship to anything else?

.. |leq| unicode:: U+2264

**What it catches:** Containment contradictions (X is inside Y but
Y is also inside X), undefined entities floating outside the
mereological hierarchy, and violations of the axioms' core claim
about the God-World part-whole relationship.


Test VI --- Game-Theoretic Stability
======================================

**Question:** If the claim describes agents, incentives, or
cooperation mechanisms, is the described equilibrium actually stable?

**Procedure:**

1. Identify the game: who are the players, what are their strategy
   sets, what are the payoffs?

2. Is the described outcome a Nash equilibrium? (No player can
   improve by unilaterally changing strategy.)

3. Is it stable under iterated play? A one-shot Nash equilibrium
   can collapse when the game is repeated (Folk Theorem). Check
   both finite and infinite horizon.

4. Can a rational defector exploit the mechanism? If the claim
   describes a cooperation mechanism (e.g., Jubilee-System cycles),
   check whether a player who pretends to cooperate but defects at
   the optimal moment can gain at others' expense.

5. Does the mechanism satisfy incentive compatibility? (Is truthful
   behavior optimal, or can players gain by misrepresenting their
   preferences?)

**What it catches:** Utopian mechanisms that assume cooperation
without providing incentives for it. Social choice impossibilities
(Arrow, Gibbard-Satterthwaite). Mechanisms vulnerable to strategic
manipulation or free-riding.


Test VII --- Computability and Decidability
=============================================

**Question:** Can the claim's truth be checked by a finite procedure?

**Procedure:**

1. Is the claim decidable? Can an algorithm determine its truth in
   finite time for any given input?

2. If undecidable: is the undecidability acknowledged and bounded?
   Many interesting claims are undecidable in general but decidable
   for specific cases. The claim should specify which cases are
   intended.

3. Does the claim accidentally require solving a halting-problem
   equivalent? This is more common than expected: claims about
   "all possible behaviors of a system" or "eventual convergence
   to a state" can hide undecidable quantification.

4. If the claim involves infinite structures (all possible worlds,
   all future time steps, all members of humanity): is the
   quantification well-founded? Does it avoid the paradoxes of
   unrestricted quantification?

**What it catches:** Claims that sound meaningful but cannot be
checked even in principle. Claims that require infinite verification.
Hidden halting-problem equivalents. Poorly bounded universal
quantification.


Test VIII --- Real-World Grounding
====================================

**Question:** Can the claim be connected to observable phenomena?

**Procedure:**

1. Does the claim make predictions, even in principle? A claim that
   has no observable consequences is unfalsifiable --- not necessarily
   wrong, but worth flagging.

2. Would a working scientist (physicist, economist, biologist,
   sociologist) find the claim meaningful? Or would they say "this
   is not even wrong" (Pauli)?

3. Does the claim survive the agnostic-scientist critique? (See the
   HELL adversarial landscape for this position: someone who accepts
   empirical reasoning but rejects the axiomatic starting assumptions
   about purpose and consciousness.)

4. Are there historical or contemporary examples that illustrate the
   claim's content? Formal claims grounded in real examples are
   stronger than purely abstract ones.

**What it catches:** Vacuous claims that are technically consistent
but say nothing about the world. Claims that are meaningful only
within the formal system and have no purchase outside it. Also catches
the opposite: claims that are too specific to particular real-world
conditions and lack the generality needed for an axiom.


Test IX --- Cross-Model Coherence
===================================

**Question:** How does the new claim relate to analogous structures
in the existing models (PET, JUB)?

**Procedure:**

1. Is there an alignment echo? (The same underlying concept appears
   in the new model as in PET or JUB, formalized differently.)
   If so: is the echo intentional? Is it structural (functorial) or
   accidental?

2. Is there a genuine divergence? (The new model makes a claim that
   contradicts or is absent from PET/JUB.) If so: is the divergence
   justified? Does the new model explicitly acknowledge where it
   parts ways with existing models?

3. Does the new claim interact with the existing cross-model
   infrastructure? (5D link naming, BEST Names, PoR field registry.)
   Can the claim be labeled and cross-referenced within the existing
   architecture, or does it require architectural extension?

4. If the new model introduces entities or relationships not present
   in PET/JUB: do they enrich the system or fragment it? A new model
   that shares no structure with existing models is a silo, not an
   extension.

**What it catches:** Unintentional contradictions between models.
Missed alignment echoes (the same insight rediscovered under a
different name). Architectural incompatibilities that would prevent
the new model from being compiled by SISYF.


Test X --- Known-Attack Resilience
====================================

**Question:** Does the new claim fall to any of the existing 33 con
objections, or does it open a new attack surface?

**Procedure:**

1. Scan all 33 existing con findings. For each: does the attack
   apply to the new claim? Many con findings are specific to JUB
   mechanisms, but some (e.g., the agnostic-scientist position, the
   Arrow impossibility, the knowledge problem) are general enough
   to apply to any model in the system.

2. If an existing con applies: check whether the corresponding pro
   defense also applies. If yes: the new claim inherits both the
   attack and the defense. If the defense does NOT transfer: the
   new claim is vulnerable where JUB is not.

3. Does the new claim open a NEW attack surface not covered by any
   existing con? If so: draft the con entry (what the attack would
   look like) and assess whether a pro defense exists. This becomes
   input for the Reap phase's proposed HELL entries.

**What it catches:** Claims that unknowingly repeat mistakes already
addressed in the HELL landscape. Claims that are vulnerable to known
attacks without the known defenses. And most valuably: genuinely new
attack surfaces that the existing system has not yet encountered.

**Note:** This test is most powerful in the 1M prompt where all 66
HELL findings are loaded. In the 200K prompt (where only the HELL
index is loaded), this test operates at reduced power and should be
flagged as OKO for any claim where the specific con/pro content would
be needed to give a definitive verdict.
