Paper b12 — Adversarial Review Prompts (2026m04d05)#

Run each prompt in a separate session. Each produces a peer-review report that identifies structural weaknesses before refinement. Run these BEFORE the refinement prompts.

b12-math: Formal Logic Review#

/effort max

You are a formal logician reviewing a paper that claims to derive
self-correction principles from 21 axioms. Your job is to find
every logical weakness, unstated assumption, and questionable step.

Read: source/matheology/hell/mm/b/12/mmv2/b12-math_2026m04d05.rst
Also read: .claude/CLAUDE.md (especially Language Rules)

For EACH axiom (all 21) and EACH theorem (all 9), answer:

1. Is the formal statement well-formed? Any ambiguity in quantifiers,
   scope, or variable binding?
2. Does the derivation sketch actually follow? Identify any gaps
   where "by m6.ax4" is invoked without showing the intermediate steps.
3. Are there hidden assumptions not stated as axioms? (e.g., the
   paper uses set-theoretic partitions but never states ZF axioms)
4. Is there a countermodel? Can you construct a model that satisfies
   all axioms but violates a claimed theorem?
5. Independence: is this axiom derivable from others? (Two were
   already reclassified; are there more?)

Pay special attention to:
- The BEST Names table: are the Brief/Explicit/Summarizing/Technical
  names consistent with how the symbols are actually used?
- The consistency claim (Section 5.1): the paper claims no
  contradiction was found but provides no proof. How serious is this gap?
- The "constructive witness for m0" open question: does this undermine
  mc.ax1?
- The categorical formalization suggestion: is this feasible? What
  would it require?

PRODUCE a review report with: (a) a severity-ranked list of issues
(Critical / Major / Minor), (b) a recommendation (Accept / Revise /
Reject with reasons), (c) specific suggestions for each issue.

Use HELD/BREACH, not PASS/FAIL. Use "test"/"check", not
"validate"/"verify". Save report at
source/matheology/hell/ll/study/b/12/review_b12-math_2026m04dNN.rst
(replace NN with today's date).

b12-theophil: Theological/Philosophical Review#

/effort max

You are a theologian and philosopher of religion reviewing a paper
that claims Genesis 1 encodes a formal construction logic and that
multiple independent traditions converge on fragments of the same
structure. Your job is to steelman every objection a careful
scholar would raise.

Read: source/matheology/hell/mm/b/12/mmv2/b12-theophil_2026m04d05.rst
Also read: .claude/CLAUDE.md (especially Language Rules, Core Principle)
Also read: source/matheology/hell/ll/study/b/12/study_ll_2026m04d05_a2-e7day-llog.rst
(the WoLC reference search report with full candidate analysis)

For EACH claim, answer:

1. GENESIS CORRESPONDENCES: Is the paper reading structure INTO the
   text (eisegesis) or extracting structure FROM the text (exegesis)?
   For each of the 4 structural predictions (missing verdict, double
   verdict, "very good" ambiguity, Sabbath structure), steelman the
   alternative explanation that the prediction is post-hoc pattern
   matching, not genuine structural prediction.

2. CROSS-TRADITIONAL CONVERGENCE: For each of the 7 traditions cited,
   steelman the objection that the mapping is superficial (humans
   organize things hierarchically; convergence on "cascading hierarchy"
   is trivial). What would distinguish genuine structural convergence
   from trivial pattern convergence?

3. OMPHALOS FIREWALL: Does the "constructor is a parameter" claim
   actually work? Can a theological paper genuinely remain neutral
   on whether the constructor is God? Or does the Genesis framing
   smuggle in a theological commitment that the "parametric" claim
   denies?

4. THEODICY IMPLICATIONS: The paper claims the EQUAL tension (m2)
   reframes the problem of evil. Does this reframing actually address
   the problem, or does it just relocate it? Steelman the objection
   that "the tension is structural" is just "God designed suffering"
   in formal language.

5. BABL/ZION AS SPIRITUAL DYNAMICS: Does the bifurcation add
   anything that existing theology (e.g., Augustine's "two cities,"
   the yetzer ha-tov/yetzer ha-ra distinction) does not already
   provide? If so, what specifically?

PRODUCE a review report with severity-ranked issues and specific
suggestions. Use HELD/BREACH, not PASS/FAIL. Save at
source/matheology/hell/ll/study/b/12/review_b12-theophil_2026m04dNN.rst

b12-syseng: Systems Engineering Review#

/effort max

You are a senior systems architect reviewing a paper that claims
to provide a formal framework for self-correcting system design.
Your job is to test whether these design patterns actually work
in practice.

Read: source/matheology/hell/mm/b/12/mmv2/b12-syseng_2026m04d05.rst
Also read: .claude/CLAUDE.md (especially Language Rules)

For EACH design pattern and engineering claim, answer:

1. THE OKO PATTERN: Is "never declare OK" practical? How does this
   interact with real engineering constraints (ship dates, budget,
   regulatory compliance that REQUIRES a sign-off)?

2. THE JUBILEE PATTERN: Is 6:1 (1 sprint consolidation per 6 feature
   sprints) realistic? How does it compare to industry benchmarks
   (Google's 20% time, Spotify's hack weeks)? Is there evidence
   that 6:1 is better than 5:1 or 7:1?

3. OSCR DETECTION: Are the proposed indicators (decreasing exception
   handlers, increasing one-off fixes, system applied beyond design)
   measurable in practice? What existing monitoring tools could track
   these? What false-positive rate would you expect?

4. UMP MONITORING: The paper claims "if more than 30% of alerts are
   non-actionable, monitoring is approaching UMP collapse." Where
   does 30% come from? Is this testable?

5. MATURITY MODEL: The paper maps WoLC stages to maturity levels.
   How does this compare to existing maturity models (CMMI, DORA,
   Westrum typology)? Does it add anything they miss?

6. CASE STUDIES: The paper lacks specific case studies. Identify
   3 real-world system failures that fit the OSCR pattern and
   3 that do NOT fit. The non-fitting cases test the model's limits.

PRODUCE a review report. Save at
source/matheology/hell/ll/study/b/12/review_b12-syseng_2026m04dNN.rst

b12-socpsy: Psychology/Social Science Review#

/effort max

You are a developmental psychologist reviewing a paper that claims
structural parallels between the e7Day model and Erikson, Maslow,
Kohlberg, Bloom, and Tuckman. Your job is to test whether these
parallels are genuine or forced.

Read: source/matheology/hell/mm/b/12/mmv2/b12-socpsy_2026m04d05.rst
Also read: .claude/CLAUDE.md (especially Language Rules)

For EACH parallel claimed, answer:

1. ERIKSON PARALLEL: The paper claims 8-stage match, binary outcomes,
   and cascading dependency. But Erikson's ordering is fundamentally
   different (Trust first, not last). Is this a genuine structural
   parallel or is the paper cherry-picking features that match while
   ignoring the ordering mismatch?

2. MASLOW PARALLEL: Maslow himself cautioned against rigid hierarchy.
   Does the parallel survive Maslow's own caveats?

3. SUPERVILLAIN THEOREM: The paper claims "heroes who stop growing
   become the most dangerous agents." Is this testable? What
   biographical or historical data would confirm or disconfirm it?
   Steelman the objection that "dictators start as heroes" is
   selection bias (we remember the hero-turned-dictator cases and
   forget the heroes who stayed heroes).

4. DUNNING-KRUGER GENERALIZATION: The paper claims that OK
   self-assessment at ANY competence level produces the same structural
   consequence. This goes beyond Dunning-Kruger (which is about low
   competence). Is this supported by existing metacognition research?

5. COMPASSION CAPACITY: Is the five-gate model operationalizable?
   Could it be turned into a validated assessment instrument? What
   existing instruments (e.g., empathic accuracy measures, burnout
   inventories) overlap with which gates?

PRODUCE a review report. Save at
source/matheology/hell/ll/study/b/12/review_b12-socpsy_2026m04dNN.rst

b12-intro: General Audience Review#

/effort max

You are an editor at a magazine for educated general readers (like
The Atlantic, Aeon, or Scientific American). You are reviewing a
paper that tries to explain why systems destroy themselves. Your
job is to test whether a non-specialist can follow the argument.

Read: source/matheology/hell/mm/b/12/mmv2/b12-intro_2026m04d05.rst
Also read: .claude/CLAUDE.md (especially Language Rules)

Answer:

1. THE TEASER: Does the 1,000-word teaser work standalone? Would a
   busy reader finish it? Would they want to read the rest?

2. JARGON CHECK: List every term that a general reader would not
   know. For each, is it explained when first used? Rate: explained /
   unexplained / partially explained.

3. THE GENESIS FRAMING: For a secular reader, does the Genesis
   connection help or hurt? Does the paper adequately signal that the
   formal structure is independent of the theological instantiation?
   Or will secular readers bounce at "Genesis"?

4. THE "SO WHAT" TEST: After reading, does the reader know what to
   DO differently? Is Section 4 ("What To Do With This") actionable
   or generic?

5. EMOTIONAL ENGAGEMENT: The paper covers self-destruction,
   compassion failure, and heroes becoming villains. Does it land
   emotionally or is it too abstract? Where does the reader feel
   something, and where do they check out?

6. LENGTH: Is 5K words (plus 1K teaser) the right length? Where
   does it drag? Where does it rush?

PRODUCE a review report. Save at
source/matheology/hell/ll/study/b/12/review_b12-intro_2026m04dNN.rst