:orphan:

*******************************************************************************
Extraction Lessons Learned (cumulative)
*******************************************************************************

.. note::

   **Naming convention (2026m04d05):** The series uses b-numbers (b11--b17)
   not a-numbers (a1--a7). Papers formerly called a1--a7 are now b11--b17.
   This aligns with the ``hell/mm/b/NN/`` and ``hell/ll/study/b/NN/`` folder
   structure. See ``BB/series-status.rst`` for the full mapping.


Section 1: From b12 Extraction (2026m04d04)
=============================================

Recommendations for preparing extraction prompts for papers b13 (e7He),
b14 (JUB), and beyond. Based on the b12 extraction session that walked
24,945 lines across three FORGE logs.


What Worked Well (Keep These)
===============================

1. **Orient-Extract-Build sequence.** Reading the model source and writing
   prompt BEFORE the forge logs is essential. Without knowing what the paper
   needs, extraction is unfocused.

2. **Six extraction categories (A--F).** Categories A (Design Rationales) and
   F (OSCR/BABL) yielded the richest material. B (TEMPER Refinements) was
   most valuable for the paper writer. All six were useful; none were
   redundant.

3. **Organize KB by model structure, not by category.** The paper writer works
   through submodels (or stages, or axiom groups) sequentially. Organizing the
   KB to match this flow is more efficient than category-first organization.

4. **Parallel extraction agents for large files.** Two agents reading 14K and
   10K lines simultaneously cut wall-clock time in half.

5. **Scope rule with exception clause.** "Only a2 material, EXCEPT where a3
   material illuminates a2" was clear and practical. The boundary was sharp.

6. **Restate > point (80/20 ratio).** The paper writer should rarely need to
   go back to the original forge logs.


What to Improve
=================

1. **Rejected Alternatives (Category C) needs better source material.** Most
   rejections in forge logs are implicit (ideas that simply disappeared). Future
   forge sessions should explicitly flag "explored and rejected" decisions.
   Workaround: in the extraction prompt, tell the agent to look for ideas that
   appeared in HEAT but NOT in STRIKE --- the absence IS the rejection.

2. **Line number precision is approximate (~50-line window).** Acceptable for
   orientation but not for precise citation. If exact citation is needed, the
   agent should record the first distinctive text string near the reference
   point so the paper writer can search for it.

3. **Cross-cutting material needs its own section.** BABL/ZION for b12,
   stopping-outcomes for b13 --- any material that spans multiple structural
   units needs a dedicated KB section.


Specific Recommendations
==========================


For Paper b13 (e7He)
----------------------

- **Source:** sa3 (b/13/llog.rst, 10K lines) is nearly all b13-relevant.
  sa2 has some b13-relevant material (th7 Gate 5 development, ~lines 8650--8920).
- **Organizational axis:** The 7 hero journey stages (like submodels for b12).
- **Extra category:** Add "Stopping Outcomes" alongside the standard A--F.
  This is the e7He-specific material analogous to b12's OSCR/BABL.
- **Straddle material:** The Ie framework, 4D scope, and ridge dynamics
  straddle b12/b13. The b12 KB already has pointers; the b13 KB should own the
  full development.
- **Single agent is sufficient.** 10K lines is manageable in one systematic
  walk, unlike b12's 25K across three files.


For Paper b14 (JUB)
---------------------

- **Source:** ~50 llog entries across ``hell/ll/jub/`` (multiple files).
  The extraction prompt must provide a file list, not one file.
- **Pre-FORGE material:** JUB logs lack HEAT/STRIKE/TEMPER/QUENCH structure.
  Categories A--F still apply, but the agent should expect less structured
  source material.
- **Bridge from b12:** The PERFECT/PERFIDE -> Jubilee connection (b12 KB, m2
  section) is the primary bridge. Start from there.
- **Extra category:** Consider "Formal Gaps" --- JUB has more acknowledged
  gaps than e7Day (proto-formal predicates, undefined agency semantics, etc.).
- **Flag b15 material:** The ax11/ax11b fork belongs to b15, not a4.


For All Future Extractions
----------------------------

- Always include the ``WHAT NOT TO DO`` section --- it prevents the most
  common errors (using "validate," writing the paper, duplicating content).
- Always include ``BUILD CHECK: Run "make dev"`` --- RST errors caught early
  save time.
- The retrospective step (Phase 4) pays for itself: each extraction improves
  the next. Keep it.


Section 2: From b13 Extraction (2026m04d06)
=============================================

Recommendations for preparing extraction prompts for papers b14 (JUB),
b15 (Divine Simplicity), b16 (RiskyMADorMAP), and b17 (Full Synthesis).
Based on the b13 extraction session that walked 10,242 lines of sa3 plus
secondary sources (sa2 th7 Gate 5, PROMY pipeline/audit).


What Worked Well (Keep These)
-------------------------------

1. **b12 lessons learned directly improved b13.** Organizing by stage, adding
   category G (Stopping Outcomes), and the HEAT-vs-STRIKE approach for finding
   rejected alternatives all worked as recommended.

2. **Parallel extraction agents (4 agents, ~2000 lines each).** Effective for
   the 10K-line primary source. Wall-clock time ~60% of sequential reading.

3. **Category G (Stopping Outcomes) added substantial value.** This
   model-specific cross-cutting category captured the central narrative device
   of e7He (why perpetual cycling is necessary, what happens when the hero
   stops). Without it, material would have been scattered.

4. **b12 KB "Notes for Other Papers" section useful as starting pointer.**
   All 4 bullet points for b13 were accurate and useful. No false leads.
   Confirmed sa3 is ~90% e7He-relevant, identified the Ie/scope straddle,
   flagged rest-vs-stopping paradox.

5. **High restate ratio (~85%) appropriate.** The paper writer should rarely
   need to return to the forge log. Dense source material justifies restating
   over pointing.


What to Improve
-----------------

1. **Front-loading problem in parallel extraction.** Sa3's first 1500 lines
   (IGNITE + SEED) contain the richest conceptual material; later TEMPER
   rounds are more mechanical. Uniform 2000-line segments gave equal weight
   to all sections. Future extractions should weight segments by conceptual
   density, not line count.

2. **Theorem section organization.** The KB organized theorems by number. The
   paper writer would benefit from dependency-graph ordering (th1 depends on
   sp1; th3 depends on th1 + m0.ax3 + m7; th6 depends on m0.ax5; th7 depends
   on th6). Future KBs should organize theorems by derivation chain.


Specific Recommendations
==========================


For Paper b14 (JUB)
----------------------

- **Source structure differs fundamentally.** JUB material is spread across
  many files in ``hell/ll/jub/`` WITHOUT structured HEAT/STRIKE/TEMPER/QUENCH
  phases. The extraction prompt must provide a file list, not rely on a single
  forge log.
- **Pre-extraction inventory.** Consider creating an inventory of JUB-relevant
  llogs before starting extraction itself. This scoping step was unnecessary
  for b12 (single sa2 file) and b13 (single sa3 file) but will be essential
  for b14's distributed sources.
- **Bridge from b13.** The PD-to-Assurance transformation (th6) has
  system-level consequences for JUB. Start from the b13 KB's "Notes for Other
  Papers / For Paper b14" section: ax19 vulnerability, ax25/m0.ax5 macro/micro
  echo.
- **Model-specific cross-cutting category:** Consider "Recalibration
  Mechanisms" (the Jubilee cycle's reset function), analogous to b13's
  "Stopping Outcomes."
- **Extra standard category:** Consider "Formal Gaps" --- JUB has more
  acknowledged gaps than e7Day or e7He (proto-formal predicates, undefined
  agency semantics).


For Paper b16 (RiskyMADorMAP)
--------------------------------

- **Key input from b13:** th5 metastability refinement (BABL quasi-absorbing
  with exit rate lambda_ISMR > 0) and AA-th8-Metastable-a1 are direct inputs.
- **Model-specific cross-cutting category:** Consider "Attractor Dynamics"
  (CTMC absorbing-state analysis).


For All Future Extractions
----------------------------

- Continue including "Notes for Other Papers" --- the b12 and b13 sections
  both proved their value as starting pointers.
- Identify the model-specific cross-cutting category BEFORE starting
  extraction, not after. This ensures the extraction agents know to collect
  cross-cutting material from the start.
- When using parallel agents, brief them on which segments are likely
  conceptually dense vs. mechanically structured, so they can adjust
  attention accordingly.


Section 3: From b14 Extraction (2026m04d08)
=============================================

Recommendations for preparing extraction prompts for papers b15 (Divine
Simplicity), b16 (RiskyMADorMAP), b17 (h* Theorem), and b18 (Call to
Action). Based on the b14 extraction session that walked ~100 files across
``hell/ll/jub/b/11/`` through ``b/50/``.


What Worked Well (Keep These)
-------------------------------

1. **Parallel batch extraction (4 agents).** The largest corpus in the series
   (~100 files) was handled by splitting into 4 batches by file number range.
   Wall-clock time approximately 4 minutes for all batches.

2. **Axiom-group organization.** Organizing by axiom group (Agency,
   Delegation, Volunteer/Mediator, Preference, Innovation/Jubilee) rather
   than by extraction category made the KB directly usable by the paper
   writer. Confirmed b12 recommendation.

3. **Category H (Formal Gaps) added substantial value.** The JUB corpus has
   the most explicit gap-acknowledgment thanks to 3-round adversarial critique
   + 3-angle stress-test. The consolidated "Top 5 Mathematical + Top 5
   Feasibility + Resolution Grading" section provides a complete honesty
   checklist for the paper writer.

4. **Category I (Call-to-Action Material) worked well.** Captured material
   that would otherwise have been scattered across A and E. The register
   assessment (academic vs. urgent) directly informs b18 strategy.

5. **Steelmanning section organized by stakeholder type.** Economists,
   socialists, libertarians, theologians, determinists --- each with their
   strongest objection and the JUB response. Directly usable for the paper's
   discussion section.

6. **High restate ratio (~90%).** Necessary for multi-file corpus where
   pointing requires the reader to open many different files. Higher than
   b12 (80%) and b13 (85%) but appropriate for the distributed source
   structure.


What to Improve
-----------------

1. **Pre-sorted file priority list.** The original prompt provided a KEY FILES
   list but it was incomplete. Future multi-file extractions should include
   three tiers: HIGH (read completely), MEDIUM (scan for specific categories),
   LOW (skip unless time permits). For b14, the actual priority was:

   - HIGH: b/11, b/16, b/17, b/18, b/19, b/20, b/21, b/22, b/23, b/27--b/35,
     b/42--b/44
   - MEDIUM: b/12--b/15, b/24, b/25, b/49, b/50
   - LOW: b/36--b/41, b/45--b/48 (infrastructure, naming, migration)

2. **Redundancy handling for parallel compilations.** The b/11--b/16 range
   contains 4 parallel compilations of the same session (Sonnet
   details/overview, Opus details/overview, plus 2 final-memory logs). Future
   extractions with parallel compilations should identify the unique material
   in each compilation FIRST, then extract only unique material.

3. **Infrastructure files need explicit filtering.** b/36--b/50 mix
   JUB-relevant design decisions with BEST Names technical implementation
   detail. The extraction prompt should explicitly state: "From b/36--b/50,
   extract ONLY material about axiom/theorem content, design rationales, or
   adversarial findings. Skip naming conventions, label syntax, migration
   mechanics."


Specific Recommendations
==========================


For Paper b16 (RiskyMADorMAP)
--------------------------------

- **Source:** RiskyMADorMAP CTMC model material is spread across b/19
  (Reply 1b, Part B), b/21 (Reply 2, competitive inhibitor), b/30
  (restructuring 2d), and b/44 (math stress-test). The b14 KB's
  "RiskyMADorMAP Model" section has pre-digested cross-references.
- **Bridge from b14:** Competitive-inhibitor model, commons-tragedy
  convergence, N=1 credibility limitations, survivorship bias.
- **Model-specific cross-cutting category:** Consider "Attractor Dynamics"
  (CTMC absorbing-state analysis, metastability timescales, technological
  amplification).
- **Extra source:** The b12 foundation test summary may have relevant
  CTMC formalization material.


For Paper b17 (h* Theorem)
-----------------------------

- **Source:** ax19 and th6 material is concentrated in b/11, b/13 (opus-regen
  details), b/18 (Reply 1, fitness analogy), b/21 (Reply 2, scalar
  projection), b/27 (restructuring 2a), b/30 (restructuring 2d).
- **Bridge from b14:** Fitness analogy defense, ontological vs. epistemic
  distinction, h* does not require traditional privilege.
- **The epistemic/ontological distinction is critical:** ax19 claims h*
  exists, NOT that anyone can identify h* in real time. This distinction
  must be front and center in b17.


For All Future Extractions
----------------------------

- Multi-file corpora benefit from parallel batch extraction (4 agents for
  ~100 files, 2 for ~20 files, 1 for ~10 files). Size the batches by
  conceptual density, not file count.
- The adversarial critique/reply structure (b14's Quest format) yields
  richer Temper Refinement and Formal Gap material than FORGE logs. Future
  models with adversarial rounds should preserve the critique-reply pairing
  in the extraction.
- Resolution grading (P/S/L/A) from stress-tests is directly usable as
  paper material. Include grading definitions and distribution in the KB.
- When a corpus has both content files and infrastructure files, provide
  explicit filtering instructions to prevent infrastructure details from
  diluting content extraction.
