Extraction Lessons Learned (cumulative)#

Note

Naming convention (2026m04d05): The series uses b-numbers (b11–b17) not a-numbers (a1–a7). Papers formerly called a1–a7 are now b11–b17. This aligns with the hell/mm/b/NN/ and hell/ll/study/b/NN/ folder structure. See BB/series-status.rst for the full mapping.

Section 1: From b12 Extraction (2026m04d04)#

Recommendations for preparing extraction prompts for papers b13 (e7He), b14 (JUB), and beyond. Based on the b12 extraction session that walked 24,945 lines across three FORGE logs.

What Worked Well (Keep These)#

Orient-Extract-Build sequence. Reading the model source and writing prompt BEFORE the forge logs is essential. Without knowing what the paper needs, extraction is unfocused.
Six extraction categories (A–F). Categories A (Design Rationales) and F (OSCR/BABL) yielded the richest material. B (TEMPER Refinements) was most valuable for the paper writer. All six were useful; none were redundant.
Organize KB by model structure, not by category. The paper writer works through submodels (or stages, or axiom groups) sequentially. Organizing the KB to match this flow is more efficient than category-first organization.
Parallel extraction agents for large files. Two agents reading 14K and 10K lines simultaneously cut wall-clock time in half.
Scope rule with exception clause. “Only a2 material, EXCEPT where a3 material illuminates a2” was clear and practical. The boundary was sharp.
Restate > point (80/20 ratio). The paper writer should rarely need to go back to the original forge logs.

What to Improve#

Rejected Alternatives (Category C) needs better source material. Most rejections in forge logs are implicit (ideas that simply disappeared). Future forge sessions should explicitly flag “explored and rejected” decisions. Workaround: in the extraction prompt, tell the agent to look for ideas that appeared in HEAT but NOT in STRIKE — the absence IS the rejection.
Line number precision is approximate (~50-line window). Acceptable for orientation but not for precise citation. If exact citation is needed, the agent should record the first distinctive text string near the reference point so the paper writer can search for it.
Cross-cutting material needs its own section. BABL/ZION for b12, stopping-outcomes for b13 — any material that spans multiple structural units needs a dedicated KB section.

Specific Recommendations#

For Paper b13 (e7He)#

Source: sa3 (b/13/llog.rst, 10K lines) is nearly all b13-relevant. sa2 has some b13-relevant material (th7 Gate 5 development, ~lines 8650–8920).
Organizational axis: The 7 hero journey stages (like submodels for b12).
Extra category: Add “Stopping Outcomes” alongside the standard A–F. This is the e7He-specific material analogous to b12’s OSCR/BABL.
Straddle material: The Ie framework, 4D scope, and ridge dynamics straddle b12/b13. The b12 KB already has pointers; the b13 KB should own the full development.
Single agent is sufficient. 10K lines is manageable in one systematic walk, unlike b12’s 25K across three files.

For Paper b14 (JUB)#

Source: ~50 llog entries across hell/ll/jub/ (multiple files). The extraction prompt must provide a file list, not one file.
Pre-FORGE material: JUB logs lack HEAT/STRIKE/TEMPER/QUENCH structure. Categories A–F still apply, but the agent should expect less structured source material.
Bridge from b12: The PERFECT/PERFIDE -> Jubilee connection (b12 KB, m2 section) is the primary bridge. Start from there.
Extra category: Consider “Formal Gaps” — JUB has more acknowledged gaps than e7Day (proto-formal predicates, undefined agency semantics, etc.).
Flag b15 material: The ax11/ax11b fork belongs to b15, not a4.

For All Future Extractions#

Always include the WHAT NOT TO DO section — it prevents the most common errors (using “validate,” writing the paper, duplicating content).
Always include BUILD CHECK: Run "make dev" — RST errors caught early save time.
The retrospective step (Phase 4) pays for itself: each extraction improves the next. Keep it.

Section 2: From b13 Extraction (2026m04d06)#

Recommendations for preparing extraction prompts for papers b14 (JUB), b15 (Divine Simplicity), b16 (RiskyMADorMAP), and b17 (Full Synthesis). Based on the b13 extraction session that walked 10,242 lines of sa3 plus secondary sources (sa2 th7 Gate 5, PROMY pipeline/audit).

What Worked Well (Keep These)#

b12 lessons learned directly improved b13. Organizing by stage, adding category G (Stopping Outcomes), and the HEAT-vs-STRIKE approach for finding rejected alternatives all worked as recommended.
Parallel extraction agents (4 agents, ~2000 lines each). Effective for the 10K-line primary source. Wall-clock time ~60% of sequential reading.
Category G (Stopping Outcomes) added substantial value. This model-specific cross-cutting category captured the central narrative device of e7He (why perpetual cycling is necessary, what happens when the hero stops). Without it, material would have been scattered.
b12 KB “Notes for Other Papers” section useful as starting pointer. All 4 bullet points for b13 were accurate and useful. No false leads. Confirmed sa3 is ~90% e7He-relevant, identified the Ie/scope straddle, flagged rest-vs-stopping paradox.
High restate ratio (~85%) appropriate. The paper writer should rarely need to return to the forge log. Dense source material justifies restating over pointing.

What to Improve#

Front-loading problem in parallel extraction. Sa3’s first 1500 lines (IGNITE + SEED) contain the richest conceptual material; later TEMPER rounds are more mechanical. Uniform 2000-line segments gave equal weight to all sections. Future extractions should weight segments by conceptual density, not line count.
Theorem section organization. The KB organized theorems by number. The paper writer would benefit from dependency-graph ordering (th1 depends on sp1; th3 depends on th1 + m0.ax3 + m7; th6 depends on m0.ax5; th7 depends on th6). Future KBs should organize theorems by derivation chain.

Specific Recommendations#

For Paper b14 (JUB)#

Source structure differs fundamentally. JUB material is spread across many files in hell/ll/jub/ WITHOUT structured HEAT/STRIKE/TEMPER/QUENCH phases. The extraction prompt must provide a file list, not rely on a single forge log.
Pre-extraction inventory. Consider creating an inventory of JUB-relevant llogs before starting extraction itself. This scoping step was unnecessary for b12 (single sa2 file) and b13 (single sa3 file) but will be essential for b14’s distributed sources.
Bridge from b13. The PD-to-Assurance transformation (th6) has system-level consequences for JUB. Start from the b13 KB’s “Notes for Other Papers / For Paper b14” section: ax19 vulnerability, ax25/m0.ax5 macro/micro echo.
Model-specific cross-cutting category: Consider “Recalibration Mechanisms” (the Jubilee cycle’s reset function), analogous to b13’s “Stopping Outcomes.”
Extra standard category: Consider “Formal Gaps” — JUB has more acknowledged gaps than e7Day or e7He (proto-formal predicates, undefined agency semantics).

For Paper b16 (RiskyMADorMAP)#

Key input from b13: th5 metastability refinement (BABL quasi-absorbing with exit rate lambda_ISMR > 0) and AA-th8-Metastable-a1 are direct inputs.
Model-specific cross-cutting category: Consider “Attractor Dynamics” (CTMC absorbing-state analysis).

For All Future Extractions#

Continue including “Notes for Other Papers” — the b12 and b13 sections both proved their value as starting pointers.
Identify the model-specific cross-cutting category BEFORE starting extraction, not after. This ensures the extraction agents know to collect cross-cutting material from the start.
When using parallel agents, brief them on which segments are likely conceptually dense vs. mechanically structured, so they can adjust attention accordingly.

Section 3: From b14 Extraction (2026m04d08)#

Recommendations for preparing extraction prompts for papers b15 (Divine Simplicity), b16 (RiskyMADorMAP), b17 (h* Theorem), and b18 (Call to Action). Based on the b14 extraction session that walked ~100 files across hell/ll/jub/b/11/ through b/50/.

What Worked Well (Keep These)#

Parallel batch extraction (4 agents). The largest corpus in the series (~100 files) was handled by splitting into 4 batches by file number range. Wall-clock time approximately 4 minutes for all batches.
Axiom-group organization. Organizing by axiom group (Agency, Delegation, Volunteer/Mediator, Preference, Innovation/Jubilee) rather than by extraction category made the KB directly usable by the paper writer. Confirmed b12 recommendation.
Category H (Formal Gaps) added substantial value. The JUB corpus has the most explicit gap-acknowledgment thanks to 3-round adversarial critique + 3-angle stress-test. The consolidated “Top 5 Mathematical + Top 5 Feasibility + Resolution Grading” section provides a complete honesty checklist for the paper writer.
Category I (Call-to-Action Material) worked well. Captured material that would otherwise have been scattered across A and E. The register assessment (academic vs. urgent) directly informs b18 strategy.
Steelmanning section organized by stakeholder type. Economists, socialists, libertarians, theologians, determinists — each with their strongest objection and the JUB response. Directly usable for the paper’s discussion section.
High restate ratio (~90%). Necessary for multi-file corpus where pointing requires the reader to open many different files. Higher than b12 (80%) and b13 (85%) but appropriate for the distributed source structure.

What to Improve#

Pre-sorted file priority list. The original prompt provided a KEY FILES list but it was incomplete. Future multi-file extractions should include three tiers: HIGH (read completely), MEDIUM (scan for specific categories), LOW (skip unless time permits). For b14, the actual priority was:
- HIGH: b/11, b/16, b/17, b/18, b/19, b/20, b/21, b/22, b/23, b/27–b/35, b/42–b/44
- MEDIUM: b/12–b/15, b/24, b/25, b/49, b/50
- LOW: b/36–b/41, b/45–b/48 (infrastructure, naming, migration)
Redundancy handling for parallel compilations. The b/11–b/16 range contains 4 parallel compilations of the same session (Sonnet details/overview, Opus details/overview, plus 2 final-memory logs). Future extractions with parallel compilations should identify the unique material in each compilation FIRST, then extract only unique material.
Infrastructure files need explicit filtering. b/36–b/50 mix JUB-relevant design decisions with BEST Names technical implementation detail. The extraction prompt should explicitly state: “From b/36–b/50, extract ONLY material about axiom/theorem content, design rationales, or adversarial findings. Skip naming conventions, label syntax, migration mechanics.”

Specific Recommendations#

For Paper b16 (RiskyMADorMAP)#

Source: RiskyMADorMAP CTMC model material is spread across b/19 (Reply 1b, Part B), b/21 (Reply 2, competitive inhibitor), b/30 (restructuring 2d), and b/44 (math stress-test). The b14 KB’s “RiskyMADorMAP Model” section has pre-digested cross-references.
Bridge from b14: Competitive-inhibitor model, commons-tragedy convergence, N=1 credibility limitations, survivorship bias.
Model-specific cross-cutting category: Consider “Attractor Dynamics” (CTMC absorbing-state analysis, metastability timescales, technological amplification).
Extra source: The b12 foundation test summary may have relevant CTMC formalization material.

For Paper b17 (h* Theorem)#

Source: ax19 and th6 material is concentrated in b/11, b/13 (opus-regen details), b/18 (Reply 1, fitness analogy), b/21 (Reply 2, scalar projection), b/27 (restructuring 2a), b/30 (restructuring 2d).
Bridge from b14: Fitness analogy defense, ontological vs. epistemic distinction, h* does not require traditional privilege.
The epistemic/ontological distinction is critical: ax19 claims h* exists, NOT that anyone can identify h* in real time. This distinction must be front and center in b17.

For All Future Extractions#

Multi-file corpora benefit from parallel batch extraction (4 agents for ~100 files, 2 for ~20 files, 1 for ~10 files). Size the batches by conceptual density, not file count.
The adversarial critique/reply structure (b14’s Quest format) yields richer Temper Refinement and Formal Gap material than FORGE logs. Future models with adversarial rounds should preserve the critique-reply pairing in the extraction.
Resolution grading (P/S/L/A) from stress-tests is directly usable as paper material. Include grading definitions and distribution in the KB.
When a corpus has both content files and infrastructure files, provide explicit filtering instructions to prevent infrastructure details from diluting content extraction.