BEST Names Architecture — Technical Design#

This page documents the technical design decisions behind the BEST Names system. It is written for software developers, system architects, and future ResearchCity contributors who need to understand why the system works the way it does — not just what it does.

For the “what,” see the expert reference. For the authoritative specification, see the AHA design document.


1. The .. include:: problem#

The BEST Names system exists because of a Sphinx limitation.

Sphinx’s .. include:: directive pastes content into the including document before label resolution. If file A defines .. _pet-ax5: and file B includes file A, then pet-ax5 appears in both documents. Sphinx emits a duplicate-label warning and resolves the label nondeterministically.

This means you cannot build a compilation system (combining axioms from multiple source files into a single overview page) using .. include:: if the source files contain cross-reference labels. The BEST Names architecture solves this by requiring that compiled downstream pages contain their OWN content with their OWN suffixed labels — never including labeled content from upstream PoR sources.

Rule: RST files that define site-wide labels (.. _label:) MUST NOT be included via .. include:: into other Sphinx documents. The compilation skill copies and transforms content; it does not include it.


2. The LL(1) parsing proof#

The label grammar is designed to be parseable by an LL(1) parser with a symbol table (registry lookups). This means any label can be unambiguously decomposed into its dimensional components by reading left to right, one token at a time, without backtracking.

2.1 Why hyphens, not underscores#

The sole separator is - (hyphen). Reasons:

  • URL standard: Hyphens are the web convention for URL word separators (Google SEO guidelines, W3C best practices).

  • LaTeX safety: Underscores are special characters in LaTeX (subscript). Axiom labels appear in LaTeX math contexts.

  • Visibility: Underscores are hidden by link underline decoration in browsers.

  • Parsing clarity: With - as the sole separator, a parser can tokenize any label by splitting on hyphens. Each token is then classified against dimension registries.

Within dimension values: No internal separators. Use concatenation: oov2 not oo-v2, prod not pro-d.

2.2 Why this dimension order#

The grammar enforces a fixed token order: D1 (model), D2 (element chain), D3 (version), D4 (depth), D5 (view/source/language). This order was chosen so that:

  1. D1 is always first because it is always required and anchors the parse.

  2. D2 follows D1 because element identification is the primary purpose of a label. D2 tokens are read greedily: the parser continues consuming D2 type IDs until it encounters a token matching D3, D4, or D5.

  3. D3–D5 follow in decreasing permanence. Version (D3) changes least often, depth (D4) changes with audience, view (D5) changes with perspective. This ordering means the most stable coordinates appear first, and the most variable appear last — analogous to how postal addresses go from country to street.

2.3 How the parser works#

For a label like pet-ax5-oov2-easy-vjud:

  1. D1: First token before the first hyphen: pet. Validate against the D1 model registry. Match.

  2. D2: Remaining tokens are tested as D2 type IDs. ax5 splits at the letter/digit boundary: type=``ax``, number=``5``. Match. The parser checks the next token: oov2. This does NOT match any D2 code (all D2 codes are letter-only). Transition out of D2.

  3. D3: oov2 is in the version registry. Match. Consume and advance.

  4. D4: easy is in the depth registry. Match. Consume and advance.

  5. D5: vjud starts with v and has 4 characters (v + 3 letters). Structural match for a view code. Validate against D5 view registry. Match.

  6. End of label. Parse complete.

The parser is deterministic because each dimension’s codes are structurally distinguishable:

  • D2 codes are letter-only ([a-z]+), distinguished from D3–D5 by registry lookup.

  • D3 codes are registered version strings (oov1, oov2, etc.).

  • D4 codes are a closed registry (prod, easy, math, hu, ma, dump).

  • D5 codes have mandatory prefixes: v (views), s (sources), l (languages).

Constraint for collision freedom: D4 codes MUST NOT begin with v, s, or l (reserved for D5). This was identified as an unstated invariant during adversarial stress-testing (Attack 1.4) and has since been codified.


3. The collision-free proof for D2#

D2 contains three sub-namespaces that must not collide with each other or with other dimensions:

Doubled-letter codes (POST operational fields): Pattern ^([a-z])\1$ — exactly 2 identical lowercase letters (aa, ff, kk). These are reserved by the Evolvix POST System.

Structural/analytical codes (3+ letter codes): logic, limit, needs, feeds, net, diff, stayc, conv, bib, his, etc. Length >= 3, and by registry discipline they do not collide with any D3/D4/D5 code.

Formal element codes (2–5 letters, non-doubled): ax, th, lm, cr, con, pro, proof, etc.

2-character disambiguation (D2 vs. D4):

  • D2 codes at 2 characters MUST have char[0] == char[1] (doubled: ff, aa).

  • D4 codes at 2 characters MUST have char[0] != char[1] (distinct: hu, ma).

This makes the parser deterministic for 2-character tokens without a registry lookup: doubled = D2 field, distinct = D4 depth.

Cross-dimension safety: No D2 code may equal any D1 model code, D3 version code, D4 depth code, or D5 view/source/language code. The registries in the AHA design document are the single source of truth. Before adding any new code to any dimension, ALL other registries must be checked for collisions.


4. The compilation skill architecture#

4.1 Modes#

Mode

When to use

What it does

CURRENT-Replace

PoR changed substantially

Regenerate ALL downstream pages from current PoR sources

CURRENT-Append

New model added

Add new elements without regenerating existing entries

MakeNew-Archive

Major milestone

Replace first, then copy to vv/{version}/

MIGRATE

Structure change

Transform existing files to new naming structure

4.2 The extraction matrix#

The extraction matrix is a table mapping PoR fields (rows) to audience depths (columns). Each cell contains a keyword (full, brief, top1, rewrite, ref, stub) or is empty (field omitted at that depth).

The matrix is stored in a dedicated file so the compilation skill can read it programmatically. At build time, the skill reads the matrix to decide which fields to extract for each depth.

Example: At easy depth, the tctx (technical context) field is omitted, the sum (summary) field is included in full, and the stor (Torah) field is included as top1 (single strongest citation). At dump depth, every field is included in full.

4.3 Stub policy#

When Generate stubs = No (the default), pages that would contain only stub content are not generated and no links to them are created. This prevents dead pages. Stubs are generated only when explicitly requested to prepare scaffolding for the next research round.


5. Integration with Sphinx#

5.1 Labels and cross-references#

RST labels (.. _pet-ax5:) are global identifiers. They survive file moves because Sphinx resolves them by name, not by file path. This is the foundation of link stability: a :ref:`pet-ax5` reference works regardless of where pet/axioms.rst lives in the directory tree.

Consequence: Labels are permanent. Never remove a label. If content is deprecated, the label stays as a redirect to the HistoryHeap page.

5.2 Internationalization#

Sphinx i18n uses .po translation files with language-neutral labels. The label pet-ax5 resolves to the translated page in whatever language the reader is browsing. This is the standard mechanism for straight translations.

D5 language codes (lde, lar, lhe) are for a different purpose: content that is structurally different because the language itself contributes something untranslatable. A German concept like Aufgehobenheit that captures a deeper truth belongs under lde, not under Sphinx i18n.

The boundary: Sphinx i18n = same content in another language. D5 language codes = different content born from another language’s unique perspective.

5.3 Toctrees and build#

Each model has its own directory (pet/, jub/). Compiled views have dedicated directories (axioms/easy/, axioms/expert/). Toctrees in index.rst files connect the pages for navigation.

The build system (make html) compiles English only by default. make build-all compiles all 10 languages. The CI pipeline runs build-all for production.


6. Integration with Evolvix#

The BEST Names system is a domain-specific application of the Evolvix naming convention:

  • BEST: Brief, Explicit, Summarizing, Title — four naming layers from compact code to full display title.

  • POST codes: The doubled-letter operational fields (aa through zz) are reserved by the Evolvix POST System (Project Organization Stabilizing Toolkit System). Only codes registered in the Evolvix POST specification may be assigned meanings.

  • StayVS: The maturity code system (stayc) tracks element stability using the Evolvix versioning framework.

  • VVN (Versioned Variant Number): Each frozen snapshot has a VVN like iv_LLoL_PPv1r0p0_2026m03d25. The VVN encodes author, version, and date.



8. How to add new elements#

8.1 Adding a new model#

  1. Choose a unique D1 code (lowercase, 2–5 characters). Check it against ALL D2, D3, D4, and D5 registries for collisions.

  2. Register the code in the D1 Model Registry section of the AHA design document.

  3. Create the model directory: source/matheology/{model}/.

  4. Define the model’s axioms in {model}/axioms.rst with labels following the grammar: {model}-ax1, {model}-ax2, etc.

  5. Run the compilation skill in CURRENT-Append mode.

8.2 Adding a new element type#

  1. Choose a unique D2 code. Must be letter-only ([a-z]+). Must not equal any D1, D3, D4, or D5 code. If 2 characters, must be doubled (^([a-z])\1$) unless it is a formal element type.

  2. Register in the D2 Type ID Registry.

  3. The new type chains like any other: pet-{newtype}5, pet-ax5-{newtype}.

8.3 Adding a new worldview#

  1. Choose a v + 3-letter code. Check against all D5 registries.

  2. Register in the D5 View Registry with name, scope, and related views.

  3. The new view appears as a D5 suffix: pet-ax5-{vnew}.

  4. Optionally add corresponding PoR support fields.

8.4 Adding a new language#

  1. Use the ISO 639-1 two-letter code. If the code is a doubled letter (aa, ee, ff, ii, kk, nn, ss, tt), use the ISO 639-3 three-letter form instead to avoid collision with POST codes.

  2. l + ISO code: lde, lar, lhe, etc.

  3. This is for language-specific cultural content, not translations. Translations use Sphinx i18n (.po files).


9. Known limitations and future work#

The following issues were identified during the adversarial stress-test (29 attacks, 17 HELD, 12 BREACH):

Critical (1):

  • D7/D6 collision analysis incomplete (Attack 1.6). The original design listed 3 ISO 639-1 / POST code collisions; there are actually 8, including kk (Kazakh, ~13M speakers) which collides with the live KnownKiller POST code. Fix: all 8 languages must use ISO 639-3 forms.

Major (4):

  • Quest label grammar gap (Attack 1.8). The round-based pattern {N}r{M} (e.g., jub-con2r1) is not accommodated by the formal grammar. 28+ live labels are affected.

  • D6 Set B catch-all (Attack 2.2). The structural fields namespace absorbs unregistered codes silently (e.g., a misspelled oov3 is parsed as a structural field, not flagged as an error). Fix: close the registry and require lookup-based matching.

  • Include ban violation (Attack 5.2). The compiled axioms index uses .. include:: with labeled content, violating the AHA’s own rule. Marked with a TODO for Phase 3.

  • D7/Sphinx i18n boundary (Attack 5.3). The D5 language suffix and Sphinx i18n are two parallel multilingual mechanisms with undocumented interaction. The boundary is now clarified: Sphinx i18n for translations, D5 language codes for untranslatable cultural content.

Minor (6): Unstated 2-letter invariant, D4 prefix restriction, unidirectional D1/D2 constraint, D2 letter-only rule not explicit, no label-testing tooling, ref overloaded across 3 contexts.

Cosmetic (1): No quick-start guide existed for beginner onboarding (this set of pages addresses that breach).

For the complete test results and proposed fixes, see the adversarial stress-test report.


Cross-references#