AHA: Translation System Manual#
All Help Available for running, reviewing, and improving translations on this site.
Overview#
This site uses the standard Sphinx internationalization (i18n) (ADD LINK) pipeline combined with a custom AI translation script. The workflow has three stages:
Extract — Sphinx extracts translatable strings from English
.rstfiles into.pofiles (one per page, per language).Translate — An AI script fills in the translations, respecting any existing work especially by humans.
Build — Sphinx builds each language’s HTML from the translated
.pofiles.
All translation data lives in the local Git repository
in the folder locale/<lang>/LC_MESSAGES/*.po. These are all
plain text files. No proprietary tools are needed.
The .po file format#
Each .po file contains entries like this:
#: ../../source/crisis/science.rst:47
#: 383386ba62344ee8bd45def28980a26f
msgid "Accidental Nuclear Winter forecast of waiting times"
msgstr ""
msgid— the English original (never edit this)msgstr— the translation (this is what translators fill in or review)An empty
msgstr ""means “not yet translated” — Sphinx shows the English original as fallback
The .po file format is defined by GNU gettext. It specifies
exactly six comment prefixes (Layer 1). Sphinx adds one convention on top (Layer 2). This
project’s translation script adds further conventions (Layer 3, see below) using only the
standard prefixes for human comments, which are not touched by the tools.
Understanding these three layers is important for
knowing what you can do when editing .po files if you don’t want to break the code.
An extra thank you for the patience of all non-programming human contributors who have to put up with such idiosyncrasies for now if they wish to help translate. The cost of avoiding such codes by providing a shiny web-interface is so prohibitive at the moment that this simple system is all LLoL can offer for now.
What you CAN do freely as a human editor of .po files#
Given all the above constraints, here is what you can safely do when editing
.po files:
Add ``#`` translator comments above any entry. These survive all tool runs. Use them for notes, questions, alternative suggestions, rejection reasons, or any other human commentary (albeit please follow the conventions defined below as Layer 3):
# LLoL 2026-03-17: Q: Is "Prognose" better here than "Vorhersage"? # AIMS k3 s3: verify against climate science terminology in Potsdam papers. msgid "Accidental Nuclear Winter forecast of waiting times" msgstr "Prognose der Wartezeiten bis zum versehentlichen nuklearen Winter"
Edit ``msgstr`` — this is the whole point. Change translations freely, but if you want to help your editors, you will use the StabilityCodes to self-assess the quality of your own work to help others see what needs further improvement. See more elsewhere about how to coordinate the human side of editing.
Add or remove the ``fuzzy`` flag on
#,lines. Remove it to mark an entry as human-approved. Add it (#, fuzzy) to flag an entry for re-review.Write multi-line ``#`` comments — each line must start with
#at column 1:# This is the first line of a longer note. # The Haiku version was surprisingly good here.
Use any text after ``#`` as long as you follow up with at least one space. Apart from key-line conventions below (“Layer 3”), there is no reserved syntax within translator comments. The conventions below help to document archived translations for reference, from whom they came, whether HUman or MAchine, to encourage Negotiations towards becoming a truly HUMAN translation efficiently. Please familiarize yourself with the StabilityCodes used (MM … SS) to understand the maturity life-cycle of translations in this project, and how these are used to encourage self-stabilizing versioning using the StayVS system (see elsewhere how to compose versioned variant names). To get started, you can use any style you find helpful. Once many others contribute as well, then it becomes helpful to decide as a translation community “on which side of the road to drive.” The syntax extensions proposed below for this purpose were developed by LLoL based on his experience with the StayVS self-stabilizing versioning system that drives his vision for a ResearchCity, where the integrity of information is properly guarded.
What you should NOT do#
Never edit
msgid— this is the key that links the English source to the translation. Changing it will orphan the entry.Never edit
#:source references — these are auto-generated and will be overwritten.Never put comments after the last
msgstrline of an entry — they will either attach to the wrong entry or be discarded.Never append comments on the same line as
msgidormsgstr— this is a syntax error.Avoid editing
#|previous-string lines — these are auto-generated bymsgmergeand exist only to help you see what changed in the English source.Never have a space before # and always have a space after # at the beginning of the line This is is the pattern that defines a safe comment. All else confuses the tools.
Do not ignore the Balospe translation guidelines below. It might seem a bit of overhead, but keeping things consistent from the start is the secret to be able to scale up.
Layer 3 — Balospe syntax conventions for translating with AI#
The translation script (scripts/translate-po.py) for this website adds conventions
using only the standard prefixes (as vetted above)— by defining its own convention
syntax within the freedom provided by the existing tool syntax.
Specifically, for effective and HuMaNE HUman-MAchine Negotiation Encouraging, the following syntax conventions are used on this site.
Note
In the prefix patterns below, trailing spaces are significant but
invisible. The symbol ␣ (open box) is used to make them visible
where needed. In actual .po files, ␣ is a normal ASCII space.
(This is a workaround, because having a space before closing `` is not
allowed in reStructuredText).
``# #␣`` for active provenance of a current HUman translation: If human editors wish to override an AI translation, they pick their preferred human translation and copy it from the translation archive (see below) to the respective
msgstr, while also copying the respective human metadata to a new line starting with# #␣immediately abovemsgid. Thus, human translations hide translation metadata and StabilityCodes behind# #␣, which looks like a normal.pocomment#␣to all other tools. The second#marks HUman content in contrast to MAchine content (see next), making it easy to distinguish AI translations (%) from human translations (#). Example:# # MM_LLoL_v1r0p0_2026m03d17 — Laurence Loewe of Laodicea, first try``# %␣`` for active provenance of a currently used MAchine AI translation: For AI translations the script hides its translation metadata and StabilityCode behind
# %␣on the line immediately abovemsgid, as close to the activemsgstras possible. It looks like a human comment to other.potools, because it has a space after the#that starts the new line. But the AI translation script used here and all human editors for balospe.com hereby know that the# %␣at the beginning of a line signals work by AI, so it is visually easy to spot in contrast to entirely human lines. To safeguard it against accidental human editing, the active entry is also added to the translation archive as defined below. Example:# % PP_ClaudeOpHi_v1r0p0_2026m03d17_11h00-0500 — AI-translated (Claude Opus, high)``# # #␣`` for archived HUman translation metadata: To ensure that past translation work isn’t accidentally overwritten by hasty humans or sloppy code, work towards translating is documented in a local translation store for each
msgid. This store follows the opening#:codes that define the message to be translated and precedes the active message metadata as defined above. Whenever a new translation is made, by humans or machines, it is added to the translation store as a two-line construct, where the first line starts with# # #␣followed by the last stability assessment (e.g. by an editor), before ending with the original metadata produced at translation time. The actual text x of this archived translation entry is preserved in a line starting with# # # msgstr "x"to distinguish it from other explanatory notes and to make it easy to activate it again (only drop the leading# # #).``# # %␣`` for archived MAchine translation metadata: use
%as above.``# # % msgstr “x”`` for archived MAchine translation: use
%as above.``# # # #␣NN_…`` for rejected HUman translation notes: Human reviewers mark rejected alternatives by changing their StabilityCode to
NNand adding an additional layer of#␣to make it clear that this alternative is only kept to avoid re-inventing this variant. The reason for rejecting (if not obvious anyway) is given in the metadata (possibly on a separate line to help automate processing such insights). As experience is gained with this system, more conventions may evolve for documenting such lessons.``# # # # msgstr “x”`` for rejected HUman translation data: following conventions above.
``# # # %␣NN_…`` for rejected MAchine translation notes: Human reviewers mark rejected AI alternatives using the conventions above. This may help track if an AI tool is worth using.
``# # # % msgstr “x”`` for rejected MAchine translation data: following conventions above.
``#␣`` for any other comment or whitespace within comments: All other lines that start with
#␣(including the space!) are also preserved as they are by the.potools, but are currently not interpreted in any special way here.
Given the importance of StabilityCodes everywhere, the above formal syntax was assessed to be at the level of OperatesOddly by LLoL on 2026-03-18. It has seen some review by Claude Opus, but has not yet been used extensively in practice. The StabilityCodes themselves are assessed to be at the level of ReviewedRelease version 2; they have been extensively reviewed and used by LLoL since introduced by LLoL in 2017 (see introduction elsewhere and below). For a summary of how StayC codes map to translation quality stages, see the StayC translation table.
The lower stability of the syntax here is in part due to the lack of experience with how to best apply the general and abstract StabilityCodes to the fine granularity of translation quality required within this project.
Claude Opus had these key comments:
The # vs % distinction is good. Clean visual separation of human vs machine provenance, easy to grep, and it all stays within safe # translator-comment syntax. Redundancy between #/% and metadata is intentional and fine — it aids visual scanning even when you can’t read the full metadata string.
Nesting depth concern. Active human = # # , archived = # # # , rejected = # # # # — that’s 4 nesting levels and 7-8 prefix characters before content starts. With full metadata strings this will make long lines. Workable but be aware.
# # # msgstr “x” is clever — makes reactivation trivial (strip prefix). Minor risk: naive grep msgstr will match these archive lines too, but any proper .po parser won’t be confused since the # prefix makes them comments.
To which LLoL replied (OO_LLoL_v1_2026m03d17):
Redundancy is indeed intentional for HUman MAchine Negotiation Encouraging.
Line length is a concern, but there is no more nesting after these 4 and many msgstr lines will likely be either very long anyway or be broken down to follow the 80-character/line rule (if only to make version control easier). In that case, conventions suggest code:
msgstr "" "This is a very long translation that needs to be " "wrapped to follow the 80 character convention."
which can then be transformed as follows (eg. for a deactivated message text):
# # # # msgstr "" # "This is a very long translation that needs to be " # "wrapped to follow the 80 character convention."
How confusing it is to get too many hits in naive grep searches remains to be seen. There is also a possibility that this could be turned into a feature by annotating in some helpful way.
Practical work with this proposed solution will undoubtedly turn up more questions. However, to get started, this OO-quality solution ought to be sufficient for now.
How to compose VVNs: Versioned Variant Numbers for StayVS#
The core idea of StayVS (see elsewhere on this site) is to include a stability assessment of the versioned solution, so users don’t have to worry too much about breaking code and other problems from using immature information.
To this end, authors of code or any information relied upon by others are asked to self-assess the reliability of the data and code they provide. That is necessarily a qualitative assessment and to arrive at it is necessarily subjective. Yet to keep this useful, LLoL developed StayVS, the Stabilizing Versioning System of Evolvix, which he has been using and developing in some form since 2017.
Here is not the place to properly introduce it, but to get translators up and running fast, the following bits may be useful.
The most important info in the .po file when assessing a translation is its maturity in the information life-cycle as encoded by StayC, the StabilityCode. This is annotated in a VVN by starting it with the StayC DoubleCaps, ranging from MM to SS (reserving TT for special Jubilee purposes). See below for a table with how these codes are translated into the quality of a translation.
Immediately after StayC comes the information of who made that assessment. Different people may assess stability differently. By associating a nickname with StayC assessments people build over time reputations for being reliable in how they assess reliability (or not).
After that comes the VRP number documenting version-release-patch in the StayVS variant of complete rewrite (version, breaks backwards compatiblilty), major new feature (release that does not break backwards compatiblity), and minor bugfix (patch that is recommended to resolve an obvious problem).
Lastly, a date is added to help people orient themselves in how that VVN is anchored in time. To keep it all standardized and easy to double-click as a unit, all elements are concatenated by “_” underscores and the date is given in the modified ISO format of YYYYmMMdDD (appending _HHhMMmSS in case seconds are needd, which is unlikely here (day-resolution may suffice).
Example for a MockupModel produced by LLoL (nickname for Laurence Loewe of Laodicea):
MM_LLoL_v1r0p0_2026m03d17
Example for code that OperatesOften (but not always) as assessed by LLoL
OO_LLoL_v1_2026m03d17
As all VRP positions with a 0 can be dropped by default (or added in as needed), the v1 is equivalent to v1r0p0.
Please find below a table with predefined StabilityCodes and nicknames for the various AI agents that LLoL may be using for assisting in the initial translations work.
Full lifecycle of a translation entry#
The following example shows how a single entry evolves through all quality levels, from first AI draft to human-edited release.
Stage 0 — Empty (untranslated)#
After running make update-po-de:
#: ../../source/crisis/science.rst:47
#: 383386ba62344ee8bd45def28980a26f
msgid "Accidental Nuclear Winter forecast of waiting times"
msgstr ""
Sphinx shows the English original on the German page.
Stage 1 — Fast AI draft (MM)#
After running make translate-de (fast/Haiku, cost ~1x):
#: ../../source/crisis/science.rst:47
#: 383386ba62344ee8bd45def28980a26f
# # % MM_ClaHai_v1_2026m03d17
# # % msgstr "Prognose der Wartezeiten bis zum versehentlichen nuklearen Winter"
# % MM_ClaHai_v1_2026m03d17
msgid "Accidental Nuclear Winter forecast of waiting times"
msgstr "Prognose der Wartezeiten bis zum versehentlichen nuklearen Winter"
The script first adds the translation to the archive (# # % metadata
+ # # % msgstr) and then sets it as the active translation with a
# % provenance line above msgid. The VVN MM_ClaHai_v1_2026m03d17
encodes: StayC MM, model nickname ClaHai (Claude Haiku), version 1,
date 2026-03-17.
Stage 2 — Medium AI pass (OO)#
After running make translate-de LEVEL=medium (Sonnet, cost ~5x),
the previous MM translation stays in the archive and the new OO
translation is added and becomes the active msgstr:
#: ../../source/crisis/science.rst:47
#: 383386ba62344ee8bd45def28980a26f
# # % MM_ClaHai_v1_2026m03d17
# # % msgstr "Prognose der Wartezeiten bis zum versehentlichen nuklearen Winter"
# # % OO_ClaSon_v1_2026m03d18
# # % msgstr "Prognostizierte Wartezeiten bis zum versehentlichen nuklearen Winter"
# % OO_ClaSon_v1_2026m03d18
msgid "Accidental Nuclear Winter forecast of waiting times"
msgstr "Prognostizierte Wartezeiten bis zum versehentlichen nuklearen Winter"
The archive now contains both the MM and OO translations. A reviewer can always see what the cheaper model proposed — sometimes Haiku produces a brilliantly concise phrasing that is worth keeping.
Stage 3 — High AI pass (PP)#
After running make translate-de LEVEL=high (Opus, cost ~25x):
#: ../../source/crisis/science.rst:47
#: 383386ba62344ee8bd45def28980a26f
# # % MM_ClaHai_v1_2026m03d17
# # % msgstr "Prognose der Wartezeiten bis zum versehentlichen nuklearen Winter"
# # % OO_ClaSon_v1_2026m03d18
# # % msgstr "Prognostizierte Wartezeiten bis zum versehentlichen nuklearen Winter"
# # % PP_ClaOpHi_v1_2026m03d19
# # % msgstr "Wartezeit-Prognose bis zum versehentlichen Ausbruch eines nuklearen Winters"
# % PP_ClaOpHi_v1_2026m03d19
msgid "Accidental Nuclear Winter forecast of waiting times"
msgstr "Wartezeit-Prognose bis zum versehentlichen Ausbruch eines nuklearen Winters"
Each new AI translation is appended to the archive. The # % active
provenance line always reflects which translation is currently in
msgstr. No previous work is ever deleted.
Stage 4 — Max AI pass with review (QQ)#
After running make translate-de LEVEL=max (Opus 2-pass, cost ~50x):
#: ../../source/crisis/science.rst:47
#: 383386ba62344ee8bd45def28980a26f
# # % MM_ClaHai_v1_2026m03d17
# # % msgstr "Prognose der Wartezeiten bis zum versehentlichen nuklearen Winter"
# # % OO_ClaSon_v1_2026m03d18
# # % msgstr "Prognostizierte Wartezeiten bis zum versehentlichen nuklearen Winter"
# # % PP_ClaOpHi_v1_2026m03d19
# # % msgstr "Wartezeit-Prognose bis zum versehentlichen Ausbruch eines nuklearen Winters"
# # % QQ_ClaOpMax_v1_2026m03d20
# # % msgstr "Aktuarielle Wartezeit-Prognose bis zum versehentlichen nuklearen Winter"
# % QQ_ClaOpMax_v1_2026m03d20
# % Review note: verified "nuklearer Winter" matches German climate science terminology
msgid "Accidental Nuclear Winter forecast of waiting times"
msgstr "Aktuarielle Wartezeit-Prognose bis zum versehentlichen nuklearen Winter"
At QQ level, the review pass may add notes (as additional # % lines)
explaining terminology decisions. All four AI attempts are visible in the
archive.
Stage 5 — Human reviewer rejects one alternative (NN)#
A human reviewer examines all proposals and determines that the Sonnet (OO) version is misleading — “Prognostizierte” implies the waiting times were already predicted rather than being a forecast model. They mark it NN:
#: ../../source/crisis/science.rst:47
#: 383386ba62344ee8bd45def28980a26f
# # % MM_ClaHai_v1_2026m03d17
# # % msgstr "Prognose der Wartezeiten bis zum versehentlichen nuklearen Winter"
# # # % NN_LLoL_v1_2026m03d21 — was OO_ClaSon_v1_2026m03d18.
# # # % "Prognostizierte" implies completed prediction, not ongoing forecast model.
# # # % msgstr "Prognostizierte Wartezeiten bis zum versehentlichen nuklearen Winter"
# # % PP_ClaOpHi_v1_2026m03d19
# # % msgstr "Wartezeit-Prognose bis zum versehentlichen Ausbruch eines nuklearen Winters"
# # % QQ_ClaOpMax_v1_2026m03d20
# # % msgstr "Aktuarielle Wartezeit-Prognose bis zum versehentlichen nuklearen Winter"
# % QQ_ClaOpMax_v1_2026m03d20
msgid "Accidental Nuclear Winter forecast of waiting times"
msgstr "Aktuarielle Wartezeit-Prognose bis zum versehentlichen nuklearen Winter"
The NN entry gets an extra # nesting level (# # # % instead of
# # %) to push it visually away from active candidates. The rejection
documents why — preventing future reviewers from re-proposing the same
phrasing. The useful insight (that “Prognose” is better than
“Prognostizierte”) is preserved in the rejection note.
Stage 6 — Human editor makes the release decision (RR)#
The human editor reviews all remaining candidates (MM, PP, QQ) and the
active msgstr, then decides on the final phrasing for the first
official release. They may combine the best elements from multiple
proposals:
#: ../../source/crisis/science.rst:47
#: 383386ba62344ee8bd45def28980a26f
# # % MM_ClaHai_v1_2026m03d17
# # % msgstr "Prognose der Wartezeiten bis zum versehentlichen nuklearen Winter"
# # # % NN_LLoL_v1_2026m03d21 — was OO_ClaSon_v1_2026m03d18.
# # # % "Prognostizierte" implies completed prediction, not ongoing forecast model.
# # # % msgstr "Prognostizierte Wartezeiten bis zum versehentlichen nuklearen Winter"
# # % PP_ClaOpHi_v1_2026m03d19
# # % msgstr "Wartezeit-Prognose bis zum versehentlichen Ausbruch eines nuklearen Winters"
# # % QQ_ClaOpMax_v1_2026m03d20
# # % msgstr "Aktuarielle Wartezeit-Prognose bis zum versehentlichen nuklearen Winter"
# # # RR_LLoL_v1_2026m03d22 — Combined Haiku's concise "Prognose der Wartezeiten"
# # # with Opus's "nuklearen Winter" (without "Ausbruch" which added unnecessary
# # # length). Dropped "Aktuarielle" from QQ as too technical for a heading.
# # # msgstr "Prognose der Wartezeiten bis zum versehentlichen nuklearen Winter"
# # RR_LLoL_v1_2026m03d22
msgid "Accidental Nuclear Winter forecast of waiting times"
msgstr "Prognose der Wartezeiten bis zum versehentlichen nuklearen Winter"
In this case, the editor chose a phrasing very close to the original Haiku
draft — sometimes the simplest version wins. The RR comment in the archive
(# # #) documents the reasoning, so future editors understand why
this phrasing was chosen over the alternatives.
Note: the active provenance uses # # (human marker) instead of # %
(machine marker) because this is a human decision. Once an entry has a
human # # active provenance with RR or higher StabilityCode and no
fuzzy flag, the AI translation script will never overwrite it (unless
the --ai-override flag is used, in which case the AI translation is
added to the archive but does not replace the active msgstr).
Key principles#
Nothing is ever deleted. Every translation is added to the archive. The full history of proposals is always available.
VVN at the front. Every provenance line starts with its Versioned Variant Name (e.g.,
MM_ClaHai_v1_2026m03d17) for easy visual scanning. The StabilityCode (MM, OO, PP, …) is always the first element.``# %`` = machine, ``# #`` = human. The second character after
#distinguishes AI (%) from human (#) provenance, both for active lines and for archived entries.Archive before activate. Every new translation is first written to the archive (
# # %or# # #), then set as active (# %or# #). This ensures no work is ever lost, even if the script crashes mid-operation.NN entries are documented rejections, not deletions. They get an extra
#nesting level (# # # %or# # # #) and include the reason so the same mistake is not re-proposed.Even the weakest contributor can win. As shown in Stage 6, a Haiku draft at ~$0.001 per string can end up being the final release if its phrasing happens to be the most concise and accurate.
Quick start#
Prerequisites#
Python 3.9+ and the project’s
.venv(runmake setuponce)For AI translation: an Anthropic API key set as
ANTHROPIC_API_KEYenvironment variableFor cost estimation and token counting: no API key needed
Step 1 — Generate .po files for a language#
make update-po-de # German only
make update-po-fr # French only
make update-po # all 9 non-English languages
This creates/updates locale/de/LC_MESSAGES/*.po with empty msgstr
entries for every translatable string.
Step 2 — Estimate cost before translating#
Always run a dry-run first:
make translate-de-dry-run # all German, fast level
make translate-de-dry-run LEVEL=medium # all German, medium level
make translate-de-dry-run LEVEL=high PATH=matheology # matheology only, high
This prints the number of strings, estimated tokens, and estimated cost without calling the API.
Step 3 — Run the translation#
export ANTHROPIC_API_KEY=sk-ant-... # set once per terminal session
make translate-de # all German, fast ($~1)
make translate-de LEVEL=medium PATH=matheology # matheology, medium ($~14)
make translate-fr LEVEL=fast # all French, fast ($~1)
The script asks for confirmation before spending money.
Step 4 — Build and preview#
make build-de
python3 -m http.server 8000 -d build/html
# open http://localhost:8000/de/
Translation quality levels#
Level |
Model Nickname |
StayC |
Cost |
Description |
|---|---|---|---|---|
|
ClaHai |
MM |
~1x |
Claude Haiku: 1 pass, basic prompt. Literal but correct. Serviceable quality. Good starting point. Occasionally produces brilliant short phrasings worth preserving even when upgrading. |
|
ClaSon |
OO |
~5x |
Claude Sonnet: 1 pass, rich prompt with domain context. Proper diacritics, natural prose, target-language quotation marks, idiomatic scripture abbreviations. |
|
ClaOpHi |
PP |
~25x |
Opus high effort: 1 pass, highest single-pass quality. Best prose, scripture-aware. Worth consulting even for human reviewers. |
|
ClaOpMax |
~50x |
Opus ultra-high effort: 2 passes: translate + review for consistency and scripture verification against established Bible editions. |
NN (NimbleNonsense) is reserved for rejected translations — whether from AI or humans. NN entries are preserved as comments to document the work and the reason for rejection, but are deactivated from competing to become the final translation.
Recommendation: Start with fast to get as many languages covered as possible
at a relative cost of 1x. Upgrade to medium for the most important pages
as soon as possible (cost ~5x). Note that theologically and technically
dense content will likely require high``at a cost of ~25x and may also benefit
from ``max as will final review (~50x). Then a human release manager
selects the best QQ variant for the first official release (RR).
Please note that this workflow also assumes that the English original is already RR stable. However, given that this site is being written by one person at the fastest possible speed due to the extreme urgency of the subject-matter (see the The Crisis!), the quality of the pages may vary. The FeedbackFlow system was designed to help improve, both the original and the translations.
How all these various updating systems will interact remains to be seen in a real-life test. Therefore, LLoL wishes to thank everyone using this site for their patience, and especially all who consider contributing. Ultimately the answer to solve the information-processing problems discussed here is to be developed in the ResearchCity proposed by LLoL on Balospe.com.
Targeting specific content#
The --path flag (or PATH= in Make) lets you target any subfolder or
individual file:
# Translate only the crisis section
make translate-de LEVEL=medium PATH=crisis
# Translate a single file
make translate-de LEVEL=high PATH=matheology/heaven/axioms/pet/axioms
# Estimate cost for just one section
make translate-de-dry-run LEVEL=medium PATH=matheology
The path is relative to locale/<lang>/LC_MESSAGES/. Partial matches work —
matheology matches everything under that directory.
Token counting#
To see exactly how many strings and characters exist without any cost estimate:
make count-tokens # all languages
make count-tokens LANG=de # German only
make count-tokens LANG=de PATH=matheology # German matheology only
How human review works#
The translation script has a strict safety rule: it never overwrites a human-reviewed translation. Here is how it decides what to touch:
Entry state |
AI action |
Why |
|---|---|---|
|
Translates |
No existing work to protect |
|
Re-translates |
Fuzzy means the English source changed; old translation is preserved as a comment |
|
Skips |
This is human-reviewed work. AI will not touch it. |
This means a human reviewer can work on any .po file at any time. Once
they remove the fuzzy flag (or fill in an entry that was empty), their
translation is permanently protected from AI overwriting.
If the --ai-override flag is used, the script will still add an AI
translation to the archive, but it will not replace the active
msgstr or the human # # provenance line. This allows running
higher-quality AI passes for comparison without disturbing human work.
Review workflow for human translators#
Step 1 — Claim a file or section#
Pick a .po file to review. The # % provenance lines tell you
which entries have been AI-translated and at what quality level.
Step 2 — Review and improve#
For each entry:
Read the
msgid(English original)Read the
msgstr(current translation)If the translation is correct — leave it. Remove the
fuzzyflag if present. This marks it as human-approved.If the translation needs improvement — edit the
msgstrand change the# %(machine) provenance to# #(human) with your own VVN. The AI translation remains in the archive for reference.If you are unsure — add a
# TODO:comment and move on.
Step 3 — Build and check#
make build-de
# preview in browser
Step 4 — Commit#
Your changes are in locale/<lang>/LC_MESSAGES/*.po. Commit them normally.
The next make update-po will merge any new English strings without
disturbing your reviewed translations.
Upgrading translations#
As funds or quality requirements increase, you can upgrade translations
progressively. Higher-quality AI runs never delete lower-quality results.
Each new translation is added to the archive; only the active msgstr
and # % provenance line are updated.
MM → OO (Haiku → Sonnet): Run
make translate-de LEVEL=medium. The MM translation stays in the archive. The new OO translation becomes the activemsgstr. Entries with a human# #provenance are untouched (unless--ai-overrideis used, see below).OO → PP (Sonnet → Opus): Run
make translate-de LEVEL=highfor higher-quality prose on important pages. The MM and OO translations remain in the archive.PP → QQ (Opus → Opus 2-pass): Run
make translate-de LEVEL=maxfor automated review and scripture verification.QQ → RR (AI → Human release): A human release manager reviews the QQ translations and all archive candidates, selects the best variant, and marks it as RR — the first official release without a “translation in progress” warning.
After English content changes: Run
make update-po-de. Changed English strings get their translations marked asfuzzy— signaling that the translation may need updating. AI or human can then update just those entries.
The VVN on the # % or # # active provenance line tells you what
level of work has been done. The archive (# # % and # # # lines)
provides the full audit trail, so reviewers can always see all proposals
and choose the best phrasing from any level.
The ``–ai-override`` flag: By default, the script skips entries that
have a human # # provenance (i.e., a human has reviewed or written the
translation). If you pass --ai-override, the script will still run the
AI translation and add it to the archive, but it will not replace
the active msgstr or the human provenance. This is useful for generating
AI alternatives for comparison without disturbing human-reviewed work.
Ensuring reviewer work is preserved#
Multiple reviewers can work on the same language safely:
Reviewer A works on
matheology/*.poReviewer B works on
crisis/*.poBoth commit their changes. No conflicts (different files).
If two reviewers edit the same file:
Git’s merge will handle non-overlapping changes automatically
For overlapping changes (same
msgstredited differently), Git flags a merge conflict. The reviewers discuss and choose the best translation.The
.poformat makes merge conflicts easy to read: you see both proposed translations side by side.
Rule of thumb: a native speaker with domain expertise always outranks a native speaker without it. When in doubt, choose the translation that most faithfully preserves the theological/mathematical precision of the English original while reading naturally in the target language.
Handling scripture quotations#
Scripture quotes deserve special care because established translations exist in most languages (Luther Bible for German, Louis Segond for French, etc.).
When reviewing scripture entries:
Check if the AI translation matches the established Bible translation in your language
If the original English uses a paraphrase rather than a literal quote, the translation should also paraphrase — but from the target-language Bible, not from the English paraphrase
For Quran quotations, use the most widely accepted translation in your language
For Sanskrit/Hindu scripture, preserve IAST transliteration and provide the established target-language translation
The max AI level attempts this verification automatically, but a
human reviewer with access to the actual Bible text will always do better.
Command reference#
Command |
What it does |
|---|---|
|
Create/update German .po files from English source |
|
Estimate cost for German translation (no API key needed) |
|
Translate all German (default: fast level) |
|
Translate German at medium quality |
|
Translate German matheology at high quality |
|
Build German HTML |
|
Token report across all languages |
|
Token report for German crisis section |
|
Build all 10 languages |
|
Delete all build output |
Troubleshooting#
- “No locale directory for ‘de’”
Run
make update-po-defirst to generate the .po files.- “ANTHROPIC_API_KEY environment variable not set”
Run
export ANTHROPIC_API_KEY=sk-ant-...in your terminal. Get a key at https://console.anthropic.com/- API errors mid-translation
The script continues with the next batch. Already-translated entries are saved. Re-run the same command to fill in any gaps (entries that got translated won’t be re-done).
- Translation seems wrong
Edit the
msgstrin the .po file, removefuzzyflag if present, rebuild withmake build-de. Your edit is permanently protected.
Comment prefixes: Standard syntax inherited here#
Here is how Claude Opus summarized the constraints inherited by Balospe for using the international translation infrastructure offered by the reStructuredText sphinx-book-theme, which uses the
.pofile format defined by GNU gettext.It specifies exactly six comment prefixes. Understanding these is important for knowing what you can and cannot do when editing
.pofiles.Layer 1 — GNU gettext standard (all .po tools respect these)#
Prefix
Name
Meaning
#(space)Translator comment
Free-form human notes. Written by translators to communicate with other translators. These are the safest place for human commentary — all tools (
msgmerge,msgfmt,polib) preserve them across updates. Multiple lines allowed. Example:# Phrasing uses Luther 1912 Bible.#.Extracted comment
Auto-generated from source code. In C projects,
xgettextextracts these from marked comments adjacent to translatable strings. Warning:msgmergemay overwrite#.comments when it regenerates the.pofile from a new.pottemplate. This means#.is not a safe place for information you want to survivemake update-po.#:Source reference
File path and line number where the string appears in the source. Generated and updated automatically by
xgettextandmsgmerge. Never edit these by hand.#,Flags
Special markers. The most common is
fuzzy— meaning the English source has changed and the translation may need updating.msgfmt(which compiles.poto.mo) treatsfuzzyentries as untranslated. Other flags likepython-formatare set byxgettext. Translators may manually add or removefuzzy.#|Previous string
The old msgid before it was changed. Added by
msgmergewhen it fuzzy-matches an updated English string to an existing translation. Shows the translator what changed. Format:#| msgid "old wording". Never edit these by hand.#~Obsolete entry
An entry that no longer exists in the source. Added by
msgmergewhen a translatable string is removed from the English original. The entire entry (including msgid and msgstr) is prefixed with#~ ``. Preserved in case the string returns. Not compiled into the ``.mofile.Layer 2 — Added by Sphinx#
Sphinx uses the standard
#:prefix but adds a UUID on a second reference line:This UUID (enabled by
gettext_uuid = Trueinconf.py) helps Sphinx detect when the English original has been modified, even if the line number changes. It is generated automatically — never edit it.Sphinx also generates
#.extracted comments in some cases (e.g., for image alt-text or directive content). These follow the standard gettext behavior.Debug of formatting rules and constraints by Claude Opus#
The
.poformat is strictly line-oriented. Several things that might seem natural are actually impossible or dangerous:No end-of-line comments. You cannot append a comment after a translation on the same line:
The
.poformat has no end-of-line comment syntax at all. Everything after the closing"on amsgstrormsgidline must be empty.No comments after msgstr. All comments for an entry must appear above the
msgid. If you place a comment line aftermsgstr, the.poparser treats it as belonging to the next entry:Trailing comments at end of file are silently discarded by
polib. If the last entry in a.pofile is followed by comment lines, they will be lost on the next save.Comments between msgid and msgstr are technically accepted by
polibbut are non-standard.polibwill accept them during parsing, but when it saves the file it normalizes all comments above msgid. So even if you write:After
polibsaves the file, you will get:This is harmless but surprising. GNU tools (
msgmerge,msgfmt) may not accept this layout at all. Best practice: always put comments above the msgid line.Column 1. The
#must be the first character on the line. Most parsers strip leading whitespace, so indented comments will technically work, but all tools write#at column 1 and you should too.Non-standard comment prefixes. Only the six prefixes listed above (
#,#.,#:,#,,#|,#~) are part of the.postandard. Here is what happens with other patterns (tested empirically withpolib1.2.0 and GNUmsgmerge):You write
polibreadspolibsaves asmsgmergeresult## commentTranslator comment:
"comment"# comment(extra#stripped)# # comment(space inserted)### commentTranslator comment:
"comment"# comment(extra##stripped)# ## comment(space inserted)# #QQv1 textTranslator comment:
"#QQv1 text"# #QQv1 text(survives intact)# #QQv1 text(survives intact)#+ commentSyntax error (crashes
polib)—
Silently dropped
#- commentSyntax error (crashes
polib)—
Silently dropped
#! commentSyntax error (crashes
polib)—
Silently dropped
Key takeaways:
#+,#-,#!are dangerous — they will crashpoliband break the translation script. Never use them.##and###are unstable — they parse correctly but lose their extra#characters on save.polibnormalizes them to#, andmsgmergeinserts a space (## x→# # x). If you need a visual separator, use something like# ---or# ===instead.# #QQv1is safe — it is a standard translator comment whose text happens to start with#QQv1. Bothpolibandmsgmergepreserve it exactly. This pattern could be used as a compact StabilityCode tag within translator comments if desired.The safe rule is simple: only ever use
#(hash-space) for human comments. Put whatever text you want after the space — including#characters as part of the text content (like# #QQv1). But never use any prefix other than the six standard ones.Blank lines between entries are conventional and improve readability but are optional. Parsers skip them. A blank line inside an entry has no effect in
polib(it skips blank lines) but may confuse stricter GNU tools.Important caveat about
#.andmsgmerge: Originally Claude Opus proposed use of#.to add MAchine metadata. But test have shown that runningmake update-po(which callsmsgmerge) drops all ``#.`` comments. Hence, provenance markers written as#.are lost at the nextmake update-po. The system above avoids that problem by strictly staying within the.posyntax that guarantees respecting human comments (#␣i.e.#with a space at the start of a new line). Only these translator comments are guaranteed to survive any tool run.