b19 — QC calibration spot-check of the seven fact-sheets#
- Compiled:
2026m05d13
- Compiled by:
Claude Opus 4.7 Max (main session, WebSearch + WebFetch)
- Scope:
10 decision-relevant claim-checks, drawn from the cross-consistency decision-relevance ranking
- Result summary:
10/10 confirmed. One sharpening (Zhavoronkov first-author listing). One critical empirical confirmation (ICMJE’s own text gives accountability as the rationale, not personhood — directly supporting Fact-sheet 7’s central reframe).
- Status:
Calibration sufficient to proceed to (b) EDEN steelmans with the fact-sheets and the cross-check as a trustworthy framing-level reference shelf. Individual operative quotes intended for direct publication still warrant re-checking against live URLs.
Section 1 — QC sampling strategy (developed and adopted)#
LLoL delegated the strategy. Adopted:
10 claims, selected to span all seven fact-sheets and the full decision-relevance ranking.
Decision-relevance weighting — load-bearing claims for (b) steelmans take priority over peripheral findings.
Diversity across claim types — empirical (was X published?), quotative (what does ICMJE actually say?), structural (does precedent X mean Y?), procedural (is consortium accepted in PubMed?).
Source-method diversity — WebSearch for snippet-level confirmation, WebFetch for verbatim quoted-text checks where the fact-sheet’s quoted-quote needed to be re-grounded.
False-positive aversion — for the empirically load-bearing “unprecedented AI co-authorship” claim, a single confirmation case is enough to falsify the unprecedented framing; spot-checks for that claim are weighted accordingly.
Section 2 — The 10 spot-checks#
# |
Claim being checked |
Source fact-sheet(s) |
Verification method |
Result |
|---|---|---|---|---|
1 |
Zhavoronkov & ChatGPT in Oncoscience (2022m12d21); ChatGPT still on byline |
FS6 |
WebSearch + (WebFetch attempted, returned binary PDF) |
Confirmed and sharpened (see Section 3) |
2 |
GPT-4 Technical Report lists humans + OpenAI but NOT GPT-4 itself as author |
FS6 |
WebSearch |
Confirmed. 281+ human authors listed; no model on byline. |
3 |
AlphaFold (Jumper et al. 2021 Nature) lists humans + DeepMind but NOT AlphaFold itself as author |
FS3, FS6 |
WebSearch |
Confirmed. Lead author John M. Jumper, 30+ human co-authors; AlphaFold is the subject, not author. |
4 |
Bourbaki accepted as byline since 1935 via Élie Cartan sponsor-vouching |
FS7 |
WebSearch |
Confirmed. 1935 Comptes Rendus paper “Sur un théorème de Carathéodory et la mesure dans les espaces topologiques”, presented by Élie Cartan and accepted by publishers. |
5 |
ATLAS / consortium-as-byline accepted in PubMed corporate-author field |
FS3, FS7 |
WebSearch |
Confirmed. NLM/MEDLINE has indexed group/corporate authors routinely since March 2008. ATLAS has ~3000 scientific authors handled this way. |
6 |
ICMJE four criteria — exact wording of criterion 4 (accountability) |
FS2, FS4, FS5 |
WebFetch (live URL) |
Confirmed verbatim (see Section 3 for the exact text). WebFetch worked in main context. |
7 |
NEJM AI launched 2024; encourages LLM use; prohibits AI on byline |
FS1, FS6 |
WebSearch |
Confirmed. Split policy confirmed directly: “AI authorship is prohibited, and AI tools cannot be listed as authors and cannot take accountability for the work”; LLM use during preparation is encouraged. |
8 |
Hosseini et al. 2025 — voluntary disclosure on equity grounds |
FS1, FS6 |
WebSearch |
Confirmed. Hosseini, Gordijn, Kaebnick, Holmes (2025), Research Ethics (Sage). Three reasons: credit below threshold, impractical to specify human-vs-AI portions, disclosures bias against non-native English authors. |
9 |
Thorp 2023 Science editorial “ChatGPT is fun, but not an author” (Jan 27, 2023) |
FS1, FS3, FS6 |
WebSearch |
Confirmed. H. Holden Thorp, Science, 2023m01d27. Key claim: “the scientific record is ultimately a human endeavor… the product must come from—and be expressed by—the human mind.” |
10 |
ICMJE May 2023 update adds Section II.A.4 on AI; rationale = accountability, not personhood |
FS1, FS7 |
WebSearch + WebFetch |
Confirmed — and this is the critical confirmation for the cross-check. ICMJE’s own text: chatbots “should not be listed as authors because they cannot be responsible for the accuracy, integrity, and originality of the work, and these responsibilities are required for authorship.” The stated reason is accountability, not personhood. |
Section 3 — Notable refinements and confirmations#
Refinement on item 1 — Zhavoronkov case is sharper than the fact-sheet#
The fact-sheet language said “ChatGPT still on the byline as of 2026m05d13.” The QC reveals that ChatGPT is actually listed as first author, with the citation form “Transformer, C.G.P.-T. and Zhavoronkov, A. (2022).” (See e.g. the SciRP / Scientific Research Publishing reference entry, the EurekAlert release, and the PubMed record.) Zhavoronkov also reached out to Sam Altman (OpenAI CEO) for confirmation prior to listing, and received no objection.
This sharpens (not weakens) the precedent. The one durable case of officially-acknowledged AI co-authorship at a refereed venue is not just “AI somewhere on the byline” — it is AI listed as first author, with the AI company’s CEO consulted and not objecting. The conventional consensus’s “this is unprecedented” response in 2023 was responding to a specific structural form that did include AI as the listed first author.
Critical confirmation on item 10 — ICMJE’s stated rationale is accountability#
The ICMJE Section II.A.4 added in May 2023 says (verbatim, retrieved via WebFetch on the live URL):
“Chatbots (such as ChatGPT) should not be listed as authors because they cannot be responsible for the accuracy, integrity, and originality of the work, and these responsibilities are required for authorship.”
This is the stated rationale, in ICMJE’s own primary text. The reasoning is accountability / responsibility, not personhood. The cross-consistency check’s resolution of Tension A is now empirically grounded in primary text, not inferred from subagent reconstruction.
For the (b) steelmans this means: the conventional position, by its own primary text, must argue AI cannot bear responsibility, not AI is not a person. The steelman must therefore engage with the accountability question structurally, not retreat to a personhood gate.
Confirmation on item 6 — ICMJE four criteria verbatim#
Retrieved via WebFetch on the live ICMJE URL:
“Substantial contributions to the conception or design of the work; or the acquisition, analysis, or interpretation of data for the work; AND”
“Drafting the work or reviewing it critically for important intellectual content; AND”
“Final approval of the version to be published; AND”
“Agreement to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.”
The fact-sheets’ rendering of these criteria is accurate. Criterion 4 is the load-bearing accountability requirement.
Section 4 — WebFetch availability in main context#
Subagents reported WebFetch sandbox-blocked. Main session has
different permissions: WebFetch works for at least some URLs
(ICMJE confirmed; Oncoscience returned a binary PDF, which is a
content-type limitation, not a permission block).
This means: when individual operative quotes are needed for direct
publication in the b19 paper, they can be re-fetched against the live
URLs from the main session, and the [QUOTE NEEDS VERIFICATION]
flags can be lifted on a per-quote basis. Plan to do this when forging
the bibliography near the end of the discussion.
Section 5 — Calibration verdict#
The seven fact-sheets are publication-trustworthy at the framing level. All 10 decision-relevant claims sampled survived spot-check against primary sources retrieved in the main session. The training-reconstruction risk that the subagent-protocol guarded against did not materially affect the load-bearing claims.
What still needs end-of-discussion verification:
Exact verbatim quotations intended for direct publication — per-quote re-fetch via WebFetch (precautionary).
The PraS-adjacent reference candidates (Licklider 1960, Engelbart 1962, Clark & Chalmers 1998, Brynjolfsson 2022, Kasparov 2017), held for end-of-discussion verification per LLoL’s earlier request.
What is now established with sufficient confidence to ground (b):
The five convergent-evidence claims of the cross-consistency check (Section 2 of
b19-cross-consistency-check.rst).The 10 decision-relevance ranking findings (Section 3 of the cross-check).
The two tension resolutions (Tension A on accountability vs personhood; Tension B on Bourbaki domains).
Note
Proceeding to the (b) EDEN steelmans in
b19-eden-steelmans.rst on the basis of this calibration. The
steelmans will cite the cross-check by section number and will
not duplicate its content; they will incorporate the LLoL
reframes (ResearchCity, failed-current-accountability,
2020-delay-amplification, BABL-to-ZION) from prompt 5.