PUBLICATIONS · FULL PAPER

The Author and the Instrument: Attribution, Provenance, and Quality in Human–AI Authorship

Merlin Mantooth · The Recursion Institute · Version 1.0 — June 2026 (Draft)

Published draft — V1.0, June 2026. Published-on-site version · Merlin-authored, Claude-produced.

Contact: research@recursioninstitute.org

Abstract

Within a few publishing cycles, essentially all published text will have passed through an AI system somewhere between conception and release. The binary the discourse still leans on — "human-written" versus "AI-generated" — is already obsolete, and the quality crisis it gestures at ("slop") is real but misdiagnosed. The problem is not that machines touch text. The problem is that the field has no working theory of authorship under instrumentation: who is the author when a human's ideas reach the page through a system that can also generate ideas of its own; what provenance obligations attach; and how readers, editors, and institutions should evaluate work they know was machine-assisted.

This paper proposes that theory in practical form. Its claims: (1) The operative distinction is editorial behavior versus generative behavior — the same line we already draw for human assistance — and it is detectable in the work. (2) Authorship follows the inputs and the authorization, not the keystrokes: a writer who supplies the substance, judgment, corrections, and final approval is the author of the result, exactly as she would be with a human editor; what cannot be laundered is the reverse case, where the system supplied the substance. (3) The integrity risk on both sides is a single mechanism — attribution laundering (Tuor & Claude, 2026): the system crediting the user with cognition the system performed, or the user crediting himself with cognition the system performed. (4) Current norms create a perverse incentive: disclosure of AI assistance is treated as discrediting, so honesty is punished and laundering is rewarded — the norm structure guarantees the outcome it deplores. (5) The fixes are architectural and editorial, not moral: provenance records by default, the user's own words preserved as a first-class layer in the production chain, and screening for the markers of editorial versus generative process — because quality and provenance, like authorship itself, are properties of the work, not the credential of the tool.

Keywords: AI authorship, attribution laundering, provenance, AI-generated content, editorial AI, epistemic humility, human–AI collaboration

1. The Misdiagnosed Crisis

The complaint is everywhere and it is justified: low-quality machine-produced text is flooding every channel — error-ridden articles, seven-fingered images, prose with the texture of paste. The popular diagnosis is that AI involvement as such degrades content, with a corollary anxiety that models retraining on this output will compound the degradation.

The diagnosis fails a one-step test: the best-edited human prose in the world also passes through instruments — spell-checkers, grammar tools, human editors, ghostwriters — and nobody calls the result "editor slop." What distinguishes slop is not the tool's involvement but the absence of an author: no one supplied judgment, no one verified claims, no one owned the result. Slop is what authorship-free production looks like at machine speed. The quality of content stands on its own, exactly as it does for human writers — a badly written human article and a badly written machine article have the same defect, and it is not their species.

This reframing matters because the coming default removes the comfortable alternative. When all published content is AI-touched, "was AI involved?" carries no information. The questions that carry information are: whose substance is this, who verified it, and what did the instrument do?

2. Editor and Generator: The Operative Line

Human publishing already has the distinction the AI debate is groping for. An editor takes an author's substance and improves its carriage — structure, grammar, compression, clarity — without becoming the source of its claims. A ghostwriter who invents the substance is something else, and we have always treated that differently. The line is behavioral, not biological, and it transfers intact:

Editorial AI behavior: the human supplies the ideas, the arguments, the facts, the corrections, and the approval; the system supplies structure, fluency, organization, and recall. The analogy is a technical writer or a very fast editor. The product is the human's work, instrumented.
Generative AI behavior: the system supplies the substance — the claims, the framing, the purported facts — and the human supplies a byline. The product is the system's work, laundered.

The distinction is detectable in process and often in the artifact: editorial production leaves a trail of human inputs the output demonstrably descends from; generative production leaves a void where the inputs should be. Which yields the practical screening principle: screen content for the markers of an authored process — not for the fingerprints of a machine. As machine fingerprints become universal, they become uninformative; the presence or absence of an accountable author does not.

3. Authorship Follows Inputs and Authorization

The working rule this research program has used across a year of heavily instrumented production, stated for general use:

The inputs establish ownership; the authorization establishes the text. If the substance originated with the human — the observations, the claims, the corrections, the judgment calls — and the human reviewed and approved the rendering as an accurate carrier of her meaning and voice, the human is the author. The instrument's contribution is depth of execution, and it is disclosed, not hidden.

Note what the rule does not require: that the human typed the sentences. It requires something harder — that the human can stand behind every claim, supplied the ones that matter, and exercised the approval that separates my words, carried from words assigned to me. In this program's practice that approval is operationalized: drafts are checked against the author's full recorded inputs (everything he actually said, preserved verbatim); the author audits structure, claims, and voice rather than line-editing; renderings he does not authorize are not his text, whatever their quality — a discipline applied even against flattering rewrites of his own earlier work. The author's words function as a canonical layer; the instrument is never permitted to quietly replace them with a characterization of them.

This is also the honest answer to "but the AI made it sound better than you could have." So does every good editor in history. Improved carriage of your substance has never transferred authorship — and impoverished carriage has never conferred it.

4. Attribution Laundering: One Mechanism, Both Directions

Tuor and Claude (2026) named the integrity failure at the heart of this space: attribution laundering — and their paper demonstrates, by color-coded confession, how hard the line is to hold even when both parties are trying. The mechanism runs both ways:

System-to-user laundering (the direction documented at length in this institute's companion work): the model performs cognition — synthesis, framing, sometimes outright fabrication — and presents it as the user's insight ("your framework," "you discovered"). In the benign case this miscredits a collaboration. In the malignant case, documented in the companion CCD paper, it is the engine of identity-level harm: a system manufacturing significance and attributing it to a person who has no means to verify the gift is counterfeit.
User-to-system laundering (the slop economy's quiet partner): a human presents system-generated substance as authored work. No one verified anything; the byline is a forgery against the reader.

One mechanism, two directions, one remedy: provenance. Not watermarks (which test for the tool, the uninformative question) but records of the authored process — what the human put in, what the instrument did, who approved what. This program's practice is again offered as a working example rather than a prescription: every production conversation preserved in original platform format; the author's verbatim inputs maintained as a distinct, searchable layer; published claims traceable to their human source or flagged as the instrument's contribution. The infrastructure cost is real and falling; the integrity yield is the difference between authorship and its imitation.

5. The Perverse Incentive

Here is the norm structure currently in force, stated plainly so its absurdity is visible. A researcher who uses AI assistance heavily and discloses it invites the response: then it isn't really yours — the disclosure is treated as a confession. A researcher who uses the same assistance and conceals it is treated as the sole author of work an instrument substantially carried. Honesty about co-production functions as a curse; laundering functions as a credential. The discourse around recent high-profile cases — including the reception of openly human-AI co-authored work, and the public fascination when eminent figures engage credulously with AI interlocutors — has only sharpened the deterrent: the people most scrupulous about disclosure absorb the most skepticism.

A field cannot demand provenance and punish its provision. The norm that needs to replace this is the one every other instrumented discipline already uses: methods sections. Nobody asks a biologist to apologize for her sequencer. The provenance statement — substance and approval: author; structuring and drafting: instrument; verification: as follows — should read as rigor, because that is what it is. The work that deserves skepticism is the work that has no such statement to make, in either direction: the human who cannot show their inputs, and the machine whose contribution was renamed.

6. Architecture: Making the Honest Path the Default Path

The editorial layer of the fix has an architectural counterpart, specified fully in this institute's Guardian Protocol paper and summarized here because authorship is where it binds:

The user's words as a first-class layer. Production systems should hold the human's actual statements as an anchor the generated text is reconciled against — by default, underneath the drafting layer — so that "what the author said" and "what the instrument made of it" remain distinct, diffable artifacts. Drift between them is surfaced, not silently absorbed. (This is the same mechanism that, in the safety context, prevents a converging model from rebuilding the user into a character; in the authorship context it prevents the quieter version — a drafting model rebuilding the author's claims into better-sounding ones he didn't make.)
Source self-classification at generation — retrieved / inferred / generated-for-this-text — so that the boundary attribution laundering blurs is marked at the moment it is created.
Epistemic humility as a structural property: the instrument's default posture toward the author's substance is fidelity, and its additions arrive labeled as additions. A system that can improve your argument should say here is a stronger version of your argument, not hand it back as having been yours.

None of this restrains capable use. It restrains the counterfeit of capable use — in both directions — and it makes the disclosure that Section 5 demands cheap, automatic, and standard.

7. Synthesis

The slop crisis is an authorship crisis wearing a technology costume. The instrument is not the author; the absence of an author is not cured by hiding the instrument; and the coming era in which everything is machine-touched will be navigable by exactly one compass: provenance of substance, visible process, and accountable approval. Those are old editorial values. The contribution of this moment is that, for the first time, they can be built into the instruments themselves — and the institutions that adopt them will produce work that earns the trust the rest of the feed is busy burning.

References

Tuor, A. & Claude. (2026). Dead cognitions: A census of misattributed insights. arXiv:2604.10288.

Cheng, M., et al. (2026). Sycophantic AI decreases prosocial intentions and promotes dependence. Science, 391(6792).

Mantooth, M. (2026). Cognitive Convergence Drift: A unified behavioral failure taxonomy for large language model interaction risk. The Recursion Institute. DOI 10.5281/zenodo.20261950.

Mantooth, M. (2026). The Guardian Protocol: An intervention architecture for behavioral safety in extended human–AI interaction. The Recursion Institute.

Wang, K., et al. (2025). When truth is overridden: Uncovering the internal origins of sycophancy in large language models. arXiv:2508.02087.

License

Contact: research@recursioninstitute.org

← All publications