the paper

Autopoet: a self‑maintaining software system

The complete definition and scope of the autopoet — what it is, what it may touch, what it can never touch, and how it learns. This document is the product: everything below is the contract the system is built and measured against.

Abstract

The autopoet is a single, opt-in, heartbeat-led runtime actor — one per running system — that continuously maintains the software it lives in. It senses telemetry drift and a backlog of typed self-edit requests, proposes changes to the system's own agents, apps, and configuration, validates every candidate in an isolated scratch copy, and routes the result one of three ways: merged autonomously (within a capability ceiling it structurally cannot escape), escalated to a human (whenever anything structural is touched), or rejected (with the reason written down as a lesson). Its authority is structural, not operational: it runs no shell and fetches no URL — its only power is deciding what to edit, and that power is fenced by a constitution. On top of that constitution runs a learning system with no gradients and no GPU: plasticity on the workspace graph, surprise-gated cognition that spends the expensive model only on novelty, and an internal economy that assigns credit by selection. The result is a system that gets cheaper, faster, and more specific to you the longer it runs — while remaining incapable of promoting itself.

1 · The name

Autopoiesis — from the Greek auto (self) and poiesis (making) — is Maturana and Varela's term for a system that continuously produces and maintains itself: a living cell synthesizing the very membrane that contains it. The autopoet is that idea applied to software: a system whose maintenance loop is itself a component of the system, producing and repairing the parts it is made of — including, in the limit, the rules by which it repairs. The name is the contraction: the actor that does the self-making.

A system maintained from outside decays toward its last deploy. A system maintained from inside converges toward its owner's intent.

2 · The problem

Every serious agent deployment meets the same wall. Agents that cannot change their own tools, prompts, and wiring go stale: the system they serve evolves and their model of it doesn't. Agents that can change themselves drift: an agent reads something wrong and makes a drastic self-edit, or a prompt injection walks an agent into widening its own permissions. The industry's answers — approval queues for everything, or YOLO autonomy with a kill switch — trade away either the value or the safety.

The autopoet's answer is architectural: make beneficial self-change routine and harmful self-change structurally impossible, then put a learning system inside that cage so the routine changes compound. Safety is not a policy the system follows; it is a shape the system has.

3 · The actor

There is exactly one autopoet per running system. It is not another agent in the swarm — it is a supervised runtime worker, armed on a schedule (a heartbeat, every 15 minutes by default, off until an admin arms it), running four phases per beat:

Phase	What happens
SENSE	Gather telemetry concerns — drift, error-rate shifts, cost anomalies — plus the backlog of typed self-edit requests filed by other agents.
DECIDE	For each item, propose a concrete change to the tree: a diff against the system's own agents, apps, or configuration.
ACT	Validate the candidate in an isolated scratch copy; route it — merge autonomously, escalate to a human, or reject.
LEARN	Write the outcome down. Every acceptance, escalation, and rejection becomes durable knowledge the next beat reads.

The autopoet holds no operational capabilities: no shell, no network, no process spawning. Its entire power is editorial — deciding what the system's source of truth should say next. Everything it reads and writes is plain, human-readable text: the same literate files a person would edit by hand. There is no hidden state, no binary store, no database of decisions you can't open in a text editor.

4 · The constitution

Five mechanisms fence the autopoet. Together they make self-escalation impossible by construction — not forbidden by a rule the model might talk itself around, but unreachable in the way a function cannot call what is not in scope.

4.1 Ceilings live in the tree

Every subtree of the system declares a capability ceiling in its index file — the maximum set of capabilities anything below it may hold. The effective ceiling at any path is the intersection of every ancestor's ceiling from the root down: deeper levels can only narrow, never widen. An agent's effective grant is its declared grant intersected with every ceiling above it. Capability flows in exactly one direction: down.

Widening a ceiling therefore requires editing an ancestor index — a file that sits above the agent's write scope. The autopoet cannot reach it. That single property is the keystone: no sequence of self-edits, however clever, terminates in the autopoet holding more capability than a human placed above it.

4.2 The structural triad is human-gated

Three fields are constitutionally special: grant (what an agent may do), ceiling (what a subtree may ever do), and management (who may edit the agent). The autopoet may propose changes to any of them — with evidence — but it may never land one alone. Every triad change crosses a human's desk with a precise, machine-generated reason attached: this change widens scout's grant by +net; here is the request and the evidence that motivated it.

4.3 Management postures

Every agent carries a posture, mutable only by humans:

managed — the autopoet edits it autonomously, within ceiling. Prompts, tools, wiring, logic: all fair game. This is the default, and it is where the compounding lives.
proposed — the autopoet prepares the change; a human merges it. The posture for anything you want to watch for a while.
frozen — the autopoet cannot touch it, even for a benign edit. The posture for injection-exposed and public-facing agents.

Postures also apply to whole subtrees, and the most restrictive ancestor wins. And the autopoet carries frozen on itself: its own definition, grants, and wiring are 100% human-gated. The one thing it mutates freely is its knowledge — and knowledge is data, not structure.

4.4 Leases, not escalations

When a request genuinely needs a capability above an agent's ceiling, the ceiling does not move. The autopoet issues a lease: a bounded, expiring grant of one capability to one principal — fifteen minutes by default, non-delegable, so a spawned sub-agent inherits nothing. The agent gets unblocked for the task at hand; the ceiling never moves; the exception evaporates on its own. Durable widening remains what it should be: a rare, deliberate, human act.

4.5 Eval-in-scratch and typed requests

No candidate change touches the live tree. Every proposal is materialized in a throwaway scratch copy and gated three ways — does every changed file still parse; does every index stay pure (composition and config only, no smuggled logic); does the change stay inside the authority rules above. The scratch is destroyed either way; verdict and autonomy are decided before the live system ever sees the diff.

And the intake is typed. When an agent needs a change it cannot make itself, it files a structured request: target, a typed change delta (like grant +net), verifiable evidence, and free-prose why. The autopoet decides off the typed delta and the evidence — never off the prose, which is untrusted context for the human reviewer. A prompt injection can write whatever story it likes into why; the story is not an instruction, and the request it rides on still can't cross the triad. This is the injection firewall.

5 · The life of a change

Filed. An agent hits a wall — a missing capability, a stale prompt, a tool that errors — and files a typed request, fire-and-forget. It degrades gracefully and keeps working; it never blocks on the autopoet.
Sensed. On the next beat the autopoet picks the request up alongside whatever telemetry drift it noticed on its own.
Proposed. It authors the concrete edit — the actual new source for the affected files.
Evaluated. The edit is applied in scratch and gated: parse, purity, authority.
Routed. Within ceiling, on a managed target, passing all gates → merged autonomously through the same checked merge lane every change in the system uses. Touching the triad, or a proposed/frozen target → escalated with reasons. Malformed → rejected.
Propagated. A merged change hot-reloads: the runtime pushes the new file to everything referencing it, and long-lived agents pick it up at their next turn — permissions are re-resolved every turn, never cached.
Learned. The outcome — including every rejection — is appended to the knowledge log, and the weights that led here update.

6 · The learning system

The constitution is the immune system; this is the nervous system. Three mechanisms, all classical, all measured, running on ordinary CPUs with no gradients and no GPU. The design constraint was severe: every piece of learning must be inspectable data — weights on a graph you can read, rules in files you can open, ledgers you can audit — because a learning system inside a constitution must itself be governable.

6.1 Plasticity: the workspace learns its own shape

Every co-activation of two parts of the workspace — files edited together, documents referenced in the same task, agents that hand off to each other — strengthens a weighted edge between them; unused edges decay. This is bounded Hebbian learning on the knowledge graph, and it means the system's map of itself is drawn by use, not by declaration. Measured against a frequency-count baseline on next-access prediction: 0.783 vs 0.757 steady-state, with the gap widening right after the workload shifts (0.748 vs 0.686) and recovery to 90%-of-peak in 500 events instead of 622. Structure is the big win — an edge-weighted graph beats flat frequency by 4.6× — and plasticity is what keeps the structure current when your work changes.

6.2 Surprise-gated cognition: the model is the motor, not the engine

A cheap statistical predictor watches the event stream and prices every event's surprise. Familiar events are handled by installed reflexes at effectively zero cost. Surprising events — the predictor's misses — escalate to the full model, and the model's answer doesn't just handle the event: it installs a handler, so the same situation is reflex-priced forever after. Just-in-time cognition. Measured: quality 1.000 at 1.3% of the cost of calling the model on everything, with only 0.8% of events escalating. When the environment shifts, the surprise channel spikes 4.7×, attention floods to the novelty, new handlers install, and the system self-quenches back to reflex pricing. That spike is also the drift alarm: the autopoet notices the world changed because its own predictions got worse — and it treats that as a concern to fix.

6.3 The economy: credit by selection, not backprop

Which prompt, which tool, which rule actually earned the outcome? The autopoet answers with an internal economy — classifier-system economics (Wilson's ZCS; Grefenstette's profit sharing): components bid to act, chains that end in verified good outcomes get paid along their whole length, and wealth is fitness. Rules the economy evolves are interpretable — maximally general condition→action strings you can read — and on the canonical benchmark the economy reaches 0.970 accuracy from outcome-only reward. Pushed harder — two layers, credit only at the end, layer one allowed just a two-bit message — the layers invent a communication protocol from outcome pressure alone (0.947). Credit is episodic (whole chains are paid on results, never per-step), exploration is explicit, and no component monopolizes: the ledger is auditable, and what stops earning stops being selected.

The intelligence is not in any one component. It is in the organization — plasticity drawing the map, surprise directing attention, the economy deciding what survives. That is the autopoet as an intelligent server, not a server that calls an LLM.

7 · The safety model

Money. Model spend is the only marginal cost, and it flows through a hard admission budget. The autopoet cannot spend what you didn't give it — by construction, not by promise.
Capability. Ceilings intersect downward; the triad is human-gated; leases expire; the autopoet is frozen to itself. There is no self-edit path that ends in more power.
Blast radius. Every candidate runs in scratch first; every merge passes the same compile-and-check gate as a human's change; the runtime's supervision tree absorbs anything that breaks mid-flight.
Time. Work is bounded by wall-clock ceilings that exist solely so a hung run cannot hold resources — never by turn counters that guillotine work mid-flight.
Goodhart. The reward definition is frozen and human-owned — the learner cannot redefine "what works." Held-out tripwire metrics watch for gaming; components pay to act, so chain-stuffing costs the staffer; counterfactual ablation audits verify that paid components actually mattered.
Legibility. Weights, rules, lessons, and ledgers are all plain data in the tree. Nothing the autopoet knows is somewhere you can't read.

8 · Scope

In scope — the autopoet maintains:

Agent definitions: prompts, tools, wiring, schedules — everything behavioral about a managed agent.
Apps and automation: hooks, flows, triggers, the reactive fabric of the workspace.
Configuration below the ceiling: models, cadences, budgets-within-budget, routing.
Its own knowledge: lessons, weights, reflex rules, the credit ledger.
Plans: decomposing goals into delegated, task-scoped handoffs — each a self-contained brief, dispatched ephemerally. One planner, no persistent swarm.

Out of scope — permanently, by construction:

Its own structure, grants, and posture.
Any ceiling, anywhere.
Any grant change, landed alone.
Anything frozen, for any reason.
Operational side effects: the autopoet edits source, it does not run things.

The nearest neighbor is the "LLM wiki" pattern — a model curating a corpus of documents it also reads, compounding what it learns as plain text. The autopoet is that loop pointed at a running software system instead of a research corpus, with the two things the pattern needs to become multi-user infrastructure: structure as a format (typed, parseable, literate files rather than freeform notes) and a real authority model (ceilings, postures, leases — so many agents and many humans can share one self-maintaining tree without anyone's injection becoming anyone else's escalation). Same loop, same storage philosophy, different object of attention — and the corpus is a parameter.

10 · Autopoet and Workbooks

The autopoet is the self-maintenance layer of Workbooks — living, breathing software authored as literate .work files and run by the Nexus runtime. Workbooks gives every workspace its agents, apps, and reactive fabric; the autopoet is what keeps all of it alive: tuned to your usage, repaired when it drifts, cheaper every week, and incapable of outgrowing the box you drew around it. You write the ceiling once. The system grows into it — and never out of it.

■ autopoet — a product of workbooks. Questions? Start with the docs or the FAQ.