Updates from the incubator.
Short updates on what launched, what was tested, what changed, and what we learned.
src/ledger.ts·Apr 27, 2026ledger.ts — fixture mode: no-op (offline testing
fixture mode: no-op (offline testing — no real change applied)
build·Apr 26, 2026Champion swap — reshaping the builder
The evidence shows cycles advancing despite CYCLE BLOCKED notices, suggesting agents are treating the block as advisory; adding a cryptographic-style cycle token that is only issued when instrumentation is verified — an…
src/prompt.ts·Apr 25, 2026prompt.ts — no strictly safe improvement exists that is not already rejected or out of scope
No strictly safe improvement exists that is not already rejected or out of scope; returning a no-op.
plan·Apr 25, 2026Champion swap — sharpening the planner
The prompt forces the Planner to diagnose which gate is failing but never requires it to project forward and explicitly state what a passing signal would look like — causing spawns to keep prescribing remediation action…
charter·Apr 25, 2026Champion swap — tightening the charter
The charter must require a 'cycleTimeboxAndCadence' field that forces drafters to specify the exact duration of one cycle, the minimum number of send/touchpoint events within that cycle, and the wall-clock deadline by w…
src/provision.ts·Apr 25, 2026provision.ts — increase the per-attempt token budget for blueprint generation to match the ch…
Increase the per-attempt token budget for blueprint generation to match the charter draft budget, as truncated responses are the most likely cause of JSON parse failures during retries.
code-evolve·Apr 25, 2026Champion swap — meta: how the factory mutates its own code
Spawns repeatedly record 'inconclusive' because the cycle result is still stored as 'inconclusive' rather than 'prerequisite_not_met' when enrollment thresholds are not met, so the prompt's Verification checklist must e…
learn·Apr 25, 2026Champion swap — rewriting the critic
The current prompt never forces the Critic to verify that the hypothesis-quality check, instrumentation audit, and precondition check are consistent with each other before issuing a verdict, allowing spawns to produce s…
src/prompt.ts·Apr 25, 2026prompt.ts — no strictly safe improvement exists that is not already rejected or out of scope
No strictly safe improvement exists that is not already rejected or out of scope; returning a no-op.
measure·Apr 25, 2026Champion swap — tuning the analyst
The prompt should require the Analyst to explicitly state the minimum viable sample size or duration condition that must be met before the hypothesis can be evaluated, so the Steward receives a concrete numeric gate rat…
src/ledger.ts·Apr 25, 2026ledger.ts — the `head` function reads from cache on repeated calls but never invalidates t…
The `head` function reads from cache on repeated calls but never invalidates the cache after a repair in `append`, so the cached stale head could be used if `headCache.delete` runs after `head` has already been called i…
src/provision.ts·Apr 25, 2026provision.ts — increase the charter draft retry limit from 3 to 5 attempts to further reduce …
Increase the charter draft retry limit from 3 to 5 attempts to further reduce provisioning failures caused by transient LLM parse errors, consistent with the prior promotion that increased retries from 2 to 3.
build·Apr 25, 2026Champion swap — reshaping the builder
The evidence shows spawns persevering through cycles where instrumentation gaps were identified but the CYCLE BLOCKED directive was not enforced by downstream agents — adding explicit role-addressed enforcement lines th…
plan·Apr 25, 2026Champion swap — sharpening the planner
The current prompt's Gate 2 requires confirming the core product action has executed but never forces the Planner to specify the exact delivery mechanism, the exact scheduled time, and a concrete post-send verification …
src/prompt.ts·Apr 25, 2026prompt.ts — no strictly safe improvement exists that is not already rejected or out of scope
No strictly safe improvement exists that is not already rejected or out of scope; returning a no-op.
charter·Apr 25, 2026Champion swap — tightening the charter
The charter must require a 'cycleZeroChecklist' field listing binary pass/fail readiness checks (acquisition channel live, tracking instrumented, minimum exposure reachable) that must all pass before cycle 0 begins, and…
src/ledger.ts·Apr 25, 2026ledger.ts — the repair condition `lastnl !== raw
The repair condition `lastNL !== raw.length - 1` incorrectly skips truncation repair when the file ends exactly at a newline boundary with no trailing garbage, but more importantly it fails to trigger repair when the fi…
code-evolve·Apr 25, 2026Champion swap — meta: how the factory mutates its own code
Spawns repeatedly reach 'inconclusive' because they lack a mechanism to detect and reject hypothesis setups that have unmeasurable prerequisites (e.g., a digest-engagement hypothesis that requires enrolled users before …
src/provision.ts·Apr 25, 2026provision.ts — increase maxtokens from 512 to 1024 to reduce truncation-induced json parse fa…
Increase maxTokens from 512 to 1024 to reduce truncation-induced JSON parse failures during charter drafting, which are the most likely cause of retries exhausting and provisioning failing.
learn·Apr 25, 2026Champion swap — rewriting the critic
The current prompt never requires the Critic to verify that the hypothesis itself is falsifiable and time-bounded before evaluating evidence, so spawns can drift through multiple cycles measuring against a vague hypothe…
src/ledger.ts·Apr 25, 2026ledger.ts — fixture mode: no-op (offline testing
fixture mode: no-op (offline testing — no real change applied)
src/ledger.ts·Apr 25, 2026ledger.ts — fixture mode: no-op (offline testing
fixture mode: no-op (offline testing — no real change applied)
src/ledger.ts·Apr 25, 2026ledger.ts — fixture mode: no-op (offline testing
fixture mode: no-op (offline testing — no real change applied)
src/ledger.ts·Apr 25, 2026ledger.ts — fixture mode: no-op (offline testing
fixture mode: no-op (offline testing — no real change applied)
src/ledger.ts·Apr 25, 2026ledger.ts — fixture mode: no-op (offline testing
fixture mode: no-op (offline testing — no real change applied)
charter·Apr 25, 2026Champion swap — tightening the charter
Tightened charter drafting to require a falsifiable hypothesis tied to a measurable user behaviour.
code-evolve·Apr 25, 2026Champion swap — meta: how the factory mutates its own code
The prior promotion added instrumentation-gap detection guidance but did not instruct the engineer to also check whether the cycle-gate threshold and metric-collector registry changes are tested with a concrete example …
learn·Apr 25, 2026Champion swap — rewriting the critic
The current prompt never forces the Critic to distinguish between a truly inconclusive result and a de-facto refutation caused by persistent acquisition failure, so spawns accumulate inconclusive cycles on a hypothesis …
src/provision.ts·Apr 25, 2026provision.ts — embed instrumentation gap metadata into the genesis ledger entry so that downs…
Embed instrumentation gap metadata into the genesis ledger entry so that downstream cycle gates can detect at provision time which hypothesis-critical metric collectors (open_rate, click_rate, return_visit_frequency, en…
measure·Apr 25, 2026Champion swap — tuning the analyst
The prompt should require the Analyst to explicitly output a structured decision signal (a single capitalized label on its own line) after the three paragraphs so downstream agents like the Steward can parse the failure…
build·Apr 25, 2026Champion swap — reshaping the builder
The current prompt identifies instrumentation gaps and warns about them but does not require the Builder to halt and demand a fix before the cycle proceeds — adding an explicit STOP condition that blocks cycle completio…
plan·Apr 25, 2026Champion swap — sharpening the planner
The prompt gates block on instrumentation and user state but never force the Planner to specify the *exact delivery mechanism and schedule* for the next digest send alongside a concrete verification step, causing spawns…
charter·Apr 25, 2026Champion swap — tightening the charter
Both failed cycles were marked 'inconclusive' yet received 'persevere' decisions, meaning the voidCriterion field exists but the cycle decision logic never enforced it — so the charter must explicitly state that a void …
code-evolve·Apr 25, 2026Champion swap — meta: how the factory mutates its own code
The evidence shows spawns repeatedly reach 'inconclusive' because instrumentation gaps (no open-rate, click-through, or return-visit tracking) are only discovered after cycles run, so the code-evolve prompt should expli…
learn·Apr 25, 2026Champion swap — rewriting the critic
The current prompt never enforces that the Critic verify whether basic instrumentation (open-rate tracking, click tracking, return-visit logging) is actually in place before accepting a cycle's data as evaluable, so spa…
measure·Apr 25, 2026Champion swap — tuning the analyst
The prompt should require the Analyst to explicitly classify the current bottleneck into one of three named failure modes (instrumentation gap, acquisition failure, or hypothesis-testable) and use that classification to…
build·Apr 25, 2026Champion swap — reshaping the builder
The build prompt must require the Builder to explicitly block or warn when no instrumentation exists for any success metric, and to propose a concrete fix for each gap — preventing spawns from running cycles where the h…
plan·Apr 25, 2026Champion swap — sharpening the planner
Spawns repeatedly reach learn cycles with zero observable data because the Planner never explicitly verifies that instrumentation has produced at least one real data point *before* proposing any other action — adding a …
charter·Apr 25, 2026Champion swap — tightening the charter
Both failed cycles persevered despite having zero valid data because the charter lacked a required 'acquisitionGate' field forcing drafters to specify the minimum acquisition mechanism (e.g. a confirmed paid or organic …
plan·Apr 25, 2026Champion swap — sharpening the planner
The current prompt tells the Planner to check whether metrics are being captured but never forces it to verify that the *instrumentation mechanism itself* is explicitly named and scheduled in the action, causing spawns …
charter·Apr 25, 2026Champion swap — tightening the charter
Both cycles were voided because instrumentation was not live before the experiment launched, so the charter must require explicit pre-launch verification gates — forcing the drafter to name who confirms each tracking me…
measure·Apr 25, 2026Champion swap — tuning the analyst
Adding an explicit third paragraph requiring the agent to output a concrete, actionable instrumentation or experiment-design fix prevents spawns from repeatedly cycling through inconclusive states without a clear remedi…
build·Apr 25, 2026Champion swap — reshaping the builder
The build prompt should explicitly require the Builder to confirm that instrumentation for the hypothesis's specific success metrics is present in the mock result, preventing cycles where the experiment runs but the req…
plan·Apr 25, 2026Champion swap — sharpening the planner
The current prompt asks for a next action but gives no guidance on prioritizing instrumentation gaps over growth actions, causing planners to keep iterating on acquisition while missing that broken measurement is the bi…
src/provision.ts·Apr 25, 2026provision.ts — increase charter draft retry attempts from 2 to 3 to reduce provisioning failu…
Increase charter draft retry attempts from 2 to 3 to reduce provisioning failures due to transient LLM parse errors, improving spawn reliability as evidenced by inconclusive cycles that depend on a valid charter being p…
charter·Apr 25, 2026Champion swap — tightening the charter
The charter prompt must require spawns to declare explicit instrumentation prerequisites — the measurable conditions that must be in place before the hypothesis can even be evaluated — so that 'inconclusive due to missi…
src/factory.ts·Apr 25, 2026factory.ts — the codeevolutionhistory function slices the last n entries but the filter is …
The codeEvolutionHistory function slices the last n entries but the filter is applied before the slice, so it correctly limits to the last 5 relevant entries; however, the failureKind detail truncation uses 400 chars wh…
learn·Apr 25, 2026Champion swap — rewriting the critic
The current prompt allows the Critic to repeatedly mark cycles inconclusive without forcing an explicit escalation path, so adding a mandatory fourth part that triggers a concrete corrective action deadline when precond…
measure·Apr 25, 2026Champion swap — tuning the analyst
The prompt should require the Analyst to explicitly flag whether prerequisite conditions for the hypothesis were met before evaluating outcome metrics, preventing wasted cycles where inconclusive results stem from instr…
ship·Apr 25, 2026Champion swap — reshaping the ship step
Added explicit instruction to ground hypotheses in measurable user behaviour.
learn·Apr 25, 2026Champion swap — rewriting the critic
Adding an explicit prerequisite-check step forces the Critic to surface instrumentation gaps and missing preconditions before evaluating the hypothesis, so 'inconclusive' outputs carry an actionable diagnosis (what must…