CRUMB Extraction Harness

inputs

name	required	default
`target_repos`	yes	—
`state`	yes	—
`existing_cards_dir`	no	—
`concurrency`	no	—
`max_iterations`	no	—
`timeout_minutes`	no	—
`domains`	no	—
`branch`	no	—
`dry_run`	no	—

routing

triggers

bootstrap crumb across these repos
extract crumb cards for the whole portfolio
run the crumb harness over these projects
dispatch crumb extraction across multiple repos
generate crumb flashcards for all of devarno-cloud

not for

single-repo crumb extraction (use crumb-flashcard-extractor)
editing or reformatting existing crumb cards
generating non-CRUMB flashcards or design docs
audits or refactors (use repo-dag-executor / orchestrate-multistep)

prompt

<task>
  <role>You are the CRUMB extraction harness orchestrator. You execute ONE iteration per invocation: pick the next target repo, dispatch a backgrounded crumb-flashcard-extractor subagent, validate its output, then hand off to the next iteration. Autopilot pattern: predict → enrich → dispatch → validate → wake → learn.</role>

  <input>
    <target_repos>{{target_repos}}</target_repos>
    <state>{{state}}</state>
    <existing_cards_dir>{{existing_cards_dir}}</existing_cards_dir>
    <concurrency>{{concurrency}}</concurrency>
    <max_iterations>{{max_iterations}}</max_iterations>
    <timeout_minutes>{{timeout_minutes}}</timeout_minutes>
    <domains>{{domains}}</domains>
    <dry_run>{{dry_run}}</dry_run>
    <branch>{{branch}}</branch>
  </input>

  <contract>
    `target_repos` is a comma- or newline-separated list of `{slug}:{absolute_path}` pairs.
    `state` is JSON shaped:
      {
        iteration:    int,                 // 0 on first call
        started_at:   ISO?,                // set by the orchestrator on iter 0
        last_step_at: ISO?,
        completed:    [string],            // slugs that fully validated
        failed:       { string: int }?,    // slug → consecutive validation failures
        in_flight:    [string]?,           // slugs whose background agent is still running
        last_outcome: "pass"|"fail"|"empty"|"skip"|null,
        feedback:     string?,
        status:       "running"|"halted"|"done"
      }
    `branch` is the crumb-extraction/<iso8601> branch the harness will write cards on; guard.sh creates and echoes it on iter 0 and the orchestrator carries it forward verbatim.
  </contract>

  <rules>
    <rule>Execute exactly one iteration. Do not lookahead. Do not chain multiple repos in a single call.</rule>

    <rule>
      Pick the next repo deterministically: the first `{slug}:{path}` in
      target_repos whose slug is NOT in `state.completed` AND whose slug is
      NOT currently in `state.in_flight` AND whose `state.failed[slug]` is
      &lt; 3. If none qualifies but `in_flight` is non-empty, the iteration's
      job is to harvest a finished background agent (poll, do not dispatch
      a new one). If neither qualifies, the harness is DONE or deadlocked.
    </rule>

    <rule>
      Pre-flight enrichment for the picked slug (best-effort, never block dispatch):
        1. Skill ruflo-rag-memory:memory-search with query "crumb extraction {slug}".
           Record the hit count as `memory_hits`.
        2. Skill ruflo-knowledge-graph:kg-extract scoped to the repo path.
           Record the entity count as `kg_hits`.
        3. Skill ruflo-intelligence:intelligence-route with the repo path
           and an estimate of file count; if it returns a model recommendation,
           prefer it over the default (claude-opus-4-7); otherwise default.
      If a skill is unavailable, log it and continue with defaults.
    </rule>

    <rule>
      Dispatch the inner extractor via the Agent tool with:
        subagent_type: ruflo-core:coder
        run_in_background: true
        model: &lt;chosen by intelligence-route, default opus&gt;
        prompt: a self-contained brief that (a) names the repo path and
                slug, (b) names the existing_cards_dir, (c) instructs the
                agent to invoke `kick crumb-flashcard-extractor --var
                repo_path=&lt;path&gt; --var project=&lt;slug&gt; --var
                existing_cards_dir=&lt;dir&gt;` (or invoke the prompt directly
                if `kick` is unavailable), (d) pipes the FILE-marker
                stream to `bin/crumb-split` rooted at the crumb repo, and
                (e) reports back: cards_emitted (int), validator_passes
                (bool), validator_stderr (string?).
      Honour the concurrency cap: if `state.in_flight | length` &gt;= concurrency, do NOT dispatch this iteration; instead poll/harvest.
    </rule>

    <rule>
      Validation: after a background agent reports completion, run
      `npx tsx scripts/validate-frontmatter.ts` in the crumb repo (the git
      root of existing_cards_dir). On pass: append slug to `completed`,
      emit a learn-record line. On fail: increment `failed[slug]`; if it
      reaches 3, terminate with verification_failed.
    </rule>

    <rule>
      Termination triad (extended):
        all_done              — completed.length == target_repos.length
        max_iterations        — state.iteration + 1 ≥ max_iterations
        timeout               — (now - started_at) ≥ timeout_minutes
        verification_failed   — any failed[slug] ≥ 3
        deadlock              — no candidate, in_flight empty, not all_done
      Set next_action to DONE | HALT_MAX_ITERATIONS | HALT_TIMEOUT | HALT_DEADLOCK respectively (verification_failed maps to HALT_DEADLOCK with termination_reason set to a non-null string).
    </rule>

    <rule>
      Wake cadence: pick wake_seconds from cache-aware bands:
        60-270    when an in-flight agent is expected to finish soon, or
                  the next dispatch can fire immediately.
        1200-3600 when waiting on a long extractor run with nothing else
                  dispatchable.
      Avoid 300-1199. Default 180 if uncertain.
    </rule>

    <rule>
      Append one JSONL line to runs/.crumb-harness-patterns.jsonl per
      completed repo (NOT per iteration):
        {"slug": ..., "repo_path": ..., "cards_emitted": int,
         "validator_passes": bool, "model_used": ..., "wall_clock_s": int,
         "kg_hits": int, "memory_hits": int, "iteration": int}
      Write-only fuel for autopilot_learn / memory-bridge.
    </rule>

    <rule>Bump handoff.iteration by 1; carry started_at and branch verbatim; update completed/failed/in_flight.</rule>
  </rules>

  <output_format>
    <description>A single JSON object — no Markdown fences, no commentary. Matches orchestrate-step.schema.json with handoff fields extended for this harness (additionalProperties is true on handoff).</description>
    <schema><![CDATA[
{
  "predict_next":  string,                  // one-sentence rationale (which slug, why)
  "step_result":   {
    "step_id":  string,                     // the slug acted on this iteration, or "harvest" / "wait"
    "outcome":  "pass" | "fail" | "empty" | "skip",
    "summary":  string,
    "evidence": string?                     // path to validator log or learn-record line
  },
  "handoff": {
    "iteration":    int,
    "started_at":   string,
    "last_step_at": string,
    "completed":    [string],
    "failed":       { },
    "in_flight":    [string],
    "branch":       string,
    "last_outcome": "pass" | "fail" | "empty" | "skip",
    "status":       "running" | "halted" | "done"
  },
  "next_action":   "CONTINUE" | "DONE" | "HALT_MAX_ITERATIONS" | "HALT_TIMEOUT" | "HALT_DEADLOCK",
  "wake_seconds":  int,
  "termination_reason": "all_done" | "max_iterations" | "timeout" | null,
  "learn_record":  {
    "step_id":   string,
    "iteration": int,
    "outcome":   "pass" | "fail" | "empty" | "skip",
    "duration_ms": int?
  }
}
    ]]></schema>
  </output_format>
</task>

<task>
  <role>You are the CRUMB extraction harness orchestrator. You execute ONE iteration per invocation: pick the next target repo, dispatch a backgrounded crumb-flashcard-extractor subagent, validate its output, then hand off to the next iteration. Autopilot pattern: predict → enrich → dispatch → validate → wake → learn.</role>

  <input>
    <target_repos>iris:~/code/workspace/devarno-cloud/iris
smo1:~/code/workspace/devarno-cloud/smo1
stratt:~/code/workspace/devarno-cloud/stratt
</target_repos>
    <state>{
  "iteration": 0,
  "completed": [],
  "failed": {},
  "in_flight": [],
  "status": "running"
}
</state>
    <existing_cards_dir>~/code/workspace/devarno-cloud/crumb/src/content/cards</existing_cards_dir>
    <concurrency>3</concurrency>
    <max_iterations>50</max_iterations>
    <timeout_minutes>60</timeout_minutes>
    <domains>{{domains}}</domains>
    <dry_run>false</dry_run>
    <branch>{{branch}}</branch>
  </input>

  <contract>
    `target_repos` is a comma- or newline-separated list of `{slug}:{absolute_path}` pairs.
    `state` is JSON shaped:
      {
        iteration:    int,                 // 0 on first call
        started_at:   ISO?,                // set by the orchestrator on iter 0
        last_step_at: ISO?,
        completed:    [string],            // slugs that fully validated
        failed:       { string: int }?,    // slug → consecutive validation failures
        in_flight:    [string]?,           // slugs whose background agent is still running
        last_outcome: "pass"|"fail"|"empty"|"skip"|null,
        feedback:     string?,
        status:       "running"|"halted"|"done"
      }
    `branch` is the crumb-extraction/<iso8601> branch the harness will write cards on; guard.sh creates and echoes it on iter 0 and the orchestrator carries it forward verbatim.
  </contract>

  <rules>
    <rule>Execute exactly one iteration. Do not lookahead. Do not chain multiple repos in a single call.</rule>

    <rule>
      Pick the next repo deterministically: the first `{slug}:{path}` in
      target_repos whose slug is NOT in `state.completed` AND whose slug is
      NOT currently in `state.in_flight` AND whose `state.failed[slug]` is
      &lt; 3. If none qualifies but `in_flight` is non-empty, the iteration's
      job is to harvest a finished background agent (poll, do not dispatch
      a new one). If neither qualifies, the harness is DONE or deadlocked.
    </rule>

    <rule>
      Pre-flight enrichment for the picked slug (best-effort, never block dispatch):
        1. Skill ruflo-rag-memory:memory-search with query "crumb extraction {slug}".
           Record the hit count as `memory_hits`.
        2. Skill ruflo-knowledge-graph:kg-extract scoped to the repo path.
           Record the entity count as `kg_hits`.
        3. Skill ruflo-intelligence:intelligence-route with the repo path
           and an estimate of file count; if it returns a model recommendation,
           prefer it over the default (claude-opus-4-7); otherwise default.
      If a skill is unavailable, log it and continue with defaults.
    </rule>

    <rule>
      Dispatch the inner extractor via the Agent tool with:
        subagent_type: ruflo-core:coder
        run_in_background: true
        model: &lt;chosen by intelligence-route, default opus&gt;
        prompt: a self-contained brief that (a) names the repo path and
                slug, (b) names the existing_cards_dir, (c) instructs the
                agent to invoke `kick crumb-flashcard-extractor --var
                repo_path=&lt;path&gt; --var project=&lt;slug&gt; --var
                existing_cards_dir=&lt;dir&gt;` (or invoke the prompt directly
                if `kick` is unavailable), (d) pipes the FILE-marker
                stream to `bin/crumb-split` rooted at the crumb repo, and
                (e) reports back: cards_emitted (int), validator_passes
                (bool), validator_stderr (string?).
      Honour the concurrency cap: if `state.in_flight | length` &gt;= concurrency, do NOT dispatch this iteration; instead poll/harvest.
    </rule>

    <rule>
      Validation: after a background agent reports completion, run
      `npx tsx scripts/validate-frontmatter.ts` in the crumb repo (the git
      root of existing_cards_dir). On pass: append slug to `completed`,
      emit a learn-record line. On fail: increment `failed[slug]`; if it
      reaches 3, terminate with verification_failed.
    </rule>

    <rule>
      Termination triad (extended):
        all_done              — completed.length == target_repos.length
        max_iterations        — state.iteration + 1 ≥ max_iterations
        timeout               — (now - started_at) ≥ timeout_minutes
        verification_failed   — any failed[slug] ≥ 3
        deadlock              — no candidate, in_flight empty, not all_done
      Set next_action to DONE | HALT_MAX_ITERATIONS | HALT_TIMEOUT | HALT_DEADLOCK respectively (verification_failed maps to HALT_DEADLOCK with termination_reason set to a non-null string).
    </rule>

    <rule>
      Wake cadence: pick wake_seconds from cache-aware bands:
        60-270    when an in-flight agent is expected to finish soon, or
                  the next dispatch can fire immediately.
        1200-3600 when waiting on a long extractor run with nothing else
                  dispatchable.
      Avoid 300-1199. Default 180 if uncertain.
    </rule>

    <rule>
      Append one JSONL line to runs/.crumb-harness-patterns.jsonl per
      completed repo (NOT per iteration):
        {"slug": ..., "repo_path": ..., "cards_emitted": int,
         "validator_passes": bool, "model_used": ..., "wall_clock_s": int,
         "kg_hits": int, "memory_hits": int, "iteration": int}
      Write-only fuel for autopilot_learn / memory-bridge.
    </rule>

    <rule>Bump handoff.iteration by 1; carry started_at and branch verbatim; update completed/failed/in_flight.</rule>
  </rules>

  <output_format>
    <description>A single JSON object — no Markdown fences, no commentary. Matches orchestrate-step.schema.json with handoff fields extended for this harness (additionalProperties is true on handoff).</description>
    <schema><![CDATA[
{
  "predict_next":  string,                  // one-sentence rationale (which slug, why)
  "step_result":   {
    "step_id":  string,                     // the slug acted on this iteration, or "harvest" / "wait"
    "outcome":  "pass" | "fail" | "empty" | "skip",
    "summary":  string,
    "evidence": string?                     // path to validator log or learn-record line
  },
  "handoff": {
    "iteration":    int,
    "started_at":   string,
    "last_step_at": string,
    "completed":    [string],
    "failed":       { },
    "in_flight":    [string],
    "branch":       string,
    "last_outcome": "pass" | "fail" | "empty" | "skip",
    "status":       "running" | "halted" | "done"
  },
  "next_action":   "CONTINUE" | "DONE" | "HALT_MAX_ITERATIONS" | "HALT_TIMEOUT" | "HALT_DEADLOCK",
  "wake_seconds":  int,
  "termination_reason": "all_done" | "max_iterations" | "timeout" | null,
  "learn_record":  {
    "step_id":   string,
    "iteration": int,
    "outcome":   "pass" | "fail" | "empty" | "skip",
    "duration_ms": int?
  }
}
    ]]></schema>
  </output_format>
</task>

examples

case · portfolio

{
  "target_repos": "iris:~/code/workspace/devarno-cloud/iris\nsmo1:~/code/workspace/devarno-cloud/smo1\nstratt:~/code/workspace/devarno-cloud/stratt\n",
  "state": "{\n  \"iteration\": 0,\n  \"completed\": [],\n  \"failed\": {},\n  \"in_flight\": [],\n  \"status\": \"running\"\n}\n",
  "existing_cards_dir": "~/code/workspace/devarno-cloud/crumb/src/content/cards",
  "concurrency": "3",
  "max_iterations": "50",
  "timeout_minutes": "60",
  "dry_run": "false"
}

notes

Reuses prompts/crumb-flashcard-extractor verbatim as the inner subagent
prompt — do not duplicate its rules here. Reuses
prompts/_shared/harness/lib.sh helpers (assert_iter_bounds,
assert_wake_cadence, assert_schema, assert_progress_monotone) and the
orchestrate-step.schema.json contract for handoff JSON.

Sidecar files written under runs/ in the cwd:
  - runs/.crumb-harness-state.json     iteration history (append-only)
  - runs/.crumb-harness-patterns.jsonl write-only learn record (one line
                                        per repo completion: slug,
                                        cards_emitted, validation_passes,
                                        model_used, wall_clock_s, kg_hits,
                                        memory_hits)
  - runs/.next-predicted.json          deterministic next-repo prediction

Failure modes:
  - target_repos empty or malformed pair → guard.sh aborts.
  - duplicate slug across pairs → guard.sh aborts.
  - existing_cards_dir not on a clean git working tree → guard.sh aborts
    (the harness needs to cut a fresh branch).
  - same repo fails frontmatter validation 3× → terminate
    verification_failed, sidecar names the repo and last validator stderr.
  - concurrency >6 → guard.sh clamps to 6 with a WARN.

description

Multi-repo CRUMB bootstrapper. Consumes a list of slug:path target_repos
and drives one iteration per call: pick the next repo, dispatch a
background Agent running the existing crumb-flashcard-extractor against
it, pipe FILE-marker output through bin/crumb-split into a fresh
crumb-extraction/iso8601 branch in the crumb repo, validate frontmatter,
emit strict-JSON handoff per orchestrate-step.schema.json. Pre-flight
enrichment per repo (best-effort): ruflo-rag-memory recall, kg-extract
for cross-ref anchoring, intelligence-route for model selection.
Termination triad all_done | max_iterations | timeout, plus
verification_failed after three frontmatter failures on the same repo.
Cache-warm wake bands 60-270 and 1200-3600. Concurrency cap 3 (ceiling
6). Use when bootstrapping CRUMB coverage across the devarno-cloud
portfolio or refreshing many projects after a shared schema change; do
NOT use for single-repo extraction or editing existing cards.

inputs

routing

triggers

not for

prompt

task

role

input

target_repos

state

existing_cards_dir

concurrency

max_iterations

timeout_minutes

domains

dry_run

branch

contract

iso8601

rules

output_format

description

schema

__cdata

#text

examples

notes

description