test-cycle-orchestrate¶
Drives the iterative test cycle (determine → execute → analyse → adapt → re-execute) for a feature, dispatching each phase and looping to green under the no-cheating invariant.
Drives one or more turns of the iterative test cycle defined in spec/project/test-cycle-foundation/ — case determination → execution → result analysis → code adaptation → re-execute — dispatching each phase to its capability (test-case-extractor and the per-tier test-generators for cases, quality-gate for execution, test-result-analyzer for analysis, test-code-adapter for the fix) and looping until an explicit exit condition holds. Enforces the cycle's integrity rules: a regression case before fixing a defect, a flake quarantined not retried-until-green, and the no-cheating invariant (never weaken/skip a test to force a pass). Invoke when the user asks to run the test cycle for a feature, drive a feature to green, or iterate determine-execute-analyse-fix. Don't use to run the gate once (use quality-gate), to scaffold tests without the loop (use a tier generator), or to classify results without acting (use test-result-analyzer). Supports resume per spec/claude/resumable-work/.
- Plugin:
nolte-engineering - Phase: 6 Quality (
quality) - Tags:
quality-gate - Source: skills/test-cycle-orchestrate/SKILL.md
Use when¶
- you want the test cycle run for a feature, looping determine-execute-analyse-fix to green
- you want a confirmed failure driven through analysis and a verified code fix
Don't use when¶
- you want to run the lint/typecheck/test gate once, not the loop →
quality-gate - you want to scaffold tests for a tier without running the loop →
unit-test-generator
See also¶
Test Cycle Orchestrate: $ARGUMENTS¶
Drive the iterative test cycle for $ARGUMENTS (a feature, module, or failing case): determine cases → execute → analyse results → adapt code → re-execute, looping until an explicit exit condition holds. This skill orchestrates; each phase is performed by its capability, dispatched in turn, and the results flow back here so you decide whether to loop again.
Governed by spec/project/test-cycle-foundation/ (the loop, the inter-phase contracts, the exit conditions, and the no-cheating invariant) and the four phase specs it references.
Why this is a skill, not an agent¶
- Orchestrator that chains other capabilities: the cycle dispatches
test-case-extractor/ per-tier generators,quality-gate,test-result-analyzer, andtest-code-adapterin a loop; the skill-orchestrates pattern (perskill-vs-agent) defaults the orchestrator to skill form. - Mid-flow gating lives in the conversation: the no-cheating decision, the "is this exit-ready or loop again" decision, and the wrong-test-routes-to-a-reviewable-case-change decision are per-turn user-visible gates an agent's fire-and-forget shape would lose.
- Output flows back into the main conversation: each phase's result (the classification, the applied fix, the re-run verdict) surfaces so the operator can steer.
- Counter-dimension considered: each individual phase is a self-contained agent (the generators, analyzer, adapter); the loop that sequences them is the orchestration, which stays a skill.
The cycle¶
Run the loop for $ARGUMENTS. One turn:
Phase 1 — Determine cases¶
Ensure the cases that should be green exist and currently fail or are absent. Dispatch test-case-extractor to derive abstract cases from a requirement, and the per-tier generator (unit-test-generator, component-test-generator, integration-test-generator, contract-test-generator, or e2e-test-generator) to scaffold the runnable test at the tier the foundation's lowest-tier-that-gives-confidence rule selects. When the trigger is a confirmed defect, a failing regression case is written first.
Phase 2 — Execute¶
Run the cases. Dispatch quality-gate (the fast tiers) and the project's tier runners; collect the structured per-case results. Honour the staged-execution model — fast tiers gate, slow/broad tiers run where the project places them.
Phase 3 — Analyse results¶
Dispatch test-result-analyzer to classify each non-pass into a routed category (real defect / flake / test bug / infra / stale dep / config drift) with evidence. Route a flake to quarantine, a test bug back to phase 1 as a reviewable case change, and infra/stale/config to the environment (via workflow-health-triage).
Phase 4 — Adapt code¶
For each confirmed real failure, dispatch test-code-adapter to apply the minimal correct change that satisfies the asserted behaviour, fixing the root cause. The fix re-enters phase 2 (re-execute); never assume green without re-running.
Loop or exit¶
Repeat from phase 2 until the exit conditions hold: every required case is green, no previously-green case regressed, and the coverage/mutation signal is acceptable per the foundation's coverage governance. Surface the per-turn state each round; stop when exit-ready, or hand back when a turn cannot make progress (for example a fix needs a product decision).
Hard rules¶
- Never make a case pass by weakening, deleting, skipping, or hard-coding to its expected value; resolve a red case by a code adaptation (phase 4) or, when the test was wrong, a reviewable case change (phase 1) — never a silent escape. This is the cycle's central integrity rule.
- Always write a failing regression case before fixing a confirmed defect, so the cycle accumulates coverage of real failures over time.
- Never retry a flaky test until it goes green; route a confirmed flake to quarantine-and-track, not to the gating signal.
- Never declare a turn complete without re-execution: a code change re-enters phase 2 and all cases must be green with no regression before exit.
- Never restate a phase's internals here; dispatch its capability (
test-case-extractor/ tier generators,quality-gate,test-result-analyzer,test-code-adapter) and orchestrate the loop. When a phase spec disagrees with this skill, the spec wins; propose a skill update rather than diverging.
Resumability¶
Per spec/claude/resumable-work/, this skill is resumable: true. State is persisted to .resume/test-cycle-orchestrate/<run-id>.yml after every successful phase boundary and user-approval gate. On re-invocation, scan that directory for files with status: in_progress whose inputs: snapshot matches the current invocation; if one matches, prompt the operator with Resume run <run_id> from phase <phase> (last checkpoint <last_checkpoint_at>)? [resume / start-new / discard]. The state-file envelope and fail-closed semantics are load-bearing in the spec; don't duplicate those rules here.