agent-review¶

Prüft einen Claude-Code-Agent gegen die Spec und erzeugt einen umsetzbaren Review-Plan unter .audits/agent-review/.

Reviews a Claude Code agent against spec/claude/agent-management/ and spec/claude/skill-vs-agent/, and emit an actionable review plan per spec/claude/review-plan/ under .audits/agent-review/ keyed by the target agent's name. Invoke when the user asks "review this agent", "audit a specific agent file", "check whether this agent is spec-compliant", or "agent review for a specific agent". Also handles closing an existing review plan once every item is addressed — "close the agent review plan for a specific agent". Also handles equivalent German-language requests. Do NOT use for skill review (use skill-review) or for pull-request-level review (review skill). Supports resume on re-invocation per spec/claude/resumable-work/.

Plugin: nolte-shared
Phase: 5 Review (review)
Tags: review
Quelle: skills/agent-review/SKILL.md

Anwenden wenn¶

you want to review a specific agent file for spec compliance
you want an actionable review plan for an agent
you want to close an existing agent-review plan once every item is addressed

Nicht anwenden wenn¶

You want to review a skill (SKILL.md), not an agent → skill-review

Siehe auch¶

skill-review

Referenziert von¶

Agent Review Skill¶

Operationalizes spec/claude/agent-review/ — reviews one Claude Code agent against its authoring specs and persists the result as a processable plan under .audits/agent-review/. The plan is the deliverable; the skill is the procedure that produces and, later, retires it.

German trigger phrases¶

This skill also triggers on equivalent German-language requests, including:

"prüfe diesen Agent"
"Agent-Review für X"
"Audit von agents/"
"ist dieser Agent spec-konform"
"schließe den Agent-Review-Plan"

Why this is a skill, not an agent¶

Mid-flow interactivity — scope confirmation (which agent, narrowed aspect?) and item-closure decisions happen with the user in the loop; an agent's fire-and-forget contract would lose that.
Persistent on-disk output is the contract — the review-plan artifact under .audits/ must survive past the current turn and be worked off incrementally; skills own persistent state, agents return structured reports.
Orchestrator, potentially chains to other skills — after the plan is closed, the user may run pull-request-create in the same thread; the skill-orchestrates pattern (see skill-vs-agent) defaults the orchestrator to skill form.
Counter-dimension considered: context-window impact from reading three specs plus the target agent would bias toward an agent, but an agent file is a single markdown and each spec is one file — the read volume is bounded and the user wants visibility.
This skill is the sibling of skill-review; the two share the review-plan output shape and differ only in which authoring spec drives the checks.

User-language policy¶

Detect the user's language from their message and reply in that language.
The plan file uses English section headings (## Scope, ## Summary, ## Findings, ## Processing log) regardless of conversation language — review-plan requirement so downstream tooling can grep deterministically.
Finding prose (the one-line statement, Where, Fix, Verify) may follow the conversation language.
Commit messages produced by this skill stay in English for portfolio consistency.

Preconditions¶

Before any operation, verify with Read that these files exist in the current repository:

spec/claude/agent-management/<canonical>.md
spec/claude/skill-vs-agent/<canonical>.md
spec/claude/review-plan/<canonical>.md
spec/claude/agent-review/<canonical>.md

<canonical> comes from spec/.spec-config.yml (canonical_language); fall back to en if the config is absent. If any of the four is missing, stop and tell the user — the specs are the input; without them there is nothing to review against. Do not improvise replacements, do not read a translation when the canonical file exists.

Also verify .audits/ exists and is tracked by git. If absent, create .audits/agent-review/.gitkeep as part of the first plan write so the folder is not lost.

Operations¶

1. `run <agent-name>` — produce a review plan¶

Interactive. Confirm each decision with the user before acting on it.

Resolve the target. Accept agents/<name>.md or a runtime path .claude/agents/<name>.md / ~/.claude/agents/<name>.md. If the user gave only a name without a path prefix, default to agents/<name>.md. If multiple candidates match, list them back and ask.
Check for an existing plan at .audits/agent-review/<name>.md. If present:
If status is open or in-progress, tell the user a live plan exists and ask: resume it, supersede it (overwrite), or abort. Do not silently overwrite.
If status is complete, ask whether to rerun (overwrite — per review-plan one-plan-per-target invariant).
Narrow scope if the user asks ("frontmatter only", "tools only", "rationale only"). Record the narrowing verbatim — it goes into the plan's ## Scope.
Read the review surface in this order: YAML frontmatter → markdown body (role, output format, procedure, rationale) → any external supporting asset referenced from the body. Agents are single self-contained files: per agent-management §Structure there is no agents/<name>/ sibling folder, and a nested companion markdown file is itself a finding (recursive discovery registers it as a phantom agent). A missing referenced file is a finding, not a stop.
Apply the checks from spec/claude/agent-review/, in the spec's declared order:
Frontmatter fields: name matches filename, description names concrete triggers, distribution is exactly plugin or project.
tools scoping: declared-vs-used bidirectional check (declared-unused → Warning, used-undeclared → Critical), read-only-agent invariant (if the description verbs are review / audit / research / lint / report, tools must NOT contain Edit, Write, Bash, or NotebookEdit → otherwise Critical).
System-prompt body: single responsibility, output shape stated, no hard-coded absolute paths, English-only frontmatter and body.
No-Skill-dispatch check: Grep the body for Skill(, Skill tool, or equivalent phrasings → any match is a Critical per skill-vs-agent.
Subagent-boundary check: Grep the body for Agent(, subagent_type, Task(, or equivalent dispatch phrasings → any match is a Critical (Claude Code subagents can't spawn subagents) per agent-review §Subagent-boundary checks.
Model-choice check (agent-review §Model-choice checks): when model is declared, its value is exactly opus, sonnet, or haiku → otherwise Critical; a pinned model without a stated rationale is a Warning; an implausible pin (read-only/reporting agent on opus, or complex audit/planning agent on haiku) without rationale is a Suggestion.
Spec-anchor check (agent-review §spec-driven-development): the body cites at least one spec/... path → absence is a Critical per spec/project/spec-driven-development/.
Plugin-distribution constraint check (agent-review §Plugin-distribution constraint checks): when distribution: plugin, the frontmatter does NOT set hooks, mcpServers, or permissionMode → any of those is a Critical; for distribution: project those fields are valid. Flag a Warning when the body describes behavior requiring those fields but the distribution is plugin, or when a distribution: project agent references a plugin-co-located asset (${CLAUDE_PLUGIN_ROOT}).
Rationale section: at least one decisive dimension named → absence is Critical; at least one counter-dimension named → absence is Suggestion.
Referenced assets exist.
Duplicate-prevention: Grep the description: line of every other agents/*.md and skills/*/SKILL.md for semantic overlap with the target — keyword hits are candidates, not verdicts; read each candidate and judge.
Info observations for body-length or asset-factoring opportunities.
Map severities. MUST failure → Critical, SHOULD failure → Warning, applicable MAY → Suggestion, observation without a rule → Info. The severity vocabulary itself is fixed by spec/claude/review-plan/ §Severity scale — Title Case, no abbreviations, no portfolio-local extensions. Never promote Vale / markdown-style observations above Info.
Draft the plan from templates/plan.template.md, filling every field. repo-revision is git rev-parse HEAD (or unknown). created is today's ISO date.
Write the plan to .audits/agent-review/<name>.md. Confirm the path back to the user. Do not mark any item - [x] on creation.
Stage and commit only if the user asks; otherwise leave as a working-tree change.

2. `update <agent-name>` — check off processed items¶

When the user reports closures:

Read .audits/agent-review/<name>.md.
Verify each claimed closure by re-running the specific check (re-read the file, re-grep, re-inspect tools list). If verification fails, leave the item - [ ] and say so.
Mark verified items - [x] in place.
Append one line to ## Processing log per closure: YYYY-MM-DD — <item-shorthand> — <action> — verified: <method>.
Flip status to in-progress on the first closure if it was open.
Do not commit automatically — show the diff and let the user commit.

3. `close <agent-name>` — delete the plan after full processing¶

Read .audits/agent-review/<name>.md.
Refuse if any open - [ ] Critical remains. Warning / Suggestion / Info may be closed via → deferred: <issue-url>. Offer to help open tracking issues if missing.
Count findings at creation time per severity (from ## Summary, not current state).
Delete the plan file.
Compose the deletion commit message exactly: review(agent-review): close <agent-name>—<C>C/<W>W/<S>S/<I>I in the subject; body lists deferred-issue URLs and the repo-revision. No hook bypass, no signing skip.
Run the commit only if the user confirms. Show the message first.

Output — plan shape¶

Reference spec/claude/review-plan/<canonical>.md for the authoritative format. Never restate its rules in the plan itself. The template at templates/plan.template.md is the starting point. Every finding uses the four-line structure (statement + Where / Fix / Verify) and cites a spec requirement in the bracketed prefix.

Examples¶

Read examples/01-fresh-review.md when running the first end-to-end run on a new agent target.
Read examples/02-update-after-fix.md when closing individual findings after the author has pushed fixes.
Read examples/03-close-plan.md when all items are resolved and you are ready to delete the plan.

Gotchas¶

Plan file is local-only until committed: the .audits/agent-review/<name>.md plan survives only in the working tree until the user explicitly commits it; if the session ends or the branch is switched before committing, the plan is silently lost — stage and commit early or remind the user.
Declared-vs-used check requires the current file version: the bidirectional tools check (declared-unused → Warning, used-undeclared → Critical) must be run against the on-disk agent file at review time; stale cached reads produce false positives or missed Critical findings — always re-read the agent file immediately before running step 5.2.
spec/.spec-config.yml must be read first: <canonical> depends on canonical_language in that file; skipping the read and defaulting to en silently misroutes reviews in repos where the canonical language differs — read the config before resolving any spec path.

Resumability¶

Per spec/claude/resumable-work/, this skill is resumable: true. State is persisted to .resume/agent-review/<run-id>.yml after every successful user-approval gate and after each named phase boundary. On re-invocation, scan that directory for files with status: in_progress whose inputs: snapshot matches the current invocation; if one matches, prompt the operator with Resume run <run_id> from phase <phase> (last checkpoint <last_checkpoint_at>)? [resume / start-new / discard]. The state-file envelope (schema_version, run_id, inputs, phase, decisions[], status, ...) and the fail-closed semantics on schema or YAML errors are load-bearing in the spec; don't duplicate those rules here.

Hard rules¶

One plan per target. A rerun supersedes; never edit a previous run's plan into a new one.
No finding without a spec citation. Real issues that no agent-management / skill-vs-agent rule covers are recorded as Info with a note that the spec may need to grow — never promoted to Warning / Critical.
Never delete a plan with an open Critical. Warning / Suggestion / Info may be deferred; Critical must land or be downgraded (which requires a spec change, not a reviewer's choice).
No invention in frontmatter. Unknown git SHA → unknown. Missing reference → the finding describes the gap.
English section headings, English commit messages. Prose inside findings may follow the user's language; the structural contract stays English-only.
No cross-target batching. One agent per plan, one plan per run. For "review all agents", loop and emit one plan each.
This skill reviews the agent artifact, not its live behavior. Do not dispatch the agent under review to see what it does — out of scope and risks side effects.
Tool-scope check is bidirectional and strict. Declared-but-unused is a Warning; used-but-undeclared is a Critical (the agent will fail to run). Do not soften either side.