Skip to content

yaml-json-schema

Authors, audits, refactors, and validates YAML-encoded JSON Schema 2020-12 documents.

Authors, audits, refactors, and validates YAML-encoded JSON Schema 2020-12 documents per spec/project/yaml-json-schema/. Scaffolds a new <slug>-v1.0.schema.yaml with all mandatory skeleton entries in spec order; audits existing *.schema.yaml files for dialect drift, missing identity, inline duplicates, and missing property descriptions; refactors duplicates into $defs; runs meta-validation and data-conformance validation. Invoke for: "scaffold a JSON Schema in YAML", "audit our schemas", "extract a $defs entry", "validate this YAML against a schema", or German equivalents ("YAML-Schema anlegen", "Schemata auditieren", "Daten gegen Schema validieren"). Skip for: OpenAPI/AsyncAPI Schema Objects, JSON-encoded schemas, feature-frontmatter rules (spec/project/feature/), or project-structure scaffolding (project-structure-apply). Supports resume on re-invocation per spec/claude/resumable-work/.

Use when

  • you want to scaffold a new JSON Schema in YAML
  • you want to audit existing schemas for dialect drift or missing identity
  • you want to refactor inline duplicates into a $defs entry
  • you want to validate YAML data against a schema

YAML JSON Schema

Operationalises spec/project/yaml-json-schema/<canonical_language>.md. The skill covers the full lifecycle of a YAML-encoded JSON Schema 2020-12 document — authoring a new schema, auditing an existing one against the spec, refactoring duplicates into $defs, and validating both the schema itself and any data files it governs — through one explicit, per-step interactive surface.

Why this is a skill, not an agent

  • Per-step user approval is the contract. Every file change — a new schema, a $defs extraction, a $ref rewrite, a meta-validation fix — is shown to the user before it lands. The audit step is read-only; the apply steps are sequences of approvals an agent's fire-and-forget shape can't carry.
  • Mid-flow type and identity choices are load-bearing. Picking type: object vs oneOf, choosing the right $id slug and version segment, and deciding whether a property warrants $defs extraction are interactive judgment calls against the spec. Skill form preserves the dialogue surface; an autonomous agent would obscure it.
  • The output flows back into the main conversation. The audit table, refactor proposals, and validation reports surface inline so the user can redirect the slug, the version bump, or the choice of validator without re-entering an agent boundary.
  • Counter-dimension considered: a narrower audit-only agent could specialise on reading every *.schema.yaml and producing a structured report, gaining on context-window protection. The high-impact part is the authoring and refactoring dialogue, not the report; skill wins.

When the spec isn't present in the target repository, fall back to the copy shipped by the nolte-shared plugin (read it at runtime from the plugin install path, or from the nolte/claude-shared repository). Never invent requirements that aren't in the spec.

User-language policy

Detect the user's language and respond in it. Generated content is split:

  • Schema file content ($schema, $id, title, property names, description strings, examples) is always English. JSON Schema is an interlingua and the spec treats title/description as technical-identifier territory so that consumers across the portfolio read them the same way regardless of conversation language.
  • Surrounding prose (audit findings, refactor proposals, validation reports surfaced to the user) follows the conversation language.
  • YAML header comments generated by the skill (# Schema for <object name>; consumed by <consumer>) are English so they read consistently in git diff and IDE views.

Preconditions

Before doing anything:

  • Confirm the working directory is a git repository (git rev-parse --is-inside-work-tree).
  • Locate spec/project/yaml-json-schema/. The spec lives either in the target repo or in the nolte-shared plugin; if neither is reachable, stop and ask the user which spec source to use.
  • Read the spec's canonical-language version (en.md) and treat it as authoritative. The translation (de.md) is structure-identical but not the source of truth.
  • Check whether Taskfile.yml exposes a lint target. The skill plugs schema meta-validation and data-conformance validation into the same target rather than inventing a new one; if no target exists, ask the user before scaffolding one.
  • Check for uncommitted changes under any directory the skill will write to (**/schemas/, Taskfile.yml, README.md or schemas/README.md). If the tree is dirty, report and ask whether to stash, commit, or abort — never overwrite uncommitted work.
  • Confirm a validator binary is available: check-jsonschema --version, ajv --version, or python -m jsonschema --version. If none is installed, stop and emit an install hint based on the repository's ecosystem (pip install check-jsonschema, npm install -g ajv-cli); the spec forbids silent skip.

Operations

Read references/operations.md when executing any of the operations below in detail.

1. Author a new schema (<slug>-v1.0.schema.yaml)

Gather slug, target object, consuming spec path, and on-disk location. Compose the $id URI deterministically. Draft the file with the ten top-level skeleton entries in spec order. Walk properties one at a time; scan for $defs extraction candidates. Validate with an example before writing. Update the repo README as a separate approved edit.

2. Audit existing schemas

Read-only walk of every **/*.schema.yaml. Classify each spec requirement as pass, missing, or drift per file. Checks: dialect, identity, skeleton order, file layout, description anchor, property descriptions, $ref shape, $defs naming, inline duplication, examples, README listing. Also runs operation 4 on every schema.

3. Refactor (apply audit findings)

Walk audit findings one at a time; propose the minimal fix and ask for per-item confirmation. Handles dialect fix, $id composition, skeleton reorder, $defs extraction, relative-$ref rewrite, missing property descriptions, forbidden top-level keys, and additionalProperties justification.

4. Meta-validation (schema-against-meta-schema)

Validate every *.schema.yaml against the JSON Schema 2020-12 meta-schema. Validator preference order: task lintcheck-jsonschema --check-metaschemaajv compile --spec=draft2020python -m jsonschema. Report pass / fail (verbatim output) / blocked (no validator). Never silence failures.

5. Data validation (data-against-schema)

Associate every data file (*.yaml, *.yml, *.json) with its schema via sidecar comment, Refs comment, or .schemas-config.yaml mapping. Run the validator per association; report pass / fail / skipped / unassociated. Never invent associations.

6. Lifecycle bump (revise an existing schema)

Diff proposed change; classify as minor (backward-compatible) or major (breaking). Compose the new filename and $id segment. Write alongside the previous file; never edit in place for externally-referenced schemas. Surface consumer migration list; delegate release-note entry to release-notes-curate.

7. Re-audit

Re-run operations 2, 4, and 5 end-to-end; present a fresh grouped summary. Call out any remaining missing, drift, fail, or unassociated items.

Examples

  • Read examples/01-meta-validation.md when running a meta-validation pass for the first time on a new schema.
  • Read examples/02-data-against-schema.md when wiring up sidecar-comment-based schema associations for the first time.
  • Read examples/03-schema-lifecycle.md when revising an existing schema (minor or major bump).

Gotchas

  • Validator precedence: When multiple validators are installed (task lint, check-jsonschema, ajv, python -m jsonschema), the skill uses the first available in that order. A task lint target with check-jsonschema configured overrides direct check-jsonschema invocation; mention which validator ran in the result table so a re-run can detect drift.
  • YAML 1.1 vs YAML 1.2 booleans: Yes/No/On/Off parse as booleans under YAML 1.1 but as strings under YAML 1.2. Schemas authored under loaders that default to 1.1 (older Ruby tooling) can land with additionalProperties: No silently re-typed; emit a YAML 1.2 directive (%YAML 1.2\n---) at the top of any file the skill writes to make the version explicit.
  • $schema URI trailing slash: https://json-schema.org/draft/2020-12/schema and https://json-schema.org/draft/2020-12/schema# resolve to the same document, but validators compare URIs as strings. The skill always emits the no-fragment form; never append #.
  • #/$defs/ JSON Pointer escaping: a $defs key containing ~ or / must be escaped (~0, ~1) in the $ref fragment. The skill rejects $defs names containing those characters at authoring time rather than silently emitting escape sequences.
  • format keyword is annotation-only by default in JSON Schema 2020-12. Validators like ajv enforce format only when configured with --all-errors --strict=false and the ajv-formats plugin. Don't promise the user that format: email will fail invalid data unless the validator config enables enforcement; record the gap in the validation report.
  • unevaluatedProperties vs additionalProperties: with allOf composition, additionalProperties: false only constrains the immediate properties block; properties contributed by allOf members are still allowed unless unevaluatedProperties: false is set. The spec settles this: a closed object schema that composes via allOf MUST express the closed-shape guarantee with unevaluatedProperties: false, and the meta-validation step rejects an allOf-composed schema that relies on additionalProperties: false alone. Flag any such schema during the audit and refactor it to unevaluatedProperties: false rather than surfacing it as an unresolved choice.
  • integer is not a subtype of number for enum: enum values are matched by JSON equality; an enum entry 1 in a type: number schema does not match 1.0 in data unless both are integers in the JSON model. Watch for this when refactoring inline enums into $defs.
  • Relative $ref works locally and breaks remotely: a validator run from the schema's directory will resolve $ref: ../other.schema.yaml fine and pass meta-validation. The same schema imported by $id from a different repo will silently fail to resolve the $ref. The skill rejects relative $ref at authoring time, regardless of local test success.
  • title and description are not validation keywords — they are annotations. The skill still treats their presence as MUST per the spec, but emit a clear "annotation, not enforced" note when the user asks why the validator passes a schema missing a description.
  • Meta-validation passes a schema with no properties defined under type: object. That's valid JSON Schema (an open object schema) but almost never the author's intent. The audit reports this as a SHOULD finding ("schema describes nothing"), not a MUST violation.

Hard rules

  • Never emit a schema file without $schema: https://json-schema.org/draft/2020-12/schema as the first keyed entry. Other dialects are forbidden by the spec.
  • Never emit a schema file without an $id URI under https://github.com/nolte/<repo>/blob/main/ as the second keyed entry, whose path after /blob/main/ matches the file's repository-relative on-disk location. A schema with no $id, or one carrying a non-GitHub namespace, is unreferenceable and forbidden by the spec.
  • Never emit a schema whose description lacks the literal substring Refs spec/<topic>/<slug>/. Traceability to the consuming spec is a MUST.
  • Never emit a relative-path $ref ($ref: ../other.schema.yaml#/$defs/Foo). Only #/$defs/<Name> and absolute https://github.com/nolte/… URIs are permitted.
  • Never mix dialects within a single schema document. A $ref from a 2020-12 schema into a draft-07 schema is forbidden; transcribe the relevant fragment instead.
  • Never store a schema under spec/. The spec/ tree is governance-only.
  • Never edit a schema in place once its $id is referenced from outside its repository by absolute URI. Bump the version (-v<major>.<minor+1> or -v<major+1>.0) and migrate consumers explicitly.
  • Never silence a meta-validation or data-validation failure with --no-fail, || true, or equivalent. A failing schema or data file is a stop-and-fix event.
  • Never treat absence of an installed validator as a passing gate. Emit an install hint and report the gate as blocked.
  • Never perform silent writes. Every file change requires explicit per-item user confirmation; every audit finding is presented before any fix is written.
  • Never take on project-structure-apply work. If the audit reveals that schemas/ directories are absent at the structural level, the README is missing entirely, or the Taskfile has no lint target, stop and route the user to project-structure-apply — don't silently scaffold those out of scope.
  • Never modify the spec while applying it. If a real-world need conflicts with spec/project/yaml-json-schema/, report it and ask the user to update the spec via the nolte-shared:spec skill before proceeding.
  • Never invoke this skill on OpenAPI Schema Objects, AsyncAPI Schema Objects, or JSON-encoded JSON Schema documents. Those formats are explicitly out of scope per the spec's §Delimitation.

Multi-model testing

Examples and operations in this skill are verified on Claude Sonnet 4.6 as the default model; spot-checked on Haiku 4.5 for cost-sensitive runs; Opus 4.7 is appropriate for high-stakes audits that require deeper reasoning. The skill body has no model-specific assumptions beyond standard tool-call semantics.

Resumability

Per spec/claude/resumable-work/, this skill is resumable: true. State is persisted to .resume/yaml-json-schema/<run-id>.yml after every successful per-step user-approval gate (each property authored in the Author walk, each finding fixed in the Refactor walk, the separately approved README edit) and after each named phase boundary (audit pass complete). On re-invocation, scan that directory for files with status: in_progress whose inputs: snapshot matches the current invocation; if one matches, prompt the operator with Resume run <run_id> from phase <phase> (last checkpoint <last_checkpoint_at>)? [resume / start-new / discard]. The state-file envelope (schema_version, run_id, inputs, phase, decisions[], status, …) and the fail-closed semantics on schema or YAML errors are load-bearing in the spec; don't duplicate those rules here.