backstage-catalog-generate¶
Generates a schema-valid Backstage catalog-info.yaml from an existing project by inferring per-kind MUST-floor fields from repo signals, confirming the rest with the operator, and self-validating.
Generates a schema-valid Backstage catalog-info.yaml from an existing software project, conforming to spec/project/backstage-catalog-generation/. Inspects the repo (language/structure, remote slug, CODEOWNERS, OpenAPI/AsyncAPI/GraphQL/gRPC specs, colocated docs), infers the per-kind MUST-floor fields, confirms what it cannot infer (owner, spec.system, dependsOn, lifecycle) with the operator, writes the descriptor at the repo root, and self-validates. Invoke when the user asks to \"generate a catalog-info.yaml\", \"onboard this repo into Backstage\", \"create a Backstage Software Catalog entity\", \"add this service to our developer portal\", or German equivalents (\"erzeuge eine catalog-info.yaml\", \"nimm dieses Repo in Backstage auf\"). Optionally emits a Tech Radar TechRadarLoaderResponse JSON (never a catalog entity). Don't use to configure a Backstage backend / catalog.providers, install the Tech Radar UI plugin, or ingest Group/User org entities. Single-shot; resume not applicable.
- Plugin:
nolte-shared - Phase: 4 Build (
build) - Tags:
scaffolding - Source: skills/backstage-catalog-generate/SKILL.md
Use when¶
- you want to onboard an existing repository into a Backstage Software Catalog
- you want a schema-valid catalog-info.yaml inferred from a project's signals
- you want a Tech Radar TechRadarLoaderResponse JSON generated from a tech inventory
See also¶
Backstage Catalog Generate¶
Turns an existing software project into a schema-valid Backstage catalog-info.yaml at the repository root — inferring what the repo reveals, asking the operator for what it does not, and self-validating before it presents the result. The authoritative rules live in spec/project/backstage-catalog-generation/; this skill operationalises that spec. When the spec and this body disagree, the spec wins.
Why this is a skill, not an agent¶
- Operator-invoked, conversational. Reached as
/nolte-shared:backstage-catalog-generateagainst the current repo; the operator drives it directly. - Mid-flow confirmation is the contract. Owner resolution,
spec.system,dependsOn, and an uncertainlifecyclecannot be inferred safely from one repo; the skill surfaces each as a confirmation gate rather than guessing. An agent's fire-and-forget shape would lose those gates. - The result flows back into the conversation. The written descriptor path and the list of inferred-vs-confirmed values land in the operator's context for the next step (committing, opening the register-component PR).
- Counter-dimension: the repo inspection could run as an isolated reader, but the load-bearing dimensions — operator invocation and the per-field confirmation gates — make this a skill.
German trigger phrases¶
- „erzeuge eine catalog-info.yaml", „nimm dieses Repo in Backstage auf", „erzeuge einen Backstage-Catalog-Eintrag für dieses Projekt", „füge diesen Service unserem Developer-Portal hinzu", „generiere ein Tech-Radar-JSON"
Repo signals → fields¶
The skill reads these signals and maps them to descriptor fields (every inferred value is recorded as inferred, distinct from confirm):
| Signal | Field | Notes |
|---|---|---|
| repo / project name | metadata.name |
slugify, strip leading/trailing non-alphanumerics, cap 63 → must satisfy isValidObjectName |
| primary language / structure | spec.type |
service / website / library (Component); convention, not a fixed enum |
git remote slug |
github.com/project-slug, backstage.io/source-location |
org/repo; source-location is url:…/<repo>/ with trailing slash |
CODEOWNERS / team slug |
spec.owner |
bare slug → Group; an individual needs an explicit user: prefix |
openapi.yaml / asyncapi.yaml / *.graphql / *.proto |
an API entity + spec.providesApis |
API spec.definition via $text: ./<relative-path>, spec.type = openapi/asyncapi/graphql/grpc |
colocated docs/ / mkdocs.yml |
backstage.io/techdocs-ref: dir:. |
only when docs are colocated |
README description / package.json etc. |
metadata.description, metadata.tags |
tags match ^[a-z0-9:+#]+(\-[a-z0-9:+#]+)*$ |
spec.lifecycle, spec.system, spec.dependsOn, and spec.domain are never inferred silently — default lifecycle only on a justifying signal, otherwise confirm; system/dependsOn/domain always require operator input.
Operations¶
Generate a catalog-info.yaml (default)¶
- Inspect. Read the repo signals above. Determine the entity kind(s) — almost always a single
Component, plus anAPIentity when an interface definition is found. Consult the per-kind tables inspec/project/backstage-catalog-generation/§"Entity kinds" for the required/optional fields of any other kind. - Infer. Fill the MUST-floor for each kind. For a Component that is:
apiVersion: backstage.io/v1alpha1,kind: Component, a validmetadata.name,spec.type,spec.lifecycle,spec.owner. Mark each value inferred or needs-confirm. - Confirm. Present the inferred descriptor and the open questions (owner resolution,
lifecycle, optionalsystem/dependsOn) to the operator in one pass. Never emit a guessedsystem/dependsOn/domain; never emit an owner that cannot be confirmed to resolve — flag it as operator-action-required instead. - Emit. Write
catalog-info.yamlat the repository root (matching the discovery default path). Use multi-entity---separators when emitting a Component plus its API. Quote any numeric/boolean-looking annotation value so it stays a string. - Self-validate. Run the offline
@roadiehq/backstage-entity-validatorwhen available (npx @roadiehq/backstage-entity-validator catalog-info.yaml); otherwise apply the §Hard rules checklist below as a deterministic pre-flight. Treat reference-target existence (owner/system/API) as unverified by the offline check — surface those references as claims to confirm, not validated facts. - Report. Tell the operator the written path, the inferred-vs-confirmed split, and any references still requiring an existing Group/User/System in the target catalog.
Generate a Tech Radar JSON (optional secondary)¶
When the operator asks for a Tech Radar output: emit a TechRadarLoaderResponse JSON (quadrants, rings, entries) per spec/project/backstage-catalog-generation/ §"The Tech Radar data model" — not a catalog-info.yaml, and never modelled as a catalog entity. Express each entry's ring placement through its timeline (a snapshot with ringId and a coercible date), not as a direct field. Target the @backstage-community package model; never reference the deprecated @backstage Tech Radar package or the dead backstage.io/docs/features/techradar/ URLs.
Gotchas¶
Backstage's real validation is stricter and less obvious than its prose docs — these are the quirks that bite a generator:
- The offline validator does not check reference-target existence.
@roadiehq/backstage-entity-validatorvalidates schema and field-format only; anowner/system/API reference can pass and still dangle at processing time. Always surface references as claims to confirm against the target catalog, never as validated facts. - Numeric/boolean-looking annotation values must be YAML-quoted. Annotation values are checked as
typeof string; an unquotedgithub.com/user-id: 123456orbackstage.io/orphan: trueparses as a number/bool and fails. Quote them. metadata.nameis stricter than "alphanumerics separated by[-_.]". The first and last character must be alphanumeric — a slug that ends in-(common after naive slugification) is rejected byFieldFormatEntityPolicyat processing time, after it passed coarse ingestion.Resourcehas nolifecycleeven thoughComponentandAPIdo — emitting it is invalid. Symmetrically,Group.childrenandUser.memberOfmust be present even when empty ([]).- There is no official
backstage-clivalidate command. Validation is either the@roadiehqvalidator or aPOSTto a running backend's/api/catalog/validate-entity(wherelocationgoes in the JSON body, not an HTTP header). - The Tech Radar moved to
@backstage-community. The old@backstageTech Radar package and thebackstage.io/docs/features/techradar/URLs are dead — never cite them.
Hard rules¶
- Never author auto-set or output-only fields:
backstage.io/managed-by-location,backstage.io/managed-by-origin-location,backstage.io/orphan,metadata.uid,metadata.etag,relations,status. - Never emit
spec.lifecycleon Resource, System, Domain, Group, or User — only Component and API carry it. - Always include empty-but-required keys:
spec.children: []on a Group with no known children;spec.memberOf: []on a User with no known memberships. - API entities must carry a non-empty
spec.definition, supplied via a$text:placeholder, and the providing Component lists the API underprovidesApis. metadata.namemust satisfyisValidObjectName: length 1–63, first and last character alphanumeric, separators[-_.]interior only.metadata.namespaceis a DNS label (lowercase, no underscore/dot).- Owner disambiguation: a bare owner reference resolves to a Group; emit an explicit
user:-prefixed reference for an individual. Prefer the fullkind:namespace/nameform for cross-system robustness. - Never use a deprecated annotation:
backstage.io/github-actions-id→github.com/project-slug;backstage.io/definition-at-location→ placeholder substitution;jenkins.io/github-folder→jenkins.io/job-full-name. - Never branch on the old-vs-new frontend system —
catalog-info.yamlis a backend, frontend-agnostic concern. - Distinguish inferred from confirmed in every output; never silently emit a value the operator must own.
- When
spec/project/backstage-catalog-generation/disagrees with this body, the spec wins; propose updating this skill rather than diverging.