The salvageable idea in tech-debt-audit is not 'ask an LLM to find technical debt.' That version fails fast. It becomes generic, finds the obvious, and leaves a list nobody wants to triage.
The better idea is a protocol. Before judging, the skill reads the repo shape, manifests, docs, churn, hot paths, largest files, and high-change files. Then it audits across named debt dimensions. Every concrete finding needs path:line. The final output is TECH_DEBT_AUDIT.md, a file the team can review, commit, and rerun against later. That is worth saving as a magazine article because it turns a loose code-review prompt into an inspectable audit workflow.
Why this seed is worth saving
The old headline used the wrong frame. Technical debt audits do not need grand language. They need proof. The README and SKILL.md define a small but opinionated Claude Code skill with enough structure to change how a team audits a repo.
The strong part is not the slash command. It is the constraints: orient first, cite every finding, avoid rewrites, avoid filler, record non-findings, and write a persistent artifact. Those choices attack common LLM review failures directly.
Install it as a skill, not a magic reviewer
The personal install path is explicit:
mkdir -p ~/.claude/skills/tech-debt-audit
curl -o ~/.claude/skills/tech-debt-audit/SKILL.md https://raw.githubusercontent.com/ksimback/tech-debt-skill/main/SKILL.md
For one repository, vendor it into the project:
mkdir -p .claude/skills/tech-debt-audit
curl -o .claude/skills/tech-debt-audit/SKILL.md https://raw.githubusercontent.com/ksimback/tech-debt-skill/main/SKILL.md
Then verify the skill list:
claude --print "/skills" | grep tech-debt-audit
For team use, the project-level path is usually better. It makes the audit protocol reviewable in the same repo it judges.
Run it only after choosing the scope
The default interface is simple:
/tech-debt-audit
That runs against the current repository and writes TECH_DEBT_AUDIT.md in the root. For very large repos, the README supports a subtree:
/tech-debt-audit src/payments
Use the full repo when you need a baseline. Use a subtree when the monorepo is too large, when one domain has most of the risk, or when a first pass would burn context before findings get good. The important choice is made before the slash command: what boundary will produce findings someone can act on?
Phase 1 is the quality gate
The skill's first phase is the reason this candidate has value. It requires reading README, package manifest, architecture docs, directory structure, entry points, hot paths, and cold corners. It also requires churn checks:
git log --oneline -200
git log --stat --since="6 months ago"
Then it asks for the 20 largest files and 20 most frequently modified files. The overlap is where debt often hides. A real adoption test should pause after Phase 1 and ask: what surprised the audit, and what should it investigate that the standard dimensions do not mention? The README recommends that interruption on first run, and it is the right habit.
Citations are the non-negotiable part
The protocol requires path/to/file.ext:LINE for every concrete finding. That sounds strict because it is. A debt finding without a location is hard to falsify, hard to assign, and easy to ignore.
When reviewing the output, reject rows that say the codebase generally has a problem. Keep rows that point to a file, line, failure mode, severity, effort, and specific recommendation. If a finding cannot survive a maintainer opening the cited line, it should not be in TECH_DEBT_AUDIT.md.
The nine dimensions keep it broad enough
Phase 2 covers architectural decay, consistency drift, type and contract debt, test debt, dependency and config debt, performance and resource hygiene, error handling and observability, security hygiene, and documentation drift.
That spread is useful because technical debt rarely lives in one category. A brittle module may combine churn, weak tests, loose contracts, and bad observability. The dimensions give the agent a checklist, but the article should warn against padding. The skill itself says if a category has nothing material, say so and move on.
Tool output is evidence, not the audit
The skill lists stack-specific tools: npm audit, npx knip, npx madge --circular, npx depcheck, and tsc --noEmit for TypeScript and JavaScript; pip-audit, ruff, vulture, pydeps, and mypy --strict for Python; cargo audit, cargo udeps, and cargo machete for Rust; govulncheck, staticcheck, and golangci-lint for Go.
That is a good pattern. The tools produce raw signals. The skill's job is to connect those signals to architecture, churn, and file-cited findings. If a tool is missing, the protocol says to note it and move on rather than install global tools. That keeps the audit moving without mutating a developer machine.
The deliverable is the product
TECH_DEBT_AUDIT.md is not a transcript. It has a fixed shape: executive summary, architectural mental model, findings table, top five priorities, quick wins, things that look bad but are fine, and open questions.
The findings table matters, but the top five is where action starts. The quick wins are for low-effort fixes with meaningful severity. The open questions are where the agent admits uncertainty instead of turning missing domain knowledge into confident claims. That structure is why the skill is stronger than a normal review prompt.
The non-findings section is the best guardrail
The required 'things that look bad but are actually fine' section is the most interesting design choice. It forces the agent to show restraint. A deeply nested callback might preserve ordering. A duplicated path might exist because two modules have different failure domains. A missing abstraction might be intentional.
If that section is empty, the audit is probably shallow. During review, ask whether each non-finding teaches something about the repo. Good non-findings reduce false positives and make the real findings more credible.
Do not let it recommend rewrites
The protocol explicitly bans rewrite recommendations. That is the right constraint. LLMs often escape hard diagnosis by proposing a broad rewrite. A useful debt audit should say what to extract, what contract to tighten, what test gap to close, or which dependency path to remove.
When reviewing TECH_DEBT_AUDIT.md, treat 'rewrite this subsystem' as a failure unless it is broken into scoped changes with cited lines and a migration path. The goal is maintainable triage, not a dramatic plan nobody will start.
Where it fits in a Codex workflow
Use this skill when inheriting a repo, preparing a refactor budget, planning a stabilization sprint, or deciding where Codex should spend sustained effort. It is less useful for a single PR, a known bug, or a small function cleanup. Those belong to review, debug, or simplify workflows.
For large repos, use the protocol's subagent rule. If the repo is above roughly 50k LOC or has more than five top-level modules, split by module and synthesize. Serial reading in one context window is how whole-repo audits become shallow.
Repeat runs make it more valuable
Repeat-run mode is important. If TECH_DEBT_AUDIT.md already exists, the skill reads it first, marks resolved findings as RESOLVED, updates stale findings, and tags new ones as NEW.
That turns the audit into a living baseline. The first run tells the team what debt exists. The second run tells the team whether the debt changed. That is the difference between a report and an operating artifact.
Save tech-debt-audit as a Claude Code skill for file-cited repo debt baselines. It is useful when a team needs a persistent TECH_DEBT_AUDIT.md with ranked findings, scoped recommendations, non-findings, and repeat-run tracking; it is a bad fit when someone wants a quick PR review, a security audit, or a tool that can infer business logic without maintainer input.
Practical takeaway
Install it project-local, run /tech-debt-audit on a bounded repo or subtree, interrupt after Phase 1 to inspect the mental model, reject uncited findings, review the top five and quick wins, then commit TECH_DEBT_AUDIT.md only if the report contains useful non-findings and maintainer questions.