워크플로우2026년 6월 1일 · 9 min read

When a Claude Code File Hook Should Die

claude-code-md-hook is worth saving as a postmortem, not as an install recommendation: the benchmark shows why hook overhead, platform-native behavior, and raw-output inspection matter.

𝕏 in

EDITOR'S NOTEThe rescue angle is not token-saving hype. It is a field guide for deciding when a Claude Code hook has become more expensive than the behavior it wraps.

The original article wanted to save a markdown hook for token efficiency. That is the wrong magazine angle. The source is more useful because the maintainer killed the hook after testing it.

That makes claude-code-md-hook one of the better rescue candidates in the queue. It is not a polished tool launch. It is a benchmark-backed autopsy of a Claude Code hook that looked plausible, produced a strong marketing claim, then failed under real measurement. For Codex and Claude Code users, the lesson is not 'install this hook.' The lesson is how to decide when a hook should exist at all.

Start with the Retraction

The README opens by retracting the original claim: the hook was supposed to save 10-20x tokens on file reads. After 228 benchmark sessions, the median improvement was 0%. V1 had 12 losing cells, 3 winning cells, and 4 neutral cells.

That is why this source should be published. Most tool writeups stop at the promise. This one shows the promise breaking. A serious Claude Code workflow needs that kind of negative evidence because hooks sit in front of tools. A bad hook does not only waste code. It changes what the agent is allowed to read.

What the Hook Actually Did

install.sh copies md-convert.sh, installs /index and /noconvert, and patches .claude/settings.json with a PreToolUse hook on Read. The hook reads JSON from stdin, inspects the requested file path, and targets PDF, DOCX, XLSX, PPTX, HTML, and markdown files.

For Office-style files it uses MarkItDown, writes markdown into a sibling .cache directory, and redirects the Read input to the converted file when it is small enough. For files above the line threshold, it denies the full read and returns a structural index of headings with line numbers. The intended behavior is simple: make Claude read only the relevant section instead of the entire document.

That design is plausible. It is also exactly why the benchmark is useful.

Turn Inflation Is the Hidden Cost

The hook's core mistake was assuming smaller file slices automatically mean fewer tokens. The README shows the opposite failure mode: index-and-fetch can turn a native 2-turn read into 14-23 turns. Each new tool call carries prior conversation state back into the model as cached input. A 300-line index that forces repeated targeted reads can cost more than one full read.

That is the first operating rule to steal: measure total session tokens, not the size of the file fragment. For Codex-style work, also measure turn count. A hook that reduces bytes but multiplies turns can make the agent slower, more expensive, and more likely to lose the thread.

Check the Platform Before Wrapping It

The README says PDF and HTML produced zero improvement in the maintainer's tests, while markdown was actively worse in v1. That does not prove every Claude Code version will behave the same forever, but it proves the right precondition: test what the platform already does before adding a hook.

Claude Code hook docs make it easy to intercept tool events. That power should raise the bar, not lower it. If native Read already handles a format well enough, a PreToolUse conversion hook becomes a liability. It can add cache drift, dependency failures, threshold tuning, and confusing denial messages without giving the user a better answer.

The Benchmark Lied in Two Useful Ways

The maintainer found two benchmark bugs that are more valuable than the original token table.

First, PPTX was not hook versus native reading. Native reading was mostly refusal behavior, so the token baseline measured Claude spending turns failing to open the file. The hook did fill a real PPTX gap, but the ratio was not a clean efficiency comparison.

Second, the DOCX quality rescue was partly a wrong-file bug. In the off condition, Claude sometimes fell back to shell workarounds and opened a sibling S2.pdf instead of files/docx/S2.docx. Aggregate quality made the hook look better than it was. Raw outputs showed the actual failure.

That is a practical lesson for agent benchmarking: always inspect transcripts. Median scores can hide refusal loops, wrong-file navigation, and accidental task changes.

Replace the Hook with One Instruction First

The final README conclusion is harsh: the useful PPTX behavior does not justify 236 lines of Bash/Python hook infrastructure, 14 deployed copies, auto-install logic, .cache management, a .noconvert toggle, and per-type thresholds.

The lower-cost replacement is a project instruction:

When you need to read `.pptx` files, first convert them with `markitdown <file_path>`.
Prerequisite: `pipx install markitdown`.

That is not as automatic as a hook. That is the point. If one instruction covers the real gap, the hook should die. Keep automation for cases where deterministic interception is necessary, not merely convenient.

When a Hook Is Still Worth Keeping

The article should not turn this postmortem into an anti-hook slogan. Hooks are useful when the behavior must happen at a lifecycle boundary: block writes to sensitive files, run a formatter after edits, inject mandatory context before prompts, or log tool use for audit.

A file conversion hook has to meet a higher bar. It should be kept only when four checks pass:

1. Native Claude Code behavior is inadequate for this exact format.
2. The hook improves answer quality or total session cost on real tasks.
3. The raw transcripts show correct files, not refusal loops or sibling-file mistakes.
4. A simpler instruction or slash command cannot cover the same workflow.

If any check fails, narrow the hook or remove it.

A Local Acceptance Test

A team can run a smaller version of the maintainer's benchmark before adopting any file-read hook:

Pick 3 real documents: one PDF, one PPTX, one DOCX.
Ask 5 factual questions per file with known answers.
Run native Claude Code once.
Run the hook once.
Record total turns, total input tokens, answer score, and wrong-file events.
Read the raw transcript before looking at the table.

The pass condition is not 'tokens went down once.' The hook must improve the exact file class it claims to improve without causing wrong-file reads, refusal loops, or excessive turn count. Otherwise it is not ready for a global PreToolUse hook.

Operational Checklist

Before a Claude Code hook goes into a shared setup, ask these questions:

What native behavior are we replacing?
Which tool event does the hook intercept?
What happens when the dependency is missing?
Where does it write cache or state?
Can the user bypass it for visual or native reads?
What is the uninstall path?
What raw transcript proves it improves a real task?

claude-code-md-hook answers those questions better after being scrapped than it did as a tool launch. It has an installer, an uninstaller, a toggle, and a clear implementation. The missing part is a reason to keep it in front of every read.

Verdict

claude-code-md-hook should not be published as a recommended token-saving hook. It should be published as a field guide for killing a hook. The maintainer measured the claim, found turn inflation, caught benchmark bugs, and reduced the real use case to a simpler MarkItDown instruction.

That is directly useful for Claude Code and Codex practitioners. Every team adding hooks, MCP servers, or wrapper scripts needs a kill path. If the platform already handles the case, remove the wrapper. If the wrapper only helps one format, narrow it. If a one-line instruction works, prefer it until the workflow proves it needs automation.

claude-code-md-hook is publishable as a postmortem: use it to decide when a Claude Code hook should be narrowed or removed, not as a token-saving install recommendation.

Practical takeaway

Before installing a file-read hook, test native behavior, measure total turns and input tokens, inspect raw transcripts for refusal or wrong-file events, try a markitdown <file_path> instruction for PPTX, and keep the hook only if it beats that simpler path on real documents.

SOURCES

[1] Primary sourcegithub.com

[2] Installergithub.com

[3] Read hook implementationgithub.com

[4] Index and no-convert commandsgithub.com

[5] Claude Code hooks docscode.claude.com

[6] MarkItDown projectgithub.com

[7] Anthropic document upload supportsupport.anthropic.com

[8] No-convert commandgithub.com

[9] Uninstallergithub.com

claude codehooksbenchmarkingmarkitdownagent workflows

Claude Code 생태계를 앞서가세요

MCP 서버, 스킬, 에이전트 패턴, 바이브코딩 인사이트를 매주 전해드립니다.

무료 구독

OpenCode 전환 후 /code-review · /security-review 공백: opencode-power-pack의 SKILL.md 포팅 구조와 도입 조건

Anthropic 공식 Claude Code 플러그인의 code-review, security-review, feature-dev는 OpenCode에서 그대로 동작하지 않는다. opencode-power-pack은 이 워크플로우들을 OpenCode 네이티브 SKILL.md 포맷으로 번역하고, ~/.config/opencode/opencode.json 한 줄 설정으로 11개 스킬을 적재한다.

2026년 6월 11일

opencodeskillscode-review

도구8분 읽기

guard-skills: Claude Code diff에 catch-all 오류·환각 import·HPOS 패턴을 잡는 5개 리뷰 Skill

guard-skills는 Claude Code, Codex, Cursor가 생성한 코드·테스트·문서에 second-pass 리뷰를 수행하는 5개 Skill 패키지입니다. clean-code-guard는 GitClear·USENIX 연구에 근거한 14가지 AI 실패 패턴을 검사하고, woo-guard는 AI가 반복 생성하는 pre-HPOS WooCommerce 코드 패턴을 직접 겨냥합니다.

2026년 6월 11일

skillsclaude-codecode-review

도구7 min read

baoyu-design: What Changes When You Run claude.ai/design Locally in Claude Code

baoyu-design ports the claude.ai/design methodology to Claude Code by detecting the agent environment at runtime and loading a tool substitution table from `references/claude.md`. The design system pipeline — `compile-design-system.mjs` produces static lint config, `import-design-system.mjs` generates a token allowlist as `_ds_prompt.md` — enforces the same constraint at two layers. Neither is optional if you want session-to-session consistency.

2026년 6월 10일

skillsclaude-codedesign-systems