The original article wanted to save a markdown hook for token efficiency. That is the wrong magazine angle. The source is more useful because the maintainer killed the hook after testing it.
That makes claude-code-md-hook one of the better rescue candidates in the queue. It is not a polished tool launch. It is a benchmark-backed autopsy of a Claude Code hook that looked plausible, produced a strong marketing claim, then failed under real measurement. For Codex and Claude Code users, the lesson is not 'install this hook.' The lesson is how to decide when a hook should exist at all.
Start with the Retraction
The README opens by retracting the original claim: the hook was supposed to save 10-20x tokens on file reads. After 228 benchmark sessions, the median improvement was 0%. V1 had 12 losing cells, 3 winning cells, and 4 neutral cells.
That is why this source should be published. Most tool writeups stop at the promise. This one shows the promise breaking. A serious Claude Code workflow needs that kind of negative evidence because hooks sit in front of tools. A bad hook does not only waste code. It changes what the agent is allowed to read.
What the Hook Actually Did
install.sh copies md-convert.sh, installs /index and /noconvert, and patches .claude/settings.json with a PreToolUse hook on Read. The hook reads JSON from stdin, inspects the requested file path, and targets PDF, DOCX, XLSX, PPTX, HTML, and markdown files.
For Office-style files it uses MarkItDown, writes markdown into a sibling .cache directory, and redirects the Read input to the converted file when it is small enough. For files above the line threshold, it denies the full read and returns a structural index of headings with line numbers. The intended behavior is simple: make Claude read only the relevant section instead of the entire document.
That design is plausible. It is also exactly why the benchmark is useful.
Turn Inflation Is the Hidden Cost
The hook's core mistake was assuming smaller file slices automatically mean fewer tokens. The README shows the opposite failure mode: index-and-fetch can turn a native 2-turn read into 14-23 turns. Each new tool call carries prior conversation state back into the model as cached input. A 300-line index that forces repeated targeted reads can cost more than one full read.
That is the first operating rule to steal: measure total session tokens, not the size of the file fragment. For Codex-style work, also measure turn count. A hook that reduces bytes but multiplies turns can make the agent slower, more expensive, and more likely to lose the thread.
Check the Platform Before Wrapping It
The README says PDF and HTML produced zero improvement in the maintainer's tests, while markdown was actively worse in v1. That does not prove every Claude Code version will behave the same forever, but it proves the right precondition: test what the platform already does before adding a hook.
Claude Code hook docs make it easy to intercept tool events. That power should raise the bar, not lower it. If native Read already handles a format well enough, a PreToolUse conversion hook becomes a liability. It can add cache drift, dependency failures, threshold tuning, and confusing denial messages without giving the user a better answer.
The Benchmark Lied in Two Useful Ways
The maintainer found two benchmark bugs that are more valuable than the original token table.
First, PPTX was not hook versus native reading. Native reading was mostly refusal behavior, so the token baseline measured Claude spending turns failing to open the file. The hook did fill a real PPTX gap, but the ratio was not a clean efficiency comparison.
Second, the DOCX quality rescue was partly a wrong-file bug. In the off condition, Claude sometimes fell back to shell workarounds and opened a sibling S2.pdf instead of files/docx/S2.docx. Aggregate quality made the hook look better than it was. Raw outputs showed the actual failure.
That is a practical lesson for agent benchmarking: always inspect transcripts. Median scores can hide refusal loops, wrong-file navigation, and accidental task changes.
Replace the Hook with One Instruction First
The final README conclusion is harsh: the useful PPTX behavior does not justify 236 lines of Bash/Python hook infrastructure, 14 deployed copies, auto-install logic, .cache management, a .noconvert toggle, and per-type thresholds.
The lower-cost replacement is a project instruction:
When you need to read `.pptx` files, first convert them with `markitdown <file_path>`.
Prerequisite: `pipx install markitdown`.
That is not as automatic as a hook. That is the point. If one instruction covers the real gap, the hook should die. Keep automation for cases where deterministic interception is necessary, not merely convenient.
When a Hook Is Still Worth Keeping
The article should not turn this postmortem into an anti-hook slogan. Hooks are useful when the behavior must happen at a lifecycle boundary: block writes to sensitive files, run a formatter after edits, inject mandatory context before prompts, or log tool use for audit.
A file conversion hook has to meet a higher bar. It should be kept only when four checks pass:
1. Native Claude Code behavior is inadequate for this exact format.
2. The hook improves answer quality or total session cost on real tasks.
3. The raw transcripts show correct files, not refusal loops or sibling-file mistakes.
4. A simpler instruction or slash command cannot cover the same workflow.
If any check fails, narrow the hook or remove it.
A Local Acceptance Test
A team can run a smaller version of the maintainer's benchmark before adopting any file-read hook:
Pick 3 real documents: one PDF, one PPTX, one DOCX.
Ask 5 factual questions per file with known answers.
Run native Claude Code once.
Run the hook once.
Record total turns, total input tokens, answer score, and wrong-file events.
Read the raw transcript before looking at the table.
The pass condition is not 'tokens went down once.' The hook must improve the exact file class it claims to improve without causing wrong-file reads, refusal loops, or excessive turn count. Otherwise it is not ready for a global PreToolUse hook.
Operational Checklist
Before a Claude Code hook goes into a shared setup, ask these questions:
What native behavior are we replacing?
Which tool event does the hook intercept?
What happens when the dependency is missing?
Where does it write cache or state?
Can the user bypass it for visual or native reads?
What is the uninstall path?
What raw transcript proves it improves a real task?
claude-code-md-hook answers those questions better after being scrapped than it did as a tool launch. It has an installer, an uninstaller, a toggle, and a clear implementation. The missing part is a reason to keep it in front of every read.
Verdict
claude-code-md-hook should not be published as a recommended token-saving hook. It should be published as a field guide for killing a hook. The maintainer measured the claim, found turn inflation, caught benchmark bugs, and reduced the real use case to a simpler MarkItDown instruction.
That is directly useful for Claude Code and Codex practitioners. Every team adding hooks, MCP servers, or wrapper scripts needs a kill path. If the platform already handles the case, remove the wrapper. If the wrapper only helps one format, narrow it. If a one-line instruction works, prefer it until the workflow proves it needs automation.
claude-code-md-hook is publishable as a postmortem: use it to decide when a Claude Code hook should be narrowed or removed, not as a token-saving install recommendation.
Practical takeaway
Before installing a file-read hook, test native behavior, measure total turns and input tokens, inspect raw transcripts for refusal or wrong-file events, try a markitdown <file_path> instruction for PPTX, and keep the hook only if it beats that simpler path on real documents.