도구2026년 6월 1일 · 8 min read

Use budget-aware-mcp as a Codex Context Gate

The practical use is a Codex pre-edit context gate: index the repo, check scope, walk a bounded graph, then read the selected source files.

𝕏 in

EDITOR'S NOTEUse budget-aware-mcp as a pre-edit context gate: index the repo, bound the graph walk, check scope, then read source. Do not treat its benchmark claims or regex fallback as proof of semantic understanding.

The old article should not be thrown away. The project behind it is useful. What failed was the angle: it sounded like a generic code-review speed claim instead of explaining how an agent should use a bounded graph before touching a repo.

budget-aware-mcp is a local MCP server for code memory. It indexes a repository into SQLite, exposes symbol and graph queries over stdio, and lets an agent ask for a connected slice of the codebase under a token budget. That is the real Codex use case. Before an agent edits, it can ask: does this task name symbols that exist, which files are connected, what calls what, what might break, and how much context will the selected neighborhood cost?

That does not replace reading source. It changes the order of operations. Use the graph to choose the source files, then read the files before generating a patch.

The useful idea is a context gate

The README describes the project as model-agnostic code memory for AI agents, with deterministic graph retrieval, token budgeting, no embeddings, no vector database, and no API keys. Those words matter less than the workflow they enable.

A coding agent often starts work with blind exploration: list directories, grep names, open several files, then discover that half the context was not needed. budget-aware-mcp gives the agent a narrower first move. Index the repo, name an anchor symbol or task, walk one or two hops through the graph, and cap the returned context by token budget. The output should become a reading list, not a final answer.

What the MCP server exposes

The inspected src/index.ts registers tools across indexing, retrieval, discovery, context, code access, session stats, and memory. The important set for daily Codex work is small: index_repo, check_scope, fuzzy_find_symbol, graph_walk, search_graph, trace_call_path, analyze_impact, suggest_files, get_file_context, explain_symbol, get_code_snippet, and get_session_stats.

That is a practical shape. check_scope can test whether a user request maps to known identifiers. fuzzy_find_symbol can find an anchor when the user gives a partial name. graph_walk can return connected symbols and files under a budget. trace_call_path and analyze_impact are the review hooks: they turn a proposed edit into a call-path or blast-radius question before the patch is written.

Graph walks are deterministic, not magic

The graph walker is plain enough to audit. It performs a breadth-first walk from an anchor symbol, sorts each hop alphabetically, follows both outgoing and incoming edges, estimates token cost from byte size, and skips additions once the requested budget would be exceeded. It groups selected symbols by file and reports an estimated saving against a full read.

That is good engineering for a context selector. It is also a boundary. The system is not claiming semantic understanding. It follows indexed symbols and edges. If the index missed a dynamic call, a framework convention, or a generated route, the graph walk can miss relevant code. A review-grade article needs that caveat in the main body, not in a footnote.

The first workflow I would run

A useful adoption test is small and mechanical:

npm install -g budget-aware-mcp
budget-aware-mcp install
budget-aware-mcp

Then give Codex a fixed pre-edit checklist. The tool schemas in src/index.ts use path for indexing, task_description for scope checks, query for fuzzy search, and anchor, hop_depth, and max_tokens for graph walks:

{"tool":"index_repo","arguments":{"path":"/work/app","name":"app"}}
{"tool":"check_scope","arguments":{"task_description":"change PaymentIntent retry handling"}}
{"tool":"fuzzy_find_symbol","arguments":{"query":"PaymentIntent","max_results":5}}
{"tool":"graph_walk","arguments":{"anchor":"PaymentIntentService","hop_depth":2,"max_tokens":8000}}
{"tool":"get_code_snippet","arguments":{"symbol":"PaymentIntentService","max_lines":80}}

Only after those calls should the agent edit. The acceptance check is simple: did the graph path select the files a senior engineer would inspect first, and did it avoid large unrelated directories?

Scope checks belong before generation

scope_check.ts extracts identifiers from quoted strings, PascalCase or camelCase names, snake_case names, and dotted paths. It compares those names with indexed symbols and returns a feasibility level: full, partial, or unknown. The thresholds are simple, but that simplicity is useful.

Before Codex writes code for "update PaymentIntent retry handling," a scope check can show whether PaymentIntent, retry, or related dotted paths exist in the repo graph:

{"tool":"check_scope","arguments":{"task_description":"update PaymentIntent retry handling without changing checkout creation"}}

If the answer is unknown, the agent should ask for clarification or run broader source search. If the answer is partial, it should name which identifiers were found and which were guessed before opening files. That is better than confidently patching the wrong subsystem.

Impact analysis is the review hook

The project should be positioned as a review aid as much as a coding aid. trace_call_path walks outgoing edges between two symbols. analyze_impact starts from symbols in changed files and walks incoming edges to find possible dependents. Those are the exact questions reviewers ask after a change: how does this function get reached, and who depends on this file?

In a Codex workflow, I would run impact analysis before finalizing a non-trivial patch:

{"tool":"trace_call_path","arguments":{"from_symbol":"checkoutRoute","to_symbol":"PaymentIntentService","max_hops":8}}
{"tool":"analyze_impact","arguments":{"changed_files":["src/payments/payment-intent.ts"],"hop_depth":2}}

The output should be copied into the agent's handoff with source reads and tests. If the tool says the impact radius is small, that is a hypothesis to verify, not a reason to skip tests.

The indexer path changes trust level

The README leans on CodeGraphContext for tree-sitter indexing across many languages. The inspected indexer looks for a CodeGraphContext binary in platform paths or PATH, imports its SQLite graph when found, and falls back to a built-in parser when it is missing or fails.

That fallback is useful because the server still works, but it lowers the trust level. The built-in path is regex-oriented and covers a broad language list, while tree-sitter indexing should capture more reliable structure. A team using budget-aware-mcp should log which indexer path ran. If a critical edit was planned from the fallback graph, treat the graph as a hint and lean harder on direct source reads.

Install reach is real, but config still needs review

The README says budget-aware-mcp install auto-detects and configures Kiro, Claude Code, Cursor, VS Code, Windsurf, Zed, Codex CLI, Gemini CLI, Aider, and OpenCode. The CLI source backs that general story by writing client-specific MCP config entries.

That is valuable reach for mixed-agent teams. It also means installation is not just a package step. Review the generated MCP config before trusting it in a shared repo or workstation. Confirm the server path, Node version, project root behavior, and whether the index database should live under .code-graph/graph.db or another local path.

The benchmark claims need boundaries

The README performance table claims very fast graph walks, fuzzy search, scope checks, and indexing on a 108-file project. Those numbers explain why the tool is worth a look, but they should be framed as maintainer-side evidence.

The inspected benchmark files include local developer-machine assumptions, including hardcoded Windows paths. That does not make the project bad. It means an article should not present the numbers as independent proof. The better test is local: benchmark on your repo after indexing, record p95 query time, and compare the selected source set against what engineers actually needed to read.

Choose it over maps only when queries matter

budget-aware-mcp is not the only way to reduce blind repo reading. A committed repo map is easier to review and works in any agent that can read files. A full-content packer is simpler for small repos and one-shot analysis. Embedding search can find semantically similar text that a symbol graph will never connect. Plain rg is still the fastest truth probe when you already know the term.

The case for budget-aware-mcp appears when the next question depends on graph structure: what imports this, what calls this, what would this changed file affect, which files sit near this symbol, and how much connected context can the agent read under a budget? If a team only needs a static overview, a map may be enough. If it needs repeatable pre-edit queries from Codex or Claude Code, the MCP surface is the point.

Failure cases to write into the rule

There are four failure cases worth writing into the agent rule. First, an incomplete graph can hide important code. Dynamic dispatch, framework routes, generated files, reflection-heavy code, and convention-based wiring can sit outside the indexed edge set. Second, token estimates are approximations; the graph walker estimates from byte size with a floor per symbol. Third, the fallback parser is not the same trust level as CodeGraphContext. Fourth, the session tracker records query count and token totals, but the inspected code initializes total_tokens_saved without updating it in the query path.

Those issues do not kill the idea. They define the contract. The graph narrows exploration. It does not certify correctness, prove savings, or replace a source-backed review. A good AGENTS.md rule should say this plainly: if the graph output and direct source reads disagree, source wins.

A rollout I would accept

Start with one large repo and one repeated task type, such as controller-to-service changes, CLI command edits, or test helper refactors. Install the MCP server locally, index the repo, and ask the agent to run the same pre-edit checklist each time:

1. Run check_scope on the user's task and report full, partial, or unknown.
2. Run fuzzy_find_symbol for any uncertain anchor.
3. Run graph_walk with hop_depth=2 and max_tokens=8000.
4. Read the selected source files directly with get_code_snippet or normal file reads.
5. After editing, run analyze_impact on changed_files and then run the relevant tests.

A useful AGENTS.md rule can be only two lines: for non-trivial edits, run budget-aware-mcp before broad file reads; never patch from graph output without direct source reads. Keep the first week manual. Compare the tool's selected files with reviewer judgment. If it repeatedly picks the right entrypoints and avoids noise, keep the checklist. If it misses framework wiring, restrict it to the subsystems where the graph is reliable.

Save budget-aware-mcp as a source-backed MCP tool review. It is useful when treated as a bounded context selector before source reads and edits; it is risky when benchmark numbers, token-saved counters, or regex fallback graphs are treated as proof.

Practical takeaway

Try budget-aware-mcp on one repeated workflow before making it a default agent dependency. Install it, index the repo with index_repo, run check_scope, select an anchor with fuzzy_find_symbol, call graph_walk with hop_depth=2 and max_tokens=8000, then read the selected files directly. After the edit, run analyze_impact on the changed files and run tests. Promote it only if the graph's file choices match reviewer judgment on your own codebase.

SOURCES

[1] Primary sourcegithub.com

[2] package metadatagithub.com

[3] MCP server sourcegithub.com

[4] graph walk implementationgithub.com

[5] scope checkergithub.com

[6] semantic cachegithub.com

[7] indexergithub.com

[8] SQLite storegithub.com

[9] session trackergithub.com

[10] in-process benchmarkgithub.com

[11] fuzzy findergithub.com

[12] CLI installergithub.com

[13] test suitegithub.com

[14] fair benchmarkgithub.com

mcpcodexclaude codecode graphagent context

Claude Code 생태계를 앞서가세요

MCP 서버, 스킬, 에이전트 패턴, 바이브코딩 인사이트를 매주 전해드립니다.

무료 구독

OpenCode 전환 후 /code-review · /security-review 공백: opencode-power-pack의 SKILL.md 포팅 구조와 도입 조건

Anthropic 공식 Claude Code 플러그인의 code-review, security-review, feature-dev는 OpenCode에서 그대로 동작하지 않는다. opencode-power-pack은 이 워크플로우들을 OpenCode 네이티브 SKILL.md 포맷으로 번역하고, ~/.config/opencode/opencode.json 한 줄 설정으로 11개 스킬을 적재한다.

2026년 6월 11일

opencodeskillscode-review

도구8분 읽기

guard-skills: Claude Code diff에 catch-all 오류·환각 import·HPOS 패턴을 잡는 5개 리뷰 Skill

guard-skills는 Claude Code, Codex, Cursor가 생성한 코드·테스트·문서에 second-pass 리뷰를 수행하는 5개 Skill 패키지입니다. clean-code-guard는 GitClear·USENIX 연구에 근거한 14가지 AI 실패 패턴을 검사하고, woo-guard는 AI가 반복 생성하는 pre-HPOS WooCommerce 코드 패턴을 직접 겨냥합니다.

2026년 6월 11일

skillsclaude-codecode-review

도구7 min read

baoyu-design: What Changes When You Run claude.ai/design Locally in Claude Code

baoyu-design ports the claude.ai/design methodology to Claude Code by detecting the agent environment at runtime and loading a tool substitution table from `references/claude.md`. The design system pipeline — `compile-design-system.mjs` produces static lint config, `import-design-system.mjs` generates a token allowlist as `_ds_prompt.md` — enforces the same constraint at two layers. Neither is optional if you want session-to-session consistency.

2026년 6월 10일

skillsclaude-codedesign-systems