심화 분석2026년 6월 1일 · 11 min read

Claude Code From Scratch as an Agent Runtime Map

The useful article is not that someone rebuilt Claude Code. It is how the repository separates agent runtime mechanisms that Codex and Claude Code teams can inspect one by one.

𝕏 in

EDITOR'S NOTEThe useful story is not cloning Claude Code. It is using a teaching agent runtime to inspect the mechanisms a real coding agent needs before teams copy patterns blindly.

The old article framed this source as modular AI development. That is too vague to publish. The stronger version is narrower and more useful: claude-code-from-scratch is a source-backed map of the control layer around a coding model. It does not need to prove that it is Claude Code. It needs to help a Codex or Claude Code practitioner ask which runtime mechanisms their own workflow actually has.

That distinction matters. The repository has a real teaching structure: a minimal perception-action loop, a shared core.py, on-demand skills, context compaction, agent teams, worktree isolation, permissions, hooks, MCP runtime, and production mailbox examples. The rescue article should preserve those parts and remove the vague promise. Treat the repo as an inspection path, not as a product clone.

Start with the Runtime Claim

The bundled agent-builder skill gives the best editorial frame: the model is the agent; the surrounding runtime supplies tools, knowledge, observation, action, and permissions. That is exactly why this source belongs on vibe4g. Codex and Claude Code users keep running into the same question: which behavior belongs in the model prompt, which belongs in a skill, which belongs in an MCP server, and which belongs in a permission layer?

The README's 23-session structure is valuable because it separates those decisions. core.py is the shared foundation. Session files add one mechanism at a time. A reader can inspect a single layer without swallowing a giant framework.

Use core.py as the Baseline

core.py is the first file to inspect. It owns the Anthropic client, model selection, default system prompt, file snapshots, bash/read/write/grep/glob/revert tools, synchronous and async dispatch maps, permission helpers, and loop helpers. It also blocks a short list of dangerous shell fragments such as rm -rf /, sudo, device writes, and fork bombs before running subprocess commands.

That baseline teaches a practical rule: before adding subagents or MCP, make the basic tool surface legible. Can each tool timeout? Is output capped? Are write operations reversible? Does the model get plain text it can reason about? If the answer is no, later architecture layers only make the failure harder to debug.

Skills Are Progressive Disclosure

s05_skill_loading.py is the cleanest source-backed lesson in the repo. It scans skills/<name>/SKILL.md, builds a lightweight index from each skill's description, and exposes list_skills plus load_skill. The full skill text enters context only after the model asks for it.

That maps directly to the operational reason skills exist in Claude Code and Codex-style systems: do not stuff every domain rule into standing instructions. The article should recommend a trial with the bundled skills:

python s05_skill_loading.py

Then ask for a code review task and confirm the agent loads the code-review skill only when the task fits. If it loads every skill for every task, the runtime has recreated the context bloat it was supposed to avoid.

Compaction Is a Contract, Not a Cleanup

s06_context_compact.py implements three layers: recent messages remain verbatim, older turns are summarized by the model, and the result is persisted to .agent_memory.md. The script triggers around 40,000 characters and keeps six recent messages intact.

The useful editorial judgment is that compaction is not just shrinking text. It is deciding what evidence survives. A serious agent workflow needs a compacted summary to preserve file paths, decisions, test results, rejected approaches, and pending tasks. If those are missing, the compressed context can make the next turn confident and wrong.

A good local trial is simple: run a long toy task, force compaction, open .agent_memory.md, and check whether it contains decisions and pending work rather than generic prose.

Agent Teams Need a Mailbox Boundary

s09_agent_teams.py introduces persistent explorer and writer teammates. They run in background threads, communicate through JSONL mailboxes under .mailboxes, and return results to a lead agent. The teaching note is honest: JSONL is visible and simple, but production mailboxes should move to a stronger broker such as Redis or RabbitMQ.

That makes this a useful article section because it turns multi-agent enthusiasm into a boundary. A specialist agent needs a role, an inbox, a timeout, a return format, and a synthesis step. Without those, subagents become expensive parallel monologues.

Use this repo to design the handoff shape, not to copy JSONL polling into a production team workflow.

Worktrees Are the Isolation Layer

s12_worktree_task_isolation.py is a better lesson than the broad 'parallel agents' claim. It creates a task/<id> branch, adds a worktree in a sibling directory, changes cwd before tool execution, restores the original directory after each tool call, and removes the worktree and branch in cleanup.

For Codex-style work, the adoption question is concrete: can two agents edit without touching the same working tree? If not, parallelism is mostly theater. The repo's implementation is intentionally educational, but the pattern is right: branch, worktree, scoped prompt, isolated tool execution, result review, cleanup.

The missing production checks are also worth naming: dirty tree handling, branch conflict policy, artifact retention, and merge review. The README later points to s23_worktree_advanced.py for more lifecycle handling, which is the right direction.

Permissions Belong Outside Tool Code

s15_permissions.py wraps every tool in a guarded dispatch function. config/permissions.yaml defines always_deny, always_allow, and ask_user tiers. The examples block root deletion, privilege escalation, system power commands, device writes, fork bombs, and pipe-to-shell downloads; they require confirmation for deletion, package installs, git writes, permissions changes, process killing, and .env access.

The source-backed principle is strong: separate what a tool does from whether the action is allowed. Official Claude Code docs have their own permission modes and rules, but the repo is still useful because it shows the shape of a policy gate in code.

The caveat is that regex rules are only a starting point. Do not call the runtime safe until the policy is tested against real commands, path traversal, shell quoting, and project-specific secrets.

MCP Is Runtime Discovery

s21_mcp_runtime.py reads config/mcp_config.yaml, starts stdio MCP servers, initializes client sessions, calls list_tools, prefixes tool names as mcp__<server>__<tool>, and routes calls back to the right server. That is a concrete mental model for MCP: the agent does not need every integration hardcoded if the runtime can discover tool schemas.

For readers, the adoption path is not to copy the file into production. It is to inspect the registry boundary. How are tool names namespaced? What happens when a server fails? How much output can a remote tool return? Which MCP servers should be project-scoped, user-scoped, or blocked?

After reading this file, a Claude Code user should be able to run claude mcp list and claude mcp get <name> with a sharper eye for what the client is actually loading.

Use It as a Four-Hour Inspection

A practical reader should not run all 23 sessions in order. Use the repo as a targeted inspection:

0-30 min: Read README and core.py. Mark the tool surface and dispatch path.
30-60 min: Run s05. Decide which local workflows deserve a skill.
60-90 min: Run s06. Inspect .agent_memory.md for useful retained decisions.
90-120 min: Read s09 and s12. Decide whether your parallel-agent work needs mailboxes, worktrees, or both.
120-150 min: Read s15 and config/permissions.yaml. Compare rules with your actual forbidden commands.
150-180 min: Read s21 and config/mcp_config.yaml. Check how MCP tool names and failures are routed.
180-240 min: Convert one lesson into a real artifact: a skill, permission rule, MCP config, or worktree runbook.

That is enough to rescue the source without overstating it.

Adoption Verdict

claude-code-from-scratch is worth publishing because it gives practitioners a vocabulary for the control layer around a coding model. The value is not the claim that Claude Code can be rebuilt from a repository. The value is the inspection sequence: loop, tools, skills, compaction, subagents, worktrees, permissions, hooks, MCP, and production mailboxes.

Use it when a team needs to understand agent architecture before adding another plugin or writing another standing instruction. Skip it when the team wants a supported product, a security-reviewed runtime, or a drop-in replacement for Claude Code.

The best outcome is one local improvement, not a cloned system: a better AGENTS.md, a narrower skill, a safer permission rule, a clearer MCP config, or a worktree protocol for parallel Codex sessions.

claude-code-from-scratch is publishable as an agent runtime map: use it to inspect the layers around a coding model, not as proof that a teaching repo can replace Claude Code.

Practical takeaway

Read core.py, run s05_skill_loading.py and s06_context_compact.py, inspect s09, s12, s15, and s21, then convert exactly one lesson into a local Codex or Claude Code artifact: a skill, permission rule, MCP config, worktree protocol, or review checklist.

SOURCES

[1] Primary sourcegithub.com

[2] Core harnessgithub.com

[3] Skill loading sessiongithub.com

[4] Context compaction sessiongithub.com

[5] Agent teams sessiongithub.com

[6] Worktree isolation sessiongithub.com

[7] Permission governance sessiongithub.com

[8] Permission rulesgithub.com

[9] MCP runtime sessiongithub.com

[10] Agent builder skillgithub.com

[11] Claude Code subagents docscode.claude.com

[12] Claude Code skills docscode.claude.com

[13] Claude Code permissions docscode.claude.com

[14] Claude Code MCP docscode.claude.com

[15] MCP configgithub.com

[16] Code review skillgithub.com

[17] Requirementsgithub.com

claude codeagent architectureskillsmcpworktrees

Claude Code 생태계를 앞서가세요

MCP 서버, 스킬, 에이전트 패턴, 바이브코딩 인사이트를 매주 전해드립니다.

무료 구독

OpenCode 전환 후 /code-review · /security-review 공백: opencode-power-pack의 SKILL.md 포팅 구조와 도입 조건

Anthropic 공식 Claude Code 플러그인의 code-review, security-review, feature-dev는 OpenCode에서 그대로 동작하지 않는다. opencode-power-pack은 이 워크플로우들을 OpenCode 네이티브 SKILL.md 포맷으로 번역하고, ~/.config/opencode/opencode.json 한 줄 설정으로 11개 스킬을 적재한다.

2026년 6월 11일

opencodeskillscode-review

도구8분 읽기

guard-skills: Claude Code diff에 catch-all 오류·환각 import·HPOS 패턴을 잡는 5개 리뷰 Skill

guard-skills는 Claude Code, Codex, Cursor가 생성한 코드·테스트·문서에 second-pass 리뷰를 수행하는 5개 Skill 패키지입니다. clean-code-guard는 GitClear·USENIX 연구에 근거한 14가지 AI 실패 패턴을 검사하고, woo-guard는 AI가 반복 생성하는 pre-HPOS WooCommerce 코드 패턴을 직접 겨냥합니다.

2026년 6월 11일

skillsclaude-codecode-review

도구7 min read

baoyu-design: What Changes When You Run claude.ai/design Locally in Claude Code

baoyu-design ports the claude.ai/design methodology to Claude Code by detecting the agent environment at runtime and loading a tool substitution table from `references/claude.md`. The design system pipeline — `compile-design-system.mjs` produces static lint config, `import-design-system.mjs` generates a token allowlist as `_ds_prompt.md` — enforces the same constraint at two layers. Neither is optional if you want session-to-session consistency.

2026년 6월 10일

skillsclaude-codedesign-systems