The Claude Coworker article should be saved, but not in its original form. The old draft leaned on cost savings and generic productivity language. The repo contains a more useful idea: make delegation explicit. Claude Code decides what needs to happen. A cheaper OpenAI-compatible worker model handles bounded bulk I/O, boilerplate drafts, or transcript extraction.
That is a real workflow pattern for Codex and Claude Code users. It is also easy to misuse. If the worker model writes code without review, reads private transcripts without a data policy, or takes over architectural decisions, the pattern becomes a liability. The rescued article should teach the boundary.
The useful idea is a routing rule
Claude Coworker is not an MCP server or a replacement for Claude Code. It is a small set of CLI tools plus a CLAUDE.md.template that tells the primary agent when to delegate. The README states the split plainly: Claude handles reasoning and architecture; the worker handles token-heavy I/O.
That distinction is the article. A team does not need a new agent framework to try the pattern. It needs a written policy for what can leave the primary model, what must stay inside Claude Code, and how worker output gets reviewed before it changes the repo.
The three tools map to three task classes
ask-kimi is for bulk reading. It ingests paths, sends file contents to an OpenAI-compatible endpoint, and asks for concise structured bullets. kimi-write is for drafts: tests, config, docs, or repetitive code that can follow a reference file. extract-chat is for Claude Code JSONL transcripts; it strips tool calls, tool results, thinking blocks, hooks, permissions, and snapshots so the worker sees only user and assistant text.
Those task classes are sensible because they are bounded. The worker can summarize many files, mirror a style, or extract a chat history. It should not decide whether the architecture is right or whether a security fix is safe.
Run one tool at a time
Use a small test matrix before adding the rules to a real project. First, run ask-kimi on two non-sensitive files and ask for facts that are easy to verify. Accept the result only if it names file paths and you can confirm the claims by opening the source. Second, run extract-chat on one local Claude Code transcript and inspect the output for private data before sending it to any external worker. Third, run kimi-write only against a temporary target path, then compare the generated file with the context file before moving it into the repo.
A practical order is:
ask-kimi --paths README.md CLAUDE.md --question "List routing rules with source evidence"
extract-chat session.jsonl -o /tmp/chat.txt
kimi-write --spec "Draft pytest cases for auth.py" --context tests/test_main.py --target /tmp/test_auth.py
If any of these commands produces vague prose, missing file references, or code that cannot pass the existing test command, keep that tool out of CLAUDE.md until the prompt and provider are fixed.
Setup is simple, but review the installation boundary
The reference path is short:
git clone https://github.com/imkunal007219/claude-coworker-model.git
cd claude-coworker-model
./setup.sh
export WORKER_API_KEY="your-key"
export WORKER_BASE_URL="https://api.moonshot.ai/v1"
export WORKER_MODEL="kimi-k2.5"
setup.sh creates a Python virtual environment under ~/.local/share/claude-coworker, installs the OpenAI dependency, writes ask-kimi and kimi-write into ~/.local/bin with a venv shebang, and symlinks extract-chat. That means the first review is local: where do the binaries land, which Python dependency is installed, and which shell profile exports the worker key?
Provider choice is an operational decision
The repo supports Kimi through Moonshot, DeepSeek, and local Ollama examples because the tools only need an OpenAI-compatible base URL, key, and model name. That flexibility is useful, but it should not be treated as a free swap. Each provider has different privacy terms, latency, reasoning behavior, context limits, and failure modes.
For a team, the worker model should start on non-sensitive files and read-only summaries. If the provider is external, do not send customer data or private transcripts until the data policy is explicit. If the provider is local Ollama, measure output quality before trusting it with docs or tests.
Copy the template, then tighten it
The CLAUDE.md.template gives the right first policy. Delegate multi-file analysis or search to ask-kimi. Delegate boilerplate, tests, config files, docstrings, or repetitive code patterns to kimi-write. Delegate session history review to extract-chat.
The same template also says what not to delegate: architecture decisions, complex logic debugging, safety-critical code, tasks under about 2,000 tokens of work, and exact-line editing. Keep that list. In many projects I would add two more rules: never delegate secrets, and never let kimi-write output land without a diff review by Claude Code or a human.
Copy this first rule set
For the first week, do not copy every delegation rule blindly. Start with a small policy in CLAUDE.md:
When a task needs summaries across three or more files, use ask-kimi and ask for file-backed bullets.
When a task needs a docs update from a long Claude Code session, run extract-chat first, then ask the worker for proposed changes.
Do not use kimi-write on repository files yet. Write only to /tmp until the workflow is reviewed.
Do not delegate architecture, security review, complex debugging, secret-bearing files, or exact-line edits.
After every worker result, verify the answer against source files before editing.
That policy is intentionally narrower than the template. It lets the team learn whether the worker provider is reliable before any generated file lands in the repo. After a week, expand one rule at a time: first docs drafts, then tests, then other boilerplate.
A Codex-style workflow
Inside a Codex-style repo session, the pattern should look like a small subroutine, not a hidden second agent. Ask the worker for a bounded artifact:
ask-kimi --paths auth.py database.py utils.py \
--question "List unvalidated inputs with file paths and likely risk"
kimi-write \
--spec "Write pytest tests for auth.py covering OAuth2 flow" \
--context tests/test_main.py \
--target tests/test_auth.py
Then the primary agent reads the result, checks source lines, edits the final patch, and runs tests. The worker's answer is evidence to inspect, not a decision to obey.
The implementation sequence
Treat the first setup as four gates, not as one install command.
Gate one is local install review. Run ./setup.sh, confirm ~/.local/bin/ask-kimi, ~/.local/bin/kimi-write, and ~/.local/bin/extract-chat exist, and confirm ask-kimi --help runs with the venv Python. Gate two is provider configuration. Export WORKER_API_KEY, WORKER_BASE_URL, and WORKER_MODEL in a shell profile or project-local environment file that is not committed. Gate three is routing. Copy CLAUDE.md.template into a branch and remove any delegation rule you do not trust yet. Gate four is verification. Run one read-only ask-kimi command, compare its answer against source files, and log whether the answer saved enough primary-model context to keep using the pattern.
Only after those gates pass should kimi-write be allowed to create files. Even then, write to a scratch branch or temporary target first, inspect the diff, and run the repo's test command before accepting the output.
Documentation is the strongest use case
The template's mandatory documentation workflow is the most persuasive part of the repo. It says not to write docs directly. Instead, extract the latest Claude Code session, ask the worker to compare the transcript with existing docs, and have Claude apply exact changes.
That division is practical. Transcripts are long and noisy. extract-chat reduces them to readable user and assistant text. A worker can summarize drift or propose doc updates. Claude Code can then apply the actual patch with repo context. This is where worker delegation feels like a useful cost boundary rather than a stunt.
Do not sell the cost table as a benchmark
The README includes a strong results table: weekly limits not hit, session reading reduced, and low worker API cost. Treat those as the author's anecdote unless you reproduce them. A magazine article should not promise the same savings to every reader.
The responsible claim is narrower: if a session repeatedly spends most of its context on bulk reading or boilerplate, worker delegation can move some of that load out of the primary model. Whether it saves money depends on provider pricing, retry rate, output quality, and how often the primary model has to repair worker mistakes.
Failure modes are easy to miss
The obvious failure mode is lower-quality output. The less obvious one is misrouting. If CLAUDE.md says to delegate too broadly, Claude may send debugging, architecture, or security review to a model that lacks the full project context. If kimi-write writes directly to a target file, a weak prompt can create plausible but wrong code. If extract-chat sends a full session to an external provider, private project details may leave the machine.
Those risks do not kill the idea. They define the adoption boundary. Keep worker tasks small, use read-only summaries first, require diff review for generated files, and log which provider handled which class of work.
A rollout I would accept
Start with one repo and one task class. I would begin with documentation, not code generation. Install the tools, configure a local or low-risk provider, copy the template into a branch, and run extract-chat plus ask-kimi against one finished session and one existing doc:
extract-chat ~/.claude/projects/my-project/session.jsonl -o /tmp/chat.txt
ask-kimi --paths /tmp/chat.txt docs/README.md \
--question "List exact documentation updates with source evidence"
Have Claude Code apply only the changes it can verify. After that, try ask-kimi for multi-file reading. Leave kimi-write for last because it writes files. Once the pattern proves useful, add project-specific rules to CLAUDE.md: allowed paths, forbidden data, provider choice, review command, and the exact tests required before accepting worker output.
Save Claude Coworker as a methodology guide. It is valuable when worker models handle bounded bulk I/O and drafts under review; it is risky when teams turn anecdotal cost savings into a blanket rule for architecture, debugging, or safety-critical code.
Practical takeaway
Use Claude Coworker for one bounded workflow first. Install it, set WORKER_API_KEY, WORKER_BASE_URL, and WORKER_MODEL, copy the template into CLAUDE.md, and allow only ask-kimi for multi-file summaries during the first week. Verify each worker answer against at least two source files, log one accepted and one rejected worker result, and keep architecture, complex debugging, security review, and exact edits in Claude Code. Add kimi-write only after you have a diff-review habit and a test command that catches bad generated output.