The shallow version of this article would say Claude Code forgets and Claude Soul remembers. That misses the real engineering question.

The hard question is what kind of memory should be allowed to steer an agent. Claude Soul is not just a folder of notes. It installs an MCP server, writes files under ~/.soul, registers Claude Code hooks, stores memory in SQLite, extracts correction patterns, builds behavioral frameworks, and injects selected context at the start of later sessions.

That can be valuable. It can also make stale personal history louder than the repo in front of the agent. The article is worth saving because the source gives enough detail to write a real operating guide: where the state lives, which hooks run, what gets injected, how evidence tiers work, and when a team should refuse the automation.

Separate facts from behavior

Claude Soul has two different memory jobs. The first is factual memory: decisions, preferences, lessons, architecture notes, and journals stored in ~/.soul/data/memory.db. The second is behavioral memory: frameworks, corrections, shadows, lessons, tensions, and state files that influence how Claude responds.

Those should not be treated the same. A saved decision about an auth flow can be checked against the current code. A behavioral framework such as 'be concise' or 'question assumptions' changes the agent's stance before it reads the current task. That is more powerful and more dangerous.

A team adopting Claude Soul should write this rule first: factual memories are evidence; behavioral frameworks are suggestions; current repo instructions and current source code win during conflict.

Install it as a runtime change

The quick start is one command, but the command does real work. The init code creates ~/.soul, writes identity and correction files, writes config, registers an MCP server, copies hooks, and edits Claude Code settings.

Use a controlled first install:

node --version
claude --version
npx claude-soul init --starter --skip-identity
ls ~/.soul/files/
cat ~/.soul/config.json | head -20
claude mcp list | grep claude-soul
cat ~/.claude/settings.json | grep -A30 hooks

That last command matters. Do not trust the docs, the terminal success message, or the article. Verify what hooks were actually registered on this machine.

Troubleshoot before using memory

A failed install should stop the trial. Do not continue with half-registered memory.

# Claude CLI missing
claude --version
# If this fails, install Claude Code first.

# MCP server missing
claude mcp list | grep claude-soul
# If empty, register manually:
claude mcp add --scope user claude-soul -- npx claude-soul-server

# Soul files missing
ls ~/.soul/files/
# Expected: CORRECTIONS.md, SHADOW.md, SOUL.md, STORY.md

# Hook config missing
cat ~/.claude/settings.json | grep -A30 hooks
# If no Soul hooks appear, rerun init or add hooks manually from the docs.

# Memory database native dependency issue
npx claude-soul status
# If better-sqlite3 fails, run npm rebuild better-sqlite3 or use non-memory features until fixed.

Only after those checks pass should the user add the CLAUDE.md instruction that calls soul_context at session start.

Hooks are the real integration point

The README says Claude Soul runs through Claude Code extension points: MCP tools and hooks. The source shows why that matters. The init command registers Stop hooks for session processing, journaling, follow-up detection, memory indexing, and correction extraction. It registers a PostToolUse hook for scratchpad logging and a PreToolUse hook for Write and Edit protection around soul files.

That is a serious integration. The tool is not waiting for the user to remember to save notes. It watches session endings, tool usage, and file writes. That is exactly why it can build memory without much user effort.

It is also why a team needs an owner. Hooks can break, duplicate, or conflict with other Claude Code automation. A weekly check should include ~/.claude/settings.json, not only claude-soul status.

Context budget is not the same as control

The default config sets contextBudget.maxTokens to 4500. That sounds like a hard guardrail. The token-budget code is more subtle: tier-1 blocks are always included, then tier-2 blocks are packed if there is room, then tier-3 material is added if possible.

Tier 1 includes framework vocabulary, process frameworks, recent patterns, SOUL.md, CORRECTIONS.md, transformed SHADOW.md, and STATE.md. If those grow too large, the system can spend the whole context allowance before the lower tiers get a chance.

The practical rule is simple:

{
  "contextBudget": { "maxTokens": 3000 },
  "lessons": { "maxInjectCount": 1 },
  "exemplars": { "maxInjectCount": 1 }
}

Then inspect the actual soul_context output in a new Claude Code session. The metric is not the config value. The metric is what Claude sees before it starts the task.

Ship a conservative first config

Do not start with every automatic path enabled. After install, edit ~/.soul/config.json for the first week:

{
  "signals": { "enabled": true, "maxLogSizeKb": 50 },
  "stateEngine": { "enabled": true },
  "reflection": {
    "enabled": false,
    "quickSignalThreshold": 20,
    "deepSignalThreshold": 100,
    "quickModel": "haiku",
    "deepModel": "sonnet"
  },
  "lessons": { "enabled": true, "maxCount": 100, "maxInjectCount": 1 },
  "exemplars": { "enabled": true, "maxCount": 50, "maxInjectCount": 0 },
  "contextBudget": { "maxTokens": 3000 },
  "writeProtection": { "enabled": true }
}

That config keeps signal capture, state tracking, memory, and write protection on, but prevents reflection from changing frameworks before the team has read the first week of journals and corrections. After the review, enable quick reflection first. Leave deep reflection for later.

Evidence tiers are the main safety feature

The framework model is more interesting than the memory claim. Frameworks start as hypotheses, move to observed after external confirmation, and become validated after three or more external confirmations across different contexts. The docs also discount self-referential evidence at half weight so the system cannot fully certify itself.

That is the right shape for agent learning. Without evidence tiers, a memory system can turn one bad interaction into permanent instruction. With tiers, a team can ask whether a behavioral rule has real confirmation or is still a guess.

The adoption rule should be: no framework with only hypothesis evidence gets to override repo policy. Observed frameworks can guide tone and pacing. Validated frameworks can become explicit CLAUDE.md rules after human review.

Reflections mutate state

Reflection is not a report. The reflection runner creates snapshots, reads unconsumed signals, calls Claude programmatic mode, applies framework changes, writes lessons, updates tensions, appends reflection logs, and marks signals consumed. That means reflection changes the memory layer that future sessions will load.

That is fine for a personal assistant. It is risky in a shared engineering setup unless someone owns the resulting files. Before team use, pick a mode:

{
  "reflection": { "enabled": false },
  "stateEngine": { "enabled": true },
  "writeProtection": { "enabled": true }
}

Run manually for a week. If the journals and corrections are accurate, enable quick reflection. Save deep reflection for after the team has reviewed what the system is learning.

Semantic memory is optional, not magic

The README says semantic search uses local Ollama with nomic-embed-text. Without it, keyword search still works. The database code uses better-sqlite3 under ~/.soul/data/memory.db and throws a clear native dependency error if SQLite support cannot load.

That gives a clean rollout path:

npx claude-soul init --starter --skip-identity
npx claude-soul status
# optional semantic search
ollama pull nomic-embed-text
npx claude-soul index

Do not start by selling semantic recall. Start by proving that journals are written, memory search returns sane entries, and recall does not surface stale decisions as if they were current repo truth.

Correction tracking needs humility

The correction extractor is useful because it names failure modes: premature_done, robot_mode, permission_asking, scope_creep, confabulation, over_explaining, quality, authenticity, frustration, independence, and redirect. That vocabulary can make recurring agent problems visible.

It is still heuristic. It scans text patterns, scores directed corrections, filters some false positives, and writes correction logs. A single correction label should not become a verdict about a user, a developer, or an assistant. Trends matter more than isolated events.

Use it this way: if premature_done appears across many sessions, add a verification rule. If confabulation appears across many sessions, add a source-check rule. If one angry line gets classified, read the transcript before changing policy.

A one-week trial

A safe trial has explicit gates:

# day 0
npx claude-soul init --starter --skip-identity
claude mcp list | grep claude-soul
cat ~/.claude/settings.json | grep -A30 hooks

# day 1-2
npx claude-soul status
ls ~/.soul/journals/

# day 3-5
npx claude-soul shadow --brief
find ~/.soul/reflections -type f | tail

# day 7
npx claude-soul status

The after-one-week guide gives good health checks: 4-8 active frameworks, 40-70 percent survival rate, 0-20 pending signals, 0-5 unresolved follow-ups, and 5-7 journal entries. If active frameworks are zero, nothing is learning. If everything survives, the system is not discriminating. If pending signals stay high, reflection is not keeping up.

Promote memory in three steps

Treat the first week as data collection, not adoption. Promotion should happen in three steps.

Step one: inspect what was captured.

ls ~/.soul/journals/
cat ~/.soul/files/CORRECTIONS.md
npx claude-soul shadow --brief
npx claude-soul status

Step two: run a new Claude Code session and ask for soul_context. Copy the returned context into a review note. Mark every item as one of four types: identity, correction, framework, or fact.

Step three: promote only reviewed items into repo policy:

Promoted from Soul, 2026-06-01:
- Verified pattern: premature_done repeated across 6 sessions.
- Repo rule: before claiming done, run the named verification command and cite the output.
- Not promoted: tone preference from SHADOW.md; keep it personal, not repo policy.

This gives teams a clean path from private memory to shared engineering rule.

Write conflict rules into CLAUDE.md

The minimum README snippet tells Claude to call soul_context() at the start of every conversation. That is necessary but incomplete. A production repo should add conflict rules:

## Soul System
Call `soul_context()` at session start.
Treat Soul memories as historical evidence, not current truth.
When Soul context conflicts with AGENTS.md, CLAUDE.md, tests, current source, or user instructions, the current repo wins.
Do not promote a Soul framework into repo policy until a human reviews its evidence tier and examples.
If Soul recalls an old decision, cite it as past context and verify it against current files before acting.

That turns memory from hidden influence into an inspectable input.

Where it fits beside Codex

Claude Soul is built for Claude Code, but Codex readers should still care because it models a pattern every agent runtime needs: local memory as a managed subsystem. The portable lesson is not the exact MCP tool names. It is the boundary design.

A Codex memory layer should have the same pieces: visible storage path, explicit recall command, context budget, source labels, conflict rules, hook ownership, and a way to retire stale lessons. Without those, persistent memory becomes a second invisible prompt that nobody reviews.

When not to use it

Skip Claude Soul on shared machines where one user's memory can leak into another user's work. Skip it in regulated or client-sensitive repos until the team has reviewed what goes into ~/.soul/journals, memory.db, and reflection logs. Skip automatic reflection if the team cannot review framework changes.

Also skip it for short-lived experiments. A static CLAUDE.md is often enough for a one-day prototype. Claude Soul earns its cost when the same human, repo, or agent workflow repeats long enough for corrections and decisions to matter across sessions.

Save the Claude Soul article as a memory-layer operating guide. It is valuable when readers learn how to verify hooks, control context, label evidence, and set conflict rules; it is weak if reduced to a claim that Claude Code can remember more.

Practical takeaway

Do not install Claude Soul straight into a shared repo and trust the default story. First run npx claude-soul init --starter --skip-identity in a personal environment, verify ~/.soul, ~/.claude/settings.json, and the claude-soul MCP entry, then add CLAUDE.md conflict rules. Keep reflection conservative for one week, inspect journals and correction trends, and only promote validated frameworks into repo policy after human review.